I am parsing xml recods using ruby. XML file has following data strcuture:
<row Id="27" PostTypeId="2" ParentId="11" CreationDate="2008-08-01T12:17:19.357" Score="13" Body="<p>@jeff</p>

<p>IMHO yours seems a little long. However it does seem a lit
tle more robust with support for "yesterday" and "years". But in my experience when this is used the person is most likely to view the content in the first 30 days. It is only the really har
dcore people that come after that. So that is why I usually elect to keep this short and simple.</p>

<p>This is the method I am currently using on one of my websites. This only re
turns a relative day, hour, time. And then the user has to slap on "ago" in the output.</p>

<pre><code>public static string ToLongString(this TimeSpan time)<br&g
t;{<br> string output = String.Empty;<br><br> if (time.Days &gt; 0)<br> output += time.Days + " days ";<br><br> if ((time.Days == 0 || time.Days =
= 1) &amp;&amp; time.Hours &gt; 0)<br> output += time.Hours + " hr ";<br><br> if (time.Days == 0 &amp;&amp; time.Minutes &gt; 0)<br> outp
ut += time.Minutes + " min ";<br><br> if (output.Length == 0)<br> output += time.Seconds + " sec";<br><br> return output.Trim();<br>}<br>
</code></pre>" OwnerUserId="17" LastEditorUserId="17" LastEditorDisplayName="Nick Berardi" LastEditDate="2008-08-01T13:16:49.127" LastActivityDate="2008-08-01T13:16:49.127" CommentCount="1" CommunityO
wnedDate="2009-09-04T13:15:59.820" />
But there are some records that doesn't have all the elements
<row Id="29" PostTypeId="2" ParentId="13" CreationDate="2008-08-01T12:19:17.417" Score="18" Body="<p>There are no HTTP headers that will report the clients timezone so far although it has been suggested t
o include it in the HTTP specification.</p>

<p>If it was me, I would probably try to fetch the timezone using clientside JavaScript and then submit it to the server using Ajax or so
mething.</p>" OwnerUserId="19" LastActivityDate="2008-08-01T12:19:17.417" CommentCount="0" />
My ruby parse goes through these XML records and insert them into an MySQL database:
def on_start_element(element, attributes)
if element == 'row'
@post_st.execute(attributes['Id'], attributes['PostTypeId'], attributes['AcceptedAnswerId'], attributes['ParentId'], attributes['Score'], attributes['ViewCount'],
attributes['Body'], attributes['OwnerUserId'] == nil ? -1 : attributes['OwnerUserId'], attributes['LastEditorUserId'], attributes['LastEditorDisplayName'],
DateTime.parse(attributes['LastEditDate']).to_time.strftime("%F %T"), DateTime.parse(attributes['LastActivityDate']).to_time.strftime("%F %T"), attributes['Title'] == nil ? '' : attributes['Title'],
attributes['AnswerCount'] == nil ? 0 : attributes['AnswerCount'], attributes['CommentCount'] == nil ? 0 : attributes['CommentCount'],
attributes['FavoriteCount'] == nil ? 0 : attributes['FavoriteCount'], DateTime.parse(attributes['CreationDate']).to_time.strftime("%F %T"))
post_id = attributes['Id']
tags = attributes['Tags'] == nil ? '' : attributes['Tags']
tags.scan(/<(.*?)>/).each do |tag_name|
tag_id = insert_or_find_tag(tag_name[0])
@post_ot_tag_insert_st.execute(post_id, tag_id)
end
end
end
But during the processing of second record based on whats been insert in my database (Last record is the record with rows id=27) I am getting following error:
/format.rb:1031:in `dup': can't dup NilClass (TypeError)
I was wondering if its related to missing elements, lets say if its missing some elements that I am expecting in in the database I wonder how I should be handling this or set to a some kind of a default value. Such as if its a missing date set the date to some default date value.
This is the line that is getting complain:
DateTime.parse(attributes['LastEditDate']).to_time.strftime("%F %T"), DateTime.parse(attributes['LastActivityDate']).to_time.strftime("%F %T"), attributes['Title'] == nil ? '' : attributes['Title'],
I think its complaining on the LastEditDate
?