I'm writing an XML parser in C# .Net and for this particular chunk of XML I'm having a difficult time conceptualizing how I would parse and store it's relationship in a database. See sample xml below:
<Article name="x" date="y">
<Words>
<Category ID="1000" Name="Person">
<Keyword ID="1124" Name="Adult">
<Keyword ID="1125" Name="Female" />
</Keyword>
</Category>
<Category ID="1000" Name="Person">
<Keyword ID="1124" Name="Adult">
<Keyword ID="1126" Name="Male" />
</Keyword>
</Category>
</Words>
</Article>
So you can see there are N Categories each containing N nested keyword nodes. What is the best way to represent this relational data?
I have a master XML doc of all the possible Keyword and Category combinations. This is my first thought but please correct me if there is a better way:
- Table for all Keywords/Categories: ID, Name, ParentID
- Table for Article: ID, Name, Date,
- Go to bottom most keyword node in each category node and add to lookup table: ID, ArticleID, KeywordID
Then I could just do a join and recursively find all the parents of the bottom most keyword node.