0

I've been using guid's to identify elements in an xml document for editing. It seems that guid's are a lot more space than just an id field. In sql there is auto increment. Is there something similar or a decent way to auto increment for xml elements in Linq to XML?

The only constraint may be that once a number is used it cannot be used again.

Thanks.

Joopk.com
  • 45
  • 1
  • 8
  • "The only constraint may be that once a number is used it cannot be used again" – in the same document, or across documents? More to the point, do you have a problem that needs solving, or is this an opinion-based poll question? – millimoose Jan 13 '14 at 04:52
  • It's across a few documents. Consider gallery and image list or some case similar. one file has all of the galleries, and the other file has all of the images. the image has a gallery id, so when i pull the gallery it calls all of the images with gallery id x as an attribute. – Joopk.com Jan 14 '14 at 16:55

2 Answers2

3

I don't know what the "best" way is, but I'd think using a GUID would be a sure-fire way of getting a unique value for your ID field.

If you wanted an alternative method that uses a smaller number, you could try checking the file each time prior to inserting, and getting the next available ID that's one larger than the previous:

private int GenerateNextId()
{
   var file = XDocument.Load("yourFile.xml");  // or pass an XDocument in
                                               // so you don't have to reload it

   return file.Descendants("SomeElement")
              .OrderByDescending(x => Convert.ToInt32(x.Attribute("ElementId").Value))
              .Select(x => Convert.ToInt32(x.Attribute("ElementId").Value))
              .FirstOrDefault() + 1;
}

This is just posted as an alternative. I don't know how efficient this is as your XML grows in size. YMMV


If you decide to keep using the GUID, there are ways to shorten it, such as this SO post:

Convert.ToBase64String(Guid.NewGuid().ToByteArray());

I tried it out - the generated ID is nearly cut in half:

0b427c5a-1541-4cb4-8995-4e67dac61654
WnxCC0EVtEyJlU5n2sYWVA==

d1205a49-f64b-4418-8449-b1cd52f06624
SVog0Uv2GESESbHNUvBmJA==
Community
  • 1
  • 1
Grant Winney
  • 65,241
  • 13
  • 115
  • 165
  • That's what I was thinking as well. My only concern was as it grows. Mainly size. I've been using guids but when you get 1000 records or more the guid takes tons of space. – Joopk.com Jan 13 '14 at 04:33
  • @Joopk.com "Tons of space" is nonsense. When you have 1000 records you're using 36000 bytes of space for GUIDs. How much hard drive space do you have available again? – millimoose Jan 13 '14 at 04:51
  • (For perspective: the HTML source of this page alone is 80kB, and SO isn't exactly a heavyweight site.) – millimoose Jan 13 '14 at 05:03
  • 1
    It's not nonsense, and it's not the drive space, it's the processor time. Also ordering. You can order by int not guid, you either have to put a date or an order field in there, so tack that on to the file size as well. Not as efficient. I know the guid is unique, just seems like there is a better way. I like this answer the best, i think adding 1 to the last id is the most efficient. I also like the Convert.ToBase64String(Guid.NewGuid().ToByteArray()); if ordering is not necessary. Thanks! – Joopk.com Jan 14 '14 at 16:51
  • 1
    What is the best way was probably the wrong way to ask the question. It should have been what is the most efficient way. – Joopk.com Jan 14 '14 at 16:57
  • 1
    @Joopk.com Generally, a claim about processor time should be backed up with some investigation saying that's where you're actually losing performance. Especially in this case, where you need the ID to be unique across several documents – this means you need to read and sort them all everytime you need to start generating new IDs. This can easily be slower in the long run than generating GUIDs. – millimoose Jan 14 '14 at 19:54
  • 1
    @Joopk.com "The best" and "the most efficient" are inherently problematic questions, because the answers depend a lot on usage patterns and the general context of your app. The only good way to get to the answer is to implement several alternatives and actually measure their performance impact in your app, and until you do that, saying "I have performance concerns" is spreading FUD. Or rather, saying "I have concerns about the performance" in general – you should have a provable performance problem. – millimoose Jan 14 '14 at 19:56
  • I agree with you, but saying I need to implement several alternatives and actually measure impact in your app is stating the painfully obvious. Firstly it's pretty simple, if ints are 1/4 the size of guid's you're saving a TON of space in comparison. I obviously have a speed issue, I just figured i'd save the readers the boring details. I have an xml file with approx. 2700 products in it, each product has 7 guids referring to other xml files for information such as status and what not. That's 18900 or so guids. Thats a lot of space and its starting to slow down. – Joopk.com Jan 15 '14 at 22:00
  • Since the file shouldn't get much larger, I was trying to figure out a more efficient way to pull a little more speed out of it. In retrospect the details might have added a little :) I really a appreciate the help millimoose. – Joopk.com Jan 15 '14 at 22:01
0

I agree with @GrantWinney that GUIDs are a surefire way to get unique IDs.

However, you can also use DateTime.Now.ToFileTimeUtc();. It's not not guaranteed to be unique, though, like GUIDs: e.g. computers in different timezones adding XML records using this method, or even different computers in the same office.

Keeler
  • 2,102
  • 14
  • 20
  • 1
    http://infiniteundo.com/post/25509354022/more-falsehoods-programmers-believe-about-time-wisdom - point #77. (Also, the other ones.) – millimoose Jan 13 '14 at 04:57
  • Yeah, that's my point. Nice list (upvoted)! Current time works in a pinch, and you're not likely to get the exact same file time, but it's never guaranteed to return distinct values. – Keeler Jan 13 '14 at 04:59