I have a stored procedure that returns a large result set (nearly 20 million records). I need to save this result to multiple XML files. I am currently using ADO.Net to fill a dataset, but it quickly throws System.OutOfMemoryException. What other methods that I can use to accomplish this?
-
2use an xml library that can stream. e.g. spit out xml without needing the entire data structure to be in memory/available. read a record, spit out an xml of that record, repeat. – Marc B Jun 07 '16 at 21:43
-
@Marc: I'm not quite following your idea. Do you have an example? I'm trying "For XML PATH" right now and it puts the result to a big huge XML. I'm looking for a way to split this to multiple XML records which I can write to XML files. – T L Jun 07 '16 at 22:15
-
1Rather than filling a `DataSet`, can you get an appropriate [`DataReader`](https://msdn.microsoft.com/en-us/library/haa3afyz.aspx) and stream from that to a file using an `XmlWriter`? – dbc Jun 08 '16 at 00:22
-
1I would use sqlcmd.exe that comes with SQL Server : https://msdn.microsoft.com/en-us/library/ms162773.aspx. 20M is not a very large output from SQL and can be very quick or can take up to an hour depending on the size of the SQL database and how the data is stored. When processing the command through c# can increase processing time 10x.I don't know how long your query takes, but if it is a long time I would recommend sqlcmd.exe.I have an application that was done in a c# form and the application froze.I ended up creating a backgroundworker that spawn a process calling sqlcmd.exe. Have code. – jdweng Jun 08 '16 at 03:31
-
1Do I understand this correctly: Your SP returns millions of rows. You want to write the result in many separate XML files? Correct? What is the grouping decision? How are the files separated (kind of *one per customer*...)? – Shnugo Jun 08 '16 at 07:26
-
My SQL query takes around 15 minutes. Follow this post http://stackoverflow.com/questions/13982871/tsql-one-row-per-element-with-for-xml, I am able to break these records to a few hundred records of XML which I can write to XML files. However, when calling the stored procedure from c#, I still get timeout exception even though I set connection timeout = 60000. – T L Jun 08 '16 at 16:16
-
1@TL, is there a decent criterium to separate your groups? I just imaging something like *per month* or *per customer* or *per product* or similar... In this case you might add an indexed column with your group distinguisher and load the data group by group? The simplest was a running number and load ranges... In any case you should not do a *load all and prepare the XML files* approach... You might use a table valued function better than a stored procedure, that depends... From SS2012 you might read about `OFFSET FETCH` – Shnugo Jun 08 '16 at 19:54
-
Thanks @Shnugo. The data is separated by Vendor/Customer as you said. I was doing the SQL For XML PATH to get a list of about 100 XML rows. One per Vendor/Customer. Even so, I still get OutOfMemoryException when I try to retrieve the XML data to write to a file. I think I will try option /3Gb next. – T L Jun 08 '16 at 20:39
2 Answers
Are you using sql server ?
in this case there is a sql instruction to automatically convert the result of a query into a xml structure, you would then get it as a string in the application.
Options :
you split the string into several ones and save them to files (in the app)
modify PS to split result into several xml objects then get them as different strings / row (1 row => 1 object) and save each of them into a file.
write a new PS that calls the original PS, split result into X xml objects, then returns X xml strings that you just have to save in the application
Not using sql server ?
do the XML formatting in the PS or write a new one that does it
Anyway, if think it will be easier to do the xml formatting server side

- 406
- 3
- 13
Assuming you are using SQL Server - you can use paging in your stored procedure. ROW_NUMBER is an option. SQL Server 2012 and above support OFFSET and FETCH.
Also, how many DataTables are you filling? There are row limits for DataTables.
The maximum number of rows that a DataTable can store is 16,777,216
https://msdn.microsoft.com/en-us/library/system.data.datatable.aspx

- 5,212
- 2
- 19
- 21