2

I am developing a package in BIDS 2005 that starts with a XML (XSLT) Task against a very large input file and keep getting the following error

[XML Task] Error: An error occurred with the following error message: "Exception of type 'System.OutOfMemoryException' was thrown.

My workstation is Win7 64 bit - 16 GIG Ram - i7 8 core processor 3.5 gHZ. I run task manager during the process and it only gets up to about 7.5 gigs of ram used during the run until it crashes with the error. I am running against my local instance of SQL 2008 R2 64 bit. The package is configured in BIDS to run in 64 bit mode. I have googled around for a while but found nothing that looks promising - anyone run into this and resolve before? - Thanks.

Wayne
  • 59,728
  • 15
  • 131
  • 126
Jim Evans
  • 6,285
  • 10
  • 37
  • 60
  • When you say "very large", what exactly do you mean? 100Mb? 100Gb? – Michael Kay Dec 19 '11 at 19:24
  • Please, provide complete (but minimal possible) example of the source XML document and the transformation, so that people could repro the problem. In the absence of this information this doesn't qualify as a "question" at all. There are numerous reasons for getting out of memory and this can be either an endless recursion or a huge XML document, the parsed instance of which cannot fit in the total available memory -- which of these exactly is the reason can be determined using the code that you forgot to show to us. – Dimitre Novatchev Dec 19 '11 at 19:26
  • It would also be helpful if you could disclose some general details about the developed package as well – cfrag Dec 19 '11 at 19:26
  • Also, if the string representation of the XML document is around or larger than 3GB, chances are that its parsed XmlDocument isn't going to fit into 16GB of RAM. – Dimitre Novatchev Dec 19 '11 at 19:28
  • @ Michael Kay - It's about 50 gig – Jim Evans Dec 19 '11 at 19:39
  • @cfrag - What details would you like? this is just the first step of the package - transforming the incomming XML file to a format we can use. – Jim Evans Dec 19 '11 at 19:40
  • @Dimitre Novatchev - unfortunately - this is private data provided by an external source so I cannot provide you the requested data. – Jim Evans Dec 19 '11 at 19:41

2 Answers2

4

It is a wellknown fact that unless an XSLT processor implements streaming (a feature that isn't part of either the W3C XSLT 1.0 or XSLT 2.0 specifications), the whole parsed XML document is as a rule kept in the available RAM.

There are different recommendations to calculate the necessary available RAM for a given XML document string size. One I know is that:

RAM ~= 5 * string-size

Based on this formula, for an XML document with 50GB string size, the needed RAM is around 250GB and this exceeds the available RAM (16GB) by a factor of 15.

You could try to use Saxon 9.4 which offers some streaming extensions.

Note: Even the currently specified streaming feature in the W3C XSLT 3.0 WD doesn't guarantee that every possible transformation can be performed in streaming mode -- in fact, there are significant limitations on what XPath expressions an XSLT transformation may use, in order for this transformation to be guaranteed successful execution in streaming mode.

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
3

I actually found a resolution. I used a script task in place of the XML task to do the transform. It only requires 3 lines of code and worked very well on all the feeds including the large ones. Example code for the script follows:

Imports System
Imports System.Data
Imports System.Math
Imports System.Xml.Xsl
Imports Microsoft.SqlServer.Dts.Runtime

Public Class ScriptMain

Public Sub Main()
        Dim xslt As New XslTransform()

        xslt.Load("C:\VVV XML\your-xml-stylesheet.xsl")
        xslt.Transform("C:\VVV XML\your-source-file.xml", "C:\VVV XML\your-destination-file.xml")

        Dts.TaskResult = Dts.Results.Success
End Sub

End Class
Valentino Vranken
  • 5,597
  • 1
  • 26
  • 28
Jim Evans
  • 6,285
  • 10
  • 37
  • 60
  • This is realy weird that the standard XSLT block (that should probably do the same as this code) throws the Out of memory exception. And this code works smooth.However, keep in mind that the results are slightly different, probably in relation to different formating or something like that. The file sizes are slightly bigger than files created using this code. – przemaz Jun 05 '19 at 21:38