0

I have some XML created by a SAS V8 routine that I am de-serialising into an object. For some reason, SAS seems to add whitespace to the start and end of every value.

<ROWSET>
 <ROW>
  <value1> 1 </value1>
  <value2> SOMEVALUE </value2>
  <value3 />
 </ROW>
</ROWSET>

I thought that maybe I could maybe deserialise from an XmlReader with the setting to ignore whitespace, but it doesn't seem to work (the whitespace remains).

Public Function GetData(FileName As String) As ObjectModel

    Using r As Xml.XmlReader = Xml.XmlReader.Create(FileName, New Xml.XmlReaderSettings With {.IgnoreWhitespace = True})
        Dim o As New ObjectModel
        Dim x As New XmlSerializer(o.GetType)
        Return x.Deserialize(r)
    End Using

End Function

This answer to a similar question suggests trimming the string while reading it, but how can I achieve the same during deserialization?

I am open to suggestions, including changing the SAS V8 code that creates the XML, but it must be SAS V8 code, not V9.

The SAS code that creates the XML is as follows, I am using xmltype=oracle as it seems to be the nicest output option for V8.

libname myxml xml "&output..\xmldata.xml"  xmltype=oracle;
data myxml.xmldata;
  set area.xmldata;
run;

Please feel free to give an answer in c# or vb.

EDIT Although the answer below works, using find and replace just feels wrong to me - I would always prefer to make the change while doing the initial write or the read into .net.

I found a good answer here, and as such this question is probably a duplicate of this question.

My resulting code is as follows:

Public Class SasXmlTextReader
    Inherits Xml.XmlTextReader
    Public Sub New(stream As IO.Stream)
        MyBase.New(stream)
    End Sub

    Public Overrides Function ReadString() As String
        Return MyBase.ReadString().Trim()
    End Function
End Class

Public Function GetDefects(FileName As String) As ObjectModel
    Using s As New IO.StreamReader(FileName)
        Using r As New SasXmlTextReader(s.BaseStream)
            Dim df As New ObjectModel
            Dim x As New XmlSerializer(df.GetType)
            Return x.Deserialize(r)
        End Using
    End Using
End Function
Community
  • 1
  • 1
  • 1
    Possible duplicate of [XML Deserialization of string elements with newlines in C#](http://stackoverflow.com/questions/7838481/xml-deserialization-of-string-elements-with-newlines-in-c-sharp) –  Nov 17 '15 at 16:44

1 Answers1

0

A crude way of fixing this in SAS would be as follows:

libname myxml xml "c:\temp\xmldata.xml"  xmltype=oracle;
data myxml.xmldata;
  set sashelp.class;
run;

data _null_;
  infile "c:\temp\xmldata.xml";
  file "c:\temp\xmldata_trimmed.xml";
  input;
  _INFILE_ = tranwrd(_INFILE_,'> ','>');
  _INFILE_ = tranwrd(_INFILE_,' <','<');
  put _INFILE_;
run;

This is all base SAS code that should work fine in v8. At face value, it assumes that your data do not contain the strings '> ' or ' <' - however, as SAS escapes XML-ish characters when exporting to xml unless you specify xmlprocess=relax in the libname statement, this is unlikely to be a concern.

Sample row before trimming:

<Name> Alfred </Name>

Sample row after trimming:

 <Name>Alfred</Name>

Example of xml escaping - code:

data myxml.example;
  str='>';
  output;
run;

Resulting xml:

  <str> &gt; </str>
user667489
  • 9,501
  • 2
  • 24
  • 35
  • Hmm, that is a little less elegant a solution than I would like, is that just making SAS do a find and replace on text? It would need to replace the strings `' >'` and `'< '`, not `'> '` and `' <'` –  Nov 17 '15 at 13:48
  • 1
    Yes, this is simple string replacement, and no, the replacements made are correct. – user667489 Nov 17 '15 at 13:53
  • yeah true, brain fart moment lol - I have upvoted for the effort but I can't be sure that I don't have `'> '` in my data, so I am going to have to try something more elegant –  Nov 17 '15 at 14:37
  • Actually, you can - SAS escapes XML-ish characters when exporting to xml unless you specify `xmlprocess=relax` in the `libname` statement. – user667489 Nov 17 '15 at 14:56
  • It wasn't the solution I wanted in this case, but I think it is a great solution for someone who wants to get xml data out of sas v8 without the padding, so I changed the title a bit and gave you the answer :) –  Nov 17 '15 at 21:43