1

I am using vb.net and I am pulling in an url xml file using the following code

    Dim PMIDList As String = "25241892,25451079"

    Dim sb As New StringBuilder
    Dim sw As New StringWriter(sb)
    Dim writer As JsonWriter = New JsonTextWriter(sw)

    Dim url As String = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=" + PMIDList + "&rettype=fasta&retmode=xml"
    Dim pmid As String = ""
    Dim pmcid As String = ""
    Dim nihmsid As String = ""



    Dim inStream As StreamReader
    Dim webRequest As WebRequest
    Dim webresponse As WebResponse
    webRequest = webRequest.Create(url)
    webresponse = webRequest.GetResponse()
    inStream = New StreamReader(webresponse.GetResponseStream())

    Dim response As String = inStream.ReadToEnd
    Dim pubXML As String = ""



    Using reader As XmlTextReader = New XmlTextReader(New StringReader(response))

        While reader.ReadToFollowing("PubmedArticle") 'Read till citation

I can pull the elements out that I want with reader.ReadToFollowing("ArticleIds") 'Go to First ArticlesId While reader.Read()

                If reader.Value = "pubmed" Then 'Get
                    reader.ReadToFollowing("Value")
                    pmid = reader.ReadInnerXml()
                End If

                If reader.Value = "pmc" Then
                    reader.ReadToFollowing("Value")
                    pmcid = reader.ReadInnerXml()
                End If

                If reader.Value = "mid" Then
                    reader.ReadToFollowing("Value")
                    nihmsid = reader.ReadInnerXml()
                End If
                If reader.Name = "History" Then Exit While 'Exit loop End of ArticleIds

            End While

but I also want to save the entire PubmedArticle node. I know that the XMLTextreader is forward reading only but is there a way that I can create another reader using the pubXML string below??

     pubXML = "<PubmedArticle>" + reader.ReadInnerXml() + "</PubmedArticle>"

I ended up with a hack

      Private Sub parseXMLPMID()
    Dim PMIDList As String = "25241892,25451079"

    Dim sb As New StringBuilder
    Dim sw As New StringWriter(sb)
    Dim writer As JsonWriter = New JsonTextWriter(sw)

    Dim url As String = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=" + PMIDList + "&rettype=fasta&retmode=xml"
    Dim pmid As String = ""
    Dim pmcid As String = ""
    Dim nihmsid As String = ""



    Dim inStream As StreamReader
    Dim webRequest As WebRequest
    Dim webresponse As WebResponse
    webRequest = webRequest.Create(url)
    webresponse = webRequest.GetResponse()
    inStream = New StreamReader(webresponse.GetResponseStream())

    Dim response As String = inStream.ReadToEnd
    Dim pubXML As String = ""
    Dim myEncoder As New System.Text.UTF8Encoding


    Using reader As XmlTextReader = New XmlTextReader(New StringReader(response))

        While reader.ReadToFollowing("PubmedArticle") 'Read till citation
            pubXML = reader.ReadOuterXml()
            Dim bytes As Byte() = myEncoder.GetBytes(pubXML)
            Dim ms As MemoryStream = New MemoryStream(bytes)
            Dim stream_reader As New StreamReader(ms)

            While stream_reader.Peek() >= 0
                Try
                    Dim line As String = stream_reader.ReadLine()
                    If line.Contains("<ArticleId IdType=""pubmed"">") Then
                        pmid = Strip_Line(line)
                    End If
                    If line.Contains("<ArticleId IdType=""pmc"">") Then
                        pmcid = Strip_Line(line)
                    End If
                    If line.Contains("<ArticleId IdType=""mid"">") Then
                        nihmsid = Strip_Line(line)
                    End If

                Catch ex As Exception

                End Try

            End While
            MessageBox.Show(pmid + " " + pmcid + " " + nihmsid + " " + pubXML)
        End While
    End Using



End Sub

The strip line just pulls out the inner text. I'd rather have clean code

Markus
  • 20,838
  • 4
  • 31
  • 55
Bill
  • 1,423
  • 2
  • 27
  • 51
  • 2
    Is the document you fetch that large that you can't use LINQ to XML or DOM to select and extract the parts you need? – Martin Honnen Nov 29 '16 at 14:28
  • I've got hundreds of pulls to do with maybe as many as 200 PMID's in the url string. I have the code to do it with a DOM but now I'm working on speed. – Bill Nov 29 '16 at 14:35
  • 1
    There is the `ReadNode` method https://msdn.microsoft.com/en-us/library/system.xml.xmldocument.readnode(v=vs.110).aspx to combine XmlReader with DOM, that might help if your task is to read through till you have identified the node you want and then want a DOM representation. LINQ to XML has a similar method. – Martin Honnen Nov 29 '16 at 14:42
  • Thanks but don't to want to use a XmlDocument. I'd rather use a standard textreader and loop line by line if I have to – Bill Nov 29 '16 at 14:47

0 Answers0