-3

I am extracting a value from XML and using that value to check if it exists in a PDF file:

XML I have is

<RealTimeLetter>
 <Customer>
    <RTLtr_Acct>0163426</RTLtr_Acct>
    <RTLtr_CustomerName>LSIH JHTWVZ</RTLtr_CustomerName>
    <RTLtr_CustomerAddress1>887 YPCLY THYZO SU</RTLtr_CustomerAddress1>
    <RTLtr_CustomerAddress2 />
    <RTLtr_CustomerCity>WOODSTOCK,</RTLtr_CustomerCity>
    <RTLtr_CustomerState>GA</RTLtr_CustomerState>
    <RTLtr_CustomerZip>30188</RTLtr_CustomerZip>
    <RTLtr_ADAPreference>NONE</RTLtr_ADAPreference>
    <RTLtr_Addressee>0</RTLtr_Addressee>
 </Customer>
</RealTimeLetter>

The PDF file has the Customer Name and address

LSIH JHTWVZ
887 YPCLY THYZO SU
WOODSTOCK, GA 30188

I am using PDF Reader and Nokogiri gems to read the text from PDF, extract the Customer name from XML and perform a check if the PDF includes the Customer name in it.

PDF reader is parsed as

require 'pdf_reader'

  def parse_pdf
    PDF::Reader.new(@filename)
  end

 @reader = file('C:\Users\ecz560\Desktop\30004_Standard.pdf').parse_pdf

require 'nokogiri'

@xml = Nokogiri::XML(File.open('C:\Users\ecz560\Desktop\30004Standard.xml'))

@CustName = @xml.xpath("//Customer[RTLtr_Loancust='0163426']//RTLtr_CustomerName").map(&:text).to_s
 page_index = 0
 @reader.pages.each do |page|
 page_index = page_index+1
   if expect(page.text).to include  @CustName
     valid_text = "Given text is present in -- #{page_index}"
     puts valid_text
   end
 end

But I am getting a error:

RSpec::Expectations::ExpectationNotMetError: expected "LSIH JHTWVZ\n        887 YPCLY THYZO SU\n               WOODSTOCK, GA 30188\n                                                                                                                                  Page 1 of 1" to include "[\"LSIH JHTWVZ\"]"
Diff:
@@ -1,2 +1,80 @@
-["LSIH JHTWVZ"]

+        LSIH JHTWVZ
+        887 YPCLY THYZO SU                                                   
+        WOODSTOCK, GA 30188                                                  

./features/step_definitions/Letters/Test1_Letters.rb:372:in `block (2 levels) in <top (required)>'
./features/step_definitions/Letters/Test1_Letters.rb:370:in `each'
./features/step_definitions/Letters/Test1_Letters.rb:370:in `/^I validate the PDF content$/'
C:\Users\ecz560\Documents\GitHub\ATDD Local\features\FeatureFiles\Letters\Test1_Letters.feature:72:in `Then I validate the PDF content'

In understanding the issue is with the way I am comparing the @Custname. How do I resolve this?

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
S.Bala
  • 21
  • 9
  • Welcome to Stack Overflow. Please read "[mcve]". We don't have enough information to help you and we can't duplicate the problem. You didn't show us the full error information. As a side note, your use of instance variables shows you don't understand how to use them. Also, in Ruby we don't use names like `@CustName` for a variable or a constant. – the Tin Man Mar 01 '16 at 18:13
  • I edited the question with all the required info. – S.Bala Mar 01 '16 at 19:42

2 Answers2

0

if expect(page.text).to include @CustName

expect is not used this way.

  1. an expectation is used in testing, to verify that your code is working the way it should. It should not be used during normal code.

  2. an expectation throws an exception and halts all code if it fails. It doesn't return true/false - you can't continue if it fails - it will throw the exception (correctly) as it did in your code, and then all your code will stop and won't start again.

What you probably want to do is just check the value like this:

if page.text.includes?(@CustName)

(Note: I have not bug-tested that... you will probably have to google for the correct way of writing it and write something similar that actually works.)

Taryn East
  • 27,486
  • 9
  • 86
  • 108
  • The issue here is the value I am storing in @custName is an Array ["LSIH JHTWVZ"]. But I want to store it as string "LSIH JHTWVZ". I am not able to find how to extract elements from XML as strings. – S.Bala Mar 01 '16 at 23:37
  • 1) that is not your only problem ;) 2) why not just do something like: `@custName = @custName.first` to extract the string out of the array? – Taryn East Mar 02 '16 at 00:44
  • or `@CustName = @xml.xpath("//Customer[RTLtr_Loancust='0163426']//RTLtr_CustomerName").map(&:text).first` – Taryn East Mar 02 '16 at 03:31
  • Using @CustName = @xml.xpath("//Customer[RTLtr_Loancust='0163426']//RTLtr_CustomerName").map(&:tex‌​t).first worked. – S.Bala Mar 02 '16 at 17:19
  • 1
    if you can put that in your answer I can accept it. Thank you :) – S.Bala Mar 02 '16 at 17:20
  • It's technically a second question ;) – Taryn East Mar 02 '16 at 23:06
0

One thing I see is that your XPath selector isn't working.

//Customer[RTLtr_Loancust='0163426']//RTLtr_CustomerName

Testing it:

require 'nokogiri'

doc = Nokogiri::XML(<<EOT)
<RealTimeLetter>
<Customer>
    <RTLtr_Acct>0163426</RTLtr_Acct>
    <RTLtr_CustomerName>LSIH JHTWVZ</RTLtr_CustomerName>
    <RTLtr_CustomerAddress1>887 YPCLY THYZO SU</RTLtr_CustomerAddress1>
    <RTLtr_CustomerAddress2 />
    <RTLtr_CustomerCity>WOODSTOCK,</RTLtr_CustomerCity>
    <RTLtr_CustomerState>GA</RTLtr_CustomerState>
    <RTLtr_CustomerZip>30188</RTLtr_CustomerZip>
    <RTLtr_ADAPreference>NONE</RTLtr_ADAPreference>
    <RTLtr_Addressee>0</RTLtr_Addressee>
</Customer>
</RealTimeLetter>
EOT

doc.search("//Customer[RTLtr_Loancust='0163426']//RTLtr_CustomerName").to_xml # => ""

Using a bit of a modification finds the <Customer> node:

doc.search('//Customer/RTLtr_Acct/text()[contains(., "0163426")]/../..').to_xml

# => "<Customer>\n    <RTLtr_Acct>0163426</RTLtr_Acct>\n    <RTLtr_CustomerName>LSIH JHTWVZ</RTLtr_CustomerName>\n    <RTLtr_CustomerAddress1>887 YPCLY THYZO SU</RTLtr_CustomerAddress1>\n    <RTLtr_CustomerAddress2/>\n    <RTLtr_CustomerCity>WOODSTOCK,</RTLtr_CustomerCity>\n    <RTLtr_CustomerState>GA</RTLtr_CustomerState>\n    <RTLtr_CustomerZip>30188</RTLtr_CustomerZip>\n    <RTLtr_ADAPreference>NONE</RTLtr_ADAPreference>\n    <RTLtr_Addressee>0</RTLtr_Addressee>\n</Customer>"

At that point it's easy to grab content from elements inside <Customer>:

customer = doc.search('//Customer/RTLtr_Acct/text()[contains(., "0163426")]/../..')
customer.at('RTLtr_Acct').text # => "0163426"
customer.at('RTLtr_CustomerAddress1').text # => "887 YPCLY THYZO SU"
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
  • I am able to resolve the issue by using @CustName = @xml.xpath("//Customer[RTLtr_Loancust='0163426']//RTLtr_CustomerName").map(&:tex‌​t).first – S.Bala Mar 02 '16 at 17:20
  • Yes, that will work, but it's not how you should write the selector. `//` means start at the top of the DOM and search downwards. So you're saying start at the top and find `Customer` then start at the top and find `RTLtr_CustomerName`, not find the `RTLtr_CustomerName` inside `Customer`. A full selector of `//RTLtr_CustomerName` would accomplish the same thing with your given XML sample but a more complex sample will cause it to fail if you need a specific Customer. – the Tin Man Mar 02 '16 at 20:27
  • I tried with multiple customers in the XML file and it was giving me the element I require. My XML is always going to be simple. If I have any issues I will try out your suggestion. Thank you :) – S.Bala Mar 03 '16 at 15:50