3
<ul class='pagination'>
    <li class='active'>
        <a class='btnPage' href='#' page='1'></a>
    </li>
    <li class>
        <a class='btnPage' href='#' page='2'>Next</a>
    </li>
</ul>

I have this Html above and I need to get the link to the next page, with XPath.

The link that is the first page: https://example.com/Search?

And the link when I click on the next button/a element: https://example.com/Search?Page=2

Just a note if it's useful, I'm using Scrapy with Splash.

Can I get the link as supposed when I click the button/a with XPath ?

Gallaecio
  • 3,620
  • 2
  • 25
  • 64
João Koritar
  • 89
  • 1
  • 7
  • 3
    Those links aren't part of the markup, likely achieved with onClick actions via JavaScript. You could construct the URL if you know what they are doing by selecting the value of the `ul/li/a/@page` attribute. – Mads Hansen Jul 21 '20 at 20:26
  • I cannot change code logic, cuz I can break other sites on the pipeline, every is data is taken by xpath.. – João Koritar Jul 21 '20 at 20:32
  • You need to combine the string literal 'https://example.com/Search?Page=' with the result returned from "ul/li/a/@page" - what environment/language are you working with? – Bryn Lewis Jul 22 '20 at 04:28
  • I'm using Python with Scrapy-Splash, I would mentiond this on the question, mah bad.. And as I've said in the comment above, I can't change the code logic, just work with XPath. – João Koritar Jul 22 '20 at 05:57
  • sometimes it's not that straight forward, which means if you didn't find the next page href in source page you need to reconsider your logic. – Moein Kameli Jul 23 '20 at 06:50
  • Yes, it's true, but I will ignore this website and more websites that doesn't have url on @href. Thank you guys. – João Koritar Jul 24 '20 at 06:22
  • Is there any reason you can't do as I suggested? – Bryn Lewis Aug 06 '20 at 09:07
  • Yes, I'm building a generic crawler for some kind of websites to extract usefull information, and when I mean 'generic' I don't have logic for specific websites as this one of the question. I know that your suggestion works well (thanks for it) but just don't want to add logic to my code. My bad. – João Koritar Aug 07 '20 at 18:41

0 Answers0