0

I'm having a problem, from the link provided (https://www.avisosdeocasion.com/Resultados-Inmuebles.aspx?n=venta-casas-nuevo-leon&PlazaBusqueda=2&Plaza=2.html) I'm trying to get the first information from every table ('2 plantas...3 plantas, etc.) but I'm receiving an empty lis from the code below:

from lxml import html
import requests
mark=2
page = requests.get('https://www.avisosdeocasion.com/Resultados-Inmuebles.aspx?n=venta-casas-nuevo-leon&PlazaBusqueda=2&Plaza=2.html')
tree = html.fromstring(page.content)
while mark<25:
    plantas=tree.xpath('//*[@id="divDetalleResultados"]/table/tbody/tr/td/table[mark]/tbody/tr[1]/td/table/tbody/tr/td[2]/table/tbody/tr[2]/td/table/tbody/tr[1]/td[1]/text()')
    mark=mark+1
print(plantas)

Does someone knows how to fix this?

Daniel Haley
  • 51,389
  • 6
  • 69
  • 95
Luis
  • 53
  • 4

1 Answers1

0

You need to fix your XPath expression. Something like this should work :

from lxml import html
import requests
page = requests.get('https://www.avisosdeocasion.com/Resultados-Inmuebles.aspx?n=venta-casas-nuevo-leon&PlazaBusqueda=2&Plaza=2.html')
tree = html.fromstring(page.content)
plantas = tree.xpath('//td[contains(text(),"terreno")]/preceding-sibling::td/text()')
plantas2 = [item.strip() for item in plantas]
print(plantas2)

Output :

['2 plantas', '3 plantas', '3 plantas', '2 plantas', '2 plantas', '2 plantas', '2 plantas', '2 plantas', '2 plantas', '2 plantas', '2 plantas', '2 plantas', '2 plantas', '2 plantas', '2 plantas', '2 plantas', '2 plantas', '2 plantas', '2 plantas', '2 plantas', '2 plantas', '3 plantas', '2 plantas', '2 plantas']
E.Wiest
  • 5,425
  • 2
  • 7
  • 12
  • Thank you for your answer. But I was wondering, from where did you get that Xpath? That is not the one that Chrome gives me when clicking on Copy Xpath – Luis Jun 03 '20 at 04:22
  • I wrote the XPath myself. You can learn the language or use an extension like "Chropath" to generate such expressions. XPath from the built-in function in Chrome are not the best to use. – E.Wiest Jun 03 '20 at 13:23