How to extract content inside html tag attr with python?

Question

I am running a scrapy project. I need to extract a content within a tag attribute like this:

<meta itemprop="datePublished" content="2018-07-08">

In this case would be the date within the content attribute. So far I was only able to extract content in the midle of tags.

thanks!

score 3 · Accepted Answer · answered Jul 09 '18 at 05:15

3

Check this out ->

response.css("time::attr(title)").extract()

Reference ->

EDIT

In your case code should be ->

response.css("meta::attr(content)").extract()

Thanks

answered Jul 09 '18 at 05:15

Utkarsh Dubey

1

It worked! response.css("[itemprop=datePublished]::attr(content)").extract() – Rodrigo Jul 09 '18 at 05:59
Congratulation :) – Utkarsh Dubey Jul 09 '18 at 06:41

score 3 · Answer 2 · answered Jul 09 '18 at 12:21

3

Here is XPath way:

content = response.xpath('//meta[@itemprop="datePublished"]/@content').extract_first()

answered Jul 09 '18 at 12:21

gangabass

2 Answers2