0

HTML:

<span class="number"> - Sep 15, 1991<br><strong>Some Number: </strong>123, 123, 145</span>

Scrapy:

 samples = response.css('ul li.somthing')
    for sample in samples:
        loader = ItemLoader(item=CatelogItem(), selector=sample)
        loader.add_css('some', 'span.number::text')
        yield loader.load_item()

Item.py

some = Field(
    input_processor=MapCompose(str.strip),
    output_processor=Join()
)

Result

- Sep 15, 1991

Expected

- Sep 15, 1991 Some Number: 123, 123, 145

Why is this behavior? how do i get the full value loaded in itemloader?

Satscreate
  • 495
  • 12
  • 38

1 Answers1

0

You needed to grab all the innerhtml instead of text which includes all of it's nested components.

loader.add_css('some', 'span.number *::text')
Arundeep Chohan
  • 9,779
  • 5
  • 15
  • 32