0

Very new to python, trying to explore the possibility of importing a long developed project from another language and a buddy swears that Python is my answer. I have the IDE up and running, scrapy working properly and properly kicking the 'name' and 'rank' listed on the website conveniently to a .csv.

Problem arises in that I have spent the last hour trying to figure out how to extract the 'team player' field on the website. It is a span, it is the first instance I have encountered with scrapy that has a space in the namespace, which seems ill advised.

Below is my code, everything works fine aside from pulling the "team position" last line. The code presented is but a representation of the many iterations I have been through trying to get this. Any help would be greatly appreciated.

import scrapy


class CBS200Spider(scrapy.Spider):
name = "expr"
start_urls = [
    'https://www.cbssports.com/fantasy/football/rankings/ppr/top200/',
    #'https://www.cbssports.com/fantasy/football/rankings/standard/top200/',
]

def parse(self, response):
    for plyr in response.css('div.player-row'):
        yield {
            'name': plyr.css('.player-name::text').get(),
            'rank': plyr.css('.rank::text').get(),
            'team': plyr.css('team position::text').get(),
        }
RKRK
  • 1,284
  • 5
  • 14
  • 18
Ben
  • 3
  • 2
  • Worth noting I have also been going nuts in the scrapy shell trying all kinds of things, still not getting any returns – Ben Aug 14 '19 at 02:27
  • There are third party apps and extensions (Chrome,Firefox) that can help you as well – hadesfv Aug 14 '19 at 03:23

1 Answers1

0

For CSS team and position are two classes and you have to use dot two times - without space.

 '.team.position::text'

BTW: xpath treats "team position" as one name.

furas
  • 134,197
  • 12
  • 106
  • 148