-2
import requests
from bs4 import BeautifulSoup

'''
It's a web crawler working in ebay, collecting every single item data
'''

def ebay_spider(max_pages):
    page = 1
    while page <= max_pages:
        url = 'http://www.ebay.co.uk/sch/Apple-Laptops/111422/i.html?_pgn=' \
              + str(page)
        source_code = requests.get(url)

        plain_text = source_code.text
        soup = BeautifulSoup(plain_text)
        for link in soup.findAll('a', {'class': 'vip'}):
            href = 'http://www.ebay.co.uk' + link.get('href')
            title = link.string
    get_single_item_data(href)
    page += 1


def get_single_item_data(item_url):
    source_code = requests.get(item_url)
    plain_text = source_code.text
    soup = BeautifulSoup(plain_text)
    for item_name in soup.findAll('h1', {'id': "itemTitle"}):
        print(item_name.string)

ebay_spider(3)

Blockquote And the error say that : https://i.stack.imgur.com/zbJ6y.jpg
I tried to fix it but it seems not to work, so any tips/answers how to fix it?

EDIT: Sorry everyone for faulty title and tag, everything was fixed.

Auginis
  • 35
  • 1
  • 1
  • 4
  • have you tried what it tells you? `soup = BeautifulSoup(plain_text,"html.parser",markup_type=markup_type)`. And please post text version of the error, not an unreadable image. – Jean-François Fabre Aug 22 '16 at 18:54
  • This has nothing to do with the `requests` module. – DeepSpace Aug 22 '16 at 18:55
  • @Jean-François Fabre sorry mate for bad pic, but you readed error right. But the problem is that I pasted that line into my code and error appears that says: SyntaxError: invalid character in identifier. For some weird reason I can't find what's wrong with it. And here is the previous error, what post was about: http://pastebin.com/HNL1ENG0 – Auginis Aug 22 '16 at 20:05

2 Answers2

1

When you're trying to make a BeatifulSoup object in line, do instead this:

soup = BeautifulSoup(plain_text)

This:

soup = BeautifulSoup(plain_text, 'html.parser')

Note: your problem refers to bs4 module, not requests.

dannyxn
  • 422
  • 4
  • 16
  • Excuse me sir, sorry for faulty title and tag(my bad) and thanks for answer but then I write this line it says : SyntaxError: invalid character in identifier. imgur.com/a/8NBDi – Auginis Aug 22 '16 at 20:09
  • If you musn't specify markup_type do this:soup = BeautifulSoup(plain_text, 'html.parser') instead this: soup = BeautifulSoup(plain_text, 'html.parser', markup_type=markup_ty‌​pe) . And if my response was useful, mark it as useful. – dannyxn Aug 22 '16 at 20:14
  • I have changed my answer fit to your needs. I refer you to look: https://www.crummy.com/software/BeautifulSoup/bs4/doc/ . – dannyxn Aug 22 '16 at 20:23
  • I tried to write the version than you edited and that is on crummy.com but it still says the same. Here is the code: http://pastebin.com/8mrjy6Mc and here is the error: http://pastebin.com/HrDLi6G6 Please help me if you could – Auginis Aug 23 '16 at 08:10
  • Look to first function, you didn't change all of soup definitions, look at 14th line and do the same as you did with 25th line. – dannyxn Aug 23 '16 at 09:56
0

This is entirely unrelated to the requests module. AS Jean-Francois stated, do what it tells you and move along.

soup = BeautifulSoup(plain_text,"html.parser",markup_type=markup_ty‌​pe)

Carter Smith
  • 17
  • 1
  • 4
  • Excuse me sir, sorry for faulty title and tag(my bad) nd thanks for answer but then I write this line it says : SyntaxError: invalid character in identifier. http://imgur.com/a/8NBDi – Auginis Aug 22 '16 at 19:48