Problem while using bs4: 'NoneType' object is not subscriptable

Question

I'm trying to scrape a Goodreads Page to get all editions of a book, but when I run the code I get this error:

Traceback (most recent call last):
  File "C:/xxx/PycharmProjects/wikipedia_pageview/isbn.py", line 141, in <module>
    ed_details = get_editions_details(isbn) 
  File "C:/xxx/PycharmProjects/wikipedia_pageview/isbn.py", line 79, in get_editions_details
    if ed_link := f"https://www.goodreads.com{ed_item['href']}":...
TypeError: 'NoneType' object is not subscriptable

I tried to put conditions for this reason in the selected areas but they don't work. Code:

def get_editions_details(isbn):
# Create the search URL with the ISBN of the book
data = {'q': isbn}
book_url = get_page("https://www.goodreads.com/search", data)
#print(book_url)
# Parse the markup with Beautiful Soup
soup = bs(book_url.text, 'lxml')
# Retrieve from the book's page the link for other editions
# and the total number of editions
if ed_item := soup.find("div", class_="otherEditionsLink"):
    if ed_item := ed_item.find("a"):
        print(ed_item)
    else:
        pass

if ed_item:
    ed_num = ed_item.text.strip().split(' ')[-1].strip('()')

if ed_link := f"https://www.goodreads.com{ed_item['href']}":#capire...
    print(ed_link)
else:
    pass
return((ed_link, int(ed_num), isbn))  



if __name__ == "__main__":
        try:
            os.mkdir('./urls_files')
        except Exception:
            pass


    isbns = get_isbn()

    for isbn in isbns:
            ed_details = get_editions_details(isbn) 
            get_editions_urls(ed_details)

I don't see where in your code that error line is. I see no `if ed_link := ...` in what you've posted. — ddejohn, Feb 08 '22 at 18:28
Right... see my edit. I posted another part of the code it wasn't really necessary — Flaskappdonotwork, Feb 08 '22 at 18:32
Does this answer your question? [Why do I get AttributeError: 'NoneType' object has no attribute 'something'?](https://stackoverflow.com/questions/8949252/why-do-i-get-attributeerror-nonetype-object-has-no-attribute-something) — Ulrich Eckhardt, Feb 08 '22 at 21:24

HedgeHog · Accepted Answer · 2022-02-08T21:00:54.480

What happens?

The indentation in your example seems to be not correct and will not handle wrong or missing isbn or editionlinks.

How to fix?

Assign the values to ed_link and ed_num in the moment you can be sure there exist a href in ed_item else set them to None or 0 or handle these issue in another way:

def get_editions_details(isbn):

    data = {'q': isbn}
    book_url = requests.get("https://www.goodreads.com/search", data)
    soup = bs(book_url.text, 'lxml')

    ed_link = None
    ed_num = 0

    if ed_item := soup.find("div", class_="otherEditionsLink"):
        if ed_item := ed_item.find("a"):
            ed_link = f"https://www.goodreads.com{ed_item['href']}"
            ed_num = ed_item.text.strip().split(' ')[-1].strip('()')
        else:
            pass

    return((ed_link, int(ed_num), isbn))


if __name__ == "__main__":
    #just as example to simulate an error
    ed_details = get_editions_details(1)
    if ed_details[0]:
        get_editions_urls(ed_details)
    else:
        print(f'no editionlinks for isbn:{ed_details[2]}')

I have an error with that: requests.exceptions.MissingSchema: Invalid URL 'None': No schema supplied. Perhaps you meant http://None? — Flaskappdonotwork, Feb 08 '22 at 20:26
As mentioned, you have to handle the case, that there are no results - Take a look, added a check that will skip execution of `get_editions_urls(ed_details)` — HedgeHog, Feb 08 '22 at 20:56

Problem while using bs4: 'NoneType' object is not subscriptable

1 Answers1

What happens?

How to fix?