2

I want to change this

def has_class_but_no_id(tag):
    return tag.has_key('class') and not tag.has_key('id')

This function is from Python2 not for Python3

I had idea that

I changed this HTML document in a list like this

list_of_descendants = list(soup.descendants)

So I can get tags which contain class but don't id it is about that find all tags with class = blabla... but not id = .... I have no idea how I can handle this problem

Keyur Potdar
  • 7,158
  • 6
  • 25
  • 40
Tae
  • 23
  • 1
  • 5

2 Answers2

0

The documentation says:

I renamed one method for compatibility with Python 3:

  • Tag.has_key() -> Tag.has_attr()

Also, the exact same function is available in the documentation here:

If none of the other matches work for you, define a function that takes an element as its only argument. The function should return True if the argument matches, and False otherwise.

Here’s a function that returns True if a tag defines the “class” attribute but doesn’t define the “id” attribute:

def has_class_but_no_id(tag):
    return tag.has_attr('class') and not tag.has_attr('id')
Keyur Potdar
  • 7,158
  • 6
  • 25
  • 40
  • ok i used ur suggestions, but there were another problem AttributeError: 'NavigableString' object has no attribute 'has_attr' like i said i changes all Tag_elements in Beautifulsoup as a List and i applied that has_attr but that error message came out can u plz advice me more? – Tae Apr 06 '18 at 08:46
  • You're using the function on text (NavigableString) as well as tags. using `.has_attr` on text will raise the above error. Before calling the function, check if the list item is a tag. You can use `isinstance(item, Tag)` for that. Also, don't forget to import Tag from bs4 module. – Keyur Potdar Apr 06 '18 at 09:24
  • If you can't implement that, add the code you're trying, in the question and I'll edit my answer accordingly. – Keyur Potdar Apr 06 '18 at 09:52
-1

Hey i solve this Problem.

What i had to do is

1.collect all the tags(BeautifulSoup) and all children of tags (contents)

soup = BeautifulSoup(html_doc,"html.parser")
list_of_descendants = list(soup.descendants)

2.eliminate all NavigableStrings(cuz they can't accept has_attr() Methodes)

def terminate_navis(list_of_some):

    new_list = []

    for elem in list_of_some:

        if type(elem) == bs4.element.Tag:
            new_list.append(elem)
        else :
            continue

    return new_list 


new_list = terminate_navis(list_of_descendants)


def contents_adding(arg_list):
//this Method helps that get all the childrens of tags in lists again

    new_list = arg_list

    child_list = []

    for elem in arg_list:

        if elem.contents:

            child_list = elem.contents
            child_list = terminate_navis(child_list)
            new_list.extend(child_list)

        new_list = list(set(new_list))

    return new_list

3.filter all tags if they have attribute 'class' (has_attr) and if they don't have 'id'(also with has_attr)

def justcl(tag_lists):

    class_lists = []

    for elem in tag_lists:
        if elem.has_attr('class'):
            class_lists.append(elem)
        else :
            continue

    return class_lists

def notids(class_lists):

    no_id_lists = []

    for elem in class_lists:

        if elem.has_attr('id'):
            continue
        else :
            no_id_lists.append(elem)

    return no_id_lists
  1. all this collected tags create as a list and print on the screen

print or using for loop and so on...

Ali
  • 1,357
  • 2
  • 12
  • 18
Tae
  • 23
  • 1
  • 5
  • 1
    Can you please format all the code in your answer and update your question with what you want to accomplish. Specifically, with the example. I doubt you need all this messy code from your answer to achieve your goal. – radzak Apr 07 '18 at 19:52