1

I am attempting to find, in order, class="A" and class="B". In other words, I want to use an OR operator so it prints out the results in proper order. Here are my attempts and results:

#Attempt #1
print(soup.find_all("li", attrs={"class": re.compile(r"Some Text A|Some Text B" )}))

#Attempt #2
soup.findAll("li", {'class':['Some Text A', 'Some Text B']})

#Attempt #3
print(soup.find_all("li", class_= re.compile(r"Some Text A|Some Text B" )))

All attempts have given me an empty list as results, yet there should be 46 results. I can do both classes individually, but I can't figure out how to do them simultaneously. It's important to note that these are two classes that aren't attributed to the same li at the same time, but two different classes that output different results.

No stackoverflow answers have worked so far. I am working with python 3.4 and Beautifulsoup 4

Berzark
  • 61
  • 6

1 Answers1

1

I have found a partial solution. For some reason, regex wouldn't work properly when the string "A" or/and the string "B" would contain spaces. For example :

This doesn't work:

print(soup.find_all("li", attrs={"class": re.compile(r"Some Text A|Some Text B" )}))

however this works:

print(soup.find_all("li", attrs={"class": re.compile(r"A|B" )}))

Thankfully my string was still precise enough while excluding the text after the space. I would appreciate an explanation or workaround for searches involving strings that contain spaces when using regex.

Berzark
  • 61
  • 6
  • That's because `Some Text A` represented as three separate CSS classes : `Some`, `Text`, and `A`. Former discussion regarding the same issue : http://stackoverflow.com/a/13794740/2998271 – har07 Jul 22 '15 at 06:58