0

I have something like this in HTML page:

<ul>
    <li>
        <span data-name-en="data1">Value1</span>
        <span data-view-en="test1"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data2">Value2</span>
        <span data-view-en="test2"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data3">Value3</span>
        <span data-view-en="test3"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data4">Value4</span>
        <span data-view-en="test4"><span class="fa fa-gear"></span></span>
    </li>
</ul>

How I can get all elements that have data-name-en attribute?

Chalist
  • 3,160
  • 5
  • 39
  • 68

2 Answers2

0
from bs4 import BeautifulSoup as bs

s = '''
<ul>
    <li>
        <span data-name-en="data1">Value1</span>
        <span data-view-en="test1"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data2">Value2</span>
        <span data-view-en="test2"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data3">Value3</span>
        <span data-view-en="test3"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data4">Value4</span>
        <span data-view-en="test4"><span class="fa fa-gear"></span></span>
    </li>
</ul>
'''

soup = bs(s, 'xml')
result = [x['data-name-en'] for x in soup('span') if x.has_attr('data-name-en')]

print(result)
Dmitry Erohin
  • 137
  • 1
  • 5
0

I found correct answer:

s = '''
<ul>
    <li>
        <span data-name-en="data1">Value1</span>
        <span data-view-en="test1"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data2">Value2</span>
        <span data-view-en="test2"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data3">Value3</span>
        <span data-view-en="test3"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data4">Value4</span>
        <span data-view-en="test4"><span class="fa fa-gear"></span></span>
    </li>
</ul>
'''

html = PyQuery(s)
items = html.find('li span[data-name-en]')

and for getting attribute value, you need to do this:

pq(item).attr("data-name-en")
Chalist
  • 3,160
  • 5
  • 39
  • 68