1

Please help me, I try to use the second time with xpath in lxml. But It's not work.

Here's my code:

from lxml import html,etree
import pprint
import requests
url="http://thuvienphapluat.vn"
page = requests.get(url)
tree=html.fromstring(page.content)
vbplm=tree.xpath('//div[@id="VBPLMOI"]//div[@class="left-col"]')
for vb in vbplm:
    print etree.tostring(vb)
    print ">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>"
    print etree.tostring(vb.xpath('//a')[0],encoding='utf-8')
    break

This line vb.xpath('//a')[0] extract image tag, it's not right with the a tag in vb:

<div class="left-col">&#13;
                <div class="number">&#13;
                    1</div>&#13;
                <div class="nq">&#13;
                    <p class="nqTitle" lawid="342726">&#13;
                        <a onclick="Doc_CT(MemberGA)" href="http://thuvienphapluat.vn/van-ban/Giao-duc/Van-ban-hop-nhat-02-VBHN-BGDDT-huong-dan-152-2007-QD-TTg-hoc-bong-chinh-sach-hoc-sinh-sinh-vien-342726.aspx">V&#259;n b&#7843;n h&#7907;p nh&#7845;t 02/VBHN-BGD&#272;T n&#259;m 2017 h&#432;&#7899;ng d&#7851;n Quy&#7871;t &#273;&#7883;nh 152/2007/Q&#272;-TTg v&#7873; h&#7885;c b&#7893;ng ch&#237;nh s&#225;ch &#273;&#7889;i v&#7899;i h&#7885;c sinh, sinh vi&#234;n h&#7885;c t&#7841;i c&#417; s&#7903; gi&#225;o d&#7909;c thu&#7897;c h&#7879; th&#7889;ng gi&#225;o d&#7909;c qu&#7889;c d&#226;n do B&#7897; Gi&#225;o d&#7909;c v&#224; &#272;&#224;o t&#7841;o ban h&#224;nh</a>&#13;
                    </p>&#13;
                    <p class="links-bot">&#13;
                        <a onmouseover="LS_Tip_New(13,0,1)" onmouseout="hideddrivetip();" style="color:#AFAFAF;">Ti&#7871;ng Anh</a>&#13;
                        |&#13;
                        <a onmouseover="LS_Tip_New(13,0,2)" onmouseout="hideddrivetip();" style="color:#AFAFAF;">V&#259;n b&#7843;n g&#7889;c</a>&#13;
                        |&#13;
                        <a onclick="Doc_Rel(MemberGA)" onmouseover="LS_Tip_New(13,0,4)" onmouseout="hideddrivetip();" href="http://thuvienphapluat.vn/van-ban/Giao-duc/Van-ban-hop-nhat-02-VBHN-BGDDT-huong-dan-152-2007-QD-TTg-hoc-bong-chinh-sach-hoc-sinh-sinh-vien-342726.aspx?tab=3">L&#432;&#7907;c &#273;&#7891;</a>&#13;
                        |&#13;
                        <a onclick="Doc_ST(MemberGA)" onmouseover="LS_Tip_New(13,0,3)" onmouseout="hideddrivetip();" href="http://thuvienphapluat.vn/van-ban/Giao-duc/Van-ban-hop-nhat-02-VBHN-BGDDT-huong-dan-152-2007-QD-TTg-hoc-bong-chinh-sach-hoc-sinh-sinh-vien-342726.aspx?tab=4">Li&#234;n quan hi&#7879;u l&#7921;c</a>&#13;
                        |&#13;
                        <a onclick="Doc_DL(MemberGA)" href="http://thuvienphapluat.vn/van-ban/Giao-duc/Van-ban-hop-nhat-02-VBHN-BGDDT-huong-dan-152-2007-QD-TTg-hoc-bong-chinh-sach-hoc-sinh-sinh-vien-342726.aspx?tab=7">T&#7843;i v&#7873;</a>&#13;
                        &#13;
                        &#13;
                    </p>&#13;
                    &#13;
                </div>&#13;
                &#13;
            </div>&#13;

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<a href="/" title="TH&#x1AF; VI&#x1EC6;N PH&#xC1;P LU&#x1EAC;T">&#13;
        <img src="/images/logo_xuan.png" alt="Logo" class="logo"/></a> 
[Finished in 0.7s]
phuong
  • 273
  • 1
  • 2
  • 10

0 Answers0