2

There is a file called "name.txt"
Content is below

<td>    
    <input class="name" value="Michael">    
    <input class="age" value="22">    
    <input class="location" value="hebei">

</td>


<td>
    <input class="name" value="Jack">
    <input class="age" value="23">
    <input class="location" value="NewYo">
</td>

Now I want to use pyquery to get all input tags, then traversal input tags

Use '.filter' to get all name class and age class

At last, get the value of name and age and write all results into a file called'name_file.txt'

My code is below

# -*- coding: utf-8 -*-
from pyquery import PyQuery as pq
doc = pq(filename='name.txt')

input = doc('input')

for result in input.items():
    name_result = result.filter('.name')
    age_result = result.filter('.age')
    name = name_result.attr('value')
    age = age_result.attr('value')
    print "%s:%s" %(name,age)
    c = "%s:%s" %(name,age)
    f = file('name_file.txt','w')
    f.write(c) 
    f.close()

But now, I met 2 issues

1. The results I got are not "Michael:22", they are "Michael:None" and "None:22"

2. The content of 'name_file' I wrote into is just 'None:None', not all results I got.

Michael
  • 31
  • 5

2 Answers2

2

The first problem stems from the fact that you're looping through all your <input ... > elements (collected by doc('input')) so you only either get the name, or the age, but not the both. What you can do is loop through individual <td> ... </td> blocks and extract the matching children - a bit wasteful but to keep in line with your idea:

from pyquery import PyQuery as pq

doc = pq(filename='name.txt')  # open our document from `name.txt` file

for result in doc('td').items():  # loop through all <td> ... </td> items
    name_result = result.find('.name')  # grab a tag with class="name"
    age_result = result.find('.age')  # grab a tag with class="age"
    name = name_result.attr('value')  # get the name's `value` attribute value
    age = age_result.attr('value')  # get the age's `value` attribute value
    print("{}:{}".format(name, age))  # print it to the STDOUT as name:age

As for the second part - you're opening your name_file.txt file in write mode, writing a line and then closing it on each loop - when you open a file in write mode it will truncate everything in it so you keep writing the first line for each loop. Try doing this instead:

from pyquery import PyQuery as pq

doc = pq(filename='name.txt')  # open our document from `name.txt` file

with open("name_file.txt", "w") as f:  # open name_file.txt for writing
    for result in doc('td').items():  # loop through all <td> ... </td> items
        name_result = result.find('.name')  # grab a tag with class="name"
        age_result = result.find('.age')  # grab a tag with class="age"
        name = name_result.attr('value')  # get the name's `value` attribute value
        age = age_result.attr('value')  # get the age's `value` attribute value
        print("{}:{}".format(name, age))  # print values to the STDOUT as name:age
        f.write("{}:{}\n".format(name, age))  # write to the file as name:age + a new line 
zwer
  • 24,943
  • 3
  • 48
  • 66
0
from pyquery import PyQuery as pq
doc = pq(filename = 'text.txt')

input=doc.children('body')

f = file('name_file.txt', 'w')

for x in [result.html() for result in input.items('td')]:
    x=pq(x)
    name = x('input').eq(0).attr('value')
    age = x('input').eq(1).attr('value')
    print "%s:%s" % (name, age)
    c = "%s:%s" % (name, age)
    f.write(c)

f.close()

You cannot have the file opening statement inside the loop else you'd just have the file being overwritten with just one record on every loop iteration.

Similarly, you close it after the loop and not after inserting every record.

Aakash Verma
  • 3,705
  • 5
  • 29
  • 66
  • This ultimately, suffers from the same problem as the OP's example - it lists individual `input` fields regardless of their grouping ` ... `, it will just produce `:` output in each loop (e.g. `name:Michael`, `age:22` etc.) instead of the OP's desired `Michael:22`, `Jack:23` etc. – zwer Jul 08 '17 at 09:46
  • @zwer I am so sorry, I didn't focus on what my code did. Here's the 110% working one. – Aakash Verma Jul 08 '17 at 11:55