0

I have a text file containing a list of names and job descriptions, for example:

Jim Salesman
Dwight Salesman
Pam Receptionist
Michael Manager
Oscar Accountant

I wanted to add the names and jobs of the persons who are "Salesman" to a list. But at the same time, I would also like to print out the full list of names and job descriptions. I wrote the following code for Python:

employee_file = open("employees.txt", "r")
matching = [sales for sales in employee_file if "Salesman" in sales]
print (matching)

print (employee_file.read())

employee_file.close()

The result I got is:

['Jim Salesman\n', 'Dwight Salesman\n']


Process finished with exit code 0

However, when I hash out the 2nd and 3rd lines of code, print(employee_file.read()) will generate the full list of names and job descriptions.

Can someone explain why print (employee_file.read()) is blank when the 2nd and 3rd lines of code are left in? I suspect it is because employee_file is an empty variable. But I can't understand why that is the case.

Do I need to define a new variable employee_file2 and reopen the "employees.txt" file before executing the print function, for example:

employee_file2 = open("employees.txt", "r")
print (employee_file2.read())

Thanks in advance for your help.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
  • 3
    The list comprehension read the entire file, leaving the current position at the very end of the file. Any further attempt to read from the file will return nothing. Reopening the file would solve this - but just doing `employee_file.seek(0)` to move the current position back to the start of the file would be simpler. – jasonharper Sep 06 '20 at 00:04
  • 1
    See https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects: "If the end of the file has been reached, f.read() will return an empty string ('')" – jarmod Sep 06 '20 at 00:11
  • 1
    Your generator read to the eof. Load the file into a list first, then operate on it. `employee_file = open("employees.txt", "r").readlines()`. – TheLazyScripter Sep 06 '20 at 00:11
  • Thanks to everyone who answer. I did not realise the "pointer/cursor" would not reset to the start of the file. It makes sense now. TheLazyScripter's solution is a good simple solution but I also like the idea of reseting it with `employee_file.seek()` – Edward Styles Sep 06 '20 at 00:40

2 Answers2

2

This is because the list comprehension

matching = [sales for sales in employee_file if "Salesman" in sales]

sets the pointer to end of the file, hence there's nothing left to print. If you again open the file and print, then it'll print all the contents.

Do I need to define a new variable employee_file2 and reopen the "employees.txt" file before executing the print function

You certainly can and it'll work. You can also use file_name.seek(0) to move the pointer back to starting position so it'll print whole file again.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
Javin
  • 124
  • 5
  • Thanks wjandrea. I understand now. I hadn't thought of it that way...I didn't realise there was a "pointer" and it needed to be reset. – Edward Styles Sep 06 '20 at 00:42
  • @EdwardStyles you're welcome :D Btw my name is Javin, wjandrea is the other answerer & his name is on this answer because he suggested edits xD – Javin Sep 06 '20 at 04:11
  • Sorry Javin...new to this. Appreciate you for taking the time to answer my query. – Edward Styles Sep 06 '20 at 06:20
0

Python keeps track of where it is in a file with a pointer. When you iterate over all the lines of a file, like in your list comprehension, the pointer goes to the end of the file. Then, per the documentation:

If the end of the file has been reached, f.read() will return an empty string ('').

>>> f.read()
'This is the entire file.\n'
>>> f.read()
''

Instead, get all the data from the file as a list, then process it, not touching the file again.

with open("employees.txt") as f:
    employees = f.read().splitlines()

salespeople = [e for e in employees if "Salesman" in e]

print(salespeople)
# -> ['Jim Salesman', 'Dwight Salesman']
print(employees)
# -> ['Jim Salesman', 'Dwight Salesman', 'Pam Receptionist', 'Michael Manager', 'Oscar Accountant']

BTW, it's best practice to use a with statement. Then you don't need to manually close it, among other things.

Due credit to jasonharper, jarmod, and TheLazyScripter, who posted comments that inspired this answer

wjandrea
  • 28,235
  • 9
  • 60
  • 81