How to extract only lines with specific word from text file and write a new one?

Question

Whats the way to extract only lines with specific word only from requests (online text file) and write to a new text file? I am stuck here...

This is my code:

r = requests.get('http://website.com/file.txt'.format(x))
with open('data.txt', 'a') as f:
    if 'word' in line:
        f.write('\n')
        f.writelines(str(r.text))
        f.write('\n')

If I remove: if 'word' in line:, it works, but for all lines. So it's only copying all lines from one file to another.

Any idea how to give the correct command to extract (filter) only lines with specific word?

Update: This is working but If that word exist in the requests file, it start copying ALL lines, i need to copy only the line with 'SOME WORD'.

I have added this code:

for line in r.text.split('\n'): if 'SOME WORD' in line:

*Thank you guys for all the answers and sorry If i didn't made myself clear.

Please post a minimally reproducible example. We understand that you might need to obfuscate the URL but even so, that's not runnable — DarkKnight, May 03 '22 at 15:57
Why not after checking if 'word' in line, write f.write(line + "\n")? — , May 03 '22 at 16:09
@martineau The source of the text is irrelevant. My point is that the code cannot possibly be executed no matter how *r* is populated. Don't we always try to encourage inquisitors to provide minimally reproducible examples of their problems? — DarkKnight, May 03 '22 at 16:17
@LancelotduLac: My comment was primarily directed at the OP, not you — their question is unclear about what the issue is — getting the lines via requests or extracting the lines. — martineau, May 03 '22 at 16:20
Getting the lines from requests but extract only the one with specific word and write to a new txt file - not all lines - — Ali Alica, May 03 '22 at 16:23
Do you want to know how to get lines of text via requests or how to filter them when writing them to the file? — martineau, May 03 '22 at 16:41
i think you're either copying the code from somewhere with some missing reference ,if 'word' in line but where is the line did you define ? probably you should previously add for loop with for line in r: and continue your statement as is. — Ozgur Oz, May 03 '22 at 16:43

DarkKnight · Answer 1 · 2022-05-03T17:26:51.740

1

Perhaps this will help.

Whenever you invoke POST/GET or whatever, always check the HTTP response code.

Now let's assume that the lines within the response text are delimited with newline ('\n') and that you want to write a new file (change the mode to 'a' if you want to append). Then:

import requests

(r := requests.get('SOME URL')).raise_for_status()

with open('SOME FILENAME', 'w') as outfile:
    for line in r.text.split('\n'):
        if 'SOME WORD' in line:
            print(line, file=outfile)
            break

Note:

You will need Python 3.8+ in order to take advantage of the walrus operator in this code

edited May 03 '22 at 17:26

answered May 03 '22 at 16:52

DarkKnight

19,739
3
6
22

This is working but If that word exist it start copying all lines, i need only the line with the word. I added this code: for line in r.text.split('\n'): if 'SOME WORD' in line: – Ali Alica May 03 '22 at 17:09
Do you mean that you only want the first line containing 'SOME WORD'? This code will **only** write lines that contain 'SOME WORD'. If you only want the first line then add a *break* after the *print()* – DarkKnight May 03 '22 at 17:21
I have added this code: "for line in r.text.split('\n'): if 'SOME WORD' in line: " but it starts copying all the lines If ' SOME WORD' exist in any line. I want to copy ONLY the line with 'SOME WORD' – Ali Alica May 03 '22 at 17:26

score 0 · Answer 2 · answered May 03 '22 at 16:50

I would suggest you these steps for properly handling the file:

Step1:Streamline the download file to a temporary file
Step2:Read lines from the temporary file
Step3:Generate main file based on your filter
Step4:Delete the temporary file

Below is the code that does the following steps:

import requests
import os

def read_lines(file_name):
    with open(file_name,'r') as fp:
        for line in fp:
            yield line


if __name__=="__main__":
    word='ipsum'
    temp_file='temp_file.txt'
    main_file='main_file.txt'
    url = 'https://filesamples.com/samples/document/txt/sample3.txt'
    with open (temp_file,'wb') as out_file:
        content = requests.get(url, stream=True).content
        out_file.write(content)
        
    with open(main_file,'w') as mf:
        out=filter(lambda x: word in x,read_lines(temp_file))
        for i in out:
            mf.write(i)
        os.remove(temp_file)

Ozgur Oz · Answer 3 · 2022-05-03T17:20:29.490

-1

Well , there is missing line you have to put in order to check with if statement.

import requests
r = requests.get('http://website.com/file.txt').text
with open('data.txt', 'a') as f:
    for line in r.splitlines(): #this is your loop where you get a hold of line.
        if 'word' in line: #so that you can check your 'word'
            f.write(line) # write your line  contains your word

edited May 03 '22 at 17:20

answered May 03 '22 at 16:53

Ozgur Oz

740
6
9

This will not work. When you iterate over *r* you will get instances of bytes which will induce a TypeError when you try *f.write(line)* – DarkKnight May 03 '22 at 16:58
fixed and tested , probably not best solution based on op's solution – Ozgur Oz May 03 '22 at 17:21
That code will induce AttributeError because the response object (*r*) doesn't have a *splitlines()* attribute – DarkKnight May 03 '22 at 17:25
you are right but r.text - str with the response text. – Ozgur Oz May 03 '22 at 17:28

How to extract only lines with specific word from text file and write a new one?

3 Answers3