Im extracting text from a pdf and passing it into a .txt fillet afterwards clean it up and select the parts I want to keep. So I installed the PyPDF2 library. I managed to extract the text from the pdf and copy it into a .txt file. But when I print the lines inside the .txt file the first line is always a "/" followed by the .txt file's name. The piece of code is the following:
import re
f=open('/Users/kenny/Documents/Atomtest1/Analizador_sintaxis/cleanpage.txt','r')
for h in f:
h=h.strip()
if re.search('\S+',h):
print(h)
This is the .txt file, cleanpage.txt :
hello
my name is alfred
And this is the output I receive when I run the code with a virtual environment that has PyPDF2 installed:
/trial.py
hello
my name is alfred
But if I run the program in a virtual environment that doesn't have PyPDF2 installed the output is the following:
hello
my name is alfred
Does anyone know what it is that is causing this variation in the output of the same program when run in different virtual environments. My best guess is that there is some overlap of keywords belonging to basic python and PyPDF2 of which im not aware. Any responses are greatly appreciated.