I want to read the text between two characters (“#*”
and “#@”
) from a file. My file contains thousands of records in the above-mentioned format. I have tried using the code below, but it is not returning the required output. My data contains thousands of records in the given format.
import re
start = '#*'
end = '#@'
myfile = open('lorem.txt')
for line in fhand:
text = text.rstrip()
print (line[line.find(start)+len(start):line.rfind(end)])
myfile.close()
My Input:
\#*OQL[C++]: Extending C++ with an Object Query Capability
\#@José A. Blakeley
\#t1995
\#cModern Database Systems
\#index0
\#*Transaction Management in Multidatabase Systems
\#@Yuri Breitbart,Hector Garcia-Molina,Abraham Silberschatz
\#t1995
\#cModern Database Systems
\#index1
My Output:
51103
OQL[C++]: Extending C++ with an Object Query Capability
t199
cModern Database System
index
...
Expected output:
OQL[C++]: Extending C++ with an Object Query Capability
Transaction Management in Multidatabase Systems