0

I have a set of data from a file as such

"johnnyboy"=splice(23):15,00,30,00,31,00,32,02,39,00,62,00,a3,00,33,00,2d,0f,39,00,\
      00,5c,00,6d,00,65,00,64,00,69,00,61,00,5c,00,57,00,69,00,6e,00,64,00,6f,00,\
      77,00,73,00,20,00,41,00,61,00,63,00,6b,00,65,aa,72,00,6f,00,75,00,6e,dd,64,\
      00,2e,00,77,00,61,00,76,00,ff,00

"johnnyboy"="gotwastedatthehouse"

"johnnyboy"=splice(23):15,00,30,00,31,00,32,02,39,00,62,00,a3,00,33,00,2d,0f,39,00,\
      00,5c,00,6d,00,65,00,64,00,69,00,61,00,5c,00,57,00,69,00,6e,00,64,00,6f,00,\
      77,00,73,00,20,00,41,00,61,00,63,00,6b,00,65,aa,72,00,6f,00,75,00,6e,dd,64,\
      00,2e,00,77,00,61,00,76,00,ff,00


[mattplayhouse\wherecanwego\tothepoolhall]

How can I read/reference the text per "johnnyboy"=splice(23) as as single line as such:

"johnnyboy"=splice(23):15,00,30,00,31,00,32,02,39,00,62,00,a3,00,33,00,2d,0f,39,00,00,5c,00,6d,00,65,00,64,00,69,00,61,00,5c,00,57,00,69,00,6e,00,64,00,6f,00,77,00,73,00,20,00,41,00,61,00,63,00,6b,00,65,aa,72,00,6f,00,75,00,6e,dd,64,00,2e,00,77,00,61,00,76,00,ff,00

I am currently matching he regex based on splice(23): with a search as follows:

re_johnny = re.compile('splice')
with open("file.txt", 'r') as file:
    read = file.readlines()
    for line in read:
        if re_johnny.match(line):
            print(line)

I think I need to take and remove the backslashes and the spaces to merge the lines but am unfamiliar with how to do that and not obtain the blank lines or the new line that is not like my regex. When trying the first solution attempt, my last row was pulled inappropriately. Any assistance would be great.

johnnyb
  • 1,745
  • 3
  • 17
  • 47
  • Possible duplicate of [Read a file with line continuation characters in Python](http://stackoverflow.com/questions/16480495/read-a-file-with-line-continuation-characters-in-python) – AChampion Feb 19 '17 at 05:36

2 Answers2

1

Input file: fin

"johnnyboy"=splice(23):15,00,30,00,31,00,32,02,39,00,62,00,a3,00,33,00,2d,0f,39,00,\
      00,5c,00,6d,00,65,00,64,00,69,00,61,00,5c,00,57,00,69,00,6e,00,64,00,6f,00,\
      77,00,73,00,20,00,41,00,61,00,63,00,6b,00,65,aa,72,00,6f,00,75,00,6e,dd,64,\
      00,2e,00,77,00,61,00,76,00,ff,00

"johnnyboy"="gotwastedatthehouse"

"johnnyboy"=splice(23):15,00,30,00,31,00,32,02,39,00,62,00,a3,00,33,00,2d,0f,39,00,\
      00,5c,00,6d,00,65,00,64,00,69,00,61,00,5c,00,57,00,69,00,6e,00,64,00,6f,00,\
      77,00,73,00,20,00,41,00,61,00,63,00,6b,00,65,aa,72,00,6f,00,75,00,6e,dd,64,\
      00,2e,00,77,00,61,00,76,00,ff,00


[mattplayhouse\wherecanwego\tothepoolhall]

Adding to tigerhawk's suggestion you can try something like this:

Code:

import re

with open('fin', 'r') as f:
    for l in [''.join([b.strip('\\') for b in a.split()]) for a in f.read().split('\n\n')]:
        if 'splice' in l:
            print(l)

Output:

"johnnyboy"=splice(23):15,00,30,00,31,00,32,02,39,00,62,00,a3,00,33,00,2d,0f,39,00,00,5c,00,6d,00,65,00,64,00,69,00,61,00,5c,00,57,00,69,00,6e,00,64,00,6f,00,77,00,73,00,20,00,41,00,61,00,63,00,6b,00,65,aa,72,00,6f,00,75,00,6e,dd,64,00,2e,00,77,00,61,00,76,00,ff,00
"johnnyboy"=splice(23):15,00,30,00,31,00,32,02,39,00,62,00,a3,00,33,00,2d,0f,39,00,00,5c,00,6d,00,65,00,64,00,69,00,61,00,5c,00,57,00,69,00,6e,00,64,00,6f,00,77,00,73,00,20,00,41,00,61,00,63,00,6b,00,65,aa,72,00,6f,00,75,00,6e,dd,64,00,2e,00,77,00,61,00,76,00,ff,00
Mohammad Yusuf
  • 16,554
  • 10
  • 50
  • 78
0

With regex you have multiplied your problems. Instead, keep it simple:

  • If a line starts with ", it begins a record.
  • Else, append it to the previous record.

You can implement parsing for such a scheme in just a few lines in Python. And you don't need regex.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • It's probably easier to do `results = [''.join(item.split()) for item in file.read().split('\n\n')]`, which seems to match the input data. – TigerhawkT3 Feb 19 '17 at 04:26
  • That ended up putting everything onto a single line. I am trying to get three separate lines based on the input data. – johnnyb Feb 19 '17 at 04:41