-1

i have data like this i read it from a file line by line

{0 1,1 1,4 1,6 'text text'}
{0 1,1 1,4 1,5 1,6 'text text text text'}
{0 1,5 1,6 'text texttext text'}
{1 1,6 'text text texttexttext text'}

i want to get all the text between the ' ' so i get it like this

'text text'
'text text text text'
'text texttext text'
'text text texttexttext text'

i tried to use re.sub to remove the first charecters before the '

line=re.sub(r'.*\'', '', line)

but it removed all the charecters Thanks

FaisalAlsalm
  • 109
  • 1
  • 2
  • 12

2 Answers2

0

Try this:

import re
lines = ["{0 1,1 1,4 1,6 'text text'}", 
         "{0 1,1 1,4 1,5 1,6 'text text text text'}",
         "{0 1,5 1,6 'text texttext text'}",
         "{1 1,6 'text text texttexttext text'}"]
for line in lines:
    print(re.sub(r"[^']*('[^']*').*", r"\1", line))

and the output:

'text text'
'text text text text'
'text texttext text'
'text text texttexttext text'
Amin Guermazi
  • 1,632
  • 9
  • 19
0

You can use findall or search

value = "{0 1,1 1,4 1,6 'text text'}"    
content = re.search("('.*')", value).group(0)
content # 'text text'
azro
  • 53,056
  • 7
  • 34
  • 70