4

I have been writing a ruby script that goes through a text file and locates all lines that begin with output path and stores it into a string (linefromtextfile) for that line. So typically it locates lines as below

"output_path":"/data/server/output/1/test_file.txt","text":
"output_path":"/data/server/output/2/test_file.txt","text":

And I want to extract from the lines the pathname (pathtokeep) only and write out to a file, i.e:

/data/server/output/1/
/data/server/output/2/

I have tried this RegEx but its not working:

pathtokeep=linefromtextfile.split(?:\$/.*?/)([^/]*?\.\S*)

Please someone advise here on my RegEx - is split the right way to go or is there an easier way to do this?

Kaspar Lee
  • 5,446
  • 4
  • 31
  • 54
adamjth
  • 75
  • 4
  • You don't need to thank the author of each answer. If you do, one day you'll face the choice of thanking someone for a poor answer or omitting the thank you for just that one person, leaving the obvious implication. If you look at other questions you'll see it's just not done at SO. – Cary Swoveland Apr 11 '16 at 18:33
  • In future, when you give an example it's helpful to assign a variable to each input object. Here that might be `str = '"output_path":"...xt":'`. That way, readers can refer to those variables in answers and comments without having to define them. – Cary Swoveland Apr 11 '16 at 18:40

3 Answers3

3

If your file has the always the same structure you could do it without a regex too.

line = '"output_path":"/data/server/output/1/test_file.txt","text":'

path = line.split(/:"|",/)[1]
# => "/data/server/output/1/test_file.txt"

basename = File.basename(path)
# => "test_file.txt"

File.dirname(path) + '/'
# => "/data/server/output/1/"
guitarman
  • 3,290
  • 1
  • 17
  • 27
2

I suggest using Ruby methods to the extent that you can, employing a regex only to extract the path from the string.

str = '"output_path":"/data/server/output/1/test_file.txt","text":'

r = /
    :"      # match a colon and double quote
    (.+?)   # match one or more of any character, lazily, in capture group 1 
    "       # match a double quote
    /x      # free-spacing regex definition mode

File.dirname(str[r,1])
  #=> "/data/server/output/1"

If you really want the trailing forward slash,

File.dirname(str[r,1]) << "/"
  #=> "/data/server/output/1/"

Should you need it,

File.basename(str[r,1])
  #=> "test_file.txt"

I will leave it to the OP to read and write to files.

If you insist on using a single regex, you could write:

r = /
    (?<=:") # match a colon followed by a double-quote in a positive lookbehind
    .+      # match one more characters, greedily
    \/      # match a forward slash
    /x

str[r]
  #=> "/data/server/output/1/"

Note that .+, being greedy, gobbles up all characters until it reaches the last forward slash in the string.

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
0

Try this RegEx:

(?<="output_path":")(.*?)(?=")

Live Demo on Regex101

How it works:

(?<="output_path":")     # Lookbehind for "output_path":"
(.*?)                    # Data inside "" (Lazy)
(?=")                    # Lookahead for closing "

Demo

Kaspar Lee
  • 5,446
  • 4
  • 31
  • 54