2

I am trying to extract from some notes, words that ends in .story. This words are always placed into some links such as bla:///bla/bla/bla/.../word.story. The notes may contain multiple links and the format of these notes may vary but there I will always have entries in form of bla///../..../bla.story.

Until now I've used the following expression: [string]$story_name = Select-String \w+..story -input $notes -AllMatches | Foreach {$_.matches -replace ('\.story','')} but now I'm facing some issues with this because it seems that if the link contains entries as bla:///bla/blablaistory/bla/bla/word.story than this expression will also select that word that contains 'istory' and I do not want this to happen. What I should use in order to avoid this?

pandoJohn
  • 415
  • 1
  • 5
  • 8
  • With your above `-replace ('\.story',''` that shouldn't happen as `\.` escapes the dot to match it literal. – LotPings May 26 '17 at 10:54
  • well, it happens :(. check this: `$notes = "alalala/bla//blablahistory/somethingnice.story" [string]$storyName = Select-String \w+...story -input $notes -AllMatches | Foreach {$_.matches -replace ('\.story','')} write-host $storyName` – pandoJohn May 26 '17 at 11:09
  • there isn't `istory` in the string and if you include it there is no match - so what? – LotPings May 26 '17 at 11:15
  • If you don't want it selected at all include the escaping backslash in the `Select-String \w+.\.story` – LotPings May 26 '17 at 11:18
  • there is `istory` in the string and there is a match - the output of the above commands is: `blablahistory somethingnice` but what i want to have is just `somethingnice` – pandoJohn May 26 '17 at 11:19
  • hmmm, that might work :) – pandoJohn May 26 '17 at 11:21

1 Answers1

1
$notes = @"
alalala/bla//blablahistory/somethingnice.istory
alalala/bla//blablahistory/somethingnice.story
alalala/bla//blablahistory/somethingverynice.story
"@

$RE = [RegEx]'/([^/]+)\.story'

$storyName = $notes -split "`n" |
  Select-String $RE -AllMatches | 
    Foreach {$_.Matches.Groups[1]}

$storyName -split "`n" 

Sample output:

> .\SF_852359.ps1
somethingnice
somethingverynice

The more complex RegEx as in the question does the following:

  • [^/] is a negated class matching everything but a slash
  • [^/]+ the trailing plus means at least one of the previous.
  • ([^/]+) the enclosing parentheses mark the first (and here only) capture group
  • /([^/]+)\.story the leading slash and the trailing literal .story frame the word we are after.
  • Results of a Regular Expression survice at least one pipe level and are accessible through the $_.Matches object, the capture groups being numbered from 1
LotPings
  • 1,015
  • 7
  • 12
  • thanks for this answer. before marking it as the correct answer could you please put an explanation for the `[regex]` and `foreach` parts? this way other users will easily understand your answer :) I would also mention that `Select-String \w+.\.story -input $notes -AllMatches | Foreach {$_.matches -replace ('\.story','')}` works as expected in this case :D – pandoJohn May 26 '17 at 13:02
  • Ack I simplified the ForEach ab bit and added an explanation. When working with multiple groups on a line to rearrange dates for example this comes in handy. – LotPings May 26 '17 at 13:35