0

I'm writing a simple script that will take URLs pointing to Evernote notes online, and convert them to the evernote:/// protocol. The regex I'm using matches and modifies the URL correctly when I try it out in a regex tester (I'm using Patterns for OS X). However, when I use it with sed, it just returns the original string.

echo "https://www.evernote.com/shard/s2/nl/227468/1875e55a-e512-4cf9-9b18-9e93c6a27359/" | sed 's#https?:_/_/www_.evernote_.com_/shard_/(..)_/nl_/(......)_/(.+_/)#evernote:_/_/_/view_/$2_/$1_/$3$3#'

Any idea why this isn't working? Thanks!

fort

[Edit: In case anyone's interested, this was for the AppleScript bit of a Keyboard Maestro macro:

set theURL to the clipboard set ENcode to "echo \"" & theURL & "\" | sed -E 's#https?://www.evernote.com/shard/(..)/nl/(.*)/(.+/)#evernote:///view/\\2/\\1/\\3\\3#' | pbcopy" do shell script ENcode

Thanks to @DreadPirateShawn for helping me fix the regex. ]

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
fort
  • 103
  • 1
  • 9

2 Answers2

0

Using the extended regex flag -E, removing the underscores, and replacing each $1 pattern with \1 yields a functional regex here:

$ echo "https://www.evernote.com/shard/s2/nl/227468/1875e55a-e512-4cf9-9b18-9e93c6a27359/" | sed -E 's#https?://www\.evernote\.com/shard/(..)/nl/(......)/(.+/)#evernote:///view/\2/\1/\3\3#'
evernote:///view/227468/s2/1875e55a-e512-4cf9-9b18-9e93c6a27359/1875e55a-e512-4cf9-9b18-9e93c6a27359/

(Confirmed on Ubuntu 12.04 and OS X.)

If you don't use -E, then you also need to change s? to [s]? and escape the grouping parentheses:

$ echo "https://www.evernote.com/shard/s2/nl/227468/1875e55a-e512-4cf9-9b18-9e93c6a27359/" | sed  's#http[s]*://www\.evernote\.com/shard/\(.*\)/nl/\(.*\)/\(.*/\)#evernote:///view/\2/\1/\3\3#'
evernote:///view/227468/s2/1875e55a-e512-4cf9-9b18-9e93c6a27359/1875e55a-e512-4cf9-9b18-9e93c6a27359/

In the latter example, I also replaced each (....)-type sequence with (.*) -- unless you're absolutely positive of the length of each sequence (and even then perhaps), the (.*) approach will be a bit more flexible.

DreadPirateShawn
  • 8,164
  • 4
  • 49
  • 71
  • PS: Not being familiar with the evernote protocol, I'm assuming that the repeated group 3 -- `$3$3` -- is intentional. – DreadPirateShawn Jul 15 '14 at 18:15
  • Thanks, that worked. My final version is: ```echo "https://www.evernote.com/shard/s2/nl/227468/1875e55a-e512-4cf9-9b18-9e93c6a27359/" | /usr/local/bin/sed -E 's#https?:\/\/www\.evernote\.com\/shard\/(..)\/nl\/(......)\/(.+\/)#evernote:\/\/\/view\/\2\/\1\/\3\3#'``` – fort Jul 15 '14 at 18:20
  • And yes, the output should look like this: ```evernote:///view/227468/s2/1875e55a-e512-4cf9-9b18-9e93c6a27359/1875e55a-e512-4cf9-9b18-9e93c6a27359/``` – fort Jul 15 '14 at 18:22
  • Escaping the dots just occurred to me right before your comment :-) so I've updated the answer to include that as well. That said, you *don't* need to escape your front-slashes, specifically because you're using `#` as your regex segment delineator. Saves on visual noise to omit those. – DreadPirateShawn Jul 15 '14 at 18:24
  • Actually, I took your advice re. the ```(.*)``` as well. Thanks! – fort Jul 15 '14 at 18:27
0

I think you're trying this:

echo "https://www.evernote.com/shard/s2/nl/227468/1875e55a-e512-4cf9-9b18-9e93c6a27359/" | sed -re 's#https://www.evernote.com/shard/(..)/nl/(......)/(.+)/#evernote://view/\2/\1/\3#'
evernote://view/227468/s2/1875e55a-e512-4cf9-9b18-9e93c6a27359

Making no use of Extended regex:

echo "https://www.evernote.com/shard/s2/nl/227468/1875e55a-e512-4cf9-9b18-9e93c6a27359/" | sed  's#https://www.evernote.com/shard/\(..\)/nl/\(......\)/\(.\+\)/#evernote://view/\2/\1/\3#'
evernote://view/227468/s2/1875e55a-e512-4cf9-9b18-9e93c6a27359
Tiago Lopo
  • 7,619
  • 1
  • 30
  • 51