0

I have a project with lots of Markdown files that include internal and external (start with http) links. Some of these internal links don't have a .md file extension and so don't work when rendered outside of Jekyll.

Examples:

[link text 1](internal-link)
[link text 2](internal-link-2.md)
[link text 3](http://external-link...)

I am looking for a regular expression that only matches the first of these three cases - internal link without .md file extension.

janpio
  • 10,645
  • 16
  • 64
  • 107

1 Answers1

1

After refining, this could be it:

\[[^]]+\]\((?!http:)(?!.+\.md).+\)

https://regex101.com/r/0uW1cl/5

(removed the capture Groups again)

Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
  • Similar to what I played around with as well, unfortunately only accidentally works. Check this with more examples: http://regexr.com/3h0lq Somehow excludes everything with last character `.` - but not `http` beginning... – janpio Oct 20 '17 at 13:47
  • This here shows what is happening with your suggested regexs (I only added some capture groups): https://regex101.com/r/0uW1cl/3 It is excluding the individual characters. (But thanks for the serious answer anyway!) – janpio Oct 20 '17 at 13:55
  • you willl have to re-add your capturing groups though – Patrick Artner Oct 20 '17 at 14:07
  • Yep, that looks good. First part doesn't work in JS mode, but then `\[.*\]\((?!http:)(?!.+\.md).+\)` does: https://regex101.com/r/0uW1cl/6 – janpio Oct 20 '17 at 14:24
  • Grml... in Visual Studio Code where I want to use this it only works in "search this file", not in "all files search". I hate computers sometimes. – janpio Oct 20 '17 at 14:25
  • 1
    try notepad++ , free, works with regex in searches, can do search in files or mark occurences, has a nice hitlist and some code coloring capabilities- not sure if its regex can do negative lookaheads though – Patrick Artner Oct 20 '17 at 14:33
  • Thanks, that works. (Remove the `:` from the answer maybe, should also exclude `https`). Has minor problems with multiple links per line, but not a deal breaker. – janpio Oct 20 '17 at 14:39
  • Adding a negative lookahead for `#` also includes pure anchor links: https://regex101.com/r/0uW1cl/7 `\[.*\]\((?!http:)(?!#)(?!.+\.md).+\)` (Minor problems with `.md` inside filenames as well, but edge case and not relevant) – janpio Oct 20 '17 at 14:41