0

How can one use @match and @exclude rules in a userscript to match URLs that have an arbitrary path at a certain level (a "level" being each new / in the path, not sure if there's a more technical name for it), but not at sublevels (i.e. having subsequent forward-slashes /)?

Example

A userscript is intended to match the homepage of any subreddit on Reddit.com, but not any sub-pages of that subreddit. This is what I've tried:

@match    https://*.reddit.com/r/*/
@exclude  https://*.reddit.com/r/*/*/

My understanding is that this SHOULD match, for example, https://reddit.com/r/funny but NOT match https://reddit.com/r/funny/submit, https://reddit.com/r/funny/new, https://reddit.com/r/funny/comments/blahblahblah, etc. However, my userscript seems to be firing on those sub-pages anyway.

I'm basing this construction off the Mozilla documentation for Match Patterns, which states:

Pattern

https://mozilla.org/\*/b/\*/

Match HTTPS URLs hosted on "mozilla.org", whose path contains a component "b" somewhere in the middle. Will match URLs with query strings, if the string ends in a /.

Example matches

https://mozilla.org/a/b/c/

https://mozilla.org/d/b/f/

https://mozilla.org/a/b/c/d/

https://mozilla.org/a/b/c/d/#section1

https://mozilla.org/a/b/c/d/?foo=/

https://mozilla.org/a?foo=21314&bar=/b/&extra=c/

Example non-matches

https://mozilla.org/b/*/

(unmatched path)

https://mozilla.org/a/b/

(unmatched path)

https://mozilla.org/a/b/c/d/?foo=bar

(unmatched path due to URL query string)

I know I could solve this using the deprecated @include with RegEx (or RegEx in @exclude, for that matter, as Regex in userscript rules generally reduces performance). I know I could add logic to my userscript that validates the pathname before executing the rest of the script, but I'd like an elegant solution using only the basic @match and @exclude if possible.

Is there a better way to form these rules to make that happen?

ETL
  • 188
  • 10
  • The URLs you want to exclude don't have a **trailing** `/` so you need to remove it from `@exclude`. – wOxxOm Jul 18 '23 at 05:33
  • Removing the trailing slash does not work because the asterisk wildcard can represent nothing/null, so `@exclude https://reddit.com/r/*/*` would match (and therefore exclude) "https://reddit.com/r/funny/". To make matters more complicated, pages often have trailing slashes that aren't shown in the browser's urlbar for aesthetic reasons, so it's even harder to figure out whether a given url path will match. – ETL Jul 18 '23 at 21:49
  • The browser never hides the trailing slash in a *non-empty path* of a URL, it's the server's doing, so you'll have to use a regex or list all excluded pages explicitly e.g. you can add 26 exclusions for each alphabet letter + `*`. – wOxxOm Jul 19 '23 at 05:39

1 Answers1

0

Match Patterns are working correctly, but there are other considerations.

Test Script

Note: Some sites, like Reddit, use JavaScript to navigate between pages, which requires additional consideration to run the script.
Open each URL in a new tab, for testing.

// ==UserScript==
// @name          Match Test Script
// @match         https://*.reddit.com/r/*/
// @exclude       https://*.reddit.com/r/*/*/
// ==/UserScript==

console.log('Match Test Script', location.href);

Result

Tested above with FM|GM|TM|VM and all were working as expected.

Match

https://www.reddit.com/r/funny/

No Match

https://www.reddit.com/r/funny/submit
https://www.reddit.com/r/funny/submit/
https://www.reddit.com/r/funny/comments/14jmh7e/forging_a_return_to_productive_conversation_an/

UserScript matching in manifest v3

Once MV3 scripting is fully implemented for userscripts, all userscript managers are expected to use Match Patterns.

UserScript matching in manifest v2

Currently, userscript managers use different methods to match & inject userscripts.

  • GM|TM|VM convert match/include/exclude to regular expression and manually check matching for injection
    (TM has started to use the dedicated API in some cases on Firefox)
  • FireMonkey uses the dedicated API and Firefox checks matching for injection based on Match Patterns
erosman
  • 7,094
  • 7
  • 27
  • 46
  • (reading your answer) "Oh cool, they mentioned FireMonkey! So many people don't even know it exists." (reading your username) "Oh yep, that'll do it." – ETL Jul 18 '23 at 21:52
  • And so, the match rules do work accurately in the sense that they match on "reddit.com/r/funny/submit", but I don't want it to match that. I only want it to match "reddit.com/r/funny/" and no sublevels like "/submit". Is this possible using asterisk wildcards, or do I need to find every possible directory name for the subpages and @exclude those individually? (I know subreddits have "/submit" and "/about" but I'm not sure what else. Discovering them might just require clicking around randomly, and trial and error, so I'd certainly prefer if there's a wildcard-only solution) – ETL Jul 18 '23 at 22:36
  • @ETL I tested the above script in FireMonkey and it matches `reddit.com/r/funny/` and doesn't match `reddit.com/r/funny/submit`, as desired. – erosman Jul 19 '23 at 05:26