-4

I want to match the pattern (including the square brackets, equals, quotes)

[fixedtext="sometext"]

What would be a correct regex expression?

Anything can occur inside quotes. 'fixedtext' is fixed.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
megan adams
  • 355
  • 1
  • 6
  • 10
  • 2
    What have you tried? What's the expected output? Do any edge-cases exist? What are the pools of characters that make up each part of your pattern? Can escaped `\"` be found inside the `""`? Can single quotes also be used, or just double quotes? Is there a possibility of whitespace after `[` or before `]` or even around `=`? Do you have samples of strings that should not be matched (i.e. unterminated `"` or illegal characters)? Can you have nesting like `[abc=[abc="x"]]`? Many questions are left unanswered, so it's difficult to provide a clear and concise answer. Also, is it java or python??? – ctwheels Mar 27 '18 at 18:12
  • “Anything can occur inside quotes.“ that could be a problem then. What about quotes inside quotes? – Patrick Parker Mar 27 '18 at 18:25
  • @megan what escape sequences are allowed inside that string literal (\\, \") ? If so, see [this answer](https://stackoverflow.com/a/37032779/7098259). – Patrick Parker Mar 30 '18 at 16:50

2 Answers2

-1

Your basic solution (although I'd be skeptical of this, per the comments) is essentially:

  "\\[fixedtext=\\\"(.*)\\\"\\]"

which resolves to:

  "\[fixedtext=\"(.*)\"\]"

Simple escaping of [] and quotes. The (.*) says capture everything in quotes as a capture group (matcher.group(1)).

But if you had a string of, for example '[fixedtext="abc\"]def"]' you'd get the an answer of abc\ instead of abc\"]def.

If you know the ending bracket ends the line, then use:

  "\\[fixedtext=\\\"(.*)\\\"\\]$"

(add the $ at the end to mark end of line) and that should be fairly reliable.

user1676075
  • 3,056
  • 1
  • 19
  • 26
  • This `(.*)` will greedily span across multiple `[tags]` – Patrick Parker Mar 28 '18 at 19:58
  • @PatrickParker it's the difference between greedy and possessive. Yes, it will capture, and then it will back off to complete the match. Greedy will backtrack, possessive won't. – user1676075 Mar 30 '18 at 14:06
  • Ah, yes, my comment about abc/def was backwards in that it grabs everything. But the regex still works to correctly grab the contents, barring nested/duplicate entries. The OP wasn't clear if that was a requirement or not. so you don't know if this solution actually works for him or not (given just the example provided, it does). – user1676075 Apr 01 '18 at 13:19
  • hint: try `[^"]` instead of `.` however, if escape sequences are allowed (OP never responded so we don't know) then you'd need something like [this](https://stackoverflow.com/a/37032779/7098259) – Patrick Parker Apr 01 '18 at 14:08
-1

My suggestion is using named-capturing groups. You can find more details here: https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html

Here's an example for your input:

String input = "[fixedtext=\"sometext\"]";

Pattern pattern = Pattern.compile("\\[(?<field>.*)=\"(?<value>.*)\"]");
Matcher matcher = pattern.matcher(input);

if (matcher.matches()) {
  System.out.println(matcher.group("field"));
  System.out.println(matcher.group("value"));
} else {
  System.err.println(input + " doesn't match " + pattern);
}