1

Is there a way to set the u flag and thus enable unicode regex patterns?

I need to match names like Straßer, Müller, Adèle, Yiğit.

/\p{L}+/u or new RegExp('\\p{L}+', 'u') would work in my case if I could use plain JS in JSON schema.

The specification says

6.3.3. pattern
The value of this keyword MUST be a string. This string SHOULD be a valid regular expression, according to the ECMA-262 regular expression dialect.

I found this: How to match a Unicode letter with a JSON Schema pattern (regular expression) . The result is too obfuscating. JavaScript/ECMA Script can handle \p{L} as expected if the u flag is set.

nuiun
  • 764
  • 8
  • 19

1 Answers1

1

The 2020-12 version of JSON Schema (which you reference) has an external more detailed changelog (informative), which details the following which may not be obvious from the specification itself...

Regular expressions are now expected (but not strictly required) to support unicode characters. Previously, this was unspecified and implementations may or may not support this unicode in regular expressions. - https://json-schema.org/draft/2020-12/release-notes.html

If you are using an implementation which supports JSON Schema draft 2020-12, you should be able to use unicode in regex, as that flag should be enabled.

You cannot specify flags with the regular expression because the actual requirements for regular expression support are only SHOULD and not MUST. In the specification world, this means you cannot rely on this to be interoperable. If you only plan to use the schemas internally and you test it and it works (it should given it sounds like you're working with js/node), then you'll probably be OK, but sharing the schemas to others may not work as expected.

Some implementations in other languages use a port of the ECMA-262 regular expression engine, but not all do, and sometimes there isn't a port avilable.

Relequestual
  • 11,631
  • 6
  • 47
  • 83
  • We are evaluating JSON Schema as an alternative to XML Schema. The platform or language is not defined yet. We have systems built in Java and C#. Schema validation is used for in house server to server communication to synchronize heterogeneous systems. – nuiun Aug 19 '21 at 08:56
  • It's not an alternative. XML Schema if you have XML, JSON Schema if you have JSON. Why are you considering changing? – Relequestual Aug 19 '21 at 09:05
  • We are changing the data format for message exchanges. JSON is preferred by employees working with micro controllers for its lightness. By using JSON, we gain more homogeneous communication standards in house. We do not need the text markup capabilities of XML. The message consists of several data fields. Therefore JSON is a good fit for that use case. I like to work with Rust. There is no Schema XML Rust library yet. But there is a JSON Schema lib =) JSON appears to be more readable for persons without programming knowledge. – nuiun Aug 19 '21 at 09:22
  • 1
    OK. Given that, I would not rely on the regular expression engine supporting unicode. You will benefit if you can build a test suite against your schemas which you can run in multiple programming languages to ensure your schemas are as interoperatble as you expect. You could do this by defining a sinlge set of tests in JSON, much like we do for JSON Schema itself. It's good to test your schemas do what you expect anyway. – Relequestual Aug 19 '21 at 09:38
  • 1
    If your JSON Schema evaluator of choice doesn't support unicode character matching, you could send a patch to add that support :) Open source software thrives on contributions from interested parties to add features and fix bugs. – Ether Aug 19 '21 at 16:05
  • Problem is VSCode does not have unicode pattern support xD working with JSON Schema is built in though. VSCode uses it for it's configuration files a lot. This site is working with unicode patterns: https://www.jsonschemavalidator.net/ . I'll see what I can contribute tomorrow. XML Schema support with the red hat plugin for VSCode is marvelous. Just in case you didn't know. Thankyou for your answer and kind replies – nuiun Aug 19 '21 at 20:02