So your input is 1
,2
,3
,4
,5
and you are expecting [1-9]
. Wouldn't [1-5]
be a better fit? Or does [1|2|3|4|5]
cover your data space best? What happens if you add a 8
? Would it rather be [1-8]
or [1|2|3|4|5|8]
. The latter is more optimal in terms of coverage of your observed data.
If you are looking for a regular expression that minimally covers your data records a expression of type [record_1|...|record_n]
is always a good choice. However, this is probably not what you want.
Finding a suitable (=minimal in terms of scope and readability) regex based on input data is very difficult or even impossible since it depends on a subjective quantification of data coverage vs. readability. As a consequence there is no such functionality in SAP HANA (and probably also not in other dbms).
What you can do is to define a set of regexes, that you would expect (e.g. one for numbers, one for text, one for dates, etc.) and programmatically determine the one that best fits your input data.