Is ANTLR right for this project?
I'm looking to process and transform a string entered in by a user which may include custom functions. For example, the user might write something like $CAPITALIZE('word') in a string and I want to perform the actual transformation in the background using StringUtils.
I would imagine the users will sometimes write nested functions like:
$RIGHT_PAD($RIGHT($CAPITALIZE('a123456789'),6),3,'0')
Where the expected output would be a string value of 'A12345000'.
I tried using regex to split the functions apart, but once nested, it wasn't so easy. I figured I might try writing my own parser, and while doing research I came across an article that suggested using ANTLR instead.
Is this something ANTLR would be right for? If so, are there any similar examples already available for me to look at? Or would someone be kind enough to give me an example of how I might write this out in ANTLR so that I can have both custom functions that can be processable individually and in a nested fashion.
Functions:
- $CAPITALIZE(String str)
- $INDEX_OF(String seq, String searchSeq)
- $LEFT(String str, int len)
- $LEFT_PAD(String str, int size,char padChar)
- $LOWERCASE(String str)
- $RIGHT(String str, int len)
- $RIGHT_PAD(String str, int size, char padChar)
- $STRIP(String str)
- $STRIP_ACCENTS(String input)
- $SUBSTRING(String str, int start)
- $SUBSTRING(String str, int start, int end)
- $TRIM(String str)
- $TRUNCATE(String str, int maxWidth)
- $UPPERCASE(String str)
Basic Examples:
- $CAPITALIZE('word') → 'Word'
- $INDEX_OF('word', 'r') → 2
- $LEFT('0123456789',6) → '012345'
- $LEFT_PAD('0123456789',3, '0') → '0000123456789'
- $LOWERCASE('WoRd') → 'word'
- $RIGHT('0123456789',6) → '456789'
- $RIGHT_PAD('0123456789',3, '0') → '0123456789000'
- $STRIP(' word ') → 'word'
- $STRIP_ACCENTS('wórd') → 'word'
- $SUBSTRING('word', 1) → 'ord'
- $SUBSTRING('word', 0, 2) → 'wor'
- $TRIM('word ') → 'word'
- $TRUNCATE('more words', 3) → 'more'
- $UPPERCASE('word') → 'WORD'
Nested Examples
- $LEFT_PAD($LEFT('123456789',6),3,'0') → '000123456'
- $RIGHT_PAD($RIGHT($CAPITALIZE('a123456789'),6),3,'0') → 'A12345000'
Actual Example: What I mean by actual example is that this is what I expect a string value might look like. You will notice that there are variables written like ${var}. These variables will be replaced with actual string values using Apache Commons StringSubstitutor prior to passing the String into ANTLR (if it turns out I should use it)
Initial String Written By User \HomeDir\Students\$RIGHT(${graduation.year},2)\$LEFT_PAD($LEFT(${state.id},6),3,'0')
String After Being Processed By StringSubstitutor \HomeDir\Students\$RIGHT('2020',2)\$LEFT_PAD($LEFT('123456789',6),3,'0')
String After Being Processed By ANTLR (And my final output)
\HomeDir\Students\20\000123456
Does ANTLR seem like something I should use for this project, or would something else be better suited?