I'm writing a parser than converts LaTeX
math in to python eval()
compatible string.
I get to a point where I have a string that looks like this:
\sqrt{4m/s} - \frac{3v+10.5v}{20a-8a} +1/2
Notice the still mostly LaTeX
syntax, as well as some arbitrary "unit" letters thrown in.
I then use the following Negated set, to replace everything except what is in the negated set.
mathstr = re.sub('[^0-9*()/+\-.Q]','',mathstr)
How Can I include a substring "sqrt" so that it can work in a similar fashion, and preferably in the same regular expression?
Right now my work around is replacing '\sqrt
' with 'Q
', doing the line of code above, and then setting 'Q
' to 'sqrt
' after, my full routine for going from the above syntax to eval()
syntax is as follows:
mathstr = mathstr.replace(" ","")
if pwrRe.search(mathstr):
mathstr = re.sub(pwrRe,'**',mathstr)
if MultiplyRe.search(mathstr):
mathstr = re.sub(MultiplyRe,'*',mathstr)
if DivideRe.search(mathstr) or sqrtRe.search(mathstr):
mathstr = re.sub('\\\\frac{','(',mathstr)
mathstr = re.sub('\\\\sqrt{','\\\\sqrt(',mathstr)
mathstr = re.sub('}{',')/(',mathstr)
mathstr = re.sub('}',')',mathstr)
mathstr = re.sub('[/*+\-^][a-zA-Z]','',mathstr)
mathstr = re.sub('\\\\sqrt','Q',mathstr)
mathstr = re.sub('[^0-9*()/+\-.Q]','',mathstr)
mathstr = re.sub(r'Q','sqrt',mathstr)
Which results in the eval()
syntax'd:
sqrt(4)-(3+10.5)/(20-8)+1/2
But this is sloppy, and it would be useful in many areas if I could 'whitelist' characters and substrings in one line, blowing away all other characters that come up.
EDIT:
As I continue expanding my script this list will get longer but for now I want to match the following and discard everything else:
0123456789()/*+-^sqrt <-- only sqrt when it's a substring
Here are a few examples:
Before: sqrt(5s+2s)+(3s**2/9s)
After: sqrt(5+2)+(3**2/9)
Before: sqrt(4*(5+2)/(2))\$
After: sqrt(4*(5+2)/(2))
Before: sqrt(4v/a)-(3v+10.5v)/(20a-8a)+1/2ohms
After: sqrt(4)-(3+10.5)/(20-8)+1/2
There is some nuance to this beyond simply matching only those characters as well. In my first example you can see I have v/a, even though there is an '/' there, I remove that as well.