-1

I know this is a bad one. I really want to know if this is possible to do in python so I have two strings with arithmetic equation now I want to place them inside a df[ ]. df is a data frame Is this possible to do?

X = "'cars'+'bikes'*'planes'"

Now this should be placed like this below

X = df['cars']+df['bikes']*df['planes']

If possible how to do it?

krish
  • 61
  • 9

1 Answers1

1

I am assuming that you know the consequences of using eval.

s =  "'cars'+'bikes'*'planes'"

df['out'] = eval(re.sub(r"([^+\-*\/]+)", r'df[\1]', s))

What is does is basically substitutes df. It changes 'cars'+'bikes'*'planes' to df['cars']+df['bikes']*df['planes']. If you don't want to use eval you can parse the column names and operands like

columns = re.findall(r"'([^+\-*\/]+)'", s)
operands = re.findall(r'([+\-*\/]+)', s)

But in this case, you need to define operate precedence and create a tree to calculate the result.


Update

import re
import pandas as pd

s =  "'cars'+30*'bikes'-'planes'+20"
s2 = re.sub(r"('[^+\-*\/'\d]+')", r'df[\1]', s)

pd.eval(s2)
Epsi95
  • 8,832
  • 1
  • 16
  • 34
  • what if I have s = "'cars'*30+'bikes'*'planes'+20" how do I avoid adding df to 20 and 30? Is it possible to get s = df['cars']*30+df['bikes']*df['planes']+20 this? – krish Feb 16 '21 at 11:37
  • 1
    is there a `*` in between `'bikes''planes'` or it is empty? – Epsi95 Feb 16 '21 at 12:15
  • Opps! it got misplaced. It is this s = "'cars'+30*'bikes'-'planes'+20" and excepted output is s = df['cars']+30*df['bikes']-df['planes']+20. Is it possible to do so? – krish Feb 16 '21 at 14:40
  • 1
    I was about to text that regarding " ' ". It is working fine now Thanks A lotttt brother! you are the best :) – krish Feb 16 '21 at 15:02
  • I have one doubt. So If I conditional/ comparison symbols in it. Will it do the same when I compare it when I tried it was adding to the symbols rather than to the word. This is what I supplied s = "'cars'>2" s2 = re.sub(r"('[>=<=<>!=^+\-*\/'\d]+')", r'df[\1]', s) and this is What I got s = "'cars' df[>] 2' – krish Feb 16 '21 at 16:33
  • 1
    The `^` symbol should be at the begining like like `r"('[^>=<=<>!=+\-*\/'\d]+')"` it means match everything `except` >=<=<>!=+\-*\/'\d`. If youy know that the column names is alphabetical like `aa` `bb` `aBa` and never like `aa_1` `some_32`, simply use `r"('[A-Za-z]+')"`, it just matches `abcd...zABCD..Z` – Epsi95 Feb 16 '21 at 16:37
  • Yes, i got the solution for that in another question Thanks a lot. – krish Feb 16 '21 at 18:24