-2

I generate a string as below which is a df expression.

However when I add another two variables , I receive a string with lot of slash characters in it. Its working fine when using print but not working when I use return string in the function for second scenario. Could you advise me.

Below is the one which is generated properly.

str =  "{0}{1}.selectExpr('*',{2})".format(df,filter_feature,retrieved_features)

return str

"ptf_overall1.filter(ptf_overall1.measurement_group == 'test').selectExpr('*','case when email_14days > 0 then 1 else 0 end as journey_email_been_sent_flag','case when opened_14days > 0 then 1 else 0 end as journey_opened_flag')"

However when I tried to add below two strings to the existing string, it generates a string with lot of slashes

print(group)

  .groupBy("country")

print(sum)

  .sum("email_14days")



str1 = "{0}{1}.selectExpr('*',{2}){3}{4}".format(df,filter_feature,retrieved_features,group,sum)



return str1
  
'ptf_overall1.filter(ptf_overall1.measurement_group == \'test\').selectExpr(\'*\',\'case when email_14days > 0 then 1 else 0 end as journey_email_been_sent_flag\',\'case when opened_14days > 0 then 1 else 0 end as journey_opened_flag\').groupBy("country").sum("email_14days")'

Expected output should be

"ptf_overall1.filter(ptf_overall1.measurement_group == 'test').selectExpr('*','case when email_14days > 0 then 1 else 0 end as journey_email_been_sent_flag','case when opened_14days > 0 then 1 else 0 end as journey_opened_flag').groupBy("country").sum("email_14days")"

I tried using replace, re, and translate. However not getting the expected output.

sbs
  • 43
  • 10
  • 3
    There is nothing wrong with the output. In the first example code, you `print` the string; in the second, you simply check the *representation of* the string named as `str1`. But *it is the same string value*. It's the same as how you have to type something like `example = '\'"'` if you want to make a string that has both a `'` and a `"` in it. – Karl Knechtel Nov 19 '20 at 23:36
  • Try running `print("isn\'t")` – Mercury Nov 19 '20 at 23:39
  • 1
    Dupe of https://stackoverflow.com/questions/24052654/python-string-adding-backslash-before-single-quotes – Wiktor Stribiżew Nov 19 '20 at 23:41
  • Does this answer your question? [python string adding backslash before single quotes](https://stackoverflow.com/questions/24052654/python-string-adding-backslash-before-single-quotes) – Ryszard Czech Nov 19 '20 at 23:42
  • Hi, I didn't go through all answers . But you are right, if I print the string I get the right result. However I am using return in the function where this string is generated. Is there a way I can return the string as I get in print – sbs Nov 19 '20 at 23:50
  • Unfortunately, none of the comments answered my question . However I understand the reason for it. But this way of returning the string affects the execution of the data-frame definition. Can you suggest a way to fix it. – sbs Nov 19 '20 at 23:56
  • I just modified the question. I am using return in both the places in actual scenario. However it acts differently in in both the cases. – sbs 3 mins ago Edit Delete – sbs Nov 20 '20 at 00:09

1 Answers1

0

A work around I found is as below. However this looks very silly for me. Can anyone suggest a better solution here.

query = print("{0}{1}.selectExpr('*',{2}){3}{4}".format(df,filter_feature,retrieved_features,group,sum))
  return  query

ptf_overall1.filter(ptf_overall1.measurement_group == 'test').selectExpr('*','case when email_14days > 0 then 1 else 0 end as journey_email_been_sent_flag','case when opened_14days > 0 then 1 else 0 end as journey_opened_flag').groupBy("country").sum("email_14days")
sbs
  • 43
  • 10