0

I wrote a function which includes a groupby aggregation. For example df.groupby([column, 'columnA', 'columnB', 'columnE'....'columnZ'].sum(), where column is the input variable. Since there are many groupby columns, I don't want to rewrite all of them for different aggregation-levels. For one output, I need to have column equal to one string, but for another output, I need to have two column strings (one more layer of aggregation-level).

I am trying to concatenate two strings for example, 'Category_col1' and 'Category_col2'. If I simply add two strings using 'Category_col1' + ', Category_col2', it will return 'Category_col1, Category_col2'

My desired output looks like this: 'Category_col1', 'Category_col2'. An output of "'Category_col1', 'Category_col2'" would not work if I use that with other groupby aggregation columns. Any way to achieve this?

Jiamei
  • 405
  • 3
  • 14
  • 1
    What type of object are you trying to make with `'apple', 'banana'`? If it's a string, that would be `"'apple', 'banana'"`. Is *that* what you want? – Mark May 12 '22 at 03:46
  • I wrote a function which includes a groupby aggregation. For example df.groupby([column, 'columnA', 'columnB', 'columnC'....'columnZ'].sum(). For one output, I need to have column equal to one string, but for another output, I need to have two column strings. – Jiamei May 12 '22 at 03:49
  • This is a string: `"'Category_col1', 'Category_col2'"`. This is a tuple `'Category_col1', 'Category_col2'`. Do you want a tuple of a string? Consider what happens here: `print('Category_col1', 'Category_col2')` vs `print("'Category_col1', 'Category_col2'")` – Mark May 12 '22 at 03:58
  • I guess I need a tuple then. I edited my question, hopefully it's clearer. – Jiamei May 12 '22 at 04:04
  • So a tuple is made simply with `some_tuple = 'Category_col1', 'Category_col2'`. Then `some_tuple[0]` is `'Category_col1'` and `len(some_tuple)` is `2`. – Mark May 12 '22 at 04:07
  • But I don't think tuple works here either. For example, tuple = ('Category_col1', 'Category_col2'), when I use that as the column input, the aggregation would become like this: df.groupby([('Category_col1', 'Category_col2'), 'columnA', 'columnB', 'columnE'....'columnZ'].sum(), which doesn't work – Jiamei May 12 '22 at 04:26

1 Answers1

0

Maybe something like this.

def my_concat(stringA, stringB):
    return f"'{stringA}',  '{stringB}'"

print(my_concat('apple','banana'))

prints the string "'apple', 'banana'"

EDIT: For building them into a tuple

def make_tuple(stringA, stringB):
    return stringA,  stringB
print(make_tuple('apple','banana'))

or just omit the function altogether

print(('apple','banana'))
Eric Breyer
  • 720
  • 1
  • 10
  • Thanks for your answer. I made some clarifications to my questions. Please take a look – Jiamei May 12 '22 at 03:54
  • `'Category_col1', 'Category_col2'` is not a string. The string would have to look like `"'Category_col1', 'Category_col2'"`. Are you looking for a tuple? – Eric Breyer May 12 '22 at 04:04
  • Yes, sorry, should be tuple then. – Jiamei May 12 '22 at 04:05
  • then just make them into a tuple `print(('apple','banana'))` – Eric Breyer May 12 '22 at 04:10
  • But if I don't think tuple works here either. For example, tuple = ('Category_col1', 'Category_col2'), when I use that as the column input, the aggregation would became like this: df.groupby([('Category_col1', 'Category_col2'), 'columnA', 'columnB', 'columnE'....'columnZ'].sum(), which doesn't work. – Jiamei May 12 '22 at 04:12
  • then i'm still not sure I understand the desired result. What do you want the output to look like in this "df.groupby" context – Eric Breyer May 12 '22 at 04:14
  • For df1, I want to replace the input variable column with 'Category_col1' so that the script would be written as df.groupby(['Category_col1', 'columnA', 'columnB', 'columnE'....'columnZ'].sum(). For df2, I want to replace the input variable column with 'Category_col1', 'Category_col2' so that the script would be written as df.groupby(['Category_col1', 'Category_col2', 'columnA', 'columnB', 'columnE'....'columnZ'].sum() – Jiamei May 12 '22 at 04:24
  • This is not string concatenation or tuple building. It looks like you want to insert another string into the list which is an entirely different thing. – Eric Breyer May 12 '22 at 04:31
  • Sorry for the confusion. So, do you think this is do-able? – Jiamei May 12 '22 at 17:55