1

I have a dataframe like this:

import pandas as pd

df = pd.DataFrame({'col1': ['abc', 'def', 'tre'],
                   'col2': ['foo', 'bar', 'stuff']})

  col1   col2
0  abc    foo
1  def    bar
2  tre  stuff

and a dictionary like this:

d = {'col1': [0, 2], 'col2': [1]}

The dictionary contains column names and indices of values to be extracted from the dataframe to generate strings like this:

abc (0, col1)

So, each string starts with the element itself and in parenthesis, the index and column name are shown.

I tried the following list comprehension:

l = [f"{df.loc[{indi}, {ci}]} ({indi}, {ci})"
     for ci, vali in d.items()
     for indi in vali]

which yields

['  col1\n0  abc (0, col1)',
 '  col1\n2  tre (2, col1)',
 '  col2\n1  bar (1, col2)']

So, it is almost ok, just the col1\n0 parts need to be avoided.

If I try

f"{df.loc[0, 'col1']} is great"

I get

'abc is great'

as desired, however, with

x = 0
f"{df.loc[{x}, 'col1']} is great"

I get

'0    abc\nName: col1, dtype: object is great'

How could this be fixed?

jpp
  • 159,742
  • 34
  • 281
  • 339
Cleb
  • 25,102
  • 20
  • 116
  • 151

2 Answers2

1
import pandas as pd

df = pd.DataFrame({'col1': ['abc', 'def', 'tre'],
                   'col2': ['foo', 'bar', 'stuff']})

d = {'col1': [0, 2], 'col2': [1]}
x = 0
[f"{df.loc[x, 'col1']} is great"
     for ci, vali in d.items()
     for indi in vali]

which gives you:

['abc is great', 'abc is great', 'abc is great']

is this what you're looking for?

Also you can do loop thru x range

[f"{df.loc[i, 'col1']} is great"
 for ci, vali in d.items()
 for indi in vali
 for i in range(2)]

#output
['abc is great',
 'def is great',
 'abc is great',
 'def is great',
 'abc is great',
 'def is great']
Chandu
  • 2,053
  • 3
  • 25
  • 39
  • Thanks, but you do not use the dictionary `d`. But @jpp's answer solve sit. Was easier than expected. – Cleb Oct 04 '18 at 12:41
1

What you are seeing is the string representation, and ugly newline \n characters, of a pd.Series object returned by the loc acessor.

You should use pd.DataFrame.at to return scalars, and note there's no need here for nested {} for your index labels:

L = [f'{df.at[indi, ci]} ({indi}, {ci})' \
     for ci, vali in d.items() \
     for indi in vali]

print(L)

['abc (0, col1)', 'tre (2, col1)', 'bar (1, col2)']
jpp
  • 159,742
  • 34
  • 281
  • 339
  • 1
    Great, just removing the `{}` already solves it, even when `.loc` is used, but `.at` might be faster. – Cleb Oct 04 '18 at 12:39