0

I have a dataset like this:

             customer_id                      offer_id                     time 
0   78afa995795e4d85b5d9ceeca43f5fef    9b98b8c7a33c4b65b9aebfe6a799e6d9    0.0 
1   a03223e636434f42ac4c3df47e8bac43    0b1e1539f2cc45b7b9fa7c272da2e1d7    0.0 

I wanted to combine those three column together, the datatype of them are:

customer_id         object
offer_id            object
time               float64

When I use the code below it works fine:

check_1 = transcript['customer_id'] + '--' +transcript['offer_id']
check_1.value_counts()

This returns:

6d2db3aad94648259e539920fc2cf2a6--f19421c1d4aa40978ebb69ca19b0e20d    10
2ea50de315514ccaa5079db4c1ecbc0b--fafdcd668e3743c1bb461111dcafc2a4    10
23d67a23296a485781e69c109a10a1cf--5a8bc65990b245e5a138643cd4eb9837    10
........

But when I tried to combine the time column (because I want to check if any customer received multiple offers with the same timestamp), it gave me error TypeError: must be str, not float

check_2 = transcript['customer_id'] + '--' +transcript['offer_id'] + '--' + transcript['time']
check_2.value_counts()

I tried to convert float to str:

check_2 = transcript['customer_id'] + '--' +transcript['offer_id'] + '--' + str(transcript['time'])
check_2.value_counts()

This returns some odd results:

eece6a9a7bdd4ea1b0f812f34fc619d6--5a8bc65990b245e5a138643cd4eb9837--0          0.00\n1          0.00\n2          0.00\n3          0.00\n4          0.00\n          ...  \n306529    29.75\n306530    29.75\n306531    29.75\n306532    29.75\n306533    29.75\nName: time, Length: 306534, dtype: float64    10
6d2db3aad94648259e539920fc2cf2a6--f19421c1d4aa40978ebb69ca19b0e20d--0          0.00\n1          0.00\n2          0.00\n3          0.00\n4          0.00\n          ...  \n306529    29.75\n306530    29.75\n306531    29.75\n306532    29.75\n306533    29.75\nName: time, Length: 306534, dtype: float64    10
.....

Just wondering what I've done wrong and is there other better ways to do this? Thanks.

wawawa
  • 2,835
  • 6
  • 44
  • 105

1 Answers1

1

Try using the following syntax instead?

transcript['time'].astype(str)
Ahmad Chaiban
  • 106
  • 10
  • Hi thanks, what if I want to combine the two columns based on a condition such as (if transcript['time'] == 7 etc)? – wawawa Jul 28 '20 at 21:45
  • Maybe try filtering instead? The syntax in pandas is transcript[transcript['time']==7].astype(str). Hope it helps. – Ahmad Chaiban Jul 28 '20 at 21:51