How do I select specific values in column and add character before

Question

How could I select in column 'Funding' all the values ending with "M" and then eliminate M,$ and add "0," before value.

ex. from $535M to 0,535

That's beacuase I have Billion and Million values, I've decided to formatting the column in billion so, values in millions must be 0,...

Here is the example. Df is Unicors companies from Kaggle

df.head(10).to_dict()

{'Company': {0: 'Bytedance',
1: 'SpaceX',
2: 'SHEIN',
3: 'Stripe',
4: 'Klarna',
5: 'Canva',
6: 'Checkout.com',
7: 'Instacart',
8: 'JUUL Labs',
9: 'Databricks'},
'Valuation': {0: '$180B',
1: '$100B',
2: '$100B',
3: '$95B',
4: '$46B',
5: '$40B',
6: '$40B',
7: '$39B',
8: '$38B',
9: '$38B'},
'Date Joined': {0: '2017-04-07',
1: '2012-12-01',
2: '2018-07-03',
3: '2014-01-23',
4: '2011-12-12',
5: '2018-01-08',
6: '2019-05-02',
7: '2014-12-30',
8: '2017-12-20',
9: '2019-02-05'},
'Industry': {0: 'Artificial intelligence',
1: 'Other',
2: 'E-commerce & direct-to-consumer',
3: 'Fintech',
4: 'Fintech',
5: 'Internet software & services',
6: 'Fintech',
7: 'Supply chain, logistics, & delivery',
8: 'Consumer & retail',
9: 'Data management & analytics'},
'City': {0: 'Beijing',
1: 'Hawthorne',
2: 'Shenzhen',
3: 'San Francisco',
4: 'Stockholm',
5: 'Surry Hills',
6: 'London',
7: 'San Francisco',
8: 'San Francisco',
9: 'San Francisco'},
'Country': {0: 'China',
1: 'United States',
2: 'China',
3: 'United States',
4: 'Sweden',
5: 'Australia',
6: 'United Kingdom',
7: 'United States',
8: 'United States',
9: 'United States'},
'Continent': {0: 'Asia',
1: 'North America',
2: 'Asia',
3: 'North America',
4: 'Europe',
5: 'Oceania',
6: 'Europe',
7: 'North America',
8: 'North America',
9: 'North America'},
'Year Founded': {0: 2012,
1: 2002,
2: 2008,
3: 2010,
4: 2005,
5: 2012,
6: 2012,
7: 2012,
8: 2015,
9: 2013},
'Funding': {0: '$8B',
1: '$7B',
2: '$2B',
3: '$2B',
4: '$4B',
5: '$572M',
6: '$2B',
7: '$3B',
8: '$14B',
9: '$3B'},
'Select Investors': {0: 'Sequoia Capital China, SIG Asia Investments, Sina Weibo, Softbank Group', 1: 'Founders Fund, Draper Fisher Jurvetson, Rothenberg Ventures', 2: 'Tiger Global Management, Sequoia Capital China, Shunwei Capital Partners', 3: 'Khosla Ventures, LowercaseCapital, capitalG', 4: 'Institutional Venture Partners, Sequoia Capital, General Atlantic', 5: 'Sequoia Capital China, Blackbird Ventures, Matrix Partners', 6: 'Tiger Global Management, Insight Partners, DST Global', 7: 'Khosla Ventures, Kleiner Perkins Caufield & Byers, Collaborative Fund', 8: 'Tiger Global Management', 9: 'Andreessen Horowitz, New Enterprise Associates, Battery Ventures'}}

I did a similar manipulation with Valuation, here is how I did. I hope it's right.

df['Valuation'] = df['Valuation'].str.replace(
    "B","").str.replace(
    "$","").astype(int)

I've tried in several way but none of them works. Here are some of them:

df['Funding'] = np.where(df.Funding.str.contain("M"), 
                         df['Funding'] = ('0,'+ df['Funding']),
                        pass)

df['Funding'] = df['Funding'].str.replace(
    "B", "").str.replace(
    "$","").str.replace(
    "M","0,")

if df['Funding'].str.contains("M").any():
    df['Funding'] = df['Funding'].str.replace("M", "")

asd = "M"
if any(("M" in asd) for M in df['Funding']):
    df['Funding'].join((df['Funding'][:0],'0,',df['Funding'][0:])) and replace("M", "")

Thank to all who want to help me. It's my first time with Python, I'm more familiare with R

please provide a sample of the DataFrame as **text** – mozway May 23 '22 at 14:45 — mozway, May 23 '22 at 14:45

not_speshal · Answer 1 · 2022-05-23T15:11:12.400

0

If you want all your column values in billions, you can use:

df["Valuation"] = df["Funding"].str[1:-1].astype(int).where(df["Funding"].str.endswith("B"),df["Funding"].str[1:-1].astype(int).div(1000))

>>> df
  Funding  Valuation
0     $8B      8.000
1     $2B      2.000
2   $535M      0.535

Input df:

df = pd.DataFrame({"Funding": ["$8B", "$2B", "$535M"]})

edited May 23 '22 at 15:11

answered May 23 '22 at 14:51

not_speshal

22,093
2
15
30

Then provide a better sample of your data using `df.head(10).to_dict()`. This works on dummy input. – not_speshal May 23 '22 at 15:09
I added to_dict() – Patrizio Iezzi May 23 '22 at 18:40
df['Funding'] = np.where(df["Funding"].str.endswith("M"),df["Funding"].str.replace("$","0,"), df["Funding"].str.replace("$","")) I did it, thank you!! – Patrizio Iezzi May 23 '22 at 18:43

How do I select specific values in column and add character before

1 Answers1

Input df: