0

I have a dataframe with several OrderId, Date of sale, product taht was sale and so on. I am currently trying to calculate the month in which were sold less Motorcycles.

This is the code I wrote, trying with group by to calculate the total amount sold en each month:

Motorcycles =sales_data.loc[sales_data['PRODUCTLINE'] == 'Motorcycles']
Motorcycles['ORDERDATE'] = pd.to_datetime(Motorcycles['ORDERDATE'])
Motorcycles.groupby(pd.Grouper(freq='M'))

Warning shown is: :10: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy Motorcycles['ORDERDATE'] = pd.to_datetime(Motorcycles['ORDERDATE'])

Error shown is: ERROR: TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Int64Index'

I tried also with and this neither works.

Motorcycles.set_index('ORDERDATE').resample('1M').sum()

df.head(10) results:

enter image description here

icatalan
  • 101
  • 2
  • 10

2 Answers2

0

Maybe this will get you in the right direction.

sample = {'date' : ['20200121','20200121','20200124','20200222','20200224','20200225'],
            'Amount' : [10000,10000,5000,6000,7000,8000]}

df = pd.DataFrame(sample)

df['month'] = pd.to_datetime(df['date']).dt.strftime('%Y%m')

df.groupby('month')['Amount'].sum().reset_index()
Lukas Muijs
  • 111
  • 4
  • Thanks Lukas. I tried it but get the error: 'DataFrame' object is not callable, when using the code ```Motorcycles('month')['Amount'].sum().reset_index()``` that you suggested. I added a table to show the data I get until ```Motorcycles['ORDERDATE'] = pd.to_datetime(Motorcycles['ORDERDATE'])``` – icatalan Mar 14 '21 at 15:41
  • you forgot the groupby after motorcycles.groupby('month')['Amount'].sum().reset_index() – Lukas Muijs Mar 14 '21 at 17:29
  • Thanks @Lukas Muijs, however, the error remains: " KeyError: 'month' " Could this error be, because 'month' isn't a variable? – icatalan Mar 14 '21 at 17:45
0

Ignoring warnings, your column ORDERDATE seems to contain an index instead of a date. Type of indexes being: 'Int64Index'

Why? Because it takes the first column, 'ORDERID' instead of ORDERDATE Precise which column (a.k.a. key) you want to use and you should be good to go.

Motorcycles.groupby(pd.Grouper(key='ORDERDATE', freq='M'))
Florian Fasmeyer
  • 795
  • 5
  • 18
  • You're right @Florian. I've added it. Format seems to be correct, isn't it? – icatalan Mar 14 '21 at 15:36
  • First: Thanks for letting us see the top 10 rows, it helped me understand what was going on. Second: Sorry for the wait, I should have seen this instantly! nowhere in the groupby do we precise which column to use. Question updated, enjoy! :) – Florian Fasmeyer Mar 14 '21 at 20:43