companies = users_file2.set_index('Created').Clean_Company2
companies = companies.groupby(pd.TimeGrouper(freq='W')).unique()
weekly = companies[:].apply(pd.Series).stack().drop_duplicates()
weekly = weekly.groupby(level=0).apply(lambda x: x.tolist())
weekly = weekly.apply(lambda x: len(x))
I then performed drop_duplicates
on weekly
to get the following:
Created
2015-02-08 3
2015-02-15 1
2015-03-01 1
2015-06-21 8
2015-07-05 1
Now, I would like to fill in the missing weeks with 0. I have played around with resample
and reindex
but get some odd errors, for example, when doing the following:
df.resample('W').fillna(0)
I get the following error:
AttributeError: 'int' object has no attribute 'lower'
If I comment out fillna(0)
, I get a DatetimeIndexResampler
object, which I have checked by plotting, does not do what I want it to (not sure what it is doing, all values are binary when I do that, but it does fill in all the weeks)
The index is a DateTimeIndex
, and the values are int64
.