0

I have the following pandas dataframe

Date        Price       pct_change  cumsum      ccdf

1927-12-30  17.660000   NaN         NaN         NaN
1928-01-03  17.760000   0.005663    0.005663    0.994337
1928-01-04  17.719999   -0.002252   0.003410    0.996590
1928-01-05  17.549999   -0.009594   -0.006183   1.006183
1928-01-06  17.660000   0.006268    0.000084    0.999916
1928-01-09  17.500000   -0.009060   -0.008976   1.008976
1928-01-10  17.370001   -0.007429   -0.016404   1.016404
1928-01-11  17.350000   -0.001151   -0.017556   1.017556
1928-01-12  17.469999   0.006916    -0.010639   1.010639
1928-01-13  17.580000   0.006297    -0.004343   1.004343
1928-01-16  17.290001   -0.016496   -0.020839   1.020839
1928-01-17  17.299999   0.000578    -0.020260   1.020260
1928-01-18  17.260000   -0.002312   -0.022572   1.022572
1928-01-19  17.379999   0.006952    -0.015620   1.015620
1928-01-20  17.480000   0.005754    -0.009866   1.009866
1928-01-23  17.639999   0.009153    -0.000713   1.000713
1928-01-24  17.709999   0.003968    0.003255    0.996745
1928-01-25  17.520000   -0.010728   -0.007473   1.007473
1928-01-26  17.629999   0.006278    -0.001195   1.001195
1928-01-27  17.690001   0.003403    0.002209    0.997791
1928-01-30  17.490000   -0.011306   -0.009097   1.009097
1928-01-31  17.570000   0.004574    -0.004523   1.004523
1928-02-01  17.530001   -0.002277   -0.006800   1.006800
1928-02-02  17.629999   0.005704    -0.001095   1.001095
1928-02-03  17.400000   -0.013046   -0.014141   1.014141
1928-02-06  17.450001   0.002874    -0.011267   1.011267
1928-02-07  17.440001   -0.000573   -0.011841   1.011841
1928-02-08  17.490000   0.002867    -0.008974   1.008974
1928-02-09  17.549999   0.003431    -0.005543   1.005543
1928-02-10  17.540001   -0.000570   -0.006113   1.006113

To calculate the ccdf, I used 1 - cumsum, I'm not sure if that part is done correctly.

I want to plot the ccdf on the y axis, sorted from 0 to 1 and on the X axis,

I want the pct_change sorted from 0 to negative infinity. I don't want any positive numbers.

I used the following code to remove positive pct_change numbers,

for cols in df.columns.tolist()[1:]:
    data = df.ix[df[cols] < 0]

I tried df.plot('cumsum','ccdf') and a few other commands but the plots don't make sense.

This guy seems to have done it in R How to plot CCDF graph on a logarithmic scale? but I can't find anything like that in python.

Any ideas?

The shape should be something like this if I'm not mistaken.

enter image description here

anarchy
  • 3,709
  • 2
  • 16
  • 48

1 Answers1

0

Can be good something like this:

import matplotlib.pyplot as plt

plt.scatter(x=df['pct_change'].values, y=df['ccdf'].values)
plt.xlim(right=0)  # This limits the x axis from negative inf to 0
plt.ylim(bottom=0, top=1) # This clip the y axis from 0 to 1
plt.xlabel('pct_change')
plt.ylabel('ccdf')
plt.show()