2

I have 2 tables.

Table A has 105 rows:

           bbgid            dt  weekly_price_per_stock  weekly_pct_change
0   BBG000J9HHN8    2018-12-31               13562.328           0.000000
1   BBG000J9HHN8    2019-01-07               34717.536           1.559851
2   BBG000J9HHN8    2019-01-14               28300.218          -0.184844
3   BBG000J9HHN8    2019-01-21               35370.134           0.249818
4   BBG000J9HHN8    2019-01-28               36104.512           0.020763
... ... ... ... ...
100 BBG000J9HHN8    2020-11-30               62065.827           0.278765
101 BBG000J9HHN8    2020-12-07               62145.445           0.001283
102 BBG000J9HHN8    2020-12-14               63516.146           0.022056
103 BBG000J9HHN8    2020-12-21               51283.187          -0.192596
104 BBG000J9HHN8    2020-12-28               51306.951           0.000463

Table B has 257970 rows:

               bbgid            dt    weekly_price_per_stock    weekly_pct_change
0       BBG000B9WJ55    2018-12-31                 34.612737             0.000000
1       BBG000B9WJ55    2019-01-07                 70.618471             1.040245
2       BBG000B9WJ55    2019-01-14                 89.123337             0.262040
3       BBG000B9WJ55    2019-01-21                 90.377643             0.014074
4       BBG000B9WJ55    2019-01-28                 90.527678             0.001660
... ... ... ... ...
257965  BBG00YFR2NJ6    2020-12-21                 30.825000            -0.251275
257966  BBG00YFR2NJ6    2020-12-28                 40.960000             0.328792
257967  BBG00YM46B38    2020-12-14                  0.155900            -0.996194
257968  BBG00YM46B38    2020-12-21                  0.372860             1.391661
257969  BBG00YM46B38    2020-12-28                   0.535650            0.436598

In table A there's only a group of stocks (CCPM) but in table B i have a lot of different stock groups. I want to run a linear regression of table B pct_change vs table A (CCPM) pct_change so i can know how the stocks in table B move with respect to CCPM stocks during the period of time in the dt column. The problem is that i only have 105 rows in table A and when i group table B by bbgid i always get more rows so i'm having a error that says X and y must be the same size.

Both tables have been previously grouped by week and their pct_change has been calculated weekly. I should compare the variations in pct_change from table B with those on table A based on date and one group at a time from table B vs the CCPM stocks' pct_change.

I would like to extract the slope from each regression and store them in a column inside the same table and associate it to its corresponding group.

I have tried the solutions in this post and this post without success.

Is there any workaround to do this or i'm a doing something wrong? Please help me fix this.

Thank you very much in advance.

Miguel 2488
  • 1,410
  • 1
  • 20
  • 41
  • You have to match these stock groups somehow to be able to do a linear regression. Looks date is your key. And you have to compare table A to each stock in table B per group. – Erfan Feb 27 '21 at 12:36
  • Yes. That's exactly what i want – Miguel 2488 Feb 27 '21 at 12:48

0 Answers0