-1

I am working on a homework problem where I am supposed to construct a portfolio of stocks based on certain constraints. Based on these constraints I will select my stocks in the portfolio and will conduct re-balancing of my portfolio based on these scores on a monthly basis. So I have two dataframes - one is the stock returns and the other is the scores on which I will pcik my stocks in the portfolio. Further, the data frame shape is 12x360, i.e. the dates are the index and the column headers are the stocks. I have a 360 stocks universe and have to pick top 30 and the bottom 30 based on the scores.

Since I am new to python, I am having difficulty in visualizing how I can pick stocks from a dataframe based on values in another dataframe. Secondly, is it possible to select the top 30 and bottom 30 scores across a single row, since my dates are the index of the data frame.

Below is the initial code I have prepared for cleaning and calculating the returns. If anybody can guide me on the next step, I will be very grateful.

import pandas as pd
import numpy as np


def log_return(price):
    return np.log(price) - np.log(price.shift(1)) 

dataframe = pd.read_excel(r'M:\Overlay\09_Projects\Madhav\Trial.xlsx')
df1 = dataframe.iloc[:,0:14]
df1 = df1.transpose()

df1.columns = df1.iloc[0]
df1 = df1.drop(df1.index[[0,1]])


for cols in df1:
    df1[cols] = pd.to_numeric(df1[cols])

df1_ret = log_return(df1)

df2 = dataframe
df2.drop(df2.iloc[:, 1:14], axis = 1, inplace=True)
df2 = df2.transpose()
df2.columns = df2.iloc[0]
df2 = df2.drop(df2.index[0])

for cols in df2:
    df2[cols] = pd.to_numeric(df2[cols])

1 Answers1

0

If I understand correctly, you have two data frames: one with the stocks(more specifically tickers?) and another one with the scores. To pick the thirty highest/lowest just do (across a column called scores):

top_30 = df.nlargest(30,'scores')
bottom_30= df.nsmallest(30,'scores')

I am not sure how you are linking the scores to the stocks (I am guessing you don't know which one is which just by looking at the number) but one option would be to merge the two data frames together so it gets easier to make operations on them. So if you have a data frame like this:

ticker          score 
AAPL              10
XOM               20
PINS              30

then it is easy to make a mapping of your score data frame: (assuming your first data frame is called df_tickers, the second is df_scores and both have the common column called "tickers")

df_scores = df[['ticker','scores']] 
df_scores = df_scores.set_index('ticker').to_dict()['scores']
df_tickers['scores'] = df_tickers['ticker'].replace(df_scores )

  • Thank you for the reply. Both the dataframes have the stock tickers. One dataframe is tickers with the stock returns and the other is tickers with the scores. Now The tricky bit for me is that the tickets are column heads and not rows. So I will have to iterate over every row (as a single row is one data point). I understand that the tickers can be used to select the stocks from the other dataframe – madhav mundra Aug 02 '19 at 13:38
  • you can't pivot them to make the tickers rows? that would make it 10x more intuitive for you –  Aug 02 '19 at 13:50
  • It would be a lot more intuitive I agree, but since my analysis is a time series I would like the time data points to be the rows and the tickers to be the column heads – madhav mundra Aug 02 '19 at 14:11