0

I am trying to make a dataframe with Historical data of daily No. of stock Advancing and declining with their respective volumes of Nifty 50 index. Being new to python I am having trouble handling pandas dataframe and conditions.

Below is the code that I wrote, but it has a lot of issues:

  1. df.index = data.index error:ValueError: Length mismatch: Expected axis has 0 elements, new values have 248 elements

  2. if I comment out the above line where I set the index of the empty dataframe, the code runs and gives an empty Dataframe at the end.

    #setting default dates
    end_date = date.today()
    start_date = end_date - timedelta(365)
    
    #Deriving the names of 50 stocks in Nifty 50 Index
    nifty_50 = pd.read_html('https://en.wikipedia.org/wiki/NIFTY_50')
    
    nifty50_symbols = nifty_50[1][1]
    
    df = pd.DataFrame(columns = {'Advances','Declines','Adv_Volume','Dec_Volume'})
    
    
    for x in nifty50_symbols:
        data = nsepy.get_history(symbol = x, start=start_date, end=end_date)
        sclose = data['Close']
        sopen = data['Open']
        svol = data['Volume']
    
    
     ##    df.index = data.index
    
     ##   for i in df.index: --- since df.index was commented out it's value was nill
        for i in data.index:
            if sclose > sopen:
                df['Advances'] = df['Advances'] + 1
                df['Adv_Volume'] = df['Adv_Volume'] + svol
    
            elif sopen > sclose:
                df['Declines'] = df['Declines'] + 1
                df['Dec_Volume'] = df['Dec_Volume'] + svol
    
    
    print(df.tail())
    

Output :

Empty DataFrame
Columns: [Dec_Volume, Declines, Advances, Adv_Volume]
Index: []

EDIT: Found the reason why the code was giving an empty dataframe, because df.index was nill, so the if statement was never triggered.When I changed that part into data.index if statement was triggered. But Now I do not know how do I use the IF statements, since it is giving the error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

EDIT2: Updated Code with the help of Akshay Nevrekar: Still getting an empty dataframe at the end. Also I have to set index of DF as the dates in data.index, so that I can later relate the Advances/declines to their respective dates.

#setting default dates
end_date = date.today()
start_date = end_date - timedelta(365)

#Deriving the names of 50 stocks in Nifty 50 Index
nifty_50 = pd.read_html('https://en.wikipedia.org/wiki/NIFTY_50')

nifty50_symbols = nifty_50[1][1]

df = pd.DataFrame(columns = {'Advances','Declines','Adv_Volume','Dec_Volume'})


for x in nifty50_symbols:
    data = ns.get_history(symbol = x, start=start_date, end=end_date)
##    sclose = data['Close']
##    sopen = data['Open']
##    svol = data['Volume']

##    df.index = data.index

    for i in data.index:
        sclose=data.loc[i]['Close']
        sopen=data.loc[i]['Open']
        svol = data.loc[i]['Volume']

        if sclose > sopen :
            df['Advances'] = df['Advances'] + 1
            df['Adv_Volume'] = df['Adv_Volume'] + svol

        elif sopen > sclose :
            df['Declines'] = df['Declines'] + 1
            df['Dec_Volume'] = df['Dec_Volume'] + svol



print(df

)

Abinash Tripathy
  • 51
  • 1
  • 2
  • 14

1 Answers1

0

You are creating an empty dataframe.

df = pd.DataFrame(columns = {'Advances','Declines','Adv_Volume','Dec_Volume'})

so df.index will be empty and

 for i in df.index:
    if sclose > sopen:
        df['Advances'] = df['Advances'] + 1
        df['Adv_Volume'] = df['Adv_Volume'] + svol

    elif sopen > sclose:
        df['Declines'] = df['Declines'] + 1
        df['Dec_Volume'] = df['Dec_Volume'] + svol

you above for loop will never run. That's why you are getting an empty dataframe.

Edit : After your edit I would suggest you declare sclose and sopen variables in for loop. You are assigning entire column to variable instead of single value.

for i in data.index:
    sclose=data.iloc[i]['Close']
    sopen=data.iloc[i]['Open']
    svol=data.iloc[i]['Volume']

    if sclose > sopen:
       df.loc[i]['Advances'] = df.loc[i]['Advances'] + 1
       df.loc[i]['Adv_Volume'] = df.loc[i]['Adv_Volume'] + svol
    elif sopen > sclose:
       df.loc[i]['Declines'] = df.loc[i]['Declines'] + 1
       df.loc[i]['Dec_Volume'] = df.loc[i]['Dec_Volume'] + svol
Sociopath
  • 13,068
  • 19
  • 47
  • 75
  • Thank you, do I need to make a new question for the remaining error(s)? Since this answer satisfies the Title but I am still having issue with getting the desired output. – Abinash Tripathy Feb 05 '18 at 09:30
  • .iloc of pandas gave me this error - TypeError: cannot do positional indexing on with these indexers [2017-02-06] of – Abinash Tripathy Feb 05 '18 at 09:33
  • use `loc` instead of `iloc` then. I thought you are having integers as an index. – Sociopath Feb 05 '18 at 09:35
  • Well the code ran but still getting empty dataframe when I print df at the end, also I still do not know how to set the index of DF same as data.index, so I can relate the Advances/Declines data with Date. – Abinash Tripathy Feb 05 '18 at 09:43
  • use `loc` again where you are assigning the values. `df.loc[i][column]= ` . – Sociopath Feb 05 '18 at 09:48
  • I got the error KeyError: 'the label [2017-02-06] is not in the [index]', because the index of the dataframe df is empty. I am unable to assign the df inside the loop because that will reset the dataframe's data everytime. defining it outside means I do not have the data to fill for it's index. Hence I tried filling the index inside the loop with df.index = data.index but that also returned an error – Abinash Tripathy Feb 05 '18 at 10:03
  • I think the formula you got here is wrong. `df.loc[i]['Advances'] = df.loc[i]['Advances'] + 1` You are trying to add 1 to the value even before initializing it. – Sociopath Feb 05 '18 at 10:07