1

I'm pulling stock quote data from Yahoo Finance for multiple stocks in 'acoes', but the pd.melt() function is not returning the stock symbols, only the other data columns (result shown in the image).

import pandas as pd
import datetime
import yfinance as yf

acoes = ["PETR4.SA", "BOVA11.SA", "VALE3.SA"]

hoje = datetime.datetime.now()
data_final = hoje - datetime.timedelta(days=30)

dados_acoes = yf.download(acoes, start=data_final, end=hoje)

dados_acoes = dados_acoes.reset_index()

volume_financeiro = dados_acoes['Volume'] * dados_acoes['Close']
volume_financeiro_label = "Volume Financeiro"

volume_financeiro.columns = pd.MultiIndex.from_product([[volume_financeiro_label],volume_financeiro.columns.unique()])

final_df = pd.concat([dados_acoes,volume_financeiro],axis=1)

dados_acoes_melzinho = pd.melt(final_df, id_vars='Date', var_name='Ativo', value_name='VF')

display(dados_acoes_melzinho)

Dataframe Result

How can I display the stock codes in the 'Ativo' column?

  • pd.melt doesn't play well with indexes it removes them. `reset_index` on dataframe to move all the index levels into columns then pd.melt that dataframe retaining all column information. – Scott Boston Apr 19 '23 at 02:12

2 Answers2

1

You are melting the DataFrame final_df which has columns with MultiIndex, and the column names are tuples.

But the column names for your stocks are actually the second element of these tuples.

You can try to extract the stock symbols first, and create a new column with them before you melt the DataFrame.

Change this:

dados_acoes_melzinho = pd.melt(final_df, id_vars='Date', var_name='Ativo', value_name='VF')

to this:

final_df['Ativo'] = final_df['Close'].columns.get_level_values(0)
dados_acoes_melzinho = pd.melt(final_df, id_vars=['Date', 'Ativo'], value_name='VF')
Bryce Chan
  • 1,639
  • 11
  • 26
  • It did not work. I received an error message "ValueError: Length of values (3) does not match length of index (22)"...What could be the problem? – Hugo Carvalho Apr 19 '23 at 22:59
0

Use index operations to reshape your dataframe and keep multi-index information:

>>> (final_df.set_index('Date').rename_axis(columns=[None, 'Symbol'])
             .stack(level='Symbol').reset_index())

         Date     Symbol   Adj Close       Close        High         Low  \
0  2023-03-20   PETR4.SA   22.930000   22.930000   23.650000   22.889999   
1  2023-03-20   VALE3.SA   83.410004   83.410004   83.989998   81.809998   
2  2023-03-21   PETR4.SA   23.400000   23.400000   23.600000   23.080000   
3  2023-03-21   VALE3.SA   82.709999   82.709999   83.440002   81.540001   
4  2023-03-22   PETR4.SA   23.330000   23.330000   23.620001   23.040001   
..        ...        ...         ...         ...         ...         ...   
48 2023-04-17   PETR4.SA   26.709999   26.709999   26.799999   26.209999   
49 2023-04-17   VALE3.SA   77.870003   77.870003   80.440002   77.449997   
50 2023-04-18  BOVA11.SA  102.699997  102.699997  103.050003  101.739998   
51 2023-04-18   PETR4.SA   27.389999   27.389999   27.620001   26.520000   
52 2023-04-18   VALE3.SA   78.540001   78.540001   78.930000   77.660004   

          Open      Volume  Volume Financeiro  
0    23.510000  57575200.0       1.320199e+09  
1    82.720001  19650900.0       1.639082e+09  
2    23.200001  43391200.0       1.015354e+09  
3    83.220001  15692200.0       1.297902e+09  
4    23.389999  46435600.0       1.083343e+09  
..         ...         ...                ...  
48   26.240000  49427100.0       1.320198e+09  
49   79.480003  21479200.0       1.672585e+09  
50  102.989998   7350557.0       7.549022e+08  
51   26.709999  99279800.0       2.719274e+09  
52   78.790001  15115300.0       1.187156e+09  

[53 rows x 9 columns]

Note: you can remove .reset_index() if you want to keep Date and Symbol columns as index.

EDIT

I tried to use resample() to get the monthly average on 'Volume Financeiro'

Try:

>>> final_df.resample('M', on='Date')['Volume Financeiro'].mean()
               BOVA11.SA      PETR4.SA      VALE3.SA
Date                                                
2023-03-31  7.735287e+08  1.167753e+09  1.512644e+09
2023-04-30  7.122048e+08  1.594721e+09  1.747704e+09
Corralien
  • 109,409
  • 8
  • 28
  • 52
  • When I try to manipulate the 'Symbol' column, I receive the error message "KeyError: ['Symbol'] not in index". I am trying to obtain the following result: | | Symbol | Volume Financeiro | |--- |------------ | --------------------- | | 0 |PETR4.SA | 1.320199e+09 | | 1 |VALE3.SA | 1.639082e+09 | | 2 |BOVA11.SA | 1.015354e+09 | I tried to use resample() to get the monthly average on 'Volume Financeiro', but I couldn't, every time I receive the error message "['Symbol'] not in index". – Hugo Carvalho Apr 19 '23 at 22:55
  • The resample() worked, now how to melt() the dataframe? ``` Symbol Volume Financeiro 0 BOVA11.SA 7.122048e+08 1 PETR4.SA 1.594721e+09 2 VALE3.SA 1.747704e+09 ``` – Hugo Carvalho Apr 20 '23 at 14:01
  • After `mean`, you can append `.reset_index().melt('Date', var_name='Symbol', value_name='Volume Financeiro')`. Is it what you expect? – Corralien Apr 20 '23 at 14:07