0

For doing so, I have a list of lists (which are my clusters), for example:

asset_clusts=[[0,1],[3,5],[2,4, 12],...]

and original dataframe(in my code I call it 'x') is as : return time series of s&p 500 companies

I want to choose column [0,1] of the original dataframe and compute the mean (by row) of them and store it in a new dataframe, then compute the mean of columns [3, 5], and add it to the new dataframe, and so on ...

mu=pd.DataFrame() 
for j in range(get_number_of_elements(asset_clusts)):
    mu=x.iloc[:,asset_clusts[j]].mean(axis=1)

but, it gives to me only a column and i checked, this one column is the mean of last cluster columns

in case of ambiguity, function of get_number_of_elements is:

def get_number_of_elements(clist):
    count = 0
    for element in clist:
        count += 1
    return count
Farhad
  • 1
  • 4

2 Answers2

0
def get_number_of_elements(clust_list):
    count = 0
    for element in clust_list:
        count += 1
    return count
Deven Ramani
  • 751
  • 4
  • 10
  • Thank you, but my problem is not the function. I did the change you suggested but again "mu" dataframe is just the mean of the last cluster columns and does not show me other clusters mean columns @Deven – Farhad Dec 31 '21 at 11:59
0

I solved it and in case if it would be helpful for others, here is the final function:

def clustered_series(x, org_asset_clust):
    """
    x:return data
    org_asset_clust: list of clusters
    ----> mean of each cluster returns by row
    """
    def get_number_of_elements(org_asset_clust):
        count = 0
        for element in org_asset_clust:
            count += 1
        return count
    mu=[]
    for j in range(get_number_of_elements(org_asset_clust)):
        mu.append(x.iloc[:,org_asset_clust[j]].mean(axis=1))
        cluster_mean=pd.concat(mu, axis=1)
        
    return cluster_mean
Farhad
  • 1
  • 4