-2

I am trying to convert a formatted string into a pandas data frame.

[['CD_012','JM_022','PT_011','CD_012','JM_022','ST_049','MB_021','MB_021','CB_003'
,'FG_031','PC_004'],['NL_003','AM_006','MB_021'],
['JA_012','MB_021','MB_021','MB_021'],['JU_006'],
['FG_002','FG_002','CK_055','ST_049','NM_004','CD_012','OP_002','FG_002','FG_031',
'TG_005','SP_014'],['FG_002','FG_031'],['MD_010'],
['JA_012','MB_021','NL_003','MZ_020','MB_021'],['MB_021'],['PC_004'],
['MB_021','MB_021'],['AM_006','NM_004','TB_006','MB_021']]

I am trying to use the pandas.DataFrame method to do so but the result is that this whole string is placed inside one element in the DataFrame.

Ch3steR
  • 20,090
  • 4
  • 28
  • 58
Sean goodlip
  • 29
  • 1
  • 5
  • 1
    what is the expected outuput? – Rafael Neves Jan 28 '20 at 18:07
  • The expected output should be a DataFrame with all of the itemsets in brackets as elements – Sean goodlip Jan 28 '20 at 18:08
  • that we understood but show us for above sample data what should be your expected output – Sociopath Jan 28 '20 at 18:13
  • `array = [[][]....[][]] df = pd.DataFrame(array)` – SKPS Jan 28 '20 at 18:13
  • @AkshayNevrekar i do not understand the question. I will try to explain this as clearly as possible. Since I have formatted my string as a DataFrame I would like to try and insert all elements into a DataFrame. In this case an element is along the lines of `['CD_012','JM_022','PT_011','CD_012','JM_022','ST_049','MB_021','MB_021','CB_003' ,'FG_031','PC_004'],['NL_003','AM_006','MB_021']`. When I am trying to do this the data is inserted in only one record in the DataFrame. – Sean goodlip Jan 28 '20 at 18:17
  • @SathishSanjeevi how do you propose that I convert the string into an array please? – Sean goodlip Jan 28 '20 at 18:19
  • @Seangoodlip: Just assign your data to an variable and then use `pd.DataFrame(variable)`. – SKPS Jan 28 '20 at 18:20
  • @SathishSanjeevi tried this already but it does not work. – Sean goodlip Jan 28 '20 at 18:27

2 Answers2

0

Is this what you mean?

import pandas as pd


list_of_lists = [['CD_012','JM_022','PT_011','CD_012','JM_022','ST_049','MB_021','MB_021','CB_003'
                ,'FG_031','PC_004'],['NL_003','AM_006','MB_021'],
                ['JA_012','MB_021','MB_021','MB_021'],['JU_006'],
                ['FG_002','FG_002','CK_055','ST_049','NM_004','CD_012','OP_002','FG_002','FG_031',
                'TG_005','SP_014'],['FG_002','FG_031'],['MD_010'],
                ['JA_012','MB_021','NL_003','MZ_020','MB_021'],['MB_021'],['PC_004'],
                ['MB_021','MB_021'],['AM_006','NM_004','TB_006','MB_021']]


result = pd.DataFrame({'result': list_of_lists})
Rafael Neves
  • 467
  • 3
  • 10
0

Best approach would be to split the string with the '],[' delimeter and then convert to df.


import numpy as np
import pandas as pd

def stringToDF(s):
    array = s.split('],[')

    # Adjust the constructor parameters based on your string
    df = pd.DataFrame(data=array,    
              #index=array[1:,0],    
             #columns=array[0,1:]
             ) 

    print(df)
    return df

stringToDF(s)

Good luck!