read from txt file and convert into dataframe in python

Question

I have a txt file as following:

sub_ID: ['sub-01','sub-02']

ses_ID: ['ses-01','ses-01']

mean: [0.3456,0.446]

I want to read this and convert it to a dataframe such as in the image -don't mind the values in mean_e_field column, it's just an example. the values should be the same as in the txt file. desired dataframe

I tried this and got this however I can't transform it to my prefered df :dataframe data = pd.read_csv(filename, sep=",", header=None) data

I appreaciate your answers in advance.

You said that you've tried using `read_csv`, but the example data you've provided is not in a csv format (in fact, it seems like YAML). Is your data presented in the format above, i.e. one line per column and a list of values? — filpa, Jan 10 '23 at 16:05
yes, my data is a txt file with each list in a separate line. I want to convert it to a dataframe where the first element is the column name and the others are row values. and with read_csv in pandas, I could automatically convert my txt file into a dataframe, but the dataframe I want is different than I got. — gulo1221, Jan 10 '23 at 16:11

score 1 · Accepted Answer · answered Jan 10 '23 at 16:22

So, several things here.

The reason why your previous data = pd.read_csv(filename, sep=",", header=None) did not work is that you've indicated that it should separate on , and it treats every single line as a row to be split. So, sub_ID: [ 'sub-01','sub-02' ] is split to sub_ID: ['sub-01' and 'sub-02' ].

The example data you've provided seems to be in YAML format:

sub_ID: [ 'sub-01','sub-02' ]

ses_ID: [ 'ses-01','ses-01' ]

mean: [ 0.3456,0.446 ]

If it were CSV, the data would look as follows (it does not):

sub_ID,ses_ID,mean
sub-01,ses-01,0.3456
sub-02,ses-02,0.445

To read this data into a dataframe, you will either need to preprocess it into another format (e.g. csv) or read it as YAML into a dict and pass that to pandas.DataFrame.

For example:

import yaml
with open("data.txt", "r") as file:
    try:
        # This returns a dict from the given YAML data.
        data = yaml.safe_load(file)
    except yaml.YAMLError as exc:
        print(exc)

print(data)
# {'sub_ID': ['sub-01', 'sub-02'], 'ses_ID': ['ses-01', 'ses-01'], 'mean': [0.3456, 0.446]}

After that, you can create a DataFrame from this dict:

df = pd.DataFrame(data)
df.head()


+-----+--------+--------+--------+
|     | sub_ID | ses_ID |  mean  |
+-----+--------+--------+--------+
|   0 | sub-01 | ses-01 | 0.3456 |
|   1 | sub-02 | ses-02 |  0.446 |
+-----+--------+--------+--------+

as desired.

If you have certain entries that are not valid YAML, you will need to preprocess the data before loading it into pandas.

read from txt file and convert into dataframe in python

1 Answers1