0

I have a txt file which I read into pandas dataframe. The problem is that inside this file my text data recorded with delimiter ''. I need to split information in 1 column into several columns but it does not work because of this delimiter.

I found this post on stackoverflow just with one string, but I don't understand how to apply it once I have a whole dataframe: Split string at delimiter '\' in python

After reading my txt file into df it looks something like this

df

column1\tcolumn2\tcolumn3

0.1\t0.2\t0.3
0.4\t0.5\t0.6
0.7\t0.8\t0.9

Basically what I am doing now is the following:

df = pd.read_fwf('my_file.txt', skiprows = 8) #I use skip rows because there is irrelevant text
df['column1\tcolumn2\tcolumn3'] = "r'" + df['column1\tcolumn2\tcolumn3'] +"'" # i try to make it a row string as in the post suggested but it does not really work
df['column1\tcolumn2\tcolumn3'].str.split('\\',expand=True)

and what I get is just the following (just displayed like text inside a data frame)

r'0.1\t0.2\t0.3'

r'0.4\t0.5\t0.6'

r'0.7\t0.8\t0.9'

I am not very good with regular expersions and it seems a bit hard, how can I target this problem?

  • \t is most likely the tab character here, and the t is not part of the data. Simply split at \t instead. – Stefan Jul 16 '22 at 10:50

1 Answers1

2

It looks like your file is tab-delimited, because of the "\t". This may work

pd.read_csv('file.txt', sep='\t', skiprows=8)