I received extracted data from a server, the problem is the extract has the delimiter ";" in the csv file.
I read the folder with the following command:
files = glob.glob(r"path/*.csv")
dfs = [pd.read_csv(f, sep=";", engine='c') for f in files]
df2 = pd.concat(dfs,ignore_index=True)
and the output is:
columnA columnB .... columnT columnU
2000 A .... I wish NaN
1000 B .... that NaN
this ends NaN .... NaN NaN
3000 A ..... I DUU
...
the text in row 3 belongs to the columnT in the second row. So far i am only possible to delete all weirds rows like row 4 but i am not able to keep that information.
df2.dropna(subset=['columnB'], how='all', inplace=True)
How can i read the files correctly? The Problem is, that in the text field columnT in the text it also use ";" as normal character.
the original text is (in csv):
columnA; columnB; .... columnT; columnU:
2000; A; .... I wish; NaN;
1000; B; .... that; this ends; NaN;
3000; A; ..... I; DUU;