1

I am trying to create a dataframe for each file from a list of >3,000 files. When I use a small number of files my code works fine, but when I try bigger numbers (>300 files) I keep getting the same error:

ParserError: Error tokenizing data. C error: Expected 1 fields in line 4, saw 5

This is the script:

all_files_df = [pd.read_table("/data/lab/datasets/Drug_CyTOF_screening/"+x, sep='\t') for x in all_files]

Does anyone know what is causing this issue?

Thank you!

1 Answers1

0

To debug, try something like that:

data = []
for x in all_files:
   try:
       df = pd.read_table("/data/lab/datasets/Drug_CyTOF_screening/"+x, sep='\t')
       data.append(df)
   except pd.errors.ParseError as err:
       print(f"'{x}' contains errors. skipped")
       print(err)
Corralien
  • 109,409
  • 8
  • 28
  • 52