4

I am new to python and have a requirement to load dataframes from various CSV files. It turns out that there is a business logic depending on the number of rows in csv. Can i know this beforehand if i can know CSV total row numbers without writing read_csv?

Matthias Fripp
  • 17,670
  • 5
  • 28
  • 45
Avij
  • 684
  • 7
  • 17

2 Answers2

5

yes, you can:

lines = sum(1 for line in open('/path/to/file.csv'))

but be aware that Pandas will read the whole file again

if you are sure that the whole file will fit into memory we can do this:

with open('/path/to/file.csv') as f:
    data = f.readlines()
    lines = len(data)
    df = pd.read_csv(data, ...)
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
  • Thanks for the answer. Although the files have only 1000-4000 lines and i think these can fir in memory. But are there ways to know that also if file fits in or not? – Avij Oct 05 '17 at 03:22
0

You can read the file without saving the contents. Try:

with open(filename, "r") as filehandle:
    number_of_lines = len(filehandle.readlines())
Stuart Buckingham
  • 1,574
  • 16
  • 25