I have a .dat file that I had been reading with pd.read_csv
and always needed to use encoding="latin"
for it to read properly / without error. When I use pyarrow.csv.read_csv
I dont see a parameter to select the encoding of the file but it still works without issue(which is great! but i dont understand why / if it only auto handles certain encodings). The only parameters im using are setting the delimiter="|"
(with ParseOptions) and auto_dict_encode=True
with (ConvertOptions).
How is pyarrow handling different encoding types?