1

I wrote a program for Python that reads a csv file in pandas, and then does some analysis. Unfortunately some friends that are using my program use European computers, as such their csv files use the comma (,) to delimiter decimals, while I use the point (.) . The general delimiter between columns will in any case the semicolon so there are no problems there.

Right now I have a setting at the beginning of the program that goes like this:

european_decimal=True #False

and then later

if european_decimal:    
          df1 = pd.read_csv(directory+filename,delimiter=";",decimal=",")
else:    
          df1 = pd.read_csv(directory+filename,delimiter=";")

This, of course, works. But it is really ugly, and requires my friend who is not a computer programmer to mess with the code. Is there any way in which I can find out if a computer where a python program is running is using comma or fullstop as delimiter?

LATE EDIT: at the end I applied @ALollz solution and I simply converted any string with commas in strings with dots:

for column in list_data_columns: 
       df1[column]=df1[column].astype(str).str.replace(",",".").astype(float) 
Pietro Speroni
  • 3,131
  • 11
  • 44
  • 55
  • 1
    But this doesn't prevent your european friends from opening csv files where the decimal separator is `'.'` though? Anyway you could detect the locale or the UI locale: https://stackoverflow.com/questions/3425294/how-to-detect-the-os-default-language-in-python to do this depending on what you want – EdChum Jun 04 '19 at 15:40
  • hi Pietro, that's not probably the case, but can't you just read a generic entry of the dataframe which is supposed to contain some numbers and check if a comma or dot is used as decimal separator? – crash Jun 04 '19 at 15:40
  • 1
    That seems so much more complicated than something simple like `decimal = '.'; if european_decimal: decimal=','` followed by a single `pd.read_csv` with `decimal=decimal`. Do you friends really have so little knowledge that wrapping it in a function with a rather verbose argument `european_decimal = True` is too hard to implement? – ALollz Jun 04 '19 at 15:42
  • 1
    You could use `locale` : `import locale locale.localeconv()['decimal_point']` will output the decimal point character but the problem remains that it will dependant on your data. If the format of the data is consistent – EdChum Jun 04 '19 at 15:43
  • at the end I applied @ALollz solution and I simply converted any string with commas in strings with dots: for column in list_data_columns: df1[column]=df1[column].astype(str).str.replace(",",".").astype(float) – Pietro Speroni Jun 07 '19 at 17:22

1 Answers1

2

You are looking for the locale module, which Doug Hellman described reasonably fully in this Python Module Of The Week column.

Hint: You are probably looking for locale.getlocale()['decimal_point'] and locale.getlocale()['mon_thousands_sep'].

holdenweb
  • 33,305
  • 7
  • 57
  • 77