read_csv2() and the use of the locale() argument to import dot decimals as numbers

Question

Using read_csv2() I tried importing data (csv format and ; delimited) and automatically detect . decimal numbers.

I have been unsuccessful so far and keep getting the following output (one can see that the last column is <chr> rather than <num>):

# A tibble: 46 x 4
             id     segment_id             value_type             value
          <int>          <int>                  <chr>             <chr>
1             1              1                    min                 0
2             1              1                    max               0.2
3             1              2                    min                 0
4             1              2                    max               0.2
...

What I have tried:

1.

read_csv2("table.csv", col_types = cols())

2. I read the readr and found out about locale() about which the following is said

The locale controls defaults that vary from place to place. The default locale is US-centric (like R), but you can use locale() to create your own locale that controls things like the default time zone, encoding, decimal mark, big mark, and day/month names.

With the code below, however, I did not solve my problem:

read_csv2("table.csv", col_types = cols(), col_names = TRUE, locale(decimal_mark = "."))

3. After reading How to make R's read_csv2() recognise the text characters properly I tried all encondings listed in File\Save with Encoding... of RStudio to no avail:

read_csv2("table.csv", col_types = cols(), col_names = TRUE, locale(encoding = "ISO-8859-1"))

The encodings listed are: ISO-8859-1, ASCII, BIG5, GB18030, GB2312, ISO-2022-JP, ISO-2022-KR, ISO-8859-2, ISO-8859-7, SHIFT-JIS, UTF-8, WINDOWS-1252

score 0 · Answer 1 · answered May 24 '19 at 07:19

When I run read_csv2 with your 4 lines as csv and run these lines:

prueba <- read_csv2(file = input_prueba, col_types = cols(), col_names = TRUE, locale(encoding = "ISO-8859-1"))

I get the same output, the last column is imported as a character column, but with a message advicing to use read_delim():

Using ',' as decimal and '.' as grouping mark. Use read_delim() for more control

If you still want to use read_csv2 here is my approach:

#For a quicker management of tables use data.table
install.packages("data.table")
library("data.table")

prueba <- as.data.table(prueba)

#Change column to number
prueba[,value:=as.double(value)]
str(prueba)
Classes ‘data.table’ and 'data.frame':  4 obs. of  4 variables:
 $ id        : int  1 1 1 1
 $ segment_id: int  1 1 2 2
 $ value_type: chr  "min" "max" "min" "max"
 $ value     : num  0 0.2 0 0.2

read_csv2() and the use of the locale() argument to import dot decimals as numbers

1 Answers1