how to specify the digits of numeric values when reading data with read.csv, read_csv or read_excel in R

Question

I am trying to read Geographic latitude and longitude into R. These geographic data are usually numeric values with over 6 digits. I was trying to read excel file with read_excel() in "read_excel" package, and read.csv in base R, and read_csv() in "readr" package. However, none of the aforementioned functions can correctly read these data without loss of information. All of these functions, without exception, could only read numeric values truncated at 4 or 5 digits. I also tried to use "options(digits = 8)" to specify the default digit before reading the data, but it does not work. Here I have made a reproducible example for the read_csv() function in "readr" package:

read_csv("112.8397456,35.50496106\n112.583984,37.8519194\n112.5826569,37.8602818", col_names = FALSE)

The system automatically truncates the data at 5 digits:

# A tibble: 3 × 2
        X1       X2
     <dbl>    <dbl>
1 112.8397 35.50496
2 112.5840 37.85192
3 112.5827 37.86028

I have checked on stackoverflow, and it seems that no similar questions have been brought up. Could any one give me a feasible answer on how to read this form of data with information loss? Thanks. :)

Note that the `options(digits = ...)` also count the number of digits before decimal point so with digits = 7, you get 112.8397 because that includes 7 significant digits — talat, Mar 20 '17 at 14:05
@docendodiscimus Thanks. I think I misunderstand the word "digits". — Miao Cai, Mar 21 '17 at 04:43

score 2 · Accepted Answer · answered Mar 20 '17 at 13:59

This isn't an issue with readr. The full data is still in there—R is just not showing it all. The same thing happens when you use base R's read.csv():

library(tidyverse)
df.readr <- read_csv("112.8397456,35.50496106\n112.583984,37.8519194\n112.5826569,37.8602818", col_names = FALSE)

df.base <- read.csv(textConnection("112.8397456,35.50496106\n112.583984,37.8519194\n112.5826569,37.8602818"), header = FALSE)

# By default R shows 7 digits
getOption("digits")
#> [1] 7

# Both CSV files are truncated at 7 digits
df.readr
#> # A tibble: 3 × 2
#>         X1       X2
#>      <dbl>    <dbl>
#> 1 112.8397 35.50496
#> 2 112.5840 37.85192
#> 3 112.5827 37.86028
df.base
#>         V1       V2
#> 1 112.8397 35.50496
#> 2 112.5840 37.85192
#> 3 112.5827 37.86028

# Bumping up the digits shows more
options("digits" = 15)

df.readr
#> # A tibble: 3 × 2
#>            X1          X2
#>         <dbl>       <dbl>
#> 1 112.8397456 35.50496106
#> 2 112.5839840 37.85191940
#> 3 112.5826569 37.86028180
df.base
#>            V1          V2
#> 1 112.8397456 35.50496106
#> 2 112.5839840 37.85191940
#> 3 112.5826569 37.86028180

Thanks. I misunderstood the word "digits" as the length of numbers after the decimal. The problem is fairly straightforward if I understand the word digit correctly. :) — Miao Cai, Mar 21 '17 at 04:44

how to specify the digits of numeric values when reading data with read.csv, read_csv or read_excel in R

1 Answers1