2

I am trying to read Geographic latitude and longitude into R. These geographic data are usually numeric values with over 6 digits. I was trying to read excel file with read_excel() in "read_excel" package, and read.csv in base R, and read_csv() in "readr" package. However, none of the aforementioned functions can correctly read these data without loss of information. All of these functions, without exception, could only read numeric values truncated at 4 or 5 digits. I also tried to use "options(digits = 8)" to specify the default digit before reading the data, but it does not work. Here I have made a reproducible example for the read_csv() function in "readr" package:

read_csv("112.8397456,35.50496106\n112.583984,37.8519194\n112.5826569,37.8602818", col_names = FALSE)

The system automatically truncates the data at 5 digits:

# A tibble: 3 × 2
        X1       X2
     <dbl>    <dbl>
1 112.8397 35.50496
2 112.5840 37.85192
3 112.5827 37.86028

I have checked on stackoverflow, and it seems that no similar questions have been brought up. Could any one give me a feasible answer on how to read this form of data with information loss? Thanks. :)

Andrew
  • 36,541
  • 13
  • 67
  • 93
Miao Cai
  • 902
  • 9
  • 25
  • Note that the `options(digits = ...)` also count the number of digits before decimal point so with digits = 7, you get 112.8397 because that includes 7 significant digits – talat Mar 20 '17 at 14:05
  • @docendodiscimus Thanks. I think I misunderstand the word "digits". – Miao Cai Mar 21 '17 at 04:43

1 Answers1

2

This isn't an issue with readr. The full data is still in there—R is just not showing it all. The same thing happens when you use base R's read.csv():

library(tidyverse)
df.readr <- read_csv("112.8397456,35.50496106\n112.583984,37.8519194\n112.5826569,37.8602818", col_names = FALSE)

df.base <- read.csv(textConnection("112.8397456,35.50496106\n112.583984,37.8519194\n112.5826569,37.8602818"), header = FALSE)

# By default R shows 7 digits
getOption("digits")
#> [1] 7

# Both CSV files are truncated at 7 digits
df.readr
#> # A tibble: 3 × 2
#>         X1       X2
#>      <dbl>    <dbl>
#> 1 112.8397 35.50496
#> 2 112.5840 37.85192
#> 3 112.5827 37.86028
df.base
#>         V1       V2
#> 1 112.8397 35.50496
#> 2 112.5840 37.85192
#> 3 112.5827 37.86028

# Bumping up the digits shows more
options("digits" = 15)

df.readr
#> # A tibble: 3 × 2
#>            X1          X2
#>         <dbl>       <dbl>
#> 1 112.8397456 35.50496106
#> 2 112.5839840 37.85191940
#> 3 112.5826569 37.86028180
df.base
#>            V1          V2
#> 1 112.8397456 35.50496106
#> 2 112.5839840 37.85191940
#> 3 112.5826569 37.86028180
Andrew
  • 36,541
  • 13
  • 67
  • 93
  • Thanks. I misunderstood the word "digits" as the length of numbers after the decimal. The problem is fairly straightforward if I understand the word digit correctly. :) – Miao Cai Mar 21 '17 at 04:44