1

Would like to be able to read Google Sheets cell values into R with googlesheets package, but without any cell formatting applied (e.g. comma separators, percentage conversion, etc.).

Have tried gs_read() without specifying a range, which uses gs_read_csv(), which will "request the data from the Sheets API via the exportcsv link". Can't find a way to tell it to provide underlying cell value without formatting applied.

Similarly, tried gs_read() and specifying a range, which uses gs_read_cellfeed(). But can't find a way to indicate that I want un-formatted cell values.

Note: I'm not after the formulas in any cells, just the values without any formatting applied.

Example: (looks like I'm not able to post image images) Here's a screenshot of an example Google Sheet: https://www.dropbox.com/s/qff05u8nn3do33n/Screenshot%202015-07-26%2008.42.58.png?dl=0 First and third columns are numeric with no formatting applied, 2nd column applies comma separators for thousands, 4th column applies percentage formatting.

Reading this sheet with the following code:

library(googlesheets)
gs <- gs_title("GoogleSheets Test")
ws <- gs_read(gs, ws = "Sheet1")

yields:

> str(ws)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   3 obs. of  4 variables:
 $ Number         : int  123456 123457 123458
 $ Number_wFormat : chr  "123,456" "123,457" "123,458"
 $ Percent        : num  0.123 0.234 0.346
 $ Percent_wFormat: chr  "12.34%" "23.45%" "34.56%"

Would like to be able to read a worksheet that has formatting applied (ala columns 2 and 4), but read the unformatted values (ala columns 1 and 3).

Rentrop
  • 20,979
  • 10
  • 72
  • 100
  • The results you get back from gs_read() ... in what way are they affected by cell formatting? I.e. what about the return value is problematic? Can you give an example? The Sheets API does not provide access to cell formatting: "The literal value of the cell element is the calculated value of the cell, without formatting applied. If the cell contains a formula, the calculated value is given here. The Sheets API has no concept of formatting, and thus cannot manipulate formatting of cells." (BTW I'm the author of googlesheets). – jennybryan Jul 24 '15 at 23:31
  • @jennybryan, first THANK YOU for writing the googlesheets package - very well done. Next, I've updated my question with a detailed example demonstrating the issue. – Eric Goldsmith Jul 26 '15 at 13:00

1 Answers1

0

At this point, I think your best bet is to fix the imported data like so:

> ws$Number_fixed <- type.convert(gsub(',', '', ws$Number_wFormat))
> ws$Percent_fixed <- type.convert(gsub('%', '', ws$Percent_wFormat)) / 100
> str(ws)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   3 obs. of  6 variables:
 $ Number         : int  123456 123457 123458
 $ Number_wFormat : chr  "123,456" "123,457" "123,458"
 $ Percent        : num  0.123 0.234 0.346
 $ Percent_wFormat: chr  "12.34%" "23.45%" "34.56%"
 $ Number_fixed   : int  123456 123457 123458
 $ Percent_fixed  : num  0.123 0.234 0.346

I had some hope that post-processing with functions from readr would be a decent answer, but it looks like percentages and "currency" style numbers are open issues there too.

I have opened an issue to solve this better in googlesheets, one way or another.

jennybryan
  • 2,606
  • 2
  • 18
  • 33
  • The unformatted number you seek is present in the XML, it's just not the piece I'm currently exporting. So that is another fruitful avenue for improvement in googlesheets -- related to an existing request for [formula support](https://github.com/jennybc/googlesheets/issues/18). – jennybryan Jul 28 '15 at 05:25