2

I have a problem with detection of the column types when importing data into R using the read_excel function of readxl package.

There are lot's of Q&As that are similar to my issue like this or this.

However I am still struggling.

Assume, I have an .xls file with just a one column INN that contains digits only; however, it needs to be imported as character type (as the data could start with 0, and the 0 is very important).

Here is the code I used:

data = read_excel('data.xls', col_types = c('text'))

When I import the data, it looks like this:

      INN
1 2123123

As a result INN detected as character however in some rows where INN starts with 0, the 0 is dropped which is unacceptable.

I want the output to look like this:

       INN
1 02123123

Any ideas how I could overcome my issue?

Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
  • What about reading that column as numeric, format the numbers with `sprintf` and then change it to character, would that be an option? – wernor Aug 10 '22 at 08:05
  • zero should not be dropped or the whole first line is dropped. please describe your error. – gaut Aug 10 '22 at 08:07
  • @wernor thank you for comment. Could provide with some code example? – Davit Bidzhoyan Aug 10 '22 at 08:08
  • 1
    @gaut assume I'm trying to import data with 1 value of '02123123' in column `INN`. As a result of my code I got '2123123' (i.e. without zero). – Davit Bidzhoyan Aug 10 '22 at 08:11
  • @DavitBidzhoyan it is hard to do with no actual data, but this `sprintf("%08.f",data)` could be an example of filling your number with initial zeros and it returns an 8 digit character string. – wernor Aug 10 '22 at 08:17
  • The short answer is that you can't with readxl. See [this Github issue](https://github.com/tidyverse/readxl/issues/389). – Ian Campbell Aug 10 '22 at 13:27

1 Answers1

0

Try this:

XLConnect::readWorksheet(..., colType="character")
gaut
  • 5,771
  • 1
  • 14
  • 45