0

I have a dataset in R which I am importing from googledrive using the googlesheets4 package

dat <- read_sheet("url")

which appears to import correctly. Inspecting the class shows that it's a tbl_df

> class(dat)
[1] "tbl_df"     "tbl"        "data.frame"

However, when I try and subset different columns, I get a variety of errors.

First I tried using the normal subset command, which results in the following

> dat1 <- dat[165:300]
Error: Positive column indexes in `[` must match number of columns:
* `.data` has 175 columns
* Position 12 equals 176
* Position 13 equals 177
* Position 14 equals 178
* Position 15 equals 179
* ... and 121 more problems

Next I tried calling the actual column names, which resulted in a different error (I double checked, those are definitely the names of the columns).

> dat1 <- dat[agency1:Like_9]
Error in check_names_df(i, x) : object 'agency1' not found

Then I tried converting the whole dataset to a data frame, which worked

> dat1 <- as.data.frame(dat)
> class(dat1)
[1] "data.frame"

However, subsetting with column names returned the same error

> dat1 <- dat[agency1:Like_9]
Error in `[.data.frame`(dat, agency1:Like_9) : object 'agency1' not found

Subsetting with column numbers returned a different error

> dat1 <- dat[165:300]
Error in `[.data.frame`(dat, 165:300) : undefined columns selected

What is going on with the data frame? All the variables I'm trying to subset are numerical, although there are non-numerical variables in the dataset. I'm not certain if the errors are due to how I've imported the data, or because the dataset contains different types of variables. I'm relatively new to R, so any guidance is appreciated :)

zx8754
  • 52,746
  • 12
  • 114
  • 209
becbot
  • 151
  • 1
  • 2
  • 9
  • What is `ncol(dat)` ? – Ronak Shah May 05 '20 at 01:57
  • It looks like your data only has 175 columns, so it's expected for `dat[165:300]` to throw an error. `[.tbl_df` doesn't provide support for tidyselect, so it's failing because there's no variable called `agency1` in the global environment. `dplyr::select(dat, agency1:Like_9)` is probably what you want – dave-edison May 05 '20 at 02:02
  • @DiceboyT using dplyr select worked, thanks! When I hovered over the columns I wanted it said 'column 300: numeric with range 1 - 6' so I assumed (incorrectly) that was the column number, which explains those errors I guess. I assume there is also a way to identify column numbers, but calling them by their names using select is probably easier :) – becbot May 05 '20 at 04:24

0 Answers0