1

I have code like this:

bulk <- read_csv("data/food_bulk_raw.csv") %>% 
  mutate(Treatment = "bulk", Individual = seq_len(Timestamp))

seq_len() is creating a list of 1:length(Timestamp). It works because 'Timestamp' is a column of the data-frame. But let's say I didn't know anything about my data-frame: Perhaps I am creating a function. How could I indicate the length of the data-frame without first saving it as an object like I have below?

data002 <- read_csv("data/data002.csv")

data002 <- mutate(data002, New_Column = 1:nrow(data002))
CubicInfinity
  • 163
  • 1
  • 10

2 Answers2

2

You could use any of the following

library(tidyverse)
#Option 1
read_csv("data/food_bulk_raw.csv") %>% 
  mutate(Treatment = "bulk", Individual = seq_len(nrow(.)))

#Option 2
read_csv("data/food_bulk_raw.csv") %>% 
     mutate(Treatment = "bulk", Individual = seq(nrow(.)))

#Option 3
read_csv("data/food_bulk_raw.csv") %>% 
      mutate(Treatment = "bulk", Individual = sequence(nrow(.)))

All of these do not depend on any column but uses nrow to create sequence.

Also as @Marius commented, you could also use n() which returns number of rows instead of nrow. So in all of the above options nrow(.) can be replaced with n().

Apart from that we can also use row_number

read_csv("data/food_bulk_raw.csv") %>% 
       mutate(Treatment = "bulk", Individual = row_number())

To demonstrate, making a function

df_sequence_func <- function(df) {
  df %>% mutate(Individual = seq_len(nrow(.)))
}

df_sequence_func(mtcars)

#    mpg cyl  disp  hp drat    wt  qsec vs am gear carb Individual
#1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4          1
#2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4          2
#3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1          3
#4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1          4
#....

df_sequence_func(cars)

#   speed dist Individual
#1      4    2          1
#2      4   10          2
#3      7    4          3
#4      7   22          4
#5      8   16          5
#6      9   10          6
#....

It returns a sequential row number irrespective of the columns or rows in the dataframe.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

We can use data.table methods

library(data.table)
setDT(df)[, seq_len(.N)]

and it can be read with fread

fread("data/food_bulk_raw.csv")[, 
     c("Treatment", "Individual")  := .("bulk", seq_len(.N))][] 

Or in tidyverse

library(tidyverse)
rownames_to_column(data002, 'rn')

Or using

data002 %>%
      mutate(New_Column = seq_len(n()))

Or in base R

df$newcolumn <- seq(nrow(df))
akrun
  • 874,273
  • 37
  • 540
  • 662