Create an ID (row number) column

Question

I need to create a column with unique ID, basically add the row number as an own column. My current data frame looks like this:

How to make it look like this:

?

Many thanks

Jaap · Answer 1 · 2018-06-18T14:41:15.333

65

Two tidyverse alternatives (using sgibb's example data):

tibble::rowid_to_column(d, "ID")

which gives:

Or:

dplyr::mutate(d, ID = row_number())

which gives:

As you can see, the rowid_to_column-function adds the new column in front of the other ones while the mutate&row_number()-combo adds the new column after the others.

And another base R alternative:

d$ID <- seq_along(d[,1])

edited Jun 18 '18 at 14:41

answered Jun 18 '18 at 12:29

Jaap

81,064
34
182
193

2

Curiously, the `mutate` and `seq_along` solutions do not work for `data.table`. – James Hirschorn Aug 17 '18 at 22:04
1

@JamesHirschorn Besides the method as shown by @altabq (which is the preferred one for [tag:data.table]), you could do `seq_along(d[[1]])` when `d` is a `data.table`. – Jaap Feb 19 '20 at 13:12

sgibb · Accepted Answer · 2013-05-05T15:32:13.613

You could use cbind:

d <- data.frame(V1=c(23, 45, 56), V2=c(45, 45, 67))

## enter id here, you could also use 1:nrow(d) instead of rownames
id <- rownames(d)
d <- cbind(id=id, d)

## set colnames to OP's wishes
colnames(d) <- paste0("V", 1:ncol(d))

EDIT: Here a comparison of @dacko suggestions. d$id <- seq_len(nrow(d) is slightly faster, but the order of the columns is different (id is the last column; reorder them seems to be slower than using cbind):

library("microbenchmark")

set.seed(1)
d <- data.frame(V1=rnorm(1e6), V2=rnorm(1e6))

cbindSeqLen <- function(x) {
  return(cbind(id=seq_len(nrow(x)), x))
}

dickoa <- function(x) {
  x$id <- seq_len(nrow(x))
  return(x)
}

dickoaReorder <- function(x) {
  x$id <- seq_len(nrow(x))
  nc <- ncol(x)
  x <- x[, c(nc, 1:(nc-1))]
  return(x)
}

microbenchmark(cbindSeqLen(d), dickoa(d), dickoaReorder(d), times=100)

# Unit: milliseconds
#             expr      min       lq   median       uq      max neval
#   cbindSeqLen(d) 23.00683 38.54196 40.24093 42.60020 47.73816   100
#        dickoa(d) 10.70718 36.12495 37.58526 40.22163 72.92796   100
# dickoaReorder(d) 19.25399 68.46162 72.45006 76.51468 88.99620   100

Why not `d$id <- seq_len(nrow(d))` and then `colnames(d) <- paste0("V", 1:ncol(d))` — dickoa, May 05 '13 at 13:30
@dickoa: I just have not thought of it. Please see my edit. Your solution is a bit faster but doesn't preserve the order of the columns (but this isn't important in most cases). — sgibb, May 05 '13 at 15:34

score 31 · Answer 3 · answered Aug 09 '18 at 14:22

31

Many presented their ideas, but I think this is the sortest and simplest code for this task:

data$ID <- 1:nrow(data)

One line. The one and only.

answered Aug 09 '18 at 14:22

Eric Lino

429
4
10

2

True, but if your data has 0 rows, then I guess you have no data at all. Therefore, why would you need to create an ID for it? – Eric Lino Aug 17 '18 at 22:51
3

In my case, it was inside a function call where the `dataframe` is passed as an argument and is not known in advance. Could have 10 rows one time, 0 the next. – James Hirschorn Aug 17 '18 at 22:55
1

This worked perfectly for me. Used arrange() first, and then applied 1:nrow() creating a new variable of sequential IDs. Thank you for this simple solution. – amsloa Jul 14 '19 at 11:52
`data <- cbind(data, 1:nrow(data))` and then followed by `names(data)[names(data)=="1:nrow(data)"] <- "ID"` would be the [Wikibooks](https://en.wikibooks.org/wiki/R_Programming) way of doing it. – PolII Aug 10 '21 at 12:34

score 25 · Answer 4 · answered Oct 23 '14 at 20:45

25

You could also do this using dplyr:

DF <- mutate(DF, id = rownames(DF))

answered Oct 23 '14 at 20:45

WhiskeyGolf

359
3
2

3

There is a **big** assumption that rownames are numeric `1:n`. – zx8754 Oct 11 '21 at 10:42

score 11 · Answer 5 · edited Jun 18 '18 at 12:07

11

data.table solution

Easier syntax and much faster

library(data.table)

dt <- data.table(V1=c(23, 45, 56), V2=c(45, 45, 67))

setnames(dt, c("V2", "V3")) # changing column names
dt[, V1 := .I] # Adding ID column

edited Jun 18 '18 at 12:07

Jaap

81,064
34
182
193

answered Nov 15 '17 at 12:53

altabq

1,322
1
20
33

score 6 · Answer 6 · edited Jun 18 '18 at 12:39

6

Hope this will help. Shortest and best way to create ID column is:

dataframe$ID <- seq.int(nrow(dataframe))

edited Jun 18 '18 at 12:39

Jaap

81,064
34
182
193

answered Nov 07 '17 at 09:18

mehakVT

167
1
8

Andrew McCartney · Answer 7 · 2020-11-03T19:49:08.890

5

If you're starting without named rows in your df, the tidy way is:

df %>% 
  mutate(id = row_number()) %>% 
  select(id, everything())

edited Nov 03 '20 at 19:49

answered Nov 03 '20 at 13:58

Andrew McCartney

191
2
10

How is this different compared to [this answer](https://stackoverflow.com/a/50909550/2204410)? – Jaap Sep 25 '22 at 09:27

score 4 · Answer 8 · answered Apr 13 '20 at 18:49

4

Here is a solution that keeps the dplyr piping format and places id in the first column, which may be preferred.

d %>% 
  mutate(id = rownames(.)) %>% 
  select(id, everything())

answered Apr 13 '20 at 18:49

Jope

176
7

3

Suggestion: `relocate(id)` instead of the select statement is more consicse. – hyman Mar 09 '22 at 08:02
How is this different compared to [this answer](https://stackoverflow.com/a/26537145/2204410)? – Jaap Sep 25 '22 at 09:29
There is now a `.before` in mutate to control where your new column appears (and not default to far right). So you can do within the mutate: `d %>% mutate(id = rownames(.), .before=everything()) ` or if you don't like the rownames use `id = row_number()` as per Andrew McCartney's solution. – micstr Feb 16 '23 at 08:16

score 0 · Answer 9 · answered Feb 23 '21 at 18:29

0

The function rownames_to_column() moves rownames into a column; found in the tidyverse package (docs).

rownames_to_column(DF, "my_column_name")

Use column_to_rownames() for the reverse operation.

answered Feb 23 '21 at 18:29

Tobi Obeck

1,918
1
19
31

score 0 · Answer 10 · answered May 04 '22 at 19:05

0

If your database is not too large this will work

# Load sample data
Dt1 <- tibble(V1=c(23,45,56),V2=c(45,45,67))
# Create Separate Tibble with row numbers
Dt2 <- tibble(id=seq(1:nrow(Dt1)))
# Join together
Dt3 <- cbind(Dt2,Dt1)

answered May 04 '22 at 19:05

Harold Henson

31
5

Create an ID (row number) column

10 Answers10

data.table solution

Linked

Related