10

I have a column with strings:

name
aldrinas63_rios200_2001
sa_c.fr.1234

I want to count the number of digits in each cell: I have used the following code:

str_count(data$name, '\\d+')

But I have getting the output as:

Name                    output_I_get
aldrinas63_rios200_2001  3
sa_c.fr.1234             1

But my desired output is as follows:

name                     output
aldrinas63_rios200_2001   9
sa_c.fr.1234              4

Any help in this regard will be highly appreciated!

user3642360
  • 762
  • 10
  • 23
  • see [this post](https://stackoverflow.com/questions/42394489/count-the-number-of-occurrences-of-in-a-string) using `str_count` from the `stringr` package. `str_count('\\d', c(data$name))` – EJJ Sep 21 '18 at 16:18
  • So you can conclude from @akrun's answer that the plus sign in `'\\d+'` is the cause of your error. Try it without it. – Rui Barradas Sep 21 '18 at 16:52
  • When you are using `str_count(data$name, '\\d+')` you are counting how many [multi-digit] numbers there are. If you do `str_count('200', '\\d+')` the answer is 1 [multi-digit] number. If you do `str_count('200', '\\d')` the answer is 3 single-digits. – Adam Sampson Sep 21 '18 at 17:12

2 Answers2

15

We can remove the elements that are not digits and count

nchar(gsub("[^0-9]+", "", data$name))
#[1] 9 4

or if we are using str_count, remove the + as + checks for patterns of one or more digits and count 63 as first instance, 200 as second, and 2001 as third (for the first element of 'name')

library(stringr)
str_count(data$name, "[0-9]")
#[1] 9 4

data

data <- structure(list(name = c("aldrinas63_rios200_2001", "sa_c.fr.1234"
 )), class = "data.frame", row.names = c(NA, -2L))
akrun
  • 874,273
  • 37
  • 540
  • 662
3

Try this:

nchar(gsub("\\D", "", data$name))

Example

s <- c("aldrinas63_rios200_2001","sa_c.fr.1234")

nchar(gsub("\\D", "", s))
#[1] 9 4
989
  • 12,579
  • 5
  • 31
  • 53