0

I am trying to present data that is currently in rows as "XXX-XX-0001, YY-YY-0001" into a new column, outlining the number within each row [2]

I have managed to mutate a new column, however it is a character output chr [2], i need this to be just 2.

{r}
bill <- bill %>%
mutate(NO_IA = strsplit(as.character(IA_YES), ","))

When I try to use as .numeric, It doesn't like that my input is "," - also if I try to double up it reject its to ( as.numeric & as.character in same line)

Kampai
  • 22,848
  • 21
  • 95
  • 95
HP.
  • 37
  • 5

1 Answers1

0

After some clarification, here is a better answer:

data (fromt the comment)

string <- scan(text = "
AAA-GB-0001 
BBB-ES-0005,ADD-GB-0001 
BSC-ES-0005,HQQ-GB-0001,REE-GB-0001 
BDD-GB-0001,BSC-ES-0005,HQQ-GB-0001,UZZ-DE-0001 
BDD-GB-0001,UEE-DE-0001 
BDD-GB-0001,BRE-EE-0005,CTT-DE-0002,LZZ-DE-0011,UZZ-DE-0001", 
               what = character(), sep = "\n")

library(dplyr)
bill <- tibble(IA_YES = string)

Next time it would make sense to provide some example data. For example by using dput() (in this case copy the result from dput(bill).

solution

Note that the strsplit command in your code creates a list. The list is stored in the newly created column and can be used as any other list in R. We can use the purrr package to operate on lists, which provides better versions of R's *apply functions:

bill %>%
  mutate(NO_IA = strsplit(as.character(IA_YES), ",")) %>% 
  mutate(length = map_int(NO_IA, length))
#> # A tibble: 6 x 3
#>   IA_YES                                                    NO_IA    length
#>   <chr>                                                     <list>    <int>
#> 1 "AAA-GB-0001 "                                            <chr [1~      1
#> 2 "BBB-ES-0005,ADD-GB-0001 "                                <chr [2~      2
#> 3 "BSC-ES-0005,HQQ-GB-0001,REE-GB-0001 "                    <chr [3~      3
#> 4 "BDD-GB-0001,BSC-ES-0005,HQQ-GB-0001,UZZ-DE-0001 "        <chr [4~      4
#> 5 "BDD-GB-0001,UEE-DE-0001 "                                <chr [2~      2
#> 6 BDD-GB-0001,BRE-EE-0005,CTT-DE-0002,LZZ-DE-0011,UZZ-DE-0~ <chr [5~      5

A short explanation of map_int(NO_IA, length): map functions all work in the same way. You provide a list or a vector that can be transformed to a list and apply a function to it. In this case we measure the length() of each entry in the list. An alternative way to write it would be map_int(NO_IA, function(x) length(x)). The advantage of purrr compared to the apply functions is that you can control the output better. map_int will return an integer, map_chr, for example, returns a character object.

Old answer

You can just replace the comma with a dot before converting it:

library(dplyr)df <- tibble(num = c("12,3", "10.7"))
df %>% 
  mutate(num = as.numeric(sub(",", ".", num, fixed = TRUE)))
#> # A tibble: 2 x 1
#>     num
#>   <dbl>
#> 1  12.3
#> 2  10.7

More "tidy" version:

library(tidyverse)
df <- tibble(num = c("12,3", "10.7"))
df %>% 
  mutate(num = str_replace(num, fixed(","), ".") %>%  
           as.numeric())
#> # A tibble: 2 x 1
#>     num
#>   <dbl>
#> 1  12.3
#> 2  10.7
JBGruber
  • 11,727
  • 1
  • 23
  • 45
  • How do i direct this as my column for the replace. currently you have strings 12,3 , 10.7. – HP. Aug 19 '19 at 12:44
  • I don't understand this sentence. Maybe you include some data in your question, so it is easier to help you. Currently, what I understood is that you have a character column that you need to convert to numeric. – JBGruber Aug 19 '19 at 13:23
  • IA_YES 1.AAA-GB-0001 2. BBB-ES-0005,ADD-GB-0001 3. BSC-ES-0005,HQQ-GB-0001,REE-GB-0001 4 .BDD-GB-0001,BSC-ES-0005,HQQ-GB-0001,UZZ-DE-0001 5. BDD-GB-0001,UEE-DE-0001 6. BDD-GB-0001,BRE-EE-0005,CTT-DE-0002,LZZ-DE-0011,UZZ-DE-0001 I need to be able to count in each numbered row, the number of strings separated by a comma. apologies for the confusion – HP. Aug 19 '19 at 13:47
  • I think I understood now what you need. Take a look. – JBGruber Aug 19 '19 at 14:19
  • Hi JBG - thanks this works, also i was able to resolve another way. ``` bill <- bill %>% mutate(NO_IA = str_count(IA_YES, ",")+1) ``` just needed to add another comma count on the end of each to get the same answer (i.e 1 would not have a comma). thank you for your help! – HP. Aug 19 '19 at 14:22
  • Great! I was thinking about `str_count` as well but I thought you needed the list column as well, at which point it makes sense to count how long the list elements actually are. – JBGruber Aug 19 '19 at 14:35