I need help with replacing or extracting string of numbers, separated by comma in each element of my df, and replacing it with the median. For example,
a <- c("3, 3, 5, 5", "7, 7, 5, 5", "3, 4, 4, 5", "5, 7")
b <- c("Karina", "Eva", "Jake", "Ana")
df <- data.frame(b,a)
Now i need to replace variable a with the median of those numbers contained in each elements so it looks like below:
b a
1 Karina 4
2 Eva 6
3 Jake 4
4 Ana 6
Little bit background. Each number is actually a length of a word that belongs to the corresponding name. I need to find median length for each name and figure out whether names that start with a vowel have longer median length or not. So for example, from the above i will conclude that names that start with vowel have shorted length. And to use a test to show that it is statistically significant. If someone can guide me in any way, i really appreciate it!