for a side project, I am currently trying to make a dashboard based on this data (a kind of scatterplot) where there is a date and how the prices change over time. I am currently in one of the hardest states (data cleaning).
I am trying to split the variables in order to extract information. For example, in the variable departure_info
I want to split into the variable departure_info_time
and departure_info_day
. Furthermore, in the variable price
I want to extract the numbers such as 358, 480, 590, etc.
df1 <- data.frame(depart = c("OSL", "WAW", "VIE", "MUC", "FRA"),
destination = c("KEF", "ARN", "RIX", "VCE", "OSL"),
departue_info = c("['12:45 am Sa 19 Feb']", "['07:55 am Sa 19 Feb']", "['09:05 am Sa 19 Feb']", "['21:45 am Sa 19 Feb', '15:30 am Sa 19 Feb']", "['10:25 am Sa 19 Feb', '16:10 am Sa 19 Feb', '21:40 am Sa 19 Feb']"),
price = c("['358<U+0080>']", "['480<U+0080>']", "['590<U+0080>']", "['354<U+0080>', '418<U+0080>']", "['249<U+0080>', '249<U+0080>', '249<U+0080>', '419<U+0080>']"))
I would appreciate if someone can help me. I tried with str_extract()
and gsub()
but I could not succeed. I also would thank if someone can give me an advice what I have to do, if in a row I have several prices in another row just one.
I thank you for your help :)