0

I have a dataset like this, but much larger:

ds2
  Event                         act      me
<fct>                       <dbl>   <dbl>
1 Labour Costs YoY             2.33  0.0264
2 Unemployment Change (000's) -5.17 -0.449 
3 Unemployment Rate            8.86  0.0900
4 Jobseekers Net Change       11.3   9.57 

The problem is that the 1st me variable should be 2.64 (so multiplied by 100), while the second needs to be multiplied by 10, the third by 100 and the 4th needs to stay the same. So that the act and me variable are in the same decimal.

Is there a way to automatically make R identify and correct this? Thanks in advance.

To replicate the dataset:

ds2 <- structure(list(Event = structure(2:5, .Label = c("Event", "Labour Costs YoY", 
   "Unemployment Change (000's)", "Unemployment Rate", "Jobseekers Net Change"), 
     .Names = c("", "", "", ""), class = "factor"), act = c(2.33230769230769, -5.17018867924528, 
    8.86180371352785, 11.3192307692308), me = c(0.0263725490196078, 
     -0.449056603773585, 0.0899796195652174, 9.56704545454545)), row.names = c(NA, 
     -4L), class = c("tbl_df", "tbl", "data.frame"))
Jurgen
  • 51
  • 5

1 Answers1

0

If the new value should be about the same number of decimals, you can compute the logarithm of the factor the values differ and use that to convert it:

ds2$conversion = round(log(ds2$act/ds2$me,10))
ds2$me.new = ds2$me * 10**ds2$conversion

This results in the data.frame:

> ds2
                        Event       act          me conversion    me.new
1            Labour Costs YoY  2.332308  0.02637255          2  2.637255
2 Unemployment Change (000's) -5.170189 -0.44905660          1 -4.490566
3           Unemployment Rate  8.861804  0.08997962          2  8.997962
4       Jobseekers Net Change 11.319231  9.56704545          0  9.567045
Martin Wettstein
  • 2,771
  • 2
  • 9
  • 15
  • Thank you @MartinWettstein for your reaction. This works for the the first four but not for all the variables in my total sample. Sometimes the `conversion` is negative (this should be an easy fix), and also sometimes the conversion rate is not correct (don't know where this comes from). – Jurgen Jun 09 '21 at 14:56
  • If the conversion is negative, it should work. That just means that `me` has to be divided by a potency of 10. But that's no problem in this code. There are two possible sources of mistake: If the signs of `act` and `me` differ, there is no logarithm. And if the two values differ by a factor around 3 (or 30 or 300), the conversion is unreliable. – Martin Wettstein Jun 09 '21 at 15:02
  • Nevermind. It works! I made an error rewriting the code. Thank you! – Jurgen Jun 09 '21 at 15:04
  • However, I just noticed the problem you just pointed out about a factor 3 and the different signs. Perhaps I can split up the dataset and do those manually? – Jurgen Jun 09 '21 at 15:12
  • You could also compute a second conversion column by computing the log of the two columns, separately, rounding and noting the difference. If the two values are the same, you most probably are correct. If they are not, you can decide manually. The problem with the signs might be easily amended by just taking the absolute values. However, you have to decide whether, for example, 1.02 and 9.98 have the same number of decimals or whether a conversion is necessary. – Martin Wettstein Jun 09 '21 at 16:08