3

Hi to all the community, I have the following DB:

ID Distance
M1_PRM    54,56
M1_PRM  4147,69
M1_PRM  1723,34

I use the following script to replace "," in "." in Distance as R doesn't like "," (and it works):

mysub<-function(x)(sub(",",".",x))
DB<-(apply(DB, 2,mysub))
DB<-data.frame(DB)

Then I need to convert DB$Distance as.numeric because I need to use tapply in conjunction with sum, like:

apply(DB$Distance,ID,sum)

When I give

DB$Distance<-as.numeric(DB$Distance)

ID Distance
M1_PRM 54
M1_PRM 4147
M1_PRM 1723

It seems that R discard the decimal!!! Someone Know what is wrong? Thanks in advance!

stefano
  • 601
  • 1
  • 8
  • 14

3 Answers3

5

Another approach (if you're reading this in from a file):

dat <- read.table(text = "ID Distance
 M1_PRM    54,56
 M1_PRM  4147,69
 M1_PRM  1723,34",header = TRUE,sep = "",dec = ",")
> dat
      ID Distance
1 M1_PRM    54.56
2 M1_PRM  4147.69
3 M1_PRM  1723.34
joran
  • 169,992
  • 32
  • 429
  • 468
3

@joran's Answer is the way to go, if you're reading in DB with read.table or read.csv, otherwise, there's type.convert, which takes a dec option.

type.convert(as.character(DB$Distance), dec = ",")
# [1]   54.56 4147.69 1723.34

Discard the as.character, if Distance is already such.

Matthew Plourde
  • 43,932
  • 7
  • 96
  • 113
  • Really really useful suggestions...I start understanding that sometimes there is an easy way to obtain what we want from R!!!Many thanks to all! – stefano Jan 18 '13 at 17:30
  • @stefano, glad to help. Don't forget to accept one of the answers here before you go, so the open question is resolved. – Matthew Plourde Jan 18 '13 at 17:35
1

R is discarding decimal because you're going in the wrong way in apply call, instead try

> DB$Distance <- as.numeric(sub(",",".",DB$Distance))
> sapply(DB, class)
       ID  Distance 
 "factor" "numeric" 
> DB
      ID Distance
1 M1_PRM    54.56
2 M1_PRM  4147.69
3 M1_PRM  1723.34

Then use tapply as in:

with(DB, tapply(Distance, ID, sum))

your apply(DB$Distance,ID,sum) will not work, instead use tapply(DB$Distance, DB$ID, sum) because the correct function is tapply and you have to give a numeric verctor and an index, both of them are attached in DB so R won't find ID unless you use with(.) function or DB$ID.

see ?apply and ?tapply.

I just try to give an answer to you according to your post. @joran's answer is the direct way to go if you're importing data from a file, if so, all your problem reduces to use dec = "," in the read.table call

Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138