0

I am having problems reading a CSV file that has a number in a strange format. I would like to read the value as a number into R.

I am reading the CSV file using read.csv normally into a DF.

The problem is that one of the columns read the value as a factor variable.

Example: CSV file:

713,78-;713,78;577,41-;577,41;123,82-;123,82 

After I read it into a dataframe the result is:

[1] 713,78- 713,78  577,41- 577,41  123,82- 123,82 
6 Levels: 713,78- 713,78  577,41- 577,41  123,82- 123,82  

In the case illustrated above, I would like the following output:

[1] -713.78  713.78 -577.41  577.41 -123.82  123.82

Where the column number would be class Numeric.

N8TRO
  • 3,348
  • 3
  • 22
  • 40

2 Answers2

4

It should work in general:

fixData <- function(x)
{
  x <- gsub(',', '.', x)
  x[grep('-$', x)] <- paste0('-', x[grep('-$', x)])
  x <- as.numeric(sub('-$', '', x))
  return(x)
}
myData <- read.csv2(file, stringsAsFactors = F)
fixedData <- sapply(myData , fixData )
Abderyt
  • 109
  • 3
1

That's an ugly number format.

This should get it to what you want it to be.

x <- factor(c("713,78-", "713,78", "577,41-", "577,41", "123,82-", "123,82"))

scalar <- ifelse(grepl("-", x), -1, 1)
x <- as.character(x)
x <- gsub(",", ".", x)
x <- gsub("-", "", x)
x <- as.numeric(x) * scalar
Benjamin
  • 16,897
  • 6
  • 45
  • 65