0

While writing some dataframe to excel, via asCOMArray, most of the data is fine. But due to encoding problem, some characters like " - " gets converted to : " †" ; "é" gets changed to : "é"

Here lies the code example :

xlApp <- COMCreate("Excel.Application")
xlApp[["Visible"]] <- TRUE
xlApp[["DisplayAlerts"]] <- F
xlApp[["EnableEvents"]] <- F

#add new wb
wb <- xlApp$Workbooks()$Add()
#rename first active sheet
sheet <- xlApp$ActiveSheet()
wbActiveNum <- xlApp$ActiveSheet()[['Index']]
wbSheet <- xlApp$Worksheets()[[wbActiveNum]]
wbSheet[['Name']] <- 'Data'
#save file
Filenm <- paste0(Output_path,"XXX File as Of ",format(Sys.Date() , "%B-%d-%Y"), " .xlsb")
Filenm <- suppressWarnings(normalizePath(Filenm))
wb$SaveAs(Filename = Filenm ,FileFormat = 50)
Sys.sleep(2)

sheet <- wb$Worksheets("Data")
I1 <- as.numeric(ncol(DF))%%26                                     
I2 <- round(as.numeric(ncol(DF))/26)                             
LC <- toupper(paste0(letters[as.numeric(I2)], letters[as.numeric(I1)]))
SC <- toupper(letters[1])
range<-sheet$Range(paste0(SC,"14:",LC,nrow(DF)+13))
range[["Value"]] <- asCOMArray(DF)
try( range[["Borders"]][["ColorIndex"]] <- 1 , silent = T )
Font <- range$Font()
Font[["Size"]] <- 9
Font[["Name"]] <- "Calibri"
EntCol <-range$EntireColumn()
EntCol$AutoFit() 

Now "DF" is my dataframe, which gets pasted successfully, but generates special characters at various places, as shown above.

Is there a way to change the encoding to "latin9" or "latin1" to help overcome these special characters generation.

Can RDCOMclient change the encoding type before writing data to excel.

(above is a random code chunk, used to write data to excel sheet)

  • A [mojibake](https://en.wikipedia.org/wiki/Mojibake) case on characters `‐` (U+2010, *Hyphen*) and `é` (U+00E9, *Latin Small Letter E With Acute*). Proof (in Python): `print("‐|é".encode( 'utf-8').decode( 'cp1252', 'ignore'))` gives `â€|é`. In fact, `print("‐|é".encode( 'utf-8').decode( 'cp1252', 'backslashreplace'))` returns `â€\x90|é`: you can see the unprintable character `\x90` there: `` (U+0090, *Device Control String*). – JosefZ Dec 10 '21 at 17:04
  • this doesn't help – sayan chakraborty Dec 11 '21 at 14:10
  • Could you provide the content of the variable DF? I can't replicate your problem – Emmanuel Hamel Sep 27 '22 at 20:27
  • On my end, If I define the variable DF as follows : DF <- matrix(c("-", "é"), nrow = 10, ncol = 1) DF <- as.data.frame(DF), the code works well. There is no encoding problem – Emmanuel Hamel Oct 01 '22 at 11:48

1 Answers1

0

You have to convert the text from UFT-8 to latin1. I have been able to write special caracters properly in the Excel file with the following code :

library(RDCOMClient)
xlApp <- COMCreate("Excel.Application")
xlApp[["Visible"]] <- TRUE
xlApp[["DisplayAlerts"]] <- FALSE
xlApp[["EnableEvents"]] <- FALSE

wb <- xlApp$Workbooks()$Add()
sheet <- xlApp$ActiveSheet()
wbActiveNum <- xlApp$ActiveSheet()[['Index']]
wbSheet <- xlApp$Worksheets()[[wbActiveNum]]
wbSheet[['Name']] <- 'Data'

y <- c("é", "ê")
y <- iconv(y, from = "UTF-8", to = "latin1")
x <- c("ô", "û")
x <- iconv(x, from = "UTF-8", to = "latin1")

DF <- data.frame(y = y, x = x)

sheet <- wb$Worksheets("Data")
range <- sheet$Range("A1:B2")
range[["Value"]] <- asCOMArray(DF)
Emmanuel Hamel
  • 1,769
  • 7
  • 19