2

I am new to R, please have mercy. I imported a table from an Access database via odbc:

df <- select(dbReadTable(accdb_path, name ="accdb_table"),"Col_1","Col_2","Col_3")

For

> typeof(df$Col_3) 

I get

[1] "list"

Using library(dplyr.teradata). I converted blob to string (maybe already on the wrong path here):

df$Hex <- blob_to_string(df$Col_3)

and now end up with a column (typeof = character) full of Hex:

df[1,4] 

[1] 49206765742061206c6f74206f662048657820616e642068617665207468652069737375652077697468207370656369616c2063687261637465727320696e204765726d616e206c616e6775616765206c696b65206e2b4150592d7

My question is, how to convert each value in Col_3 into proper Text (if possible, with respect to German special chracters like ü,ö, ä and ß).

I am aware of this solution How to convert a hex string to text in R?, but can't apply it properly:

df$Text <- rawToChar(as.raw(strtoi(df$Hex, 16L)))

Fehler in rawToChar(as.raw(strtoi(BinData$Hex, 16L))) : 
  Zeichenkette '\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\

Thx!

Sascha
  • 194
  • 7
  • Could you clarify why you would need the `dplyr.teradata` package? What's in that list if you look at it directly e.g. with `df$Col_3[[1]]`? Is it really a blob? – giocomai Dec 13 '21 at 22:41
  • Thank you for your response! `df$Col_3[[1]]` gives me `[1] 49 20 67 65 74 20 61 20 6c 6f 74 20 6f 66 20 48 65 78 20 61 6e 64 20 68 61 76 65 20 74 68 65 20 69 73 73 [49] 49 20 67 65 74 20 61 20 6c 6f 74 20 6f 66 20 48 65 78 20 61 6e 64 20 68 61 76 65 20 74 68 65 20 69 73 73 [97] ...` I am not sure if I need the `dplyr.teradata` package. – Sascha Dec 14 '21 at 08:41
  • `dplyr.teradata` does not seem to do anything useful here... given the little you shared, it just seems it collapses all the elements found there. So... the list inside the first row of `Col_3` has a lot of these two-characters strings, which looks like hexadecimal values. But converting it gives numbers... if that has any meaning, largely depends on what you expect to find there in your database. See e.g. `test <- c("49", "20", "67", "65", "74", "20", "61", "20", "6c", "6f")` `strtoi(test, base = 16L)` – giocomai Dec 14 '21 at 14:26
  • Thanks again for your answer! I expect to find plain text (that is somehow sensible, which is why I can not fully copy it here). `rawToChar(as.raw(strtoi(df$Col_3[[1]],16L)))` gives me that text, but only for the first element. I have no idea how to make it work for the whole column (and to convert it to a new one). `df$Col_3` gives me `[1] blob[547 B]`. – Sascha Dec 14 '21 at 14:38
  • 1
    Oh, now I understand (I think). See tentative answer below. – giocomai Dec 15 '21 at 09:28

1 Answers1

1

If I understand this correctly, what you want to do it to apply a function to each element of a list so that it returns a character vector (that you can add to a data frame, if you so wish).

This can be easily accomplished with the purrr family of functions. The following takes each element df$Col_3 and runs the function (with each element being the x in the given function)

purrr::map_chr(.x = df$Col_3,
               .f = function(x) {rawToChar(as.raw(strtoi(x,16L)))})

You should probably achieve the same with base R functions such as lapply() followed by unlist(), or sapply() but with purrr it's often easier to find inconsistent results.

giocomai
  • 3,043
  • 21
  • 24