2

I've written a small function that returns the categories of the ICD-10, since I use them frequently. The functions works as expected, however when I want to integrate it into my package it gives me the following error message. I replaced the german Umlauts 'ö', 'ä', 'ü' with the unicode notation \uxxxx but that does not seem to help it. Did I miss some other non-ASCII character? I can not seem to find it

R CMD Check warning

W  checking R files for non-ASCII characters ... 
   Found the following file with non-ASCII characters:
     ICD_10.R
   Portable packages must use only ASCII characters in their R code,
   except perhaps in comments.
   Use \uxxxx escapes for other characters.

Function

#' Get ICD-10 Codes as Character Vector
#' @description  Returns a character vector of length 11 for all ICD-10 Categories
#' @param lang Language for the character vector, curr available in english and german (lang = "ger"), Default: "eng"
#' @return Character Vector containing the 11 categroies for mental disorders (F Codes F01-F99)
#'
#' @author Bjoern 
#'
#' @examples
#' get_ICD_10_cats() # returns english ICD-10 Cats
#' get_ICD_10_cats("ger") # returns the german ones
#' @export
get_ICD_10_cats <- function(lang="eng") {
  eng <- c("F01-F09 Mental disorders due to known physiological conditions",
                    "F10-F19 Mental and behavioral disorders due to psychoactive substance use",
                    "F20-F29 Schizophrenia, schizotypal, delusional, and other non-mood psychotic disorders",
                    "F30-F39 Mood \u005Baffective\u005D disorders",
                    "F40-F48 Anxiety, dissociative, stress-related, somatoform and other nonpsychotic mental disorders",
                    "F50-F59 Behavioral syndromes associated with physiological disturbances and physical factors",
                    "F60-F69 Disorders of adult personality and behavior",
                    "F70-F79 Intellectual disabilities",
                    "F80-F89 Pervasive and specific developmental disorders",
                    "F90-F98 Behavioral and emotional disorders with onset usually occurring in childhood and adolescence",
                    "F99-F99 Unspecified mental disorder")

  ger <-  c(
    "F01-F09 Organische, einschließlich symptomatischer psychischer St\u00F6rungen",
    "F10-F19 Psychische und Verhaltensst\u00F6rungen durch psychotrope Substanzen",
    "F20-F29 Schizophrenie, schizotype und wahnhafte St\u00F6rungen",
    "F30-F39 Affektive St\u00F6rungen",
    "F40-F48 Neurotische, Belastungs- und somatoforme St\u00F6rungen",
    "F50-F59 Verhaltensauff\u00E4lligkeiten mit k\u00F6rperlichen St\u00F6rungen und Faktoren",
    "F60-F69 Pers\u00F6nlichkeits- und Verhaltensst\u00F6rungen",
    "F70-F79 Intelligenzst\u00F6rung",
    "F80-F89 Entwicklungsst\u00F6rungen",
    "F90-F98 Verhaltens- und emotionale St\u00F6rungen mit Beginn in der Kindheit und Jugend",
    "F99-F99 Nicht n\u00E4her bezeichnete psychische St\u00F6rungen"
  )
  if(tolower(lang) %in% c("ger", "de")) return(ger) else return(eng)

}

Solution Edit in a Nutshell

Thanks to Dirk Eddelbuettel and the dang package there is a perfect solution to find non-ASCII characters in your package:

remotes::install_github("eddelbuettel/dang")
dang::checkPackageAsciiCode(dir = ".")

This returns the ASCII Character one has missed, in my case ß, which can be replaced by "\u00DF"

Björn
  • 1,610
  • 2
  • 17
  • 37

1 Answers1

4

Two things:

  • With R 4.2.* and consistent use of UTF-8 this may no longer be an issue if utf-8 encoding is declared, it may be worth a try

  • Finding such offending non-Ascii character can be a pain; at one point in 2020 I needed that once and extracted the base R code into a function checkPackagesAsciiCode.R in my dang package containing a (somewhat random) collection of functions

If you have your package on a public repo (GitHub maybe?) I can take a closer look.

Tschoe mit oe, but in 7bit

Edit: Otherwise, and from just eyeballing, you have a remaining 'ß' in 'einschließlich' you may want to try replacing by an \uxxxx sequence.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • Thank you that's a really neat function. I used it and it turns out you were completly right. It came back with `dang::checkPackageAsciiCode(".") "F01-F09 Organische, einschließlich symptomatischer psychischer St\u00F6rungen", found ffffffc3 ICD_10.R` That was really helpful, I appreciate it. Are you open for one small feedback? You could add "." as default parameter for the `dir` argument. Also thank you for the offer to look into the package (while not necessary now) you would find it [here](https://buedenbender.github.io/datscience/index.html) – Björn Jun 13 '22 at 07:54
  • 1
    Credit to R Core -- I just wrapped the function for easier use when I also found it helpful. Glad it confirmed everything. Using `"."` (or maybe `getcwd()` ?) is not a bad idea, I'll think about it. But I also like the need for a forced argument to make it explicit. Need to think about that... Maybe you can file an issue ticket and we discuss at the package repo. Lastly, I _think_ with current R you do not need the escape sequences _if_ you declare the encoding on the file (or package?) to be utf-8. – Dirk Eddelbuettel Jun 13 '22 at 11:46