0

I am trying to export my data table from R Studio to the dta format. I use write_dta function from haven library in R and get the following error:

A provided string value was longer than the available storage size of the specified column.

I am quite new to R and Stata and don't understand what it means and what should I do about it.

Nick Cox
  • 35,529
  • 6
  • 31
  • 47

2 Answers2

4

It sounds like you have a piece of long text in your data.frame. The write_dta has known issues handling long strings (https://github.com/tidyverse/haven/issues/437). You can trim the strings in your data.frame like this:

df = as.data.frame(apply(YOUR_DATA, 2, function(x){
     if(class(x) == 'character') substr(x, 1, 128) else x}))

And then try write_dta(df). The max length of 128 characters should be safe, but newer versions of Stata can handle a lot more.

Nick Cox
  • 35,529
  • 6
  • 31
  • 47
Thomas Rosa
  • 630
  • 10
  • 21
1

I noticed that with the data.frame solution potential labels will get lost. A tibble would allow one to keep labels (e.g. imported *.sav file with labels from a survey collection plattform).

Here is a tidyverse solution using haven to read and write that would keep labels. Keep in mind that your inital df also needs to be a tibble.

library(tidyverse)

df <- haven::read_sav("YOUR FILE.sav")   # could also be some other file format that you start with as a tibble

df <- df %>%
  mutate(across(where(is.character), ~ substr(., 1, 2045)))

haven::write_dta(df, "NAME OF NEW FILE.dta")

For me the maximum string length that worked to write_dta(df) was 2045.

colonus
  • 189
  • 1
  • 8