2

I have a dataset which I export with command outsheet into a csv-file. There are some rows which breaks line at a certain place. Using a hexadecimal editor I could recognize the control character for line feed "0a" in the record. The value of the variable producing the line break shows visually (in Stata) only 5 characters. But if I count the number of characters:

gen xlen = length(x)

I get 6. I could write a Perl programm to get rid of this problem but I prefer to remove the control characters in Stata before exporting (for example using regexr()). Does anyone have an idea how to remove the control characters?

Nick Cox
  • 35,529
  • 6
  • 31
  • 47
giordano
  • 2,954
  • 7
  • 35
  • 57

1 Answers1

5

The char() function calls up particular ASCII characters. So, you can delete such characters by replacing them with empty strings.

replace x = subinstr(x, char(10), "", .) 
Nick Cox
  • 35,529
  • 6
  • 31
  • 47
  • A Stata command `charlist` to identify which characters occur in strings is downloadable via `ssc inst charlist`. Its main use is to identify problematic characters such as linefeeds. – Nick Cox Jan 25 '13 at 10:44
  • `charlist` does what was intended, but people should note the much more versatile `chartab`, also from SSC. Similarly this post is out of date insofar as Stata now supports Unicode. `help string functions` reveals a toolkit of relevant functions. – Nick Cox Jan 12 '22 at 18:52