8

Im looking to see if a string contains only blank space. The string could be

"  "

or

"           "

or

"              " 

etc...

I want to do this so I can change values in a data frame to NA, because my goal is to fix/clean messed up data.

Thank you

talat
  • 68,970
  • 21
  • 126
  • 157
Kasarrah
  • 315
  • 2
  • 4
  • 14

2 Answers2

15

You can try with grepl:

grepl("^\\s*$", your_string)

"^\\s*$" asks for 0 or more (*) spaces (\\s) between beginning (^) and end ($) of string.

Examples

grepl("^\\s*$", " ")
#[1] TRUE
grepl("^\\s*$", "")
#[1] TRUE
grepl("^\\s*$", "    ")
#[1] TRUE
grepl("^\\s*$", " ab")
[1] FALSE

NB: you can also just use a space instead of \\s in the regex ("^\\s*$").

Cath
  • 23,906
  • 5
  • 52
  • 86
  • 1
    a better regex would be `grepl("\\s+", your_string, perl = True)` the + forces it to match 1 more spaces. – Seekheart Mar 01 '16 at 15:05
  • 4
    @Seekheart it is actually why I prefered `*` so it also checks for the empty string, I thought the OP might also want empty strings to be spotted (and further modified as `NA`) (btw, yours wouldn't work without specifying beginning and end of string, you don't need perl option for that and TRUE must be written in uppercase ;-) ) – Cath Mar 01 '16 at 15:06
11

Without regex, you could use

which(nchar(trimws(vec))==0)

The function trimws() removes trailing and leading whitespace characters from a string. Hence, if after the use of trimws the length of the string (determined by nchar()) is not zero, the string contains at least one non-whitespace character.

Example:

vec <- c(" ", "", "   "," a  ", "             ", "b")
which(nchar(trimws(vec))==0)
#[1] 1 2 3 5

The entries 1, 2, 3, and 5 of the vector vec are either empty or contain only whitespace characters.


As suggested by Richard Scriven, the same result can be obtained without calling nchar(), by simply using trimws(vec)=="" (or which(trimws(vec)==""), depending on the desired output: the former results in a vector of booleans, the latter in the index numbers of the blank/empty entries).

RHertel
  • 23,412
  • 5
  • 38
  • 64