How to remove single character in a string (java)?

Question

I have a variable of type string, I want to remove all single characters from it.

example:

String test = "p testing t testing";

I want the output to be like this:

String test = "testing testing";

help me please. thanks.

Create a new String with the contents you want. You may want to use `toCharArray` and process characters individually before you build your new String. — webuster, Sep 24 '14 at 15:06
You aren't removing just the single letters, you are also removing (some of) the whitespace around them. Are you just interested in words of length greater than one? Do you care at all about preserving the original whitespace (multiple spaces, tabs, etc.)? — azurefrog, Sep 24 '14 at 15:08

Thomas · Answer 1 · 2014-09-24T15:22:25.580

You might want to use a regex and replace every character which is surrounded by whitespace, the start or the end of the input and replace that with a single space, e.g.

String test = "p testing t testing".replaceAll("(^|\\s+)[a-zA-Z](\\s+|$)", " ");

This might place a space at the front and the end of the string though, so you might want to handle those cases separateley:

//first replace all characters surrounded by whitespace and the whitespace by a single space
String test = "p testing t testing".replaceAll("\\s+[a-zA-Z]\\s+", " ");

//replace any remaining single character with whitespace and either start or end of input next to it with nothing
test = test.replaceAll("(?>^[a-zA-Z]\\s+|\\s+[a-zA-Z]$)", "");

Another hint: if you want to filter any kind of character (i.e. unicode characters) you might want to replace [a-zA-Z] with \p{L} for any letter, [\p{L}\p{N}] for any letter or number or \S for any non-whitespace. Of course there are more possible character classes so please have a look at regular-expressions.info.

Final note:

Although regular expressions are an "easy" and concise way to solve that, for large inputs it may be slower than splitting and reconcatenation by a large degree. Whether you need that performance depends on your needs and the size of the input.

score 1 · Answer 2 · answered Sep 24 '14 at 16:20

1

Using regex you can achieve that.

Try this one liner replace:

String test = "p testing t testing z".replaceAll("\\b[a-z] \\b|\\b [a-z]\\b", "");

answered Sep 24 '14 at 16:20

Paresh

156
1
3

score 0 · Answer 3 · answered Sep 24 '14 at 15:07

0

String[] splitString = null;
String test = "p testing t testing";
splitString = test.split(" ");
String newString = "";
for(int i = 0; i < splitString.length; i++)
{
   if(splitString[i].length() != 1)
   {
      newString += splitString[i] + " ";
   }
}
newString.trim();

This will loop through the split strings and get rid of ones that have 1 as length.

answered Sep 24 '14 at 15:07

brso05

13,142
2
21
40

Here as well, splitting is reasonable and may be even faster than regex (if you need that speed) but I'd make `newString` a `StringBuilder`. – Thomas Sep 24 '14 at 15:19
Yes you could do StringBuilder instead. – brso05 Sep 24 '14 at 15:20

score 0 · Answer 4 · answered Sep 24 '14 at 15:10

0

String[] chunks = test.split("\\s+");

String newtest = new String("");

for ( String chunk : chunks)
{
    if (chunk.length() > 1)
    {
        newtest+= chunk + " ";
    }
}
newtest = newtest.trim(); //to remove the last space

answered Sep 24 '14 at 15:10

I'd use a `StringBuilder` for `newtest` otherwise you'd get lots of intermediate string objects for large inputs, which hurts memory and performance. – Thomas Sep 24 '14 at 15:17
You're right, actually you could even reuse the test string to avoid allocating more memory. – Sep 24 '14 at 15:18
Well, reusing the test string would not help since for every `newtest+= chunk + " "` you'd get a new string object - keep in mind that string are immutable. – Thomas Sep 24 '14 at 15:23

score 0 · Answer 5 · answered Nov 03 '22 at 22:50

Are there no real performance tests on this?

I have a similar thing (UTF-8 words where non letter or number garbage !@#$%^&*(){}[];':",./<>? is replaced with spaces) where I was accumulating 2-8 letter words for a misspelling correction cache using LevenshteinDistance but it leaves all these one character strings, I was looping skipping over single characters after splitting the string (and capitalizing). I'm unsure if there is a faster way. Since I'm using a regex anyway I was wondering if you could kill 2 birds with one stone somehow.

static Pattern lettersAndNumbersOnly = Pattern.compile("[^\\p{L}\\p{N} ]");
static <T extends Searchable> void associateSubstringsToSearchTargets(Map<String, Collection<T>> lookupMapForSubStringSearchTargets, T searchable) {
    for (String s : StringUtils.split(lettersAndNumbersOnly.matcher(searchable.getSearchableString().toUpperCase()).replaceAll(" "))) {
        if (s.length() > 1) { //<-- skip the small stuff
            String truncated = s.substring(0, Math.min(s.length(), 8));
            for (int x = 2; x < truncated.length() + 1; x++) {
                addToMapOfCollections(lookupMapForSubStringSearchTargets, truncated.substring(0, x), searchable, HashSet::new);
            }
        }
    }
}

score -1 · Answer 6 · answered Sep 24 '14 at 15:05

-1

1.Split the String by space.

2.In String array check the length of each string and make the choice.

answered Sep 24 '14 at 15:05

pd30

240
1
7

How to remove single character in a string (java)?

6 Answers6