I have a variable of type string, I want to remove all single characters from it.
example:
String test = "p testing t testing";
I want the output to be like this:
String test = "testing testing";
help me please. thanks.
I have a variable of type string, I want to remove all single characters from it.
example:
String test = "p testing t testing";
I want the output to be like this:
String test = "testing testing";
help me please. thanks.
You might want to use a regex and replace every character which is surrounded by whitespace, the start or the end of the input and replace that with a single space, e.g.
String test = "p testing t testing".replaceAll("(^|\\s+)[a-zA-Z](\\s+|$)", " ");
This might place a space at the front and the end of the string though, so you might want to handle those cases separateley:
//first replace all characters surrounded by whitespace and the whitespace by a single space
String test = "p testing t testing".replaceAll("\\s+[a-zA-Z]\\s+", " ");
//replace any remaining single character with whitespace and either start or end of input next to it with nothing
test = test.replaceAll("(?>^[a-zA-Z]\\s+|\\s+[a-zA-Z]$)", "");
Another hint: if you want to filter any kind of character (i.e. unicode characters) you might want to replace [a-zA-Z]
with \p{L}
for any letter, [\p{L}\p{N}]
for any letter or number or \S
for any non-whitespace. Of course there are more possible character classes so please have a look at regular-expressions.info.
Final note:
Although regular expressions are an "easy" and concise way to solve that, for large inputs it may be slower than splitting and reconcatenation by a large degree. Whether you need that performance depends on your needs and the size of the input.
Using regex you can achieve that.
Try this one liner replace:
String test = "p testing t testing z".replaceAll("\\b[a-z] \\b|\\b [a-z]\\b", "");
String[] splitString = null;
String test = "p testing t testing";
splitString = test.split(" ");
String newString = "";
for(int i = 0; i < splitString.length; i++)
{
if(splitString[i].length() != 1)
{
newString += splitString[i] + " ";
}
}
newString.trim();
This will loop through the split strings and get rid of ones that have 1 as length.
String[] chunks = test.split("\\s+");
String newtest = new String("");
for ( String chunk : chunks)
{
if (chunk.length() > 1)
{
newtest+= chunk + " ";
}
}
newtest = newtest.trim(); //to remove the last space
Are there no real performance tests on this?
I have a similar thing (UTF-8 words where non letter or number garbage !@#$%^&*(){}[];':",./<>? is replaced with spaces) where I was accumulating 2-8 letter words for a misspelling correction cache using LevenshteinDistance but it leaves all these one character strings, I was looping skipping over single characters after splitting the string (and capitalizing). I'm unsure if there is a faster way. Since I'm using a regex anyway I was wondering if you could kill 2 birds with one stone somehow.
static Pattern lettersAndNumbersOnly = Pattern.compile("[^\\p{L}\\p{N} ]");
static <T extends Searchable> void associateSubstringsToSearchTargets(Map<String, Collection<T>> lookupMapForSubStringSearchTargets, T searchable) {
for (String s : StringUtils.split(lettersAndNumbersOnly.matcher(searchable.getSearchableString().toUpperCase()).replaceAll(" "))) {
if (s.length() > 1) { //<-- skip the small stuff
String truncated = s.substring(0, Math.min(s.length(), 8));
for (int x = 2; x < truncated.length() + 1; x++) {
addToMapOfCollections(lookupMapForSubStringSearchTargets, truncated.substring(0, x), searchable, HashSet::new);
}
}
}
}
1.Split the String by space.
2.In String array check the length of each string and make the choice.