-2

Is there a way I can get rid of notations at the front and the end of a string?

For example,

"hello," -> "hello"
"hello;" -> "hello"

In other words, remove all punctuation after, before, or within a word, except single quotes and single dashes if they're followed by more letters.

More examples,

"lies,", "'This", "all-eating" and "deserv'd."

will become

"lies", "this", "all-eating" and "deserv'd"
Bohemian
  • 412,405
  • 93
  • 575
  • 722
paupau
  • 57
  • 1
  • 2
  • 9
  • 1
    `remove all punctuation after, before, or within a word` provide an example for within a word. And also don't forget to show your attempts. – Avinash Raj Oct 30 '14 at 04:34
  • "lies,", "'This", "all-eating" and "deserv'd", which go to "lies", "this", "all-eating" and "deserv'd" – paupau Oct 30 '14 at 04:38
  • What would be the expected output? Add the above comment in your question. – Avinash Raj Oct 30 '14 at 04:43

1 Answers1

0

Use the poxix regex term \p{Punct}:

str = str.replaceAll("^\\p{Punct}*|\\p{Punct}+$|\\p{Punct}{2,}", "")

The mid-word punctuations are removed using a "two or more" match.


Some test code:

for (String str : new String[]{"hello,", "hello;", "li--es", "'This", "all-eating", "deserv'd."})
    System.out.println(str.replaceAll("^\\p{Punct}*|\\p{Punct}+$|\\p{Punct}{2,}", ""));

Output:

hello
hello
lies
This
all-eating
deserv'd
Bohemian
  • 412,405
  • 93
  • 575
  • 722