4

Wondering is there more simple way than computing the character count of a given string as below?

String word = "AAABBB";
    Map<String, Integer> charCount = new HashMap();
    for(String charr: word.split("")){
        Integer added = charCount.putIfAbsent(charr, 1);
        if(added != null)
            charCount.computeIfPresent(charr,(k,v) -> v+1);
    }

    System.out.println(charCount);
OTUser
  • 3,788
  • 19
  • 69
  • 127
  • For ANSI characters, you can just have an array of size 256 and compute it. – nice_dev Mar 12 '19 at 19:10
  • @vivek_23 Which [ANSI character set](https://en.wikipedia.org/wiki/ANSI_character_set) would that be? Or did you mean ASCII and 128? – Andreas Mar 12 '19 at 19:39
  • 2
    @vivek_23 that is the windows code page 1252, not ANSI. The Unicode standard matches the iso-latin-1 character set for the first 256 codepoints. Referring to the windows code page 1252 is an unnecessary complication, as that code page does not match in the 128-159 range. – Holger Jun 06 '20 at 12:59
  • @Holger Ahh! Thanks for the correction. Deleted my previous comment to avoid confusion. – nice_dev Jun 06 '20 at 14:41

12 Answers12

9

Simplest way to count occurrence of each character in a string, with full Unicode support (Java 11+)1:

String word = "AAABBB";
Map<String, Long> charCount = word.codePoints().mapToObj(Character::toString)
        .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
System.out.println(charCount);

1) Java 8 version with full Unicode support is at the end of the answer.

Output

{A=3, B=3}

UPDATE: For Java 8+ (doesn't support characters from supplemental planes, e.g. emoji):

Map<String, Long> charCount = IntStream.range(0, word.length())
        .mapToObj(i -> word.substring(i, i + 1))
        .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

UPDATE 2: Also for Java 8+.

I was mistaken, thinking that codePoints() wasn't added until Java 9. It was added in Java 8 to the CharSequence interface, so it doesn't show in javadoc for String in Java 8, and shows as added in Java 9 for later versions of the javadoc.

However, the Character.toString​(int codePoint) method wasn't added until Java 11, so to use the Character.toString​(char c) method, we can use chars() in Java 8:

Map<String, Long> charCount = word.chars().mapToObj(c -> Character.toString((char) c))
        .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

Or for full Unicode support, incl. supplemental planes, we can use codePoints() and the String(int[] codePoints, int offset, int count) constructor, in Java 8:

Map<String, Long> charCount = word.codePoints()
        .mapToObj(cp -> new String(new int[] { cp }, 0, 1))
        .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
Andreas
  • 154,647
  • 11
  • 152
  • 247
  • Am sorry, is there a simple way for Java 8? – OTUser Mar 12 '19 at 19:30
  • 5
    Speaking of “full Unicode support” and Emojis, it’s worth pointing out that even using codepoints is not necessarily providing the intended semantics. E.g. `"ā̧‍"` has 10 chars, 7 codepoints, but only three characters; the first one demonstrates that this is not only an Emoji issue. The only solution, I currently know of, is to process *grapheme clusters*, e.g. with Java 9+: `Pattern.compile("\\X").matcher(example).results() .collect(Collectors.groupingBy(MatchResult::group, Collectors.counting()))`. – Holger May 27 '20 at 08:46
3
     String str = "Hello Manash";
    Map<Character,Long> hm = str.chars().mapToObj(c-> 
    (char)c).collect(Collectors.groupingBy(c->c,Collectors.counting()));
    System.out.println(hm);
2

Try the below approaches:

Approach 1:

    String str = "abcaadcbcb";
    
    Map<Character, Integer> charCount = str.chars()
            .boxed()
            .collect(toMap(
                    k -> (char) k.intValue(),
                    v -> 1,         // 1 occurence
                    Integer::sum));
    System.out.println("Char Counts:\n" + charCount);

Approach 2:

    String str = "abcaadcbcb";
    Map<Character, Integer> charCount = new HashMap<>();
    for (char c : str.toCharArray()) {
        charCount.merge(c,          // key = char
                1,                  // value to merge
                Integer::sum);      // counting
    }
    System.out.println("Char Counts:\n" + charCount);

Output:

    Char Counts:
    {a=3, b=3, c=3, d=1}
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Nov 25 '21 at 10:59
1

Try this one :

    List<Character> chars=Arrays.asList('h','e','l','l','o','w','o','r','l','d');
    Map<Character,Long> map=chars.stream().map(c->c).
    collect(Collectors.groupingBy(c->c,Collectors.counting()));
    System.out.println(map);

output:

{r=1, d=1, e=1, w=1, h=1, l=3, o=2}
ASR
  • 3,289
  • 3
  • 37
  • 65
1
word.chars().mapToObj(c-> (char)c).collect(Collectors.groupingBy(Function.identity(),LinkedHashMap::new, Collectors.counting()));

This will give you character count in order of appearance of the character.

Paul Martin
  • 37
  • 1
  • 6
  • MIght you format your code snippet as coded, to allow for greater readability? Thanks. – user2901351 Nov 09 '21 at 11:02
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Nov 09 '21 at 11:02
1
String str = "abcaadcbcb";

Map<String, Long> charCount  = 
Arrays.asList(str.split("")).stream().collect(Collectors.groupingBy(Function.identity(),Collectors.counting()));
    
Lijo
  • 6,498
  • 5
  • 49
  • 60
1

If you're open to using a third-party library that works with Java 8 or above, Eclipse Collections (EC) can solve this problem using a primitive Bag to count characters. Use a CharBag if char values are required, or an IntBag if codePoints (int values) are required. A Bag is a simpler data structure for counting things and may be backed by a primitive HashMap so as not to box the counts as Integer or Long objects. A Bag doesn't suffer from the missing keys return null values problem that a HashMap does in Java.

@Test
public void characterCountJava8()
{
    String word = "AAABBB";
    CharAdapter chars = Strings.asChars(word);
    CharBag charCounts = chars.toBag();

    Assertions.assertEquals(3, charCounts.occurrencesOf('A'));
    Assertions.assertEquals(3, charCounts.occurrencesOf('B'));
    Assertions.assertEquals(0, charCounts.occurrencesOf('C'));

    System.out.println(charCounts.toStringOfItemToCount());
}

Outputs:

{A=3, B=3}

CharAdapter and CharBag are primitive collection types available in EC. A CharBag is useful if you want to count char values. Notice that the charCounts.occurrencesOf('C') returns 0 instead of null as it would if this was a HashMap.

The following example shows using codePoints that are visually appealing using emojis. The code itself will work with Java 8, but I believe the Emoji literal support wasn't added until Java 11.

@Test
public void codePointCountJava11()
{
    String emojis = "";
    CodePointAdapter codePoints = Strings.asCodePoints(emojis);
    IntBag emojiCounts = codePoints.toBag();

    int appleInt = "".codePointAt(0);
    int bananaInt = "".codePointAt(0);
    int pearInt = "".codePointAt(0);
    Assertions.assertEquals(3, emojiCounts.occurrencesOf(appleInt));
    Assertions.assertEquals(2, emojiCounts.occurrencesOf(bananaInt));
    Assertions.assertEquals(0, emojiCounts.occurrencesOf(pearInt));

    System.out.println(emojiCounts.toStringOfItemToCount());

    Bag<String> emojiStringCounts = emojiCounts.collect(Character::toString);

    System.out.println(emojiStringCounts.toStringOfItemToCount());
}

Outputs:

{127820=2, 127822=3}  // IntBag.toStringOfItemToCount()
{=2, =3}          // Bag<String>.toStringOfItemToCount()

CodePointAdapter and IntBag are primitive collection types available in EC. An IntBag is useful if you want to count int values. Notice that the emojiCounts.occurrencesOf(pearInt) returns 0 instead of null as it would if this was a HashMap.

I converted the IntBag to a Bag<String> to show the differences when printing int vs. char. You need to convert int codePoints back to String if you want to print anything.

The comment Holger left on the accepted answer about grapheme clusters was insightful and helpful. Thank you! The codepoint solution here suffers from the same issue as all of the other codepoint solutions.

Eclipse Collections 11.1 was compiled and released with Java 8. I wouldn't recommend staying on Java 8 any more, but wanted to point out this is still possible.

Note: I am a committer for Eclipse Collections.

Donald Raab
  • 6,458
  • 2
  • 36
  • 44
0

Hope this help : Java 8 Stream & Collector:

    String word = "AAABBB";
    Map<Character, Integer> charCount = word.chars().boxed().collect(Collectors.toMap(
                    k -> Character.valueOf((char) k.intValue()),
                    v -> 1,
                    Integer::sum));
    System.out.println(charCount);

Output:
    {A=3, B=3}
mm6
  • 84
  • 6
  • `chars()` requires Java 9, and better solution using `codePoints()` instead of `chars()` already posted 13 minutes earlier. – Andreas Mar 12 '19 at 19:35
  • 1
    @Andreas agree with`codePoints()`solution, but`chars()`introduce in java 8 [String.chars()](http://docs.oracle.com/javase/8/docs/api/java/lang/CharSequence.html#chars--) – mm6 Mar 13 '19 at 07:14
  • 1
    That would be `CharSequence.chars()`, not `String.chars()`, but I accept your correction. [Javadoc for Java 11](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/String.html#chars()) show method as added to `String` in Java 9, which is what lead me astray. – Andreas Mar 14 '19 at 16:06
0

Figured out, below is another simple way.

Map<String, Integer> charCount = new HashMap();
    for(String charr: s.split("")){
        charCount.put(charr,charCount.getOrDefault(charr,0)+1);
}
OTUser
  • 3,788
  • 19
  • 69
  • 127
  • 2
    `charCount.put(charr,charCount.getOrDefault(charr,0)+1);` can be simplified to `charCount.merge(charr, 1, Integer::sum);` By the way, you should use `new HashMap<>()`… – Holger May 27 '20 at 07:57
0

Simple Java 8 solution I can think is:

Map<String, Long> map= Arrays.stream(word.trim().toLowerCase().split(""))
            .collect(Collectors.groupingBy(Function.identity(),Collectors.counting()));
Mayur Gite
  • 397
  • 4
  • 16
0

Java stream solution for this, I hope the code is self-explanatory.

String s = "ccacbbaac"
Map<Character, Long> collect = s.chars().mapToObj(y -> (char) y).collect(Collectors.groupingBy(x -> (char) x, Collectors.counting()));
-1
String str = "edcba"

Map<String, Long> couterMap1 = str.codePoints()
                                  .mapToObj(Character::toString)
                                  .collect(Collectors.groupingBy(e -> e, Collectors.counting()));
Abra
  • 19,142
  • 7
  • 29
  • 41