2

I'm trying to write a test to verify that a smiley face is recognized as emoji: which is \uD83D and \uDE00

Is there a way to represent this in Kotlin? If I use a literal it tells me 'Too many characters in a character literal'. The isEmoji method expects a Char so I cannot pass a string and to the best of my knowledge there is no way to instantiate a Char unless I know the exact numeric value

What is the closest I can get to this type of statement:

assertTrue(isEmoji(''))

for testing

fun isEmoji(currentChar: Char): Boolean
possum
  • 1,837
  • 3
  • 9
  • 18
Nick Cardoso
  • 20,807
  • 14
  • 73
  • 124
  • from what library do you get the `isEmoji` method? – AlexT Nov 29 '21 at 10:13
  • @Alex.T not a public one. It's a method that receives one Char at a time from strings. I'll add the signature, but it's probably as you'd guess – Nick Cardoso Nov 29 '21 at 10:16
  • Thanks @matt that was actually the last thing I tried before opening the question. charAt doesnt seem to exist in kotlin, it has .get(). It didn't recognize .get(0) as emoji so now it's even more important to verify if the isEmoji method breaks or the test – Nick Cardoso Nov 29 '21 at 10:39
  • 1
    It looks like you're saying the emoji is two characters \uD83D and \uDE00. You cannot pass it as a single value. So you have to use a string or an array. – matt Nov 29 '21 at 10:53
  • Well (my understanding could be wrong) aren't they two 8bit characters that should be representable as one 1bit char since thats the default for kotlin? – Nick Cardoso Nov 29 '21 at 10:56
  • 2
    The emoji is two 16 bit characters in java. If you create it using the String literal, or the unicode escapes you get the same values. I am pretty sure that corresponds to two 16 bit characters in Kotlin. – matt Nov 29 '21 at 10:58
  • So just to clarify: an isEmoji method taking one character can never return true, because emoji are always 2 chars? – Nick Cardoso Nov 29 '21 at 11:58
  • from what I can tell here, you'll just have to read the docs to the lib you are using. Especially after this comment: "an isEmoji method taking one character can never return true, because emoji are always 2 chars?". We can't tell how that method works if it is private. – AlexT Nov 29 '21 at 14:45
  • 1
    This is a [surrogate pair](https://stackoverflow.com/a/22121318/611819). – dnault Nov 29 '21 at 22:33
  • 2
    As @dnault indicates, you need to know a little Unicode to ask the right question here. Unicode code points go up to U+10FFFF, but for historical reasons Java and Kotlin store characters as 16-bit values, which can't hold every code point. Instead, they store values encoded as UTF-16: code points up to U+FFFF are stored directly as one character, while the remaining code points are stored as a ‘surrogate pair’ of characters (the first in the range 0xD800–0xDBFF, and the second in the range 0xDC00–0xDFFF — those being code points reserved for surrogate pairs)… – gidds Nov 29 '21 at 23:26
  • 1
    …Many developers are unaware of this, because code points U+10000 and up are rarely used — emoji being one of the few common cases. – gidds Nov 29 '21 at 23:27

0 Answers0