A Java string is a sequence of characters. The language and the standard library treat strings as immutable objects that represent text. In Java, the concept of a character is represented by the char type, which is a 16-bit Unicode code point.
When you run this program, you’ll see output similar to this:
This example demonstrates how Java handles strings and Unicode characters. Unlike some other languages, Java uses UTF-16 encoding for its strings internally, which means that some characters (those outside the Basic Multilingual Plane) are represented by surrogate pairs and take up two char positions in the string.
The codePoints() method provides a way to iterate over the actual Unicode code points in the string, handling surrogate pairs correctly. This is especially important when dealing with characters from scripts that use characters outside the Basic Multilingual Plane, such as some rare Chinese characters or emoji.
Remember that in Java, char is a 16-bit type, while Unicode code points can require up to 21 bits. The int type is used to represent full Unicode code points when necessary.