A Java string is a sequence of characters. The language and the standard library treat strings as objects of the String class, which are immutable sequences of Unicode characters. In Java, individual characters are represented by the char data type, which is a 16-bit Unicode code unit.
When you run this program, you’ll get output similar to this:
This example demonstrates how Java handles Unicode strings and characters. It shows the difference between the length of a string (which counts UTF-16 code units) and the number of Unicode code points in the string. It also shows how to iterate over the code points in a string and how to examine individual code points.
Note that Java uses UTF-16 encoding for strings internally, which means that some Unicode characters (those outside the Basic Multilingual Plane) are represented by surrogate pairs and take up two char positions in the string. The codePointCount method and the codePoints() stream take this into account and give the correct count and iteration of Unicode characters.