Strings and Runes in Squirrel

Our first Java program demonstrates working with strings and characters. Java strings are sequences of characters, and characters are represented by the char type. Unlike Go, Java doesn’t have a separate concept of “runes”, but it does use Unicode internally.

import java.nio.charset.StandardCharsets;

public class StringsAndChars {
    public static void main(String[] args) {
        // s is a String assigned a literal value
        // representing the word "hello" in the Thai language.
        // Java string literals are UTF-16 encoded internally.
        final String s = "สวัสดี";

        // This will produce the length of the string in characters.
        System.out.println("Len: " + s.length());

        // We can get the raw bytes of the UTF-8 representation
        byte[] utf8Bytes = s.getBytes(StandardCharsets.UTF_8);
        System.out.print("UTF-8 bytes: ");
        for (byte b : utf8Bytes) {
            System.out.printf("%x ", b);
        }
        System.out.println();

        // To count how many characters are in a string, we can use
        // the length() method. Note that some Unicode characters
        // might be represented by surrogate pairs in Java, which
        // would count as two chars.
        System.out.println("Character count: " + s.length());

        // We can iterate over each character in the string
        for (int i = 0; i < s.length(); i++) {
            System.out.printf("U+%04X starts at %d\n", (int)s.charAt(i), i);
        }

        // We can also use a for-each loop to iterate over the characters
        System.out.println("\nUsing for-each loop");
        for (char c : s.toCharArray()) {
            System.out.printf("U+%04X\n", (int)c);
        }

        // This demonstrates passing a char value to a function
        for (char c : s.toCharArray()) {
            examineChar(c);
        }
    }

    private static void examineChar(char c) {
        // We can compare a char value to a char literal directly
        if (c == 't') {
            System.out.println("found tee");
        } else if (c == 'ส') {
            System.out.println("found so sua");
        }
    }
}

To run this program, save it as StringsAndChars.java, compile it with javac StringsAndChars.java, and then run it with java StringsAndChars.

The output will be similar to:

Len: 6
UTF-8 bytes: e0 b8 aa e0 b8 a7 e0 b8 b1 e0 b8 aa e0 b8 94 e0 b8 b5 
Character count: 6
U+0E2A starts at 0
U+0E27 starts at 1
U+0E31 starts at 2
U+0E2A starts at 3
U+0E14 starts at 4
U+0E35 starts at 5

Using for-each loop
U+0E2A
U+0E27
U+0E31
U+0E2A
U+0E14
U+0E35
found so sua
found so sua

This Java code demonstrates similar concepts to the original example, but with Java-specific syntax and methods. Note that Java uses UTF-16 encoding internally for strings, which is different from Go’s UTF-8 encoding. This can lead to some differences in how characters are handled, especially for characters outside the Basic Multilingual Plane.