Strings and Runes in Groovy

Groovy strings are similar to Java strings, but with some additional features. They are UTF-8 encoded by default and support interpolation. Groovy also has a concept similar to Go’s runes, which are called ‘characters’ in Groovy.

// The word "hello" in Thai language
def s = "สวัสดี"

// Print the length of the string (in bytes)
println "Len: ${s.length()}"

// Print the hex values of all bytes in the string
s.bytes.each { byte b ->
    print String.format("%02x ", b)
}
println()

// Count the number of characters (equivalent to runes in Go)
println "Character count: ${s.size()}"

// Iterate over each character in the string
s.eachWithIndex { ch, idx ->
    println "${ch.inspect()} starts at $idx"
}

println "\nUsing toCharArray()"
s.toCharArray().eachWithIndex { ch, idx ->
    println "${ch.inspect()} starts at $idx"
    examineCharacter(ch)
}

def examineCharacter(char c) {
    if (c == 't' as char) {
        println "found tee"
    } else if (c == 'ส' as char) {
        println "found so sua"
    }
}

When you run this Groovy script, you’ll see output similar to the following:

Len: 18
e0 b8 aa e0 b8 a7 e0 b8 b1 e0 b8 aa e0 b8 94 e0 b8 b5 
Character count: 6
'ส' starts at 0
'ว' starts at 1
'ั' starts at 2
'ส' starts at 3
'ด' starts at 4
'ี' starts at 5

Using toCharArray()
'ส' starts at 0
found so sua
'ว' starts at 1
'ั' starts at 2
'ส' starts at 3
found so sua
'ด' starts at 4
'ี' starts at 5

This Groovy code demonstrates several concepts:

  1. Groovy strings are UTF-8 encoded by default, allowing us to directly use Thai characters.

  2. The length() method returns the number of bytes in the string, similar to len(s) in Go.

  3. We can iterate over the bytes of a string using the bytes property and print their hex values.

  4. The size() method returns the number of characters in the string, which is equivalent to the rune count in Go.

  5. We can use eachWithIndex to iterate over each character in the string along with its index.

  6. The toCharArray() method converts the string to a character array, which we can then iterate over.

  7. We define an examineCharacter method that checks for specific characters, similar to the examineRune function in the Go example.

  8. Groovy uses single quotes for character literals, just like Go uses single quotes for rune literals.

While Groovy doesn’t have a direct equivalent to Go’s runes, its character type serves a similar purpose in representing Unicode code points. The main difference is that Groovy characters are 16-bit, while Go runes are 32-bit integers.