Strings and Runes in Fortress

Our first program will demonstrate string handling and rune concepts in Fortress. Here’s the full source code:

component main
  export Executable
  run() =
    s = "สวัสดี"
    println("Len: " ++ s.size().toString())

    for i <- 0 : s.size() do
      print(s[i].toHexString() ++ " ")
    end
    println()

    println("Rune count: " ++ s.codePoints().size().toString())

    for (idx, runeValue) <- s.codePoints().zipWithIndex() do
      println(runeValue.toUnicodeEscape() ++ " starts at " ++ idx.toString())
    end

    println("\nUsing explicit iteration")
    i = 0
    while i < s.size() do
      (runeValue, width) = s.codePointAt(i)
      println(runeValue.toUnicodeEscape() ++ " starts at " ++ i.toString())
      examineRune(runeValue)
      i += width
    end
  end

  examineRune(r: CodePoint): () =
    if r == 't' then
      println("found tee")
    elif r == 'ส' then
      println("found so sua")
    end
  end
end

In this Fortress program, we’re working with strings and Unicode characters (runes). Fortress treats strings as sequences of Unicode code points.

We start by declaring a string s containing Thai characters. The size() method gives us the length of the string in code units (which may not equal the number of visible characters for non-ASCII text).

We then iterate over each code unit in the string, printing its hexadecimal representation. This shows the raw bytes that make up the UTF-16 encoding of the string.

To count the actual number of Unicode characters (code points), we use the codePoints() method, which returns an array of code points.

We demonstrate two ways of iterating over the code points in the string:

  1. Using a for loop with zipWithIndex() to get both the code point and its starting index.
  2. Using explicit iteration with codePointAt(), which returns both the code point and its width in code units.

The examineRune function demonstrates how to compare a code point with specific Unicode characters.

Here’s an example of what the output might look like:

Len: 6
e0 b8 aa e0 b8 a7 e0 b8 b1 e0 b8 aa e0 b8 94 e0 b8 b5
Rune count: 6
U+0E2A starts at 0
U+0E27 starts at 1
U+0E31 starts at 2
U+0E2A starts at 3
U+0E14 starts at 4
U+0E35 starts at 5

Using explicit iteration
U+0E2A starts at 0
found so sua
U+0E27 starts at 3
U+0E31 starts at 6
U+0E2A starts at 9
found so sua
U+0E14 starts at 12
U+0E35 starts at 15

Note that Fortress, being a more academic language, might have different conventions or libraries for some of these operations. This translation attempts to capture the spirit of the original code while using Fortress-like syntax and idioms.