Strings and Runes in Standard ML

(* A Standard ML string is a sequence of characters. The language
   and the standard library treat strings as immutable sequences
   of characters. In Standard ML, there's no separate concept of
   'runes' or 'code points' as in some other languages. Characters
   are represented using the `char` type, which can hold Unicode
   characters. *)

(* The `explode` function converts a string to a list of characters *)
fun explodeString s = explode s

(* The `size` function gives the length of a string *)
fun stringLength s = size s

(* The `Char.toString` function converts a character to a string *)
fun charToString c = Char.toString c

(* The `ord` function converts a character to its Unicode code point *)
fun charToCodePoint c = ord c

(* Main function to demonstrate string operations *)
fun main () =
    let
        (* Define a string with Thai characters *)
        val s = "สวัสดี"
        
        (* Print the length of the string *)
        val _ = print ("Len: " ^ Int.toString (stringLength s) ^ "\n")
        
        (* Convert the string to a list of characters and print their code points *)
        val _ = List.app (fn c => 
            print (Int.toString (charToCodePoint c) ^ " ")
        ) (explodeString s)
        val _ = print "\n"
        
        (* Count the number of characters *)
        val _ = print ("Character count: " ^ Int.toString (List.length (explodeString s)) ^ "\n")
        
        (* Iterate over characters with their positions *)
        val _ = List.appi (fn (idx, c) =>
            print ("U+" ^ StringCvt.padLeft #"0" 4 (Int.fmt StringCvt.HEX (charToCodePoint c)) ^
                   " '" ^ charToString c ^ "' starts at " ^ Int.toString idx ^ "\n")
        ) (explodeString s)
        
        (* Demonstrate character comparison *)
        fun examineChar c =
            if c = #"t" then
                print "found tee\n"
            else if c = #"ส" then
                print "found so sua\n"
            else
                ()
        
        val _ = List.app examineChar (explodeString s)
    in
        ()
    end

(* Run the main function *)
val _ = main ()

This Standard ML code demonstrates string and character handling, which is somewhat different from Go’s approach. Here are some key points:

  1. Standard ML doesn’t have a separate concept of ‘runes’. Instead, it uses the char type which can represent Unicode characters.

  2. Strings in Standard ML are immutable sequences of characters.

  3. The explode function is used to convert a string to a list of characters, which is similar to iterating over runes in Go.

  4. The size function gives the length of a string in characters, not bytes.

  5. We use ord to get the Unicode code point of a character, which is similar to the concept of a rune value in Go.

  6. Standard ML doesn’t have built-in UTF-8 decoding functions like Go’s utf8.DecodeRuneInString. The string is already treated as a sequence of Unicode characters.

  7. Character literals in Standard ML are written with # prefix, like #"t" or #"ส".

To run this program, you would typically save it to a file (e.g., string_demo.sml) and then use an SML interpreter or compiler. For example, with Standard ML of New Jersey (SML/NJ):

$ sml string_demo.sml

This will compile and run the program, displaying output similar to the Go version, but reflecting the Standard ML approach to string and character handling.

查看推荐产品