Strings and Runes in Standard ML
(* A Standard ML string is a sequence of characters. The language
and the standard library treat strings as immutable sequences
of characters. In Standard ML, there's no separate concept of
'runes' or 'code points' as in some other languages. Characters
are represented using the `char` type, which can hold Unicode
characters. *)
(* The `explode` function converts a string to a list of characters *)
fun explodeString s = explode s
(* The `size` function gives the length of a string *)
fun stringLength s = size s
(* The `Char.toString` function converts a character to a string *)
fun charToString c = Char.toString c
(* The `ord` function converts a character to its Unicode code point *)
fun charToCodePoint c = ord c
(* Main function to demonstrate string operations *)
fun main () =
let
(* Define a string with Thai characters *)
val s = "สวัสดี"
(* Print the length of the string *)
val _ = print ("Len: " ^ Int.toString (stringLength s) ^ "\n")
(* Convert the string to a list of characters and print their code points *)
val _ = List.app (fn c =>
print (Int.toString (charToCodePoint c) ^ " ")
) (explodeString s)
val _ = print "\n"
(* Count the number of characters *)
val _ = print ("Character count: " ^ Int.toString (List.length (explodeString s)) ^ "\n")
(* Iterate over characters with their positions *)
val _ = List.appi (fn (idx, c) =>
print ("U+" ^ StringCvt.padLeft #"0" 4 (Int.fmt StringCvt.HEX (charToCodePoint c)) ^
" '" ^ charToString c ^ "' starts at " ^ Int.toString idx ^ "\n")
) (explodeString s)
(* Demonstrate character comparison *)
fun examineChar c =
if c = #"t" then
print "found tee\n"
else if c = #"ส" then
print "found so sua\n"
else
()
val _ = List.app examineChar (explodeString s)
in
()
end
(* Run the main function *)
val _ = main ()
This Standard ML code demonstrates string and character handling, which is somewhat different from Go’s approach. Here are some key points:
Standard ML doesn’t have a separate concept of ‘runes’. Instead, it uses the
char
type which can represent Unicode characters.Strings in Standard ML are immutable sequences of characters.
The
explode
function is used to convert a string to a list of characters, which is similar to iterating over runes in Go.The
size
function gives the length of a string in characters, not bytes.We use
ord
to get the Unicode code point of a character, which is similar to the concept of a rune value in Go.Standard ML doesn’t have built-in UTF-8 decoding functions like Go’s
utf8.DecodeRuneInString
. The string is already treated as a sequence of Unicode characters.Character literals in Standard ML are written with
#
prefix, like#"t"
or#"ส"
.
To run this program, you would typically save it to a file (e.g., string_demo.sml
) and then use an SML interpreter or compiler. For example, with Standard ML of New Jersey (SML/NJ):
$ sml string_demo.sml
This will compile and run the program, displaying output similar to the Go version, but reflecting the Standard ML approach to string and character handling.