Strings and Runes in C#

Our first example demonstrates strings and runes in C#. In C#, strings are immutable sequences of Unicode characters. The concept of a rune in C# is represented by the char type, which is a 16-bit Unicode character.

using System;
using System.Linq;
using System.Text;

class StringsAndRunes
{
    static void Main()
    {
        // s is a string assigned a literal value
        // representing the word "hello" in the Thai language.
        // C# string literals are UTF-16 encoded text.
        const string s = "สวัสดี";

        // This will produce the length of the string in characters.
        Console.WriteLine($"Len: {s.Length}");

        // In C#, we can iterate through the characters of a string directly.
        // This loop generates the hex values of all the characters in s.
        foreach (char c in s)
        {
            Console.Write($"{(int)c:X4} ");
        }
        Console.WriteLine();

        // To count how many characters are in a string, we can use the Length property.
        Console.WriteLine($"Character count: {s.Length}");

        // A foreach loop handles strings by iterating through each character.
        int index = 0;
        foreach (char c in s)
        {
            Console.WriteLine($"U+{(int)c:X4} '{c}' starts at {index}");
            index += char.IsSurrogate(c) ? 2 : 1;
        }

        // We can achieve a similar iteration by using the string's indexer explicitly.
        Console.WriteLine("\nUsing string indexer");
        for (int i = 0; i < s.Length; i++)
        {
            char c = s[i];
            Console.WriteLine($"U+{(int)c:X4} '{c}' starts at {i}");
            ExamineChar(c);
        }
    }

    static void ExamineChar(char c)
    {
        // We can compare a char value to a char literal directly.
        if (c == 't')
        {
            Console.WriteLine("found tee");
        }
        else if (c == 'ส')
        {
            Console.WriteLine("found so sua");
        }
    }
}

To run the program, save it as StringsAndRunes.cs and use the C# compiler:

$ csc StringsAndRunes.cs
$ mono StringsAndRunes.exe
Len: 6
0E2A 0E27 0E31 0E2A 0E14 0E35 
Character count: 6
U+0E2A 'ส' starts at 0
U+0E27 'ว' starts at 1
U+0E31 'ั' starts at 2
U+0E2A 'ส' starts at 3
U+0E14 'ด' starts at 4
U+0E35 'ี' starts at 5

Using string indexer
U+0E2A 'ส' starts at 0
found so sua
U+0E27 'ว' starts at 1
U+0E31 'ั' starts at 2
U+0E2A 'ส' starts at 3
found so sua
U+0E14 'ด' starts at 4
U+0E35 'ี' starts at 5

This C# example demonstrates how to work with strings and characters (equivalent to runes in Go). Note that C# uses UTF-16 encoding for strings internally, which is different from Go’s UTF-8 encoding. This can lead to some differences in how characters are handled, especially for characters outside the Basic Multilingual Plane.