Strings and Runes in Crystal

A Crystal string is an immutable sequence of UTF-8 characters. The language treats strings specially - as containers of text encoded in UTF-8. In Crystal, the concept of a character is represented by the `Char` type - it's a 32-bit Unicode code point.

```crystal
# The String class in Crystal provides many useful methods for working with UTF-8 encoded strings.
s = "สวัสดี"

# Since strings are arrays of bytes, this will produce the length of the raw bytes stored within.
puts "Len: #{s.bytesize}"

# This loop generates the hex values of all the bytes that constitute the code points in `s`.
s.each_byte do |byte|
  print "#{byte.to_s(16)} "
end
puts

# To count how many characters are in a string, we can use the `size` method.
# Note that some Thai characters are represented by UTF-8 code points
# that can span multiple bytes, so the result of this count may be surprising.
puts "Char count: #{s.size}"

# Crystal's `each_char` method allows us to iterate over each character in the string.
s.each_char_with_index do |char, idx|
  puts "U+#{char.ord.to_s(16).rjust(4, '0')} '#{char}' starts at #{idx}"
end

puts "\nUsing char_at method"
i = 0
while i < s.size
  char = s.char_at(i)
  puts "U+#{char.ord.to_s(16).rjust(4, '0')} '#{char}' starts at #{i}"
  examine_char(char)
  i += 1
end

This demonstrates passing a Char value to a function.

def examine_char(c : Char)
  # We can compare a `Char` value to a character literal directly.
  if c == 't'
    puts "found tee"
  elsif c == 'ส'
    puts "found so sua"
  end
end

When you run this program, you’ll see output similar to this:

Len: 18
e0 b8 aa e0 b8 a7 e0 b8 b1 e0 b8 aa e0 b8 94 e0 b8 b5 
Char count: 6
U+0e2a 'ส' starts at 0
U+0e27 'ว' starts at 1
U+0e31 'ั' starts at 2
U+0e2a 'ส' starts at 3
U+0e14 'ด' starts at 4
U+0e35 'ี' starts at 5

Using char_at method
U+0e2a 'ส' starts at 0
found so sua
U+0e27 'ว' starts at 1
U+0e31 'ั' starts at 2
U+0e2a 'ส' starts at 3
found so sua
U+0e14 'ด' starts at 4
U+0e35 'ี' starts at 5

This example demonstrates how Crystal handles Unicode strings and characters. It shows various methods for working with strings and individual characters, including byte-level operations, character counting, and iteration over characters.

查看推荐产品