Strings and Runes in COBOL
In COBOL, strings are handled differently compared to many modern languages. COBOL uses fixed-length or variable-length character fields to store text data. The concept of Unicode and UTF-8 encoding is not natively supported in traditional COBOL, but some modern COBOL compilers provide extensions for Unicode support.
IDENTIFICATION DIVISION.
PROGRAM-ID. STRING-EXAMPLE.
ENVIRONMENT DIVISION.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-THAI-HELLO PIC X(18) VALUE "สวัสดี".
01 WS-COUNTER PIC 9(2).
01 WS-CHAR PIC X.
PROCEDURE DIVISION.
MAIN-PROCEDURE.
DISPLAY "Len: " FUNCTION LENGTH(WS-THAI-HELLO)
PERFORM VARYING WS-COUNTER FROM 1 BY 1
UNTIL WS-COUNTER > FUNCTION LENGTH(WS-THAI-HELLO)
MOVE WS-THAI-HELLO(WS-COUNTER:1) TO WS-CHAR
DISPLAY FUNCTION HEX-OF(WS-CHAR) WITH NO ADVANCING
DISPLAY " " WITH NO ADVANCING
END-PERFORM
DISPLAY SPACE
DISPLAY "Character count: " FUNCTION LENGTH(WS-THAI-HELLO)
STOP RUN.In this COBOL program:
We define a string
WS-THAI-HELLOcontaining the Thai word for “hello”. COBOL doesn’t have native support for UTF-8, so this might not display correctly in all environments.We use
FUNCTION LENGTHto get the length of the string. This is equivalent to thelen()function in many other languages.We iterate through each character of the string using a
PERFORMloop. This is similar to theforloop in the original example.We display the hexadecimal representation of each character using
FUNCTION HEX-OF. This is not exactly the same as the original example, as it will show the ASCII or EBCDIC hex values, not UTF-8 bytes.We count the characters using
FUNCTION LENGTH. Note that this will count each byte as a separate character, unlike theRuneCountInStringfunction in the original example which counts Unicode code points.
COBOL doesn’t have a built-in concept of “runes” or Unicode code points. Handling Unicode in COBOL typically requires using specific compiler extensions or external libraries, which are beyond the scope of this basic example.
To run this COBOL program:
$ cobc -x string-example.cob
$ ./string-example
Len: 18
E0 B8 AA E0 B8 A7 E0 B8 B1 E0 B8 AA E0 B8 94 E0 B8 B5
Character count: 18Note that the output may vary depending on your COBOL compiler and system encoding. The hexadecimal values shown here assume an ASCII-based system.
COBOL’s string handling is quite different from many modern languages, especially when it comes to Unicode support. For more complex Unicode operations, you might need to use specific COBOL compiler extensions or integrate with external libraries.