Regular Expressions in Haskell

Our first program demonstrates the use of regular expressions in Haskell. Here’s the full source code:

import Text.Regex.PCRE
import qualified Data.ByteString.Char8 as B

main :: IO ()
main = do
    -- This tests whether a pattern matches a string.
    let match = "peach" =~ "p([a-z]+)ch" :: Bool
    print match

    -- For other regex tasks, we'll use the compiled Regex.
    let r = makeRegex "p([a-z]+)ch" :: Regex

    -- Here's a match test like we saw earlier.
    print $ match r "peach"

    -- This finds the match for the regex.
    print $ match r "peach punch"

    -- This also finds the first match but returns the
    -- start and end indexes for the match instead of the
    -- matching text.
    print $ "idx:" ++ show (matchOnce r "peach punch")

    -- The getAllTextMatches function includes information about
    -- both the whole-pattern matches and the submatches
    -- within those matches.
    print $ getAllTextMatches (r =~ "peach punch" :: AllTextMatches [] String)

    -- Similarly, this will return information about the
    -- indexes of matches and submatches.
    print $ getAllMatches (r =~ "peach punch" :: AllMatches [] (MatchOffset, MatchLength))

    -- To find all matches for a regex:
    print $ getAllTextMatches (r =~ "peach punch pinch" :: AllTextMatches [] String)

    -- Providing a non-negative integer as the second
    -- argument to these functions will limit the number
    -- of matches.
    print $ take 2 $ getAllTextMatches (r =~ "peach punch pinch" :: AllTextMatches [] String)

    -- We can also provide ByteString arguments
    print $ match r (B.pack "peach")

    -- When creating global variables with regular
    -- expressions, you can use the makeRegexOpts function
    -- for more control over compilation.
    let r' = makeRegexOpts compCaseless execBlank "p([a-z]+)ch" :: Regex
    print $ "regex:" ++ show r'

    -- The regex package can also be used to replace
    -- subsets of strings with other values.
    print $ subRegex r "a peach" "<fruit>"

    -- The second argument to subRegex allows you to transform 
    -- matched text with a given function.
    let toUpper = map (\c -> if c >= 'a' && c <= 'z' then toEnum (fromEnum c - 32) else c)
    print $ subRegex r "a peach" (toUpper . head . getAllTextMatches)

To run the program, save the code in a file (e.g., RegexExample.hs) and use runghc:

$ runghc RegexExample.hs
True
True
True
"idx:Just (0,5)"
["peach","ea"]
[(0,5),(1,2)]
["peach","punch","pinch"]
["peach","punch"]
True
"regex:Regex {}"
"a <fruit>"
"a PEACH"

This example demonstrates various regex operations in Haskell using the regex-pcre library. Note that Haskell’s regex functionality is provided through libraries, and the exact API might differ depending on the library used. The regex-pcre library is chosen here for its similarity to the Go example’s functionality.

For a complete reference on Haskell regular expressions, check the documentation of the regex library you’re using, such as regex-pcre or regex-tdfa.