A toy regex implementation in roc.
It is far from feature complete, and probably doesn't follow any spec. I am just working on this to learn about regex, roc and language theory.
- Regex objects can be parsed from strings as in most programming languages.
matchesfunction which checks if a string matches a regex object (does not return the substring)
- choices (with
|) - choice groups (with
[]) - grouped regex expressions (with
()) - star operator (
*) - plus operator (
+) - question mark operator (
?)
- choice group negation (with
^in beginning of[]) - range expressions (with
-in[]) - metacharacters (like
.,\d,\w, ...) - position anchors (like
^and$,\b,\Aand\Z, ...)
Some things I explicitly don't want to get into at the moment. This doesn't mean they won't ever get implemented, but it means it is not something I am interested in right now.
- Unicode support (for now only ascii. when not combined with operators, unicode should work)
- State machine (at the moment the implementation uses regex derivatives. to get a more performant engine, you would need to convert it to a state machine)
- High performance (This is a toy engine so I can learn)