wrapper.js converts globs like **/node_modules/** into regex via picomatch.makeRe(), producing patterns with nested negative lookaheads:
^(?:(?:^|\/|(?:(?:(?!(?:^|\/)\.{1,2}(?:\/|$)).)*?)\/)node_modules(?:\/(?!\.{1,2}(?:\/|$))(?:(?:(?!(?:^|\/)\.{1,2}(?:\/|$)).)*?)|$))$
These are pathological for std::regex (libstdc++ backtracking NFA). On a monorepo with ~650K files, subscribe() with just 4 such patterns takes 120s — 111s of which is CPU time in std::regex_match(). The actual fts traversal + inotify_add_watch() takes ~5s.
The lookaheads prevent ** from matching ./.. path components, but paths from fts_read() never contain those — they're resolved by the kernel.
A simpler regex like (^|.*/)node_modules(/.*|$) is functionally equivalent for real paths and runs 11x faster:
| Config |
Time |
User CPU |
| picomatch regex (current) |
120s |
111s |
| Simple regex, no lookaheads |
11s |
6.6s |
ignorePaths only, no regex |
9.5s |
2.8s |
Suggestion: It would be worth documenting in the README that glob patterns with leading ** (e.g. **/node_modules/**) generate complex regex with negative lookaheads that can cause significant performance degradation on large directory trees via std::regex. Users can avoid this by using simpler patterns like node_modules/** or by passing absolute paths via ignorePaths instead.
I also looked into using glob-to-regexp as a replacement for picomatch — it generates lookahead-free regex and is a much simpler library (130 lines, zero deps). However, it's not a full drop-in replacement: it doesn't support extglob features like (a|b) alternation, [a-e] character classes as glob syntax, or {a,b} brace expansion. See #245 for a PR that uses glob-to-regexp for simple patterns and falls back to picomatch for patterns containing extglob syntax.
wrapper.jsconverts globs like**/node_modules/**into regex viapicomatch.makeRe(), producing patterns with nested negative lookaheads:These are pathological for
std::regex(libstdc++ backtracking NFA). On a monorepo with ~650K files,subscribe()with just 4 such patterns takes 120s — 111s of which is CPU time instd::regex_match(). The actualftstraversal +inotify_add_watch()takes ~5s.The lookaheads prevent
**from matching./..path components, but paths fromfts_read()never contain those — they're resolved by the kernel.A simpler regex like
(^|.*/)node_modules(/.*|$)is functionally equivalent for real paths and runs 11x faster:ignorePathsonly, no regexSuggestion: It would be worth documenting in the README that glob patterns with leading
**(e.g.**/node_modules/**) generate complex regex with negative lookaheads that can cause significant performance degradation on large directory trees viastd::regex. Users can avoid this by using simpler patterns likenode_modules/**or by passing absolute paths viaignorePathsinstead.I also looked into using
glob-to-regexpas a replacement forpicomatch— it generates lookahead-free regex and is a much simpler library (130 lines, zero deps). However, it's not a full drop-in replacement: it doesn't support extglob features like(a|b)alternation,[a-e]character classes as glob syntax, or{a,b}brace expansion. See #245 for a PR that usesglob-to-regexpfor simple patterns and falls back topicomatchfor patterns containing extglob syntax.