Features/HC improvements for zk-regex Noir support#8
Closed
ewynx wants to merge 9 commits intonoir-lang:mainfrom
Closed
Features/HC improvements for zk-regex Noir support#8ewynx wants to merge 9 commits intonoir-lang:mainfrom
ewynx wants to merge 9 commits intonoir-lang:mainfrom
Conversation
…aw setting. The substrings are returned as BoundedVec since we don't know their exact length upfront, but we know they're not longer than N. To support both settings (decomposed and raw) we have to use `substring_ranges` instead of `substring_boundaries`.
…gex and input. This fix makes sure this is supported. Changes: - regex_match returns a Vec of substrings instead of an array with known length - per state where substrings have to be extracted; add the byte either to a new substring or an already started one Note that substr_count is used to extract the correct "current" substring from the Vec. This is a workaround - first implementation was using `pop` but this gave an error.
For caret anchor: Mark beginning of input byte array with 255, which makes the check for caret anchor (ˆ) works. Note that ^ is only taken into consideration in the decomposed mode.
…states reachable from state 0. Substrings only get saved when they are part of a path that doesn't reset.
This was referenced Oct 2, 2024
Closed
2 tasks
Member
2 tasks
Author
|
@TomAFrench opened new PR |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR contains implementation of features
gen_substrs,ˆsupport,$support and overall bugfixes for the Noir support.This branch has been tested equally as the circom implementation.
All circom tests from the original zk-regex lib have been added in the test-suite. All tests pass with the added features and bugfixes.
ˆsupport is realized by prefixing the input array by 255. This is the same in circom$support is realized by adding an additional accepting state, to which the previous accepting state transitions for any character. This new state is then added to the accepting states. In the case that$is at the end of the regex this extra transition is not done and inputs continuing after$are thus rejected. This solution increases the lookup table size by 255 rowsgen_substrslets us extract substrings alongside the regex check. This can be done viadecomposedorrawsetting.BoundedVec<Field,N>, because we don't know the exact length beforehandregex_matchfunction with substring extraction isVec<BoundedVec<Field,N>>because the total number of substrings is not always known beforehandconsecutivecheck in circom)$is needed also to extract the exact correct substring (otherwise it would just keep extracting until the end of the input)gen_substrsinrawto default (this is a change outside of the Noir code, but seemed to make sense)aband inputaab. For the first inputait moves into state 1. For the second inputait moves into state 0. And then it would stay there. Now, we're adding the possibility for the 2nd occurrence ofato move into state 1 again.Note: multiple accepting states that would occur directly from the regex are not supported, same as in the circom impl. (See README comment of original lib here).
This replaces previously opened PRs: #2 and #1. (Although the steps for manual verification are still valid)
Additional Context
The test suite is built specifically for the Noir zk-regex library. From a database of regex inputs + samples it will generate the required Noir code, create the desired tests and run them. The database has been filled with the equivalents of the tests for circom. Additionally, there are 2 hardcoded test projects for the circom tests that had more complex circuits (combining multiple templates).
PR Checklist*
cargo fmton default settings.