Release 1.0.0a1 by github-actions[bot] · Pull Request #9 · TigreGotico/palavreado

github-actions · 2026-04-22T18:10:40Z

Human review requested!

Configure Renovate

…, normalisation (#8) * feat: replace setup.py with pyproject.toml * fix: remove_intent dict key, no-match shape, duplicate guard, E741 rename * docs: type hints and docstrings * test: comprehensive test suite * docs: rewrite README * perf/fix: cache regexes, word-count penalty, fix plural hack, tie-breaking - Pre-compile all regexes at add_intent() time; removed per-query re.compile() - lru_cache on word_tokenize calls to avoid repeated tokenization - Replace character-length remainder penalty with word-count fraction - Fix plural candidate detection to use word-boundary regex instead of substring check (prevents "status" being dropped due to "statuses") - Fix regex slot confidence: divide by n_required not len(matches) - Add deterministic tie-breaking: lower remainder word count wins, then alphabetical intent name - Update test expected values to match new (more accurate) confidence scores Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: context gating, keyword exclusion, normalisation, opm.py, intent_names - bracket_expansion.py: add drop_apostrophes, normalize_whitespace, normalize_utterance, normalize_example — training samples and queries are now normalised identically at registration/match time - __init__.py: apply normalize_example to training data at add_intent(), apply normalize_utterance to query in calc_intents(); add full context gating API (set/unset/require/unrequire/exclude/unexclude_context); add exclude_keywords() with word-boundary safety; add intent_names property - opm.py: new OVOS ConfidenceMatcherPipeline plugin with lru_cache(128), session blacklist support, match_high/medium/low, bus listeners for padatious:register_intent, detach_intent, detach_skill, mycroft.skills.train - pyproject.toml: add ovos optional-dependencies group, pipeline entry point - test: 19 new tests covering normalisation, context gating, keyword exclusion, intent_names Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: replace plural hack with lemmatize() helper Add lemmatize(word) to bracket_expansion.py: strips apostrophes entirely and removes trailing 's' (not 'ss') for language-agnostic plural matching. Apply in _match() (replaces the old regex-based plural/singular hack) and in get_utterance_remainder() (lemmatized token comparison so plural forms of matched keywords are consumed from the remainder). "lights" now matches training keyword "light", "what s" tokens (from apostrophe normalisation) match "whats" via shared stem "what". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: apostrophes → space in lemmatize, not empty string Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: accuracy engine, benchmark, normalisation, opm rewrite, CI workflows - Three-pass keyword matching (contiguous → lemma-normalised → non-contiguous) - Non-contiguous match quality 0.8 so direct hits always win - Require all required slots to fire; eliminates partial-required FPs - _score() helper: remainder penalty, coverage bonus, slot bonus; 4dp rounding - lemma_query computed once per calc_intents call; fused required+optional loop - lemmatize() exported; apostrophes → space before lemmatization - normalize_utterance/normalize_example applied at registration and match time - opm.py rewritten for Adapt bus events (register_vocab/register_intent) - benchmark/ package: 284-case dataset, accuracy.py, compare.py (vs Adapt) - README updated with benchmark table (TN/NM column, honest FP commentary) - Standard CI workflows added Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: update workflows to standard — explicit secrets, lint, Python 3.13/3.14 - add lint.yml - build-tests.yml: add Python 3.13/3.14, drop secrets: inherit - release_workflow.yml: explicit PYPI_TOKEN/MATRIX_TOKEN, add permissions - publish_stable.yml: push trigger, explicit secrets, publish_release/sync_dev - coverage.yml: add test_path/install_extras/min_coverage, drop secrets: inherit - license_check.yml, pip_audit.yml: drop secrets: inherit Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Update README.md * Delete .github/workflows/python-support.yml * fix: address PR #8 CodeRabbit feedback and align CI workflows with nebulento - palavreado/__init__.py: read IntentCreator.name directly in remove_intent instead of calling .build() (avoids wasteful allocation) - palavreado/builder.py: add inline Note to all four regex slot methods explaining the intentional empty-bucket design for partial_conf weighting - palavreado/bracket_expansion.py: update expand_parentheses docstring to reflect actual str->List[str] signature (was stale list<str>->list<list<str>>) - pyproject.toml: switch to SPDX license string, add license-files entry, and add explicit Python 3.9-3.13 classifiers to match requires-python - README.md: add Breaking changes section documenting RuntimeError on duplicate add_intent and the remove_intent-first pattern - test/test_palavreado.py: add test_remove_intent_via_creator to lock the IntentCreator overload contract - .github/workflows: add missing opm-check.yml; remove spurious `secrets: inherit` from release-preview and repo-health (matches nebulento pattern); align changelog_max_issues to 50 in release_workflow Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: remove deprecated license classifier and clean up builder docstrings Drop the old-style `License :: OSI Approved :: Apache Software License` classifier from pyproject.toml — newer setuptools (PEP 639) rejects it when `license` and `license-files` fields are already present, causing all CI jobs (build, coverage, opm_check, license_check, pip_audit) to fail at the build step. Remove the "Note:" sections from the four regex slot methods in builder.py (require_regex, optional_regex, require_autoregex, optional_autoregex) that described internal "empty-bucket design" details; the docstrings now only describe what each method does and its args/returns. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: count misclassifications as both FN and FP in accuracy benchmark Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: correct five verified bugs from code review - __init__.py: lemma_map keys now per-token lemmatized (fixes phrase misses in Pass 2) - __init__.py: required slots check uses full .keys() not 'if s' guard - opm.py: remove lru_cache from _match_intent and _calc_palavreado_intent (stale on mutable state) - opm.py: _regexes keyed by lang+entity_type; wired into require_regex/optional_regex at intent registration; pruned in handle_detach_skill - compare.py: count misclassifications as FP for predicted intent (same fix as accuracy.py) - dataset.py: fix mislabeled cases — 'pause' expanded to 'pause the music'; 'put a timer on for lunch' and 'turn off the lights and set a timer' relabeled to set_timer Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: update benchmark results after bug fixes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix+test: regex named-group slots, duplicate import, 16 new tests Fixes: - bracket_expansion.py: remove duplicate 'import re' - __init__.py: regex slots with named groups now mark the slot name in matches so the required-check passes and conf credit fires; previously intents using require_regex with named-group patterns always returned None New tests (16): - TestRegexSlots: named groups fire + populate, slot name in keywords, missing regex = no match, combined regex + keyword slot - TestOptionalOnlyIntent: optional-only intent never fires - TestKeywordExclusionMultiword: blocks on phrase, passes without phrase, no partial-word false match - TestTiebreaking: alphabetical tiebreaker, higher-confidence multi-slot wins - TestScore: perfect score, remainder penalty, zero-word guard, clamping to [0,1], 4dp rounding Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * perf: eliminate redundant work in hot path - Pre-compile excluded keyword regexes at exclude_keywords() call instead of compiling them fresh on every query in _filter - _filter now iterates pre-compiled (kw_lower, rx|None) pairs with early break per intent — no closure allocation, no dynamic re.search pattern build - Tokenize and lemmatize the query once in calc_intents; reuse the list for both the set (query_lemmas) and the string (lemma_query) - Cache per-candidate lemma strings inside _match during the initial classification pass; Pass 2 lemma_map reuses that cache instead of re-lemmatizing every token a second time - Pre-sort regex patterns by length at add_intent() time; matching loop iterates self._sorted_regex[name][slot] directly with no per-query sort - Pre-compute matched-word count (_mw) and remainder word count (_rw) in the yield inside calc_intents; calc_intent reads them directly instead of recomputing via _matched_words() for each comparison Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

JarbasAl and others added 8 commits October 25, 2024 23:23

feat:semver

af00aac

feat:semver

000fcf1

Merge pull request #2 from TigreGotico/renovate/configure

d209353

Configure Renovate

Increment Version to 0.2.1a1

c50629e

Update Changelog

82eedb7

Increment Version to 1.0.0a1

8e1177e

Update Changelog

af7f585

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 1.0.0a1#9

Release 1.0.0a1#9
github-actions[bot] wants to merge 8 commits intomasterfrom
release-1.0.0a1

github-actions Bot commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

github-actions Bot commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant