You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: Add POS information to tag generation API across C, C++, and JS layers
Extend the tag generation API so each tag carries its part-of-speech
alongside the tag text, enabling callers to filter or display tags by POS
without re-analyzing the text.
C++ API changes:
- Introduce TagEntry struct in tag_generator.h with tag (std::string) and
pos (core::PartOfSpeech) fields
- TagGenerator::generate() and generateFromText() now return
std::vector<TagEntry> instead of std::vector<std::string>
- Suzume::generateTags() overloads updated accordingly in suzume.h/cpp
C API changes (suzume_c.h/cpp):
- Add const char** pos field to suzume_tags_t struct alongside existing
char** tags and size_t count fields
- suzume_generate_tags() and suzume_generate_tags_with_options() populate
pos array using posToString() on each TagEntry
- suzume_tags_free() correctly frees the pos array to avoid memory leaks
JS/WASM API changes (js/index.ts):
- Introduce Tag interface with tag: string and pos: string fields
- generateTags() return type changed from string[] to Tag[]
- parseTags() reads the new pos pointer array from suzume_tags_t layout
(field order: tags ptr, pos ptr, count)
- Fix memory access: use HEAPU32-derived Uint8Array instead of HEAPU8
(not exported by Emscripten) for struct writes in loadBinaryDictionary
and option struct initialization
CLI output:
- suzume-cli and cmd_analyze now output tag + tab + POS on each line
- cmd_test.cpp adapted to extract tag.tag for comparison set
WASM test suite refactored:
- Remove monolithic suzume.test.ts
- Add helpers.ts: shared WASM module loader, allocString, parseMorphemes,
parseTags, getTagCount utilities
- Add c-api-analyze.test.ts: C API analyze tests covering POS fields,
conj fields, mixed POS sentences, and create_with_options
- Add c-api-tags.test.ts: C API tag generation tests covering POS filter,
max_tags, min_length, and excludeBasic options
- Add js-api.test.ts: JS API struct layout compatibility tests
C++ unit tests (tag_generator_test.cpp):
- Update all tag comparisons from tag == "str" to tag.tag == "str"
- Add POS assertions where appropriate (Verb, Adjective, Adverb, Particle,
Auxiliary, Noun, Pronoun)
0 commit comments