Dict
dict
¶
This module defines the dict_router of the Mirumoji API
Attributes:
| Name | Type | Description |
|---|---|---|
LOGGER |
Logger
|
Module's logging object |
dict_router |
APIRouter
|
The FastAPI router object |
analyze(sentence=Query(...), mode=Query(BundleMode.grammar))
async
¶
Tokenizes a sentence and enriches every stitched word with dictionary data
Slower than /dict/tokenize (one dictionary lookup per word). Intended for
on-demand analysis rather than bulk rendering
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sentence
|
str
|
The Japanese sentence to analyze |
Query(...)
|
mode
|
BundleMode
|
How aggressively to group tokens into words |
Query(grammar)
|
Returns:
| Type | Description |
|---|---|
list[EnrichedJapaneseWord]
|
A list of |
Raises:
| Type | Description |
|---|---|
FugashiError
|
If tokenization fails |
KotobaseError
|
If a dictionary lookup fails |
query(word=Query(...), wildcard=Query(False))
async
¶
Looks up dictionary data for a single word or a wildcard pattern
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
word
|
str
|
Word or wildcard pattern to look up |
Query(...)
|
wildcard
|
bool
|
When |
Query(False)
|
Returns:
| Type | Description |
|---|---|
KotobaseData
|
The |
Raises:
| Type | Description |
|---|---|
KotobaseError
|
If the lookup fails |
tokenize(sentence=Query(...), mode=Query(BundleMode.grammar))
async
¶
Tokenizes a sentence into useful, stitched words (no dictionary lookups)
This is the fast path for rendering clickable text. Call /dict/analyze
or /dict/query to fetch dictionary data for a word
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sentence
|
str
|
The Japanese sentence to tokenize |
Query(...)
|
mode
|
BundleMode
|
How aggressively to group tokens into words |
Query(grammar)
|
Returns:
| Type | Description |
|---|---|
list[JapaneseWord]
|
A list of |
Raises:
| Type | Description |
|---|---|
FugashiError
|
If tokenization fails |
tokenize_batch(req)
async
¶
Tokenizes many sentences in one request (stitched words, no dict data)
Lets a client tokenize a whole subtitle file up front in a single call, so playback never tokenizes per-cue
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
req
|
TokenizeBatchRequest
|
The sentences to tokenize |
required |
Returns:
| Type | Description |
|---|---|
list[list[JapaneseWord]]
|
One |
Raises:
| Type | Description |
|---|---|
FugashiError
|
If tokenization fails |