Kotobase¶
A comprehensive, openly-licensed Japanese language database.
Kotobase aggregates several openly-licensed Japanese language data sources
into one SQLite database and exposes simple programmatic and command line
access to it
Install¶
Get The Database¶
Database
-
The compiled database is a
pre-requisiteand is not bundled with the package due to its size -
The
coredatabase contains all sources except for theKanji Aliveaudio clips and is a~400MBSQLitefile -
The optional
audiodatabase adds~150MBto that size -
There are 2 way to get both of them
Both databases are rebuilt weekly with updated sources via a GitHub Action and appended as assets to the Latest Kotobase GitHub Release
- Download The
Core+AudioDatabases - Download Only The
CoreDatabase
Use It¶
- Comprehensive Lookup Across Every Source
- A Single Kanji Profile
Features¶
| Comprehensive Lookups | One lookup all Query Aggregates Data From All Souces |
| Organized Data | Every Source Is Fully Extracted Into A Normalized SQLite Schema & Exposed As Typed, Serializable DTOs |
| Example Sentences | Search Tatoeba Example Sentences + Their English Translation By Text |
| Wildcard Search | Match Written / Reading Forms With * & % Wildcard Patterns |
| CLI | A Typer + Rich CLI With Readable, Panelled Output & --json For Scripting |
| Self-Contained | A Single SQLite (~400MB) File + Optional Audio Pack (~150MB) With No Server / Network Access Needed At Query Time |
| Easy Database Management | Pull Pre-Built Databases From GitHub Releases Or Build It Locally + Manage The Cache From The CLI |
More Information¶
-
Task-Oriented Python Snippets
-
All
kotobaseCommands -
Technical Documentation Of The
KotobaseWrapper + Data Objects -
What changed between versions
-
What The Public API Covers
-
Source Attribution