Using a GPU¶

Why Use A GPU

Mirumoji transcribes audio with faster-whisper and converts media with FFmpeg
For anything longer than a short clip, like an anime episode, or a podcast, that work is MUCH faster on GPU than a CPU
Mirumoji was designed for long media, so it needs a GPU to power the transcription and media conversion operations

The Transcription Backend option lets you choose where that work runs, so you don't need to own a GPU to run Mirumoji

Backend	Needs	Best When
`modal`	A Free `Modal` Account	You don't have an `NVIDIA GPU`, or don't want to set one up
`local`	An `NVIDIA GPU` + the `NVIDIA Container Toolkit`	You have a capable `NVIDIA GPU` and want everything on your machine

Local NVIDIA GPU¶

Runs faster-whisper directly on your own GPU inside the backend container

Everything stays on your machine. No cloud account is involved

Requirements

An NVIDIA GPU with up-to-date drivers
The NVIDIA Container Toolkit, which lets Docker Containers use the GPU
A few GB of disk for the GPU images + the transcription model, which downloads on the first transcription into a persistent cache volume and is reused on every run after that

Verify The Requirements On Your Machine

Nvidia GPU and Nvidia Container Toolkit should both report ok

mirumoji doctor

Configuration¶

Set the Transcription Backend option to local on the CLI or the Desktop Launcher

# CLI
mirumoji config set MIRUMOJI_TRANSCRIBE_BACKEND local
mirumoji up

This is the default backend because it works on any computer

Modal runs code on cloud GPUs on-demand

The free tier includes a generous amount of monthly compute credits, which is plenty for personal use

How It Works

With the modal backend, Mirumoji runs its lightweight CPU image on your machine and delegates only the heavy transcription / conversion of media to short-lived Modal GPU containers

Step By Step Request Workflow

You request a transcription / conversion on the frontend
The backend asks Modal to spin up an ephemeral GPU container running a fully-configured Mirumoji GPU Docker Image
The backend uploads the media into a short-lived Modal volume that the container reads from
The container runs the transcription / conversion work and returns the result (writing any converted file back to the same volume for the backend to retrieve)
The container shuts down and the temporary volume is discarded

Get Your API Token Pair¶

Through The DashboardThrough The Modal CLI

Sign Up At modal.com
Click on Settings → API Tokens
Click Create New
Copy The MODAL_TOKEN_ID + MODAL_TOKEN_SECRET

# If you already have a python environment running and a Modal Account
pip install modal
modal token new

Configure Them In Mirumoji¶

CLI SetupGUI Setup

mirumoji config set MIRUMOJI_TRANSCRIBE_BACKEND modal
mirumoji config set MODAL_TOKEN_ID <your-token-id>
mirumoji config set MODAL_TOKEN_SECRET <your-token-secret>
mirumoji up

In The Desktop Launcher, Enter MODAL_TOKEN_ID + MODAL_TOKEN_SECRET Under Settings → Modal
Click Save Configuration
Go Back To The Dashboard
Click Up

Choosing Modal Backend — The Desktop Launcher's Settings Panel → Backend + Image Source + Provider Keys

Variable	Default	What it does
`MIRUMOJI_MODAL_GPU`	`A10G`	Which GPU The Modal Containers Run (`T4`, `L4`, `A10G`, `A100`, ...). See `Modal's Available GPUs`
`MODAL_FORCE_BUILD`	`0`	Set To `1` To Force Modal To Rebuild Its Cached App Image (On Mirumoji Updates)
`MIRUMOJI_MODAL_IMAGE`	Which Image The Modal Containers Use	Override The Docker Image Used By The Modal Containers. → `Advanced`