Subtitles
subtitles
¶
Utilities for parsing and sanitizing SRT subtitle content
The LLM fix_srt step rewrites the whole SRT, so it can occasionally corrupt a
timestamp
sanitize_srt(srt)
¶
Repairs invalid SRT timestamps
Accepts both , and . millisecond separators
Repairs Applied
-
A cue whose end is at or before its start is given a zero-length duration rather than a negative one
-
A cue whose end runs past the next cue's start is clamped to that start (this fixes corrupted timestamps that would otherwise span the gaps between later cues)
-
Blocks without a parseable
-->timing line are dropped -
Cues are renumbered from 1
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
srt
|
str
|
The (possibly malformed) SRT content |
required |
Returns:
| Type | Description |
|---|---|
str
|
The sanitized SRT content. If nothing parses, the input is returned |
str
|
unchanged |