Status: ratified (M1 complete; iterations expected)
Owner: OpenQuackKit/Transcription/
Last updated: 2026-04-26
Transcribe a discrete audio buffer (file or PCM samples) into text behind a stable abstraction so engines can be swapped, compared, and benchmarked.
public protocol TranscriptionEngine: AnyObject {
static var engineName: String { get }
static var suggestedModels: [String] { get }
var modelID: String { get }
func transcribe(audioFile url: URL, language: String?) async throws -> EngineTranscription
}
public struct EngineTranscription: Sendable {
var text: String
var detectedLanguage: String?
var audioSeconds: TimeInterval
var wallSeconds: TimeInterval
var timeToFirstToken: TimeInterval?
}
public enum EngineKind: String, CaseIterable, Sendable { case whisperkit, lightning }
| Engine | Status | Use case |
|---|---|---|
WhisperKit (argmaxinc/argmax-oss-swift) |
shipped | Primary runtime engine for the app |
Lightning (lightning-whisper-mlx via subprocess) |
shipped, bench-only | Comparison baseline; not for app runtime |
| WhisperCpp | future | Non-MLX reference for cross-platform later |
MLXAudioEngine (mlx-audio-swift) |
future | Voxtral / Qwen3-ASR / Parakeet variants |
bench/corpus/librispeech for the chosen default model.A model that fails any of these is not the default. BENCHMARKS.md is the source of truth.
large-v3-turbo in argmaxinc/whisperkit-coreml — current config glob doesn’t match. Use WhisperKit.fetchAvailableModels() to enumerate.Sources/OpenQuackKit/Transcription/TranscriptionEngine.swiftSources/OpenQuackKit/Transcription/WhisperKitEngine.swiftSources/OpenQuackKit/Transcription/LightningEngine.swiftdocs/BENCHMARKS.md