openquack

Roadmap

Atomic tasks β€” every item cites a SPEC and maps to a PR. Agent contributors should claim a task by opening a draft PR; see AGENTS.md.

Status


Β  Task Spec Notes
πŸ”΅ Agent session protocol + PassthroughAgent + conversation panel SPEC-006 M
πŸ”΅ ClaudeCodeAgent β€” long-lived subprocess, streaming events SPEC-006 M
πŸ”΅ Approval prompt UX (overlay morph + buttons) SPEC-006 S
πŸ”΅ Settings β€” Privacy + Agent panes SPEC-006 S; lands with agent impl
πŸ”΅ TextPolishEngine protocol + OllamaPolishEngine (HTTP) SPEC-007 S
πŸ”΅ MLXLMPolishEngine (in-process via mlx-swift-lm) SPEC-007 M
πŸ”΅ Settings β†’ Polish pane (engine picker, model picker) SPEC-007 S
πŸ”΅ Bench polish WER delta + latency on openquack-bench SPEC-007 S
πŸ”΅ Domain-term accuracy bench (e.g. β€œClaude Code” not β€œcloud code”) SPEC-007 S
πŸ”΅ β€œSend-confidence” bench: % of utterances clean enough to ship as-is SPEC-007 S
πŸ”΅ Custom dictionary auto-learn: diff transcript vs. committed text, surface candidates with β‰₯3 occurrences as β€œAdd to dictionary” nudge; export correction log as pre-filled GitHub issue template SPEC-022 M
πŸ”΅ fn / Globe key as a bindable hotkey: bare fn or fn+key, opt-in alongside existing βŒƒβ‡§Space; fixes onboarding picker silently ignoring fn (#23) SPEC-003a S
🟑 Mandarin auto-detect fix: categorical failure-mode metrics + zh corpus expansion (PR-A); token suppression + script-match retry (PR-B) β€” issue #17 SPEC-021 S
βšͺ OllamaAgent (local HTTP) SPEC-006 ext S
βšͺ MLXLMAgent (in-process via mlx-swift-lm) SPEC-006 ext M
βšͺ Active-app context: feed foreground app + focused field text into Whisper prompt bias and polish/agent prompt β€” M
πŸ”΅ Per-app tone profiles: bundle-ID β†’ preset (technical / formal / casual / neutral) with custom prompt field; auto-switches on hotkey fire β€” issue #24 SPEC-024 M; needs SPEC-007 first
πŸ”΅ Launch at login (SMAppService toggle in Settings β†’ General) β€” issue #29 SPEC-023 S
βšͺ Investigate streaming for medium-length (15–30s) audio: bench WER vs. wall-time at lower targetChunkSeconds SPEC-012 ext S
βšͺ Live partial transcripts in pill/popover while speaking β€” M
βšͺ System-audio capture (meeting mode) β€” ScreenCaptureKit
βšͺ Multilingual UI strings β€” follow Whisper language menu
βšͺ Action confirmation UI for high-risk agent calls β€” privacy gate
βšͺ Per-agent transcript history pane (opt-in, local-only) β€” β€”
βšͺ Code signing + notarisation β€” S
βšͺ Sparkle auto-update β€” S
βšͺ Demo gif + landing page (GitHub Pages) β€” S
βšͺ Linux / Windows ports β€” post-2.0
🟒 Send-feedback menu item β€” one click from status item to GitHub issue chooser SPEC-018 merged in #5
🟒 Usage stats pane: words dictated, time saved, audio processed β€” local-only SPEC-013 merged in c91da06
🟒 Local audio + transcript history β€” local-only, retention cap SPEC-014 merged in c91da06
🟒 Stream transcription for long audio (>~30s) β€” chunk while recording SPEC-012 perf; user never sees partials
🟒 App shell β€” SwiftPM target, menu bar, About panel SPEC-010 β€”
🟒 Audio capture β€” AVAudioEngine β†’ 16 kHz mono WAV SPEC-001 β€”
🟒 Global hotkey (βŒƒβ‡§Space toggle, KeyboardShortcuts pkg) SPEC-003 β€”
🟒 Record β†’ WhisperKit medium (en) β†’ transcript in popover + clipboard SPEC-002 β€”
🟒 Floating recording-state pill (top-centre, click-through) SPEC-004 β€”
🟒 CGEvent ⌘V auto-paste at cursor (Accessibility prompt + clipboard fallback) SPEC-005 β€”
🟒 Onboarding flow (Welcome β†’ Mic β†’ Paste β†’ Hotkey β†’ Done) β€” β€”
🟒 Settings scene MVP (General / Models / Shortcut / About) β€” β€”
🟒 Smart text post-processing (capitalise, punct, fillers) β€” β€”
🟒 Live level meter + push-to-talk SPEC-001 ext β€”
🟒 VAD auto-stop + sounds + custom dictionary β€” β€”
🟒 App icon (procedural cream-gradient duck) β€” β€”
🟒 DMG + Homebrew cask + README polish β€” β€”
🟒 WhisperKit engine SPEC-002 primary; Apple Silicon Metal
🟒 Lightning engine (Python subprocess) SPEC-002 bench-only baseline
🟒 Metrics: WER / CER / RTF / RSS / cold-start SPEC-002 OpenQuackKit/Metrics/
🟒 Corpus: 177 clips (TTS / multilingual / LibriSpeech / noise-aug) β€” bench/corpus/
🟒 Bench rerun on enriched corpus β†’ BENCHMARKS.md β€” M4/16GB matrix
🟒 openquack-cli (single-file transcribe) SPEC-002 β€”
🟒 SPM scaffolding (Kit + bench + CLI) β€” Package.swift, three targets
🟒 Vision + roadmap + AGENTS.md + spec scaffold β€” β€”

How to claim a task

  1. Pick a πŸ”΅.
  2. Open an issue using the Agent Task template; mark yourself as owner.
  3. Read the cited SPEC.
  4. Open a draft PR within ~24h naming the task in the title.
  5. Follow AGENTS.md for PR shape and required tests.