Harvestry transcribes your videos — lecture recordings, online courses, tutorials, conference talks — captures the key screenshots, and exports a polished, interactive study document — entirely on your Mac, entirely in private.
Requires macOS 15 or later · Apple Silicon
You can rewatch a video. You can't search it. You can't highlight the important parts, annotate a timestamp, or find that one slide you half-remember without scrubbing back through everything. Harvestry converts any video — lecture, tutorial, talk, course — into a structured study document. Read it, search it, mark it up. Keep it forever.
Paste a URL from YouTube, Vimeo, or any of a thousand platforms — or open a local file. Harvestry downloads and prepares it automatically.
Every spoken word is transcribed on the Apple Neural Engine. Key video image frames are captured simultaneously on the GPU. Both finish at the same time — usually in just a few minutes.
A self-contained study page that opens in any web browser — on your Mac, iPhone, or any device. Live word highlighting, text annotations, audio sync, and dark mode built in. No internet required.
Harvestry uses WhisperKit — Argmax's optimised framework for running OpenAI's Whisper on Apple Silicon. Every word is processed on the Apple Neural Engine, the dedicated ML accelerator in every M-series chip. No API key. No audio upload. No subscription.
Five model sizes let you dial in exactly the right tradeoff for your video. Tiny finishes a 90-minute video in minutes. Large Turbo captures technical vocabulary, proper nouns, and accented speech that smaller models miss.
"All American: The Power of Sports" by U.S. National Archives, licensed under CC BY 3.0
Not every frame deserves a screenshot — and the same slide should never be captured twice. Harvestry runs a multi-stage pipeline — Apple Vision shape detection, accurate on-device OCR, and perceptual fingerprinting — to surface every distinct slide worth keeping and quietly drop the near-duplicates a simpler tool would flood you with. Every stage runs on your Mac.
Automated capture is thorough, but you know your video better than any algorithm. Harvestry gives you a full suite of curation tools so you can take the result from good to exactly right — without rerunning any processing.
Every video you process joins your library as a fully annotated study document — your highlights, margin notes, and thinking baked into the HTML, ready whenever you return. Not a folder of files. A curated collection that grows more valuable with every lecture you add.
A raw transcript is accurate. It isn't always elegant. Spoken content is conversational, not written — presenters repeat themselves, revise mid-sentence, and circle back. The optional consolidation stage sends your transcript through an LLM and produces a second, refined version of your study document: same facts, same structure, clearer writing.
Routes your transcript through Anthropic's Claude API. Produces consistently high-quality, well-structured academic prose. Fast, with no local setup required beyond an API key.
⚠ Transcript text is sent to Anthropic over HTTPS. Use local Ollama instead if your video contains sensitive material.
Connects to a locally running Ollama instance. Any compatible model — Llama, Mistral, Gemma, and others. Nothing leaves your Mac. No per-token cost. No internet required.
Requires a capable Mac and a few minutes of Ollama setup. Once running, it is completely private and free of any external dependency.
The consolidated document flows into the same HTML export pipeline as the base transcript — same typography, same audio sync, same layout. Both versions are fully editable in Harvestry before export. With Harvestry Pro, you can also export either version — or both — straight to your Obsidian vault or any Markdown editor.
The export isn't a raw text dump. Harvestry generates a polished, fully self-contained HTML page you open directly in Safari, Chrome, or any browser — styled like a high-quality editorial publication, with live audio playback, per-word highlighting, and text markup built in.
As the video plays, each word lights up in real time. Click any timestamp to jump to that moment.
Screenshots are classified at export time. Lecture slides with readable content render at full page width — legible without zooming. Presenter footage floats to the margin as a compact thumbnail, keeping the text front and centre.
Four highlight colours and margin notes are created in the app and baked into the HTML — visible to anyone the moment the file opens, no interaction required.
No CDN fonts, no remote scripts, no dependencies. Dark mode and font-size preferences are baked in and persist across reloads. Works in any browser, on any device — today or in ten years.
Obsidian & Markdown export. Harvestry Pro also exports your study document as a Markdown file — YAML frontmatter, per-screenshot assets, and a per-lecture folder structure that drops straight into your Obsidian vault or any Markdown editor. See what's in Pro →
Transcription, screenshot capture, and HTML export all run locally. No audio, no video, and no images are ever transmitted anywhere. The core pipeline has no network dependencies beyond the video download itself.
The one optional exception: if you choose to use the Claude API for LLM consolidation, your transcript text is sent to Anthropic's servers over HTTPS. This step is always opt-in — the base processing pipeline is fully local whether or not you use it. The Pro tier supports local Ollama models instead, keeping even that step entirely on-device.
Every label, prompt, status message, and tooltip is fully localised. Harvestry respects your macOS language preference automatically — no configuration required.
Download Harvestry and turn your first video into a study document — free. No subscription, no account. Drop in a file or paste a URL and press Begin Processing.
Requires macOS 15 or later · Apple Silicon · Free trial included