Harvestry Documentation
Everything you need to turn any video into a polished, interactive study document — on your Mac, in private.
What Harvestry Does
Harvestry turns passive video watching into active studying. You give it a video — a lecture recording, an online course, a conference talk, a tutorial — and it produces a richly formatted HTML study document you can read, search, annotate, and keep forever.
The document Harvestry generates contains the full verbatim transcript broken into readable passages, the slides that matter — detected, de-duplicated, and captured at their sharpest — an embedded audio player with per-word highlight sync so you can tap any word to hear it spoken, and — optionally — an AI-consolidated summary of the lecture's key ideas. Everything is woven into a single self-contained file you own outright.
Who Should Use Harvestry
Harvestry is for anyone who learns from video and wants something more durable than a playback history:
- Students reviewing lecture recordings, pre-recorded course material, or seminar replays before an exam.
- Professionals processing conference talks, webinars, technical demos, and training videos they need to reference later.
- Researchers working through interview recordings, panel discussions, or academic talks where every word matters.
- Lifelong learners who watch online courses and want a study-ready artifact they can annotate and return to.
If you have ever re-watched a 90-minute video just to find a single quote or diagram, Harvestry is what replaces that workflow.
Entirely On-Device, Entirely Private
Every stage of Harvestry's pipeline runs on your Mac with no data leaving your machine — unless you explicitly choose to use an AI consolidation model:
- Transcription uses WhisperKit on the Apple Neural Engine. No audio is uploaded anywhere. Your spoken words never leave the chip.
- Screenshot capture uses AVFoundation and VideoToolbox for hardware-accelerated frame extraction, plus Apple Vision text detection and on-device OCR to pick the slides that matter and drop near-duplicates. No image or recognized text ever leaves your Mac.
- HTML export is generated entirely from files on your disk. The output is a static file with no tracking, no analytics, and no external dependencies.
- LLM consolidation is the one optional network step. You can use a locally running Ollama model (fully on-device) or the Claude API (your transcript is sent to Anthropic over HTTPS). The choice is yours.
What You Get
Each processed video produces a self-contained index.html you can open in any browser, share as a file, or archive. The document includes:
- A full word-accurate transcript, broken into readable passages with timestamps
- Key screenshots placed inline at the moments they appeared in the video
- An embedded audio player with live per-word highlight sync — click any word, hear it immediately
- Your own highlights and margin notes, embedded directly in the exported HTML — visible to anyone who opens the file, in any browser
- An optional AI-generated consolidated notes section summarising key concepts
- A font-size slider, light/dark theme toggle, and print-ready layout
All Documentation Pages
Browse the full reference for every Harvestry feature.