LLM Consolidation
Send your transcript to Claude or a local Ollama model to generate polished, structured study notes alongside the verbatim transcript.
What Consolidation Does
The raw transcript produced by WhisperKit is accurate but verbatim — it captures exactly what was said, including filler words, repeated phrases, and the natural looseness of spoken language. LLM Consolidation sends this transcript to a language model and asks it to produce structured Markdown study notes: headings organized by topic, bullet-point summaries, and key takeaways.
The consolidated notes appear as a separate "Consolidated Notes" tab in the transcript panel and in the exported HTML page. They sit alongside the original verbatim transcript — the raw transcript is never replaced.
You can annotate consolidated notes independently of the transcript (highlights, margin notes, inline notes all work in both views).
Choosing a Mode
The mode picker on the Pipeline Step 3 row is a segmented control with three options:
- Off — No LLM is called. The pipeline skips Step 3 entirely and proceeds directly to export. This is the default.
- Claude — Uses the Anthropic API. Your transcript is sent to Anthropic's servers over HTTPS. Requires an API key.
- Ollama — Uses a locally running Ollama instance at
localhost:11434. Fully on-device. Requires Ollama to be installed and a model pulled.
The mode is stored per lecture. Changing the mode on a completed lecture does not automatically re-run consolidation — it only affects the next time you click Begin Processing or trigger consolidation manually.
Claude Setup
To use Claude consolidation, you need an Anthropic API key. Keys are available at console.anthropic.com.
Go to Harvestry → Settings and click the Consolidation tab.
Paste your Anthropic API key into the API Key field. The key is stored securely in the macOS Keychain — it is never stored in plain text.
After the API key is saved, Harvestry fetches your available models from the Anthropic API and shows them in a dropdown. Choose a model. Click the refresh button next to the picker to update the model list if your account gains access to new models.
Claude Model Recommendations
The model list is fetched live from the Anthropic API, so it reflects whatever your account has access to. General guidance:
| Model Family | Best For | Cost |
|---|---|---|
| Claude Sonnet | Recommended for most use. Excellent balance of quality and speed. Produces well-structured notes with good comprehension of technical content. | Moderate |
| Claude Haiku | Fastest and cheapest. Good for straightforward lectures with clear structure. May miss nuance in dense academic content. | Low |
| Claude Opus | Highest quality. Best for highly technical, multi-topic, or non-English content where maximum comprehension matters. | High |
Ollama Setup
Ollama runs language models entirely on your Mac. No data leaves the device.
Download Ollama from ollama.ai and install it. Ollama runs as a background service on localhost:11434.
Open Terminal and run a command like:
or for a larger model:
Any Ollama-compatible model works. Models with at least 7B parameters give reasonable note quality.
In the Pipeline Step 3 picker, select Ollama. Harvestry queries localhost:11434/api/tags and shows all installed models in a dropdown. Click the refresh button to re-query if you pull new models after opening Harvestry.
http://localhost:11434.
The Consolidation Prompt
The system prompt sent to the LLM instructs it how to format the output. The default prompt asks the model to produce structured Markdown with:
- A brief summary paragraph at the top
- Headings for each major topic covered
- Bullet-point summaries under each heading
- A "Key Takeaways" section at the end
You can customize the prompt in Settings → Consolidation → System Prompt. The text editor in Settings accepts any Markdown-aware instructions. A Reset to Default button restores the original prompt.
Per-Lecture Control
The consolidation mode (Off / Claude / Ollama) and model selection are stored per lecture. You can have different lectures using different modes:
- Some lectures consolidated with Claude Sonnet for maximum quality
- Others consolidated with Ollama for privacy
- Others with consolidation off for quick reference
Changing the mode on a pending lecture takes effect when you next click Begin Processing. Changing the mode on a completed lecture triggers an alert asking if you want to generate or regenerate consolidated notes now.
Disabling After the Fact
To remove consolidated notes from a completed lecture:
- On the Step 3 row, switch the mode picker from Claude or Ollama to Off.
- A confirmation alert appears: "Remove consolidated notes from this lecture? The HTML export will be regenerated without the notes tab."
- Confirm. The consolidated notes are deleted and the HTML is re-exported.
Re-running Consolidation
If a lecture is already complete and you switch its consolidation mode to Claude or Ollama, Harvestry shows an alert: "Generate consolidated notes now?" Confirming runs only Step 3 and then re-exports the HTML. The transcript and screenshots are unchanged.
Privacy
The privacy implications differ significantly between modes:
- Off — No LLM call. Nothing leaves the device.
- Claude — Your transcript text is sent to Anthropic's API servers over HTTPS. This is the one intentionally off-device step in Harvestry's pipeline. Anthropic's data handling is governed by their Privacy Policy.
- Ollama — Fully on-device. The transcript is sent over localhost to the Ollama daemon running on your Mac. No data leaves the device.
Consider using Ollama mode for lectures containing sensitive, confidential, or personally identifiable content.