LLM Consolidation

Send your transcript to Claude or a local Ollama model to generate polished, structured study notes alongside the verbatim transcript.

What Consolidation Does

The raw transcript produced by WhisperKit is accurate but verbatim — it captures exactly what was said, including filler words, repeated phrases, and the natural looseness of spoken language. LLM Consolidation sends this transcript to a language model and asks it to produce structured Markdown study notes: headings organized by topic, bullet-point summaries, and key takeaways.

The consolidated notes appear as a separate "Consolidated Notes" tab in the transcript panel and in the exported HTML page. They sit alongside the original verbatim transcript — the raw transcript is never replaced.

You can annotate consolidated notes independently of the transcript (highlights, margin notes, inline notes all work in both views).

Choosing a Mode

The mode picker on the Pipeline Step 3 row is a segmented control with three options:

Off — No LLM is called. The pipeline skips Step 3 entirely and proceeds directly to export. This is the default.
Claude — Uses the Anthropic API. Your transcript is sent to Anthropic's servers over HTTPS. Requires an API key.
Ollama — Uses a locally running Ollama instance at localhost:11434. Fully on-device. Requires Ollama to be installed and a model pulled.

The mode is stored per lecture. Changing the mode on a completed lecture does not automatically re-run consolidation — it only affects the next time you click Begin Processing or trigger consolidation manually.

Claude Setup

To use Claude consolidation, you need an Anthropic API key. Keys are available at console.anthropic.com.

Open Settings → Consolidation

Go to Harvestry → Settings and click the Consolidation tab.

Enter your API key

Paste your Anthropic API key into the API Key field. The key is stored securely in the macOS Keychain — it is never stored in plain text.

Select a model

After the API key is saved, Harvestry fetches your available models from the Anthropic API and shows them in a dropdown. Choose a model. Click the refresh button next to the picker to update the model list if your account gains access to new models.

Claude Model Recommendations

The model list is fetched live from the Anthropic API, so it reflects whatever your account has access to. General guidance:

Model Family	Best For	Cost
Claude Sonnet	Recommended for most use. Excellent balance of quality and speed. Produces well-structured notes with good comprehension of technical content.	Moderate
Claude Haiku	Fastest and cheapest. Good for straightforward lectures with clear structure. May miss nuance in dense academic content.	Low
Claude Opus	Highest quality. Best for highly technical, multi-topic, or non-English content where maximum comprehension matters.	High

⚠️

API usage is billed by Anthropic. Consolidating a long lecture can consume a significant number of tokens (input + output). A 90-minute lecture transcript is typically 8,000–15,000 input tokens. Check your Anthropic dashboard to monitor usage.

Ollama Setup

Ollama runs language models entirely on your Mac. No data leaves the device.

Install Ollama

Download Ollama from ollama.ai and install it. Ollama runs as a background service on localhost:11434.

Pull a model

Open Terminal and run a command like:

ollama pull llama3

or for a larger model:

ollama pull llama3:70b

Any Ollama-compatible model works. Models with at least 7B parameters give reasonable note quality.

Harvestry auto-detects your models

In the Pipeline Step 3 picker, select Ollama. Harvestry queries localhost:11434/api/tags and shows all installed models in a dropdown. Click the refresh button to re-query if you pull new models after opening Harvestry.

ℹ️

Ollama endpoint. If you run Ollama on a non-default port or a remote host, you can change the endpoint URL in Settings → Consolidation → Ollama Endpoint. Default: http://localhost:11434.

The Consolidation Prompt

The system prompt sent to the LLM instructs it how to format the output. The default prompt asks the model to produce structured Markdown with:

A brief summary paragraph at the top
Headings for each major topic covered
Bullet-point summaries under each heading
A "Key Takeaways" section at the end

You can customize the prompt in Settings → Consolidation → System Prompt. The text editor in Settings accepts any Markdown-aware instructions. A Reset to Default button restores the original prompt.

Per-Lecture Control

The consolidation mode (Off / Claude / Ollama) and model selection are stored per lecture. You can have different lectures using different modes:

Some lectures consolidated with Claude Sonnet for maximum quality
Others consolidated with Ollama for privacy
Others with consolidation off for quick reference

Changing the mode on a pending lecture takes effect when you next click Begin Processing. Changing the mode on a completed lecture triggers an alert asking if you want to generate or regenerate consolidated notes now.

Disabling After the Fact

To remove consolidated notes from a completed lecture:

On the Step 3 row, switch the mode picker from Claude or Ollama to Off.
A confirmation alert appears: "Remove consolidated notes from this lecture? The HTML export will be regenerated without the notes tab."
Confirm. The consolidated notes are deleted and the HTML is re-exported.

Re-running Consolidation

If a lecture is already complete and you switch its consolidation mode to Claude or Ollama, Harvestry shows an alert: "Generate consolidated notes now?" Confirming runs only Step 3 and then re-exports the HTML. The transcript and screenshots are unchanged.

Privacy

The privacy implications differ significantly between modes:

Off — No LLM call. Nothing leaves the device.
Claude — Your transcript text is sent to Anthropic's API servers over HTTPS. This is the one intentionally off-device step in Harvestry's pipeline. Anthropic's data handling is governed by their Privacy Policy.
Ollama — Fully on-device. The transcript is sent over localhost to the Ollama daemon running on your Mac. No data leaves the device.

Consider using Ollama mode for lectures containing sensitive, confidential, or personally identifiable content.

Previous Screenshot Capture

Next HTML Export