Harvestry Documentation LLM Consolidation

LLM Consolidation

Send your transcript to Claude or a local Ollama model to generate polished, structured study notes alongside the verbatim transcript.

What Consolidation Does

The raw transcript produced by WhisperKit is accurate but verbatim — it captures exactly what was said, including filler words, repeated phrases, and the natural looseness of spoken language. LLM Consolidation sends this transcript to a language model and asks it to produce structured Markdown study notes: headings organized by topic, bullet-point summaries, and key takeaways.

The consolidated notes appear as a separate "Consolidated Notes" tab in the transcript panel and in the exported HTML page. They sit alongside the original verbatim transcript — the raw transcript is never replaced.

You can annotate consolidated notes independently of the transcript (highlights, margin notes, inline notes all work in both views).

Choosing a Mode

The mode picker on the Pipeline Step 3 row is a segmented control with three options:

The mode is stored per lecture. Changing the mode on a completed lecture does not automatically re-run consolidation — it only affects the next time you click Begin Processing or trigger consolidation manually.

Claude Setup

To use Claude consolidation, you need an Anthropic API key. Keys are available at console.anthropic.com.

1
Open Settings → Consolidation

Go to Harvestry → Settings and click the Consolidation tab.

2
Enter your API key

Paste your Anthropic API key into the API Key field. The key is stored securely in the macOS Keychain — it is never stored in plain text.

3
Select a model

After the API key is saved, Harvestry fetches your available models from the Anthropic API and shows them in a dropdown. Choose a model. Click the refresh button next to the picker to update the model list if your account gains access to new models.

Claude Model Recommendations

The model list is fetched live from the Anthropic API, so it reflects whatever your account has access to. General guidance:

Model Family Best For Cost
Claude Sonnet Recommended for most use. Excellent balance of quality and speed. Produces well-structured notes with good comprehension of technical content. Moderate
Claude Haiku Fastest and cheapest. Good for straightforward lectures with clear structure. May miss nuance in dense academic content. Low
Claude Opus Highest quality. Best for highly technical, multi-topic, or non-English content where maximum comprehension matters. High
⚠️
API usage is billed by Anthropic. Consolidating a long lecture can consume a significant number of tokens (input + output). A 90-minute lecture transcript is typically 8,000–15,000 input tokens. Check your Anthropic dashboard to monitor usage.

Ollama Setup

Ollama runs language models entirely on your Mac. No data leaves the device.

1
Install Ollama

Download Ollama from ollama.ai and install it. Ollama runs as a background service on localhost:11434.

2
Pull a model

Open Terminal and run a command like:

ollama pull llama3

or for a larger model:

ollama pull llama3:70b

Any Ollama-compatible model works. Models with at least 7B parameters give reasonable note quality.

3
Harvestry auto-detects your models

In the Pipeline Step 3 picker, select Ollama. Harvestry queries localhost:11434/api/tags and shows all installed models in a dropdown. Click the refresh button to re-query if you pull new models after opening Harvestry.

ℹ️
Ollama endpoint. If you run Ollama on a non-default port or a remote host, you can change the endpoint URL in Settings → Consolidation → Ollama Endpoint. Default: http://localhost:11434.

The Consolidation Prompt

The system prompt sent to the LLM instructs it how to format the output. The default prompt asks the model to produce structured Markdown with:

You can customize the prompt in Settings → Consolidation → System Prompt. The text editor in Settings accepts any Markdown-aware instructions. A Reset to Default button restores the original prompt.

Per-Lecture Control

The consolidation mode (Off / Claude / Ollama) and model selection are stored per lecture. You can have different lectures using different modes:

Changing the mode on a pending lecture takes effect when you next click Begin Processing. Changing the mode on a completed lecture triggers an alert asking if you want to generate or regenerate consolidated notes now.

Disabling After the Fact

To remove consolidated notes from a completed lecture:

  1. On the Step 3 row, switch the mode picker from Claude or Ollama to Off.
  2. A confirmation alert appears: "Remove consolidated notes from this lecture? The HTML export will be regenerated without the notes tab."
  3. Confirm. The consolidated notes are deleted and the HTML is re-exported.

Re-running Consolidation

If a lecture is already complete and you switch its consolidation mode to Claude or Ollama, Harvestry shows an alert: "Generate consolidated notes now?" Confirming runs only Step 3 and then re-exports the HTML. The transcript and screenshots are unchanged.

Privacy

The privacy implications differ significantly between modes:

Consider using Ollama mode for lectures containing sensitive, confidential, or personally identifiable content.