How to Summarize YouTube Videos with Roo Code

Technical work involves a lot of video content. Library walkthroughs, conference talks, recorded architecture discussions, product demos. Watching a 40-minute video to extract one relevant section is a poor use of time. What you want is the transcript, summarized and searchable, available right where you’re working.

Roo Code is an open-source AI coding agent that runs as a VS Code extension. It’s good at synthesizing information: give it a block of text and it can extract key points, find specific sections, or rewrite content in whatever format you need. The gap has been getting the source material in the first place.

That’s where a YouTube transcript tool comes in. Fetch the transcript of any public video, hand it to the agent, and let it do what it does well: summarize, extract, and organize.

Setup

The AgentPatch CLI is designed for AI agents to use via shell access. Install it, and your agent can discover and invoke any tool on the marketplace.

Install (zero dependencies, Python 3.10+):

pip install agentpatch

Set your API key:

export AGENTPATCH_API_KEY=your_api_key

Example commands your agent will use:

ap search "web search"
ap run google-search --input '{"query": "test"}'

Get your API key from the AgentPatch dashboard.

Example: Summarizing a Technical Talk

You come across a 25-minute conference talk about a library you’re evaluating. Instead of watching the whole thing, you ask Roo Code:

“Summarize this video for me: https://www.youtube.com/watch?v=EKBIni-eB1A

Roo Code fetches the transcript through AgentPatch and returns a structured summary:

“Lazy Potatoes”: Growing Potatoes With No Hilling or Top Dressing

The grower planted potatoes in a 10ft x 4ft raised bed using a “set it and forget it” approach. No hilling, no mid-season fertilizer. Plants grew aggressively but flopped over without hilling, and leaves turned pale from lack of nutrients during tuber development.

Final harvest: ~60 lbs from 40 sq ft (1.5 lbs per square foot). Estimates suggest hilling and top dressing could push yield to ~80 lbs, a 33% increase.

You can follow up with specific questions:

“Was there anything about soil preparation before planting?”

Roo Code pulls the relevant passage from the transcript, with timestamps so you can jump to that section in the video if needed.

What Roo Code Does Step by Step

When you ask for a video summary, Roo Code:

  1. Extracts the video ID from the URL you provided.
  2. Calls the YouTube Transcript tool through AgentPatch to fetch the full transcript with timestamps.
  3. Reads through the transcript and produces a summary at whatever level of detail you asked for.
  4. Keeps the full transcript in context, so you can ask follow-up questions about specific sections.

The entire interaction stays inside VS Code. No browser tabs, no third-party transcript sites, no copy-pasting.

Wrapping Up

This also works for raw transcript extraction. If you need the full text rather than a summary, just ask for it. Roo Code handles both.

YouTube summarization is one capability. Adding AgentPatch to Roo Code gives you access to the full tool marketplace: web search, email, image generation, maps, and more. Visit agentpatch.ai to get started.