How to Get YouTube Transcripts with Roo Code
Development work frequently involves non-code research. A library’s demo video, a recorded architecture review, a conference talk about an API you’re integrating. That information exists only as video, and getting it into a usable text format usually means leaving your editor to hunt for a transcript service or browser extension.
Roo Code is an open-source AI coding agent that runs as a VS Code extension. It supports MCP tools, which means it can call external services during a conversation. With a YouTube transcript tool connected, Roo Code can fetch the full text of any public video and bring it into your session. You ask for the transcript, Roo Code delivers it.
YouTube is full of technical content that never gets written down. API design talks, library deep dives, recorded pair programming sessions. Being able to pull those transcripts into your editor means you can search them, ask questions about them, or incorporate the relevant parts into documentation.
Setup
The AgentPatch CLI is designed for AI agents to use via shell access. Install it, and your agent can discover and invoke any tool on the marketplace.
Install (zero dependencies, Python 3.10+):
pip install agentpatch
Set your API key:
export AGENTPATCH_API_KEY=your_api_key
Example commands your agent will use:
ap search "web search"
ap run google-search --input '{"query": "test"}'
Get your API key from the AgentPatch dashboard.
Example: Pulling Context from a Linked Video
You’re writing documentation for a project and the README links to a recorded talk that explains the design decisions. You tell Roo Code:
“Fetch the transcript for this talk and pull out anything about the architecture decisions: https://www.youtube.com/watch?v=abc123”
Roo Code calls the YouTube Transcript tool through AgentPatch, retrieves the full transcript with timestamps, and surfaces the relevant sections. You didn’t open a browser. You didn’t scrape anything. The agent handled it as part of your session.
Another use case: you’re reading through a library’s repo and find a linked video tutorial.
“Get the transcript from that YouTube link in the README and summarize the setup steps.”
Roo Code fetches the transcript and gives you the summary directly in the VS Code chat panel.
What Roo Code Does Step by Step
When you ask for a transcript, Roo Code:
- Parses the YouTube URL to extract the video ID.
- Calls the transcript tool through AgentPatch.
- Receives the full transcript with timestamps.
- Presents it in the chat, or processes it according to your instructions (summarize, extract a section, write it to a file).
This is one of the fastest ways to convert a YouTube video to text. No hunting for third-party sites, no browser extensions, no copy-pasting between windows. The transcript arrives inside your editor, ready to use.
Wrapping Up
Once AgentPatch is configured in Roo Code, you have access to the YouTube Transcript tool and everything else on the marketplace: web search, email, image generation, maps, trends, and more. One config, all the tools. Visit agentpatch.ai to get started.