AI Template Search
N8N Bazar

Find n8n Templates with AI Search

Search thousands of workflows using natural language. Find exactly what you need, instantly.

Start Searching Free
Oct 7, 2025

AI Agent to Chat with YouTube (n8n Workflow)

AI Agent to Chat with YouTube: n8n Workflow for Deep Video, Comment, and Thumbnail Analysis Build a production-ready n8n workflow that behaves as an AI agent for YouTube. Use OpenAI, the YouTube Data API, Apify, and optional Postgres storage to turn videos, comments, and thumbnails into structured, queryable insights. 1. Solution Overview This documentation describes […]

AI Agent to Chat with YouTube: n8n Workflow for Deep Video, Comment, and Thumbnail Analysis

Build a production-ready n8n workflow that behaves as an AI agent for YouTube. Use OpenAI, the YouTube Data API, Apify, and optional Postgres storage to turn videos, comments, and thumbnails into structured, queryable insights.

1. Solution Overview

This documentation describes a complete n8n workflow that exposes a conversational AI layer on top of YouTube data. The workflow accepts natural language chat requests, calls the YouTube Data API to retrieve channel and video information, optionally transcribes video audio, evaluates thumbnails, and aggregates comment sentiment and topics using OpenAI models. Results can be persisted in Postgres to maintain chat memory and session continuity.

The template is suitable for users who already understand n8n basics, REST APIs, and LLM-based agents, and want a repeatable pattern to query YouTube programmatically through a chat interface.

2. Architecture & Data Flow

2.1 High-Level Workflow

The workflow executes the following high-level sequence:

  1. Chat Trigger receives a user query (for example: “Analyze the latest video on channel @example”).
  2. AI Agent (OpenAI functions) interprets the request, determines which tools are needed, and emits a structured command.
  3. Switch node routes that command to the appropriate sub-branch:
    • Channel details
    • Video list or video details
    • Comment retrieval and analysis
    • Video transcription via Apify
    • Thumbnail image analysis
  4. YouTube Data API HTTP Request nodes fetch channel, video, and comment data.
  5. Apify optionally transcribes video audio to text for deeper content analysis.
  6. OpenAI processes transcripts, comments, and thumbnails to generate insights.
  7. Postgres (optional) stores chat memory and session context.
  8. Final OpenAI response node composes a structured answer that can be returned to the user.

2.2 Core Components

  • n8n – Orchestration layer that manages triggers, branching, HTTP calls, and integration with LLM tools.
  • OpenAI – Handles natural language understanding, summarization, comment and transcript analysis, and thumbnail critique via text and image-capable models.
  • Google YouTube Data API – Provides channel, video, and comment metadata. Requires a Google Cloud project and API key (plus OAuth if needed).
  • Apify (optional) – Used as a transcription or scraping backend when you need audio-to-text or when API access to certain data is constrained.
  • Postgres (optional) – Acts as a persistent memory store for multi-turn conversations and user-specific context.

3. Prerequisites & Setup

3.1 Required Accounts & APIs

  1. Google Cloud project
    • Enable the YouTube Data API v3.
    • Generate an API key for server-side requests.
    • Optionally configure OAuth credentials if your environment or quota model requires it.
  2. OpenAI account
    • Create an API key with access to the models you plan to use (chat and image analysis models).
  3. Apify account (optional)
    • Generate an API token if you intend to use Apify transcription actors or scraping utilities.
  4. Postgres database (optional)
    • Provision a Postgres instance if you want long-term chat memory.

3.2 n8n Credential Configuration

In n8n, configure the following credentials:

  • OpenAI – Use the built-in OpenAI credential type and store your API key securely.
  • YouTube Data API – Typically configured as an HTTP Request using Query Auth with the API key passed as a query parameter (for example key=<YOUR_API_KEY>).
  • Apify – Configure an HTTP or dedicated Apify credential with your API token.
  • Postgres – Use the Postgres credential type and set host, port, database, user, and password for the memory store.

3.3 Workflow Import

Either import the provided workflow template into n8n or recreate the logic with the following node types:

  • Chat trigger node and OpenAI agent node (functions-based).
  • Switch node for routing tool commands.
  • HTTP Request nodes for YouTube Data API and Apify.
  • OpenAI nodes for text and image analysis.
  • Postgres nodes for memory read/write (if enabled).

4. Node-by-Node Breakdown

4.1 Chat Trigger & AI Agent

Purpose: Accept user input and convert it into structured tool calls.

  • Chat Trigger Node
    • Acts as the entry point for the workflow.
    • Receives messages such as:
      • “Analyze the latest video on channel @example.”
      • “Summarize the main complaints in comments on this video URL.”
  • AI Agent Node (OpenAI functions agent)
    • Configured to use OpenAI with function calling / tools.
    • Parses the user request and maps it to one or more tool commands such as:
      • get_channel_details
      • videos
      • video_details
      • comments
      • video_transcription
      • analyze_thumbnail
    • Can request clarification from the user if critical parameters are missing, for example:
      • Missing channel handle or channel ID.
      • Missing video URL or video ID.

4.2 Switch Node (Command Router)

Purpose: Route the agent’s selected tool command to the correct processing branch.

  • The Switch node examines a field in the agent output (for example command or toolName).
  • Each case corresponds to a sub-workflow:
    • get_channel_details – Channel metadata branch.
    • videos – Channel video listing branch.
    • video_details – Single video metadata and statistics.
    • comments – Comment thread retrieval and later analysis.
    • video_transcription – Transcription via Apify.
    • analyze_thumbnail – Thumbnail image analysis via OpenAI.

4.3 YouTube Data API HTTP Nodes

Purpose: Fetch raw data from YouTube that the agent will later interpret.

Typical HTTP Request configurations include:

  • Channel details
    • Endpoint: https://www.googleapis.com/youtube/v3/channels
    • Parameters:
      • part=snippet,contentDetails,statistics (adjust as needed)
      • forHandle=<channel_handle> or id=<channel_id>
      • key=<YOUR_API_KEY>
    • Used to retrieve channel title, description, thumbnails, and basic statistics.
  • List channel videos
    • Endpoint: often search or playlistItems depending on design.
    • Common parameters:
      • channelId=<channel_id>
      • order=date or order=viewCount
      • maxResults to limit the number of videos.
    • Returns a list of video IDs and metadata for further processing.
  • Video details and statistics
    • Endpoint: https://www.googleapis.com/youtube/v3/videos
    • Parameters:
      • part=snippet,contentDetails,statistics
      • id=<video_id_list>
    • Provides titles, descriptions, durations, view counts, likes, and other metrics.
  • Comment threads
    • Endpoint: https://www.googleapis.com/youtube/v3/commentThreads
    • Parameters:
      • part=snippet,replies (if replies are needed)
      • videoId=<video_id>
      • maxResults and pageToken for pagination.
    • Supports pagination through nextPageToken when retrieving large comment sets.

4.4 Transcription via Apify

Purpose: Convert video audio into text for content analysis.

  • Input to Apify:
    • Video URL, typically constructed from the video ID (for example https://www.youtube.com/watch?v=<video_id>).
  • Apify actor:
    • Use a transcription actor or a custom actor that calls a speech-to-text (STT) service.
    • Returns structured text that represents the spoken content.
  • n8n HTTP Request node:
    • Calls the Apify actor endpoint with your API token.
    • Waits for transcription results or polls until the run is complete, depending on your actor configuration.

The resulting transcript is then passed to an OpenAI node for summarization, topic extraction, or generation of timestamps and caption suggestions.

4.5 Thumbnail Analysis (OpenAI Image Model)

Purpose: Evaluate thumbnail quality and generate actionable design feedback.

  • Input data:
    • Highest resolution thumbnail URL from the video metadata (for example maxres thumbnail if available).
  • OpenAI node:
    • Configured to use an image-capable model.
    • Prompt includes guidance to evaluate:
      • Overall design and composition.
      • Text legibility and font size.
      • Color contrast and visual hierarchy.
      • Facial expressions and emotional impact, if faces are present.
      • Clarity and prominence of calls to action (CTAs).
    • Output is a structured critique and improvement suggestions.

4.6 Comment Analysis & Synthesis

Purpose: Transform raw comment threads into insights about audience sentiment and topics.

  • Input:
    • Aggregated comments retrieved from the YouTube commentThreads endpoint.
    • Optionally flattened into a single text block or batched segments.
  • OpenAI processing:
    • Sentiment analysis across all comments (positive, negative, neutral).
    • Keyword and topic extraction to identify recurring themes.
    • Clustering of common feedback or feature requests.
    • Extraction of frequently asked questions that can drive future content.

The AI agent then returns comments-derived insights in structured form, for example bullet points of user pain points or top-requested topics.

4.7 Postgres Chat Memory

Purpose: Maintain context across multiple user interactions.

  • What is stored:
    • Session identifiers or user IDs.
    • Recent channels or videos analyzed.
    • Key user preferences or constraints surfaced during conversation.
  • How it is used:
    • Subsequent queries can reference previous results without repeating parameters, for example:
      • “Now analyze the comments on the last video you just summarized.”
    • The agent can read from Postgres at the start of a session and write updates at the end of each turn.

4.8 Final Response Composition

Purpose: Combine all partial outputs into a cohesive answer for the user.

  • Inputs:
    • Channel summaries and statistics.
    • Selected video metadata and performance metrics.
    • Transcript highlights and extracted topics.
    • Comment sentiment and FAQ clusters.
    • Thumbnail critique and optimization tips.
  • OpenAI node:
    • Formats the response into:
      • Bullet-point insights.
      • A brief analytic report.
      • Content plan suggestions or clip ideas.

5. Configuration Notes & Edge Cases

5.1 API Quotas & Limits

  • YouTube Data API:
    • Each endpoint has a quota cost per request.
    • Batch video IDs where possible to minimize calls.
  • OpenAI:
    • Token usage can grow rapidly with long transcripts and large comment sets.
    • Consider summarizing or sampling input before sending to the model.

5.2 Transcription Cost & Duration

  • Long-form content significantly increases STT costs and processing time.
  • Consider:

Leave a Reply

Your email address will not be published. Required fields are marked *

AI Workflow Builder
N8N Bazar

AI-Powered n8n Workflows

🔍 Search 1000s of Templates
✨ Generate with AI
🚀 Deploy Instantly
Try Free Now