Build an AI Newsletter Agent with n8n
This reference guide describes how to implement an AI-powered newsletter workflow in n8n using object storage, LLM nodes, and Slack integration. The goal is to automate story selection, copywriting, and newsletter assembly while preserving editorial control and consistency.
1. Workflow overview
The n8n workflow automates the full lifecycle of an AI-focused newsletter:
- Ingests raw content from markdown files and tweet exports stored in S3-compatible object storage
- Normalizes and aggregates all sources into a single structured payload
- Uses an LLM to select top stories and generate subject lines and pre-header text
- Generates Axios-style segments for each chosen story using dedicated prompts
- Builds the intro and a “Shortlist” section of additional stories
- Exports the final newsletter as markdown and sends it to Slack for review and approvals
The pipeline is designed for operators who already understand n8n concepts such as triggers, credentials, nodes, and data flows, and want a production-ready pattern for AI newsletter automation.
2. Architecture and data flow
2.1 Logical layers
The workflow is organized into clear stages:
- Input ingestion – Trigger the workflow, identify the target newsletter date, and locate relevant source files in object storage.
- Content aggregation – Normalize and merge all inputs into a unified structure for LLM processing.
- Story selection – Use an LLM node to choose a lead story and three additional stories, plus subject line and pre-header.
- Segment authoring – Iterate through selected stories and generate Axios-like segments.
- Intro and “The Shortlist” – Generate the newsletter intro and a curated list of additional notable stories.
- Approvals, export, and delivery – Assemble the final markdown, send to Slack, and optionally extract assets such as images.
2.2 High-level data flow
- The workflow starts from a form trigger that captures metadata such as the newsletter date and optionally the previous issue’s content.
- S3-compatible nodes search and download markdown and tweet objects using a date-based prefix.
- HTTP Request nodes retrieve metadata and external URLs, which are used to filter and enrich content.
- ExtractFromFile nodes convert downloaded files into raw text.
- Set and Aggregate nodes produce a consolidated object that feeds into one or more LLM nodes.
- LLM nodes perform both editorial decision-making (selection) and content generation (segments, intro, subject lines).
- SplitInBatches and SplitOut nodes handle iteration over selected stories and identifiers.
- Slack nodes push intermediate results and final outputs into a channel for human review.
- File nodes assemble and export the final markdown newsletter for distribution or further processing.
3. Node-by-node breakdown
3.1 Trigger and input configuration
Form Trigger
- Purpose: Capture runtime parameters for the newsletter run.
- Typical fields:
newsletter_date– Target date used to locate relevant content in the object store.previous_newsletter_content(optional) – Text of the previous issue, used to avoid duplicate coverage.
- Usage: The values from this trigger are referenced downstream, especially for:
- S3 search prefixes
- De-duplication logic in LLM prompts
3.2 Input ingestion from object storage
S3 (search & download) nodes
- Purpose: Find and download content candidates such as markdown articles and tweet dumps.
- Typical configuration:
- Operation: “List” or “Search” with a prefix based on
newsletter_date. - Filter: Use key patterns or prefixes to target:
- Markdown content files
- Tweet exports
- Download: For each matching object, use a download operation to fetch the file content.
- Operation: “List” or “Search” with a prefix based on
- Edge cases:
- No matches for a given date prefix. In this case, consider adding a conditional path to fail early or post a Slack notification.
- Non-text files in the same prefix. These are filtered out using metadata and file type checks.
HTTP Request nodes (metadata & external URLs)
- Purpose: Retrieve metadata about each object, including:
- File type
- Draft status
- External source URLs
- Usage:
- Exclude newsletter drafts and non-markdown files from downstream processing.
- Attach external URLs to each content item for later scraping or reference.
- Filtering logic:
- Skip any objects flagged as drafts.
- Skip file types that are not markdown or tweet exports.
3.3 Content extraction and normalization
ExtractFromFile
- Purpose: Convert downloaded objects into plain text.
- Inputs:
- Binary data from S3 download nodes.
- Output:
- Text content that can be used directly in LLM prompts or further transformed.
Set and Aggregate nodes
- Purpose: Normalize heterogeneous inputs into a consistent structure and then aggregate them.
- Typical fields per item:
identifier– A unique ID or filename for the story.friendly_type– Human-readable type, for example “markdown article” or “tweet thread”.authors– Author names when available.external_source_urls– One or more URLs associated with the story.body– Full text content extracted from the file.
- Aggregation:
- Combine all normalized items into a single structure that the LLM node can read in one pass.
- Preserve identifiers and links for later reference in selection and segment writing.
3.4 Story selection using LLM
LLM node (selection stage)
- Purpose: Act as an editor that selects the top stories and generates subject line and pre-header text.
- Model integration:
- Can be configured with LangChain-style integration, or a direct provider such as Gemini or Claude.
- Input:
- Aggregated content bundle (identifiers, text bodies, URLs, types, authors).
- Optionally, previous newsletter content to avoid duplicates.
- Expected behavior:
- Select exactly four stories:
- One lead story
- Three additional stories
- Return short reasons for selecting each story.
- Include the original identifiers for each selected story.
- Produce:
- A subject line optimized for open rates
- Pre-header text that complements the subject line
- Select exactly four stories:
- Editorial constraints:
- Enforce strict de-duplication, both within the current issue and against previous issues when provided.
- Favor substantive, high-signal sources over trivial updates.
- Avoid selecting the same story multiple times under different identifiers.
3.5 Segment generation for selected stories
SplitInBatches & SplitOut
- Purpose: Iterate safely over the list of selected stories.
- Behavior:
- Split the LLM selection output into individual story items.
- Process each story in isolation to avoid token overflows and to maintain clear mapping between inputs and outputs.
Story enrichment and external scraping
- Purpose: For each selected story, gather all associated identifiers and source texts, and optionally fetch external content.
- Steps:
- Collect all relevant text from the normalized items that match the selected identifiers.
- When
external_source_urlsare present, perform HTTP requests or scraping to retrieve additional context if needed.
- Edge considerations:
- Handle failed URL fetches gracefully and fall back to the existing body text.
- Do not introduce URLs that are not present in the original metadata.
LLM node (segment writing)
- Purpose: Generate Axios-like newsletter segments for each story.
- Output format (enforced via prompt):
- The Recap: A concise summary of the story.
- An unpacked bullet list that breaks down key details.
- A two-sentence “Bottom line” that provides clear takeaways.
- Formatting constraints:
- Consistent bolding for headings such as “The Recap” and “Bottom line”.
- Short, scannable bullets.
- Strict rules for link usage so that only provided URLs are used.
3.6 Intro and “The Shortlist” sections
LLM node (intro generation)
- Purpose: Create the newsletter opening section.
- Structure:
- Dynamic greeting tailored to the issue.
- Two short paragraphs providing context or commentary.
- The exact transition phrase: In today’s AI recap:
- A short bullet list summarizing the main items in the issue.
LLM node (“The Shortlist”)
- Purpose: Compile a secondary list of notable AI stories that did not make the main segments.
- Input:
- Remaining content items and their associated URLs.
- URL handling policy:
- Only use verbatim URLs from the source metadata.
- Do not invent new links or domains.
3.7 Approvals, export, and delivery
Slack nodes
- Purpose:
- Send selected stories, subject line options, and reasoning to an editorial Slack channel.
- Share the final assembled newsletter for review and sign-off.
- Typical usage:
- Post a preview that includes:
- Chosen lead and secondary stories
- Subject line and pre-header
- Short rationale for each choice
- Upload the final markdown file as an attachment or link.
- Post a preview that includes:
File export nodes
- Purpose: Assemble the final newsletter content and convert it into a file.
- Steps:
- Use Aggregate and Set nodes to combine:
- Intro section
- Main story segments
- “The Shortlist” section
- Output the combined text as a markdown file.
- Upload this file to Slack or store it back into S3-compatible storage.
- Use Aggregate and Set nodes to combine:
Optional image extraction
- Purpose: Extract direct image URLs for use in email builders.
- Behavior:
- Scan content for supported image formats such as
.jpg,.png, and.webp. - Expose these URLs as a separate field or list for downstream tooling.
- Scan content for supported image formats such as
4. Core n8n nodes and configuration notes
4.1 Primary nodes used
- Form Trigger – Captures the target date and optional previous newsletter content.
- S3 (search & download) – Lists and retrieves markdown and tweet objects using a date-based prefix.
- HTTP Request – Fetches object metadata and external source URLs for filtering and enrichment.
- ExtractFromFile – Converts downloaded objects into plain text for processing.
- LangChain / LLM nodes – Power selection, segment writing, intro creation, and subject-line generation.
- SplitInBatches & SplitOut – Iterate over stories and identifiers in a controlled manner.
- Aggregate & Set – Merge multiple content fragments into a single newsletter body.
- Slack – Send previews for review and upload final files to a channel.
4.2 Prompt engineering guidelines
- Selection prompt:
- Make the instructions explicit:
- Return exactly four stories.
- Include identifiers for each selected item.
- Provide reasons for inclusion and, when relevant, exclusion of others.
- Reference previous newsletter content when available to avoid duplication.
- Make the instructions explicit:
- Story-writing prompts:
- Specify required headings and their formatting, for example bold labels such as “The Recap” and “Bottom line”.
- Define bullet style and length constraints.
- Include exact transition phrases where needed to keep downstream parsing trivial.
- Hallucination control:
- Instruct the model to use only facts that appear in the provided inputs.
- Explicitly forbid inventing links, sources, or metrics.
- Require that all URLs come from specified input fields such as
external_source_urls.
