OpenAI Citations for File Retrieval in n8n

Ever had an AI confidently say something like, “According to the document…” and then absolutely refuse to tell you which document it meant? That is what this workflow template fixes.

With this n8n workflow, you can take the raw, slightly chaotic output from an OpenAI assistant that uses file retrieval, and turn it into clean, human-friendly citations. No more mystery file IDs, no more guessing which PDF your assistant was “definitely sure” about. Just clear filenames, optional links, and nicely formatted content your users can trust.

What this n8n workflow actually does

This template gives you a structured, automated way to:

Collect the full conversation thread from the OpenAI Threads/Messages API
Extract file citations and annotations from assistant responses
Map ugly file_id values to nice, readable filenames
Swap raw citation text for friendly labels or links
Optionally convert Markdown output to HTML for your UI

In other words, it turns “assistant output with weird tokens and half-baked citations” into “polished, source-aware responses” without you manually clicking through logs like it is 2004.

Why bother with explicit citations in RAG workflows?

When you build Retrieval-Augmented Generation (RAG) systems with OpenAI assistants and vector stores, the assistant can pull in content from your files and attach internal citations. That is great in theory, but in practice you might see:

Raw citation tokens that look nothing like a useful reference
Strange characters or incomplete metadata
Inconsistent formatting across different messages in a thread

Adding a post-processing step in n8n fixes that. With this workflow you can:

Replace cryptic tokens with clear filenames and optional links
Aggregate citations across the entire conversation, not just a single reply
Render output as Markdown or HTML in a consistent way
Give end users transparent, trustworthy source references

Users get to see where information came from, and you get fewer “but which file did it use?” support messages. Everyone wins.

What you need before you start

Before you spin this up in n8n, make sure you have:

An n8n instance (cloud or self-hosted)
An OpenAI API key with access to assistants and files
An OpenAI assistant already set up with a vector store, with files uploaded and indexed
Basic familiarity with n8n nodes, especially the HTTP Request node

Once that is in place, the rest is mostly wiring things together and letting automation do the repetitive work for you.

High-level workflow overview

Here is the overall journey your data takes inside n8n:

User sends a message in the n8n chat UI
The OpenAI assistant responds, using your vector store for file retrieval
You fetch the full thread from the OpenAI Threads/Messages API for complete annotations
You split the response into messages, content blocks, and annotations
You resolve each citation’s file_id to a human-readable filename
You aggregate all citations, then run a final formatting pass
Optionally, you convert Markdown to HTML before sending it to your frontend

Main n8n nodes involved

The template uses a handful of core nodes to make this magic happen:

Chat Trigger (n8n chat trigger) – your chat UI entry point.
OpenAI Assistant (assistant resource) – runs your assistant configured with vector store retrieval.
HTTP Request (Get ALL Thread Content) – calls the OpenAI Threads/Messages API to fetch the full conversation with annotations.
SplitOut nodes – iterate over messages, content blocks, and annotations or citations.
HTTP Request (Retrieve file name from file ID) – calls the OpenAI Files API to turn file_id into a filename.
Set node (Regularize output) – normalizes each citation into a consistent object with id, filename, and text.
Aggregate node – combines all citations into a single list for easier processing.
Code node (Finally format the output) – replaces raw citation text in the assistant reply with formatted citations.
Optional Markdown node – converts Markdown output to HTML, if your frontend prefers HTML.

Step-by-step: how the template workflow runs

1. User sends a message and the assistant replies

The journey starts with the Chat Trigger node. A user types a message in your n8n chat UI, and that input is forwarded to the OpenAI Assistant node.

Your assistant is configured to use a vector store, so it can fetch relevant file snippets and attach citation annotations. The initial response might include short excerpts plus internal references that point back to your files.

2. Fetch the full thread content from OpenAI

The assistant’s immediate response is not always the full story. Some citation details live in the full thread history instead of the single message you just got.

To get everything, you use an HTTP Request node to call:

GET /v1/threads/{threadId}/messages

and you include this special header:

OpenAI-Beta: assistants=v2

This returns all message iterations and their annotations, so you can reliably extract the metadata you need for each citation.

3. Split messages, content blocks, and annotations

The Threads/Messages API response is nested. To avoid scrolling through JSON for the rest of your life, the workflow uses a series of SplitOut nodes to break it into manageable pieces:

Split the thread into individual messages
Split each message into its content blocks
Split each content block into annotations, typically found under content.text.annotations

By the end of this step, you have one item per annotation or citation, ready to be resolved into something readable.

4. Turn file IDs into filenames

Each citation usually includes a file_id. That is great for APIs, not so great for humans. To translate, the workflow uses another HTTP Request node to call the Files API:

GET /v1/files/{file_id}

This returns the file metadata, including the filename. With that in hand, you can show something like project-plan.pdf instead of file-abc123xyz. You can also use this metadata to construct links to your file hosting layer if needed.

5. Regularize and aggregate all citations

Once the file metadata is retrieved, a Set node cleans up each citation into a simple, consistent object with fields like:

id
filename
text (the snippet or text in the assistant output that was annotated)

Then an Aggregate node merges all those citation objects into a single array. That way, the final formatting step can process every citation in one pass instead of juggling them individually.

6. Replace raw text with formatted citations

Now for the satisfying part. A Code node loops through all citations and replaces the raw annotated text in the assistant’s output with your preferred citation style, such as _(filename)_ or a Markdown link.

Here is the example JavaScript used in the Code node:

// Example Code node JavaScript (n8n)
let saida = $('OpenAI Assistant with Vector Store').item.json.output;

for (let i of $input.item.json.data) {  saida = saida.replaceAll(i.text, "  _("+ i.filename+")_  ");
}

$input.item.json.output = saida;
return $input.item;

You can customize that replacement string. For instance, if you host files externally, you might generate Markdown links such as:

[filename](https://your-file-hosting.com/files/{file_id})

Adjust the formatting to match your UI design and how prominently you want to display sources.

7. Optional: convert Markdown to HTML

If your chat frontend expects HTML instead of raw Markdown, you can finish with a Markdown node. It takes the Markdown-rich assistant output and converts it into HTML, ready to render in your UI.

If your frontend already handles Markdown, or you prefer to keep responses as Markdown, you can simply deactivate this node.

Tips, best practices, and common “why is this doing that” moments

Rate limits and batching

If you are resolving a lot of file_id values one by one, you may run into OpenAI rate limits. To keep things smooth:

Batch file metadata requests where possible
Cache filename lookups in n8n (for example, with a database or in-memory cache)
Reuse cached metadata for frequently accessed files

Security and access control

Some quick security reminders:

Store your OpenAI API key inside n8n credentials, not directly in nodes
When exposing filenames or links, make sure your links respect your access controls
Avoid leaking private file URLs to users who should not see them

Dealing with ambiguous or overlapping text matches

Simple string replacement is convenient, but it can be a bit literal. If two citations share overlapping text, you might get unexpected substitutions.

To reduce this risk:

Prefer replacing the exact annotated substring from the citation object
Consider using unique citation tokens in the assistant output that you later map to friendly labels
Normalize whitespace or punctuation before replacement if your data is slightly inconsistent

Formatting styles that work well in UIs

Depending on your frontend, you can experiment with different citation formats, for example:

Inline citations like _(filename)_
A numbered “Sources” list at the end of the message with links
Hover tooltips that show extra metadata such as page numbers or section IDs

The workflow gives you the raw ingredients. How you present them is completely up to your UX preferences.

Ideas for extending this workflow

Once the basic pipeline is running, you can take it further:

Store file metadata in a database to speed up lookups and reduce API calls
Generate a numbered bibliography and replace inline citations with references like [1], [2], etc.
Include richer provenance data such as page numbers or section identifiers when available
Integrate access control logic so users only see citations for files they are allowed to access

Quick troubleshooting checklist

No annotations from OpenAI? Check that your assistant is configured to return retrieval citations and that you fetch the full thread via the Threads API.
File metadata calls returning 404? Verify that the file_id is correct and that the file belongs to your OpenAI account.
Replacements not appearing consistently? Confirm that the excerpt text matches exactly. If needed, normalize whitespace or punctuation before replacement.

Wrapping up

By adding this citation processing pipeline to your n8n setup, you turn a basic RAG system into a much more transparent and reliable experience. The workflow retrieves full thread content, extracts annotations, resolves file IDs to filenames, and replaces raw tokens with readable citations or links.

You can drop the provided JavaScript snippet into your n8n Code node and tweak the formatting to output Markdown links or HTML. From there, it is easy to layer on caching, numbering, or more detailed provenance data as your use case evolves.

Try the template in your own n8n instance

If you are tired of hunting through JSON to figure out which file your assistant used, this workflow template is for you. Spin it up in your n8n instance, connect it to your assistant, and enjoy the relief of automated, clear citations.

If you need a customized version for your dataset, or want help adding caching and numbering, feel free to reach out for a consultation or share your requirements in the comments.

View template →

Find n8n Templates with AI Search

OpenAI Citations for File Retrieval in n8n

OpenAI Citations for File Retrieval in n8n

What this n8n workflow actually does

Why bother with explicit citations in RAG workflows?

What you need before you start

High-level workflow overview

Main n8n nodes involved

Step-by-step: how the template workflow runs

1. User sends a message and the assistant replies

2. Fetch the full thread content from OpenAI

3. Split messages, content blocks, and annotations

4. Turn file IDs into filenames

5. Regularize and aggregate all citations

6. Replace raw text with formatted citations

7. Optional: convert Markdown to HTML

Tips, best practices, and common “why is this doing that” moments

Rate limits and batching

Security and access control

Dealing with ambiguous or overlapping text matches

Formatting styles that work well in UIs

Ideas for extending this workflow

Quick troubleshooting checklist

Wrapping up

Try the template in your own n8n instance

Leave a Reply Cancel reply

Find n8n Templates with AI Search

OpenAI Citations for File Retrieval in n8n

What this n8n workflow actually does

Why bother with explicit citations in RAG workflows?

What you need before you start

High-level workflow overview

Main n8n nodes involved

Step-by-step: how the template workflow runs

1. User sends a message and the assistant replies

2. Fetch the full thread content from OpenAI

3. Split messages, content blocks, and annotations

4. Turn file IDs into filenames

5. Regularize and aggregate all citations

6. Replace raw text with formatted citations

7. Optional: convert Markdown to HTML

Tips, best practices, and common “why is this doing that” moments

Rate limits and batching

Security and access control

Dealing with ambiguous or overlapping text matches

Formatting styles that work well in UIs

Ideas for extending this workflow

Quick troubleshooting checklist

Wrapping up

Try the template in your own n8n instance

Leave a Reply Cancel reply

AI-Powered n8n Workflows