OpenAI Citations for File Retrieval in n8n
Ever had an AI confidently say something like, “According to the document…” and then absolutely refuse to tell you which document it meant? That is what this workflow template fixes.
With this n8n workflow, you can take the raw, slightly chaotic output from an OpenAI assistant that uses file retrieval, and turn it into clean, human-friendly citations. No more mystery file IDs, no more guessing which PDF your assistant was “definitely sure” about. Just clear filenames, optional links, and nicely formatted content your users can trust.
What this n8n workflow actually does
This template gives you a structured, automated way to:
- Collect the full conversation thread from the OpenAI Threads/Messages API
- Extract file citations and annotations from assistant responses
- Map ugly
file_idvalues to nice, readable filenames - Swap raw citation text for friendly labels or links
- Optionally convert Markdown output to HTML for your UI
In other words, it turns “assistant output with weird tokens and half-baked citations” into “polished, source-aware responses” without you manually clicking through logs like it is 2004.
Why bother with explicit citations in RAG workflows?
When you build Retrieval-Augmented Generation (RAG) systems with OpenAI assistants and vector stores, the assistant can pull in content from your files and attach internal citations. That is great in theory, but in practice you might see:
- Raw citation tokens that look nothing like a useful reference
- Strange characters or incomplete metadata
- Inconsistent formatting across different messages in a thread
Adding a post-processing step in n8n fixes that. With this workflow you can:
- Replace cryptic tokens with clear filenames and optional links
- Aggregate citations across the entire conversation, not just a single reply
- Render output as Markdown or HTML in a consistent way
- Give end users transparent, trustworthy source references
Users get to see where information came from, and you get fewer “but which file did it use?” support messages. Everyone wins.
What you need before you start
Before you spin this up in n8n, make sure you have:
- An n8n instance (cloud or self-hosted)
- An OpenAI API key with access to assistants and files
- An OpenAI assistant already set up with a vector store, with files uploaded and indexed
- Basic familiarity with n8n nodes, especially the HTTP Request node
Once that is in place, the rest is mostly wiring things together and letting automation do the repetitive work for you.
High-level workflow overview
Here is the overall journey your data takes inside n8n:
- User sends a message in the n8n chat UI
- The OpenAI assistant responds, using your vector store for file retrieval
- You fetch the full thread from the OpenAI Threads/Messages API for complete annotations
- You split the response into messages, content blocks, and annotations
- You resolve each citation’s
file_idto a human-readable filename - You aggregate all citations, then run a final formatting pass
- Optionally, you convert Markdown to HTML before sending it to your frontend
Main n8n nodes involved
The template uses a handful of core nodes to make this magic happen:
- Chat Trigger (n8n chat trigger) – your chat UI entry point.
- OpenAI Assistant (assistant resource) – runs your assistant configured with vector store retrieval.
- HTTP Request (Get ALL Thread Content) – calls the OpenAI Threads/Messages API to fetch the full conversation with annotations.
- SplitOut nodes – iterate over messages, content blocks, and annotations or citations.
- HTTP Request (Retrieve file name from file ID) – calls the OpenAI Files API to turn
file_idinto a filename. - Set node (Regularize output) – normalizes each citation into a consistent object with
id,filename, andtext. - Aggregate node – combines all citations into a single list for easier processing.
- Code node (Finally format the output) – replaces raw citation text in the assistant reply with formatted citations.
- Optional Markdown node – converts Markdown output to HTML, if your frontend prefers HTML.
Step-by-step: how the template workflow runs
1. User sends a message and the assistant replies
The journey starts with the Chat Trigger node. A user types a message in your n8n chat UI, and that input is forwarded to the OpenAI Assistant node.
Your assistant is configured to use a vector store, so it can fetch relevant file snippets and attach citation annotations. The initial response might include short excerpts plus internal references that point back to your files.
2. Fetch the full thread content from OpenAI
The assistant’s immediate response is not always the full story. Some citation details live in the full thread history instead of the single message you just got.
To get everything, you use an HTTP Request node to call:
GET /v1/threads/{threadId}/messages
and you include this special header:
OpenAI-Beta: assistants=v2
This returns all message iterations and their annotations, so you can reliably extract the metadata you need for each citation.
3. Split messages, content blocks, and annotations
The Threads/Messages API response is nested. To avoid scrolling through JSON for the rest of your life, the workflow uses a series of SplitOut nodes to break it into manageable pieces:
- Split the thread into individual messages
- Split each message into its content blocks
- Split each content block into annotations, typically found under
content.text.annotations
By the end of this step, you have one item per annotation or citation, ready to be resolved into something readable.
4. Turn file IDs into filenames
Each citation usually includes a file_id. That is great for APIs, not so great for humans. To translate, the workflow uses another HTTP Request node to call the Files API:
GET /v1/files/{file_id}
This returns the file metadata, including the filename. With that in hand, you can show something like project-plan.pdf instead of file-abc123xyz. You can also use this metadata to construct links to your file hosting layer if needed.
5. Regularize and aggregate all citations
Once the file metadata is retrieved, a Set node cleans up each citation into a simple, consistent object with fields like:
idfilenametext(the snippet or text in the assistant output that was annotated)
Then an Aggregate node merges all those citation objects into a single array. That way, the final formatting step can process every citation in one pass instead of juggling them individually.
6. Replace raw text with formatted citations
Now for the satisfying part. A Code node loops through all citations and replaces the raw annotated text in the assistant’s output with your preferred citation style, such as _(filename)_ or a Markdown link.
Here is the example JavaScript used in the Code node:
// Example Code node JavaScript (n8n)
let saida = $('OpenAI Assistant with Vector Store').item.json.output;
for (let i of $input.item.json.data) { saida = saida.replaceAll(i.text, " _("+ i.filename+")_ ");
}
$input.item.json.output = saida;
return $input.item;
You can customize that replacement string. For instance, if you host files externally, you might generate Markdown links such as:
[filename](https://your-file-hosting.com/files/{file_id})
Adjust the formatting to match your UI design and how prominently you want to display sources.
7. Optional: convert Markdown to HTML
If your chat frontend expects HTML instead of raw Markdown, you can finish with a Markdown node. It takes the Markdown-rich assistant output and converts it into HTML, ready to render in your UI.
If your frontend already handles Markdown, or you prefer to keep responses as Markdown, you can simply deactivate this node.
Tips, best practices, and common “why is this doing that” moments
Rate limits and batching
If you are resolving a lot of file_id values one by one, you may run into OpenAI rate limits. To keep things smooth:
- Batch file metadata requests where possible
- Cache filename lookups in n8n (for example, with a database or in-memory cache)
- Reuse cached metadata for frequently accessed files
Security and access control
Some quick security reminders:
- Store your OpenAI API key inside n8n credentials, not directly in nodes
- When exposing filenames or links, make sure your links respect your access controls
- Avoid leaking private file URLs to users who should not see them
Dealing with ambiguous or overlapping text matches
Simple string replacement is convenient, but it can be a bit literal. If two citations share overlapping text, you might get unexpected substitutions.
To reduce this risk:
- Prefer replacing the exact annotated substring from the citation object
- Consider using unique citation tokens in the assistant output that you later map to friendly labels
- Normalize whitespace or punctuation before replacement if your data is slightly inconsistent
Formatting styles that work well in UIs
Depending on your frontend, you can experiment with different citation formats, for example:
- Inline citations like _(filename)_
- A numbered “Sources” list at the end of the message with links
- Hover tooltips that show extra metadata such as page numbers or section IDs
The workflow gives you the raw ingredients. How you present them is completely up to your UX preferences.
Ideas for extending this workflow
Once the basic pipeline is running, you can take it further:
- Store file metadata in a database to speed up lookups and reduce API calls
- Generate a numbered bibliography and replace inline citations with references like
[1],[2], etc. - Include richer provenance data such as page numbers or section identifiers when available
- Integrate access control logic so users only see citations for files they are allowed to access
Quick troubleshooting checklist
- No annotations from OpenAI? Check that your assistant is configured to return retrieval citations and that you fetch the full thread via the Threads API.
- File metadata calls returning 404? Verify that the
file_idis correct and that the file belongs to your OpenAI account. - Replacements not appearing consistently? Confirm that the excerpt text matches exactly. If needed, normalize whitespace or punctuation before replacement.
Wrapping up
By adding this citation processing pipeline to your n8n setup, you turn a basic RAG system into a much more transparent and reliable experience. The workflow retrieves full thread content, extracts annotations, resolves file IDs to filenames, and replaces raw tokens with readable citations or links.
You can drop the provided JavaScript snippet into your n8n Code node and tweak the formatting to output Markdown links or HTML. From there, it is easy to layer on caching, numbering, or more detailed provenance data as your use case evolves.
Try the template in your own n8n instance
If you are tired of hunting through JSON to figure out which file your assistant used, this workflow template is for you. Spin it up in your n8n instance, connect it to your assistant, and enjoy the relief of automated, clear citations.
If you need a customized version for your dataset, or want help adding caching and numbering, feel free to reach out for a consultation or share your requirements in the comments.
