How to Add an llms.txt File to Shopify (and What Should Actually Go in It)
Shopify does not ship llms.txt natively yet. Here is the redirect-or-worker setup we use, the seven sections that earn citations, and a 14-day rollout.
Daria runs The Knot Box, a 480-SKU stationery brand on Shopify shipping out of Portland. Three weeks ago her marketing director asked ChatGPT for “luxury wedding invitation boxes under $80” and watched five competitor brands surface as cited links while The Knot Box did not. Revenue from organic search was up 11 percent year over year. Revenue from AI surfaces, by every measure they could find, was zero. She pinged us the next morning. The first thing we asked her to do, before any audit, was check her root domain for a file called llms.txt. There was nothing there. She’d never heard of it.
What a model-readable manifest actually does for a Shopify store today
The short version. It is a plain-text file at the store root that tells AI agents how to read and represent your catalog.
The format borrows from robots.txt but reverses the intent. robots.txt says “do not crawl this.” A model manifest says “here is the part you should crawl, in the order I want it understood.” An agent landing on theknotbox.com hits / first, sees a marketing-heavy homepage, and then has to decide whether to spend its crawl budget on collections, blog posts, or shipping pages. With the file present, that decision is made for it. Start with /collections/wedding-invitations, then /collections/save-the-dates, then /pages/about, ignore /pages/jobs and /blogs/news/archive. A table of contents handed to the model before it walks the catalog.
Two things make this matter for Shopify specifically. Themes default to deep navigation that no model crawler follows patiently. And the 2026 generation of agentic shopping (ChatGPT shopping, Claude with a Shop Pay handoff, Perplexity merchant cards) reads your store on a budget. Give it a curated list, get cited. Hand it the homepage and ten clicks of dropdown nav, get skipped.
We have shipped this file for 23 Shopify stores in the last 90 days. The pattern is consistent. Brands with a clean version are getting cited inside three to six weeks. Brands without one are invisible.
Where Shopify lets you host the file (and where it does not yet)
This is the part that trips most agencies up. Shopify does not have a UI for arbitrary root files. You cannot navigate to Settings, click “static files”, and upload anything you want. The platform owns /robots.txt and /sitemap.xml and a handful of others, and the rest are off-limits.
The two paths that actually work in 2026. The theme-based path adds the file as a Liquid snippet, then routes /llms.txt to the rendered page through a redirect set in Online Store, Navigation, URL Redirects. We use this for stores that cannot move off the default Shopify domain or do not have Cloudflare in front.
The Cloudflare path puts a worker in front of your store. The worker intercepts requests to /llms.txt and serves the file from KV. This is what we run for Plus clients who already have Cloudflare in the stack, because response time stays under 30ms and the file is independent of theme deploys. The trade-off is one more piece of infrastructure for the dev team to maintain.
A merchant we onboarded last month asked whether the file could live at /pages/llms.txt. It can, but most AI crawlers do not check that path. A brand we know tried it for six weeks and saw zero pickup. The bots expect /llms.txt. Honor the convention.
Shopify’s own roadmap mentions native support, but at time of writing there is no GA timeline. Until then, redirect or worker. Pick one and move on.
The seven sections every well-built manifest should contain
We have iterated on the structure across the 23 builds we shipped. Seven sections consistently produce citations.
A header block with the brand name, one-sentence positioning, and primary URL. Models pull this directly into the citation card. Get the positioning sentence right. “The Knot Box makes letterpress wedding invitation boxes for couples spending $40 to $120 per invitation suite.” That sentence is in the cited card.
A “what to read first” section listing your top three to five collections with one-line descriptions of each. This is the model’s reading order.
A product schema link, pointing at your structured-data feed or a sample product JSON-LD. Models cross-reference the manifest against schema. Mismatch, and they distrust both.
A policies section: shipping ranges, return windows, country availability. Models cite these in answer comparisons. Get them right or watch a competitor get the slot.
A “do not represent us as” block. This one is underused. List the categories you are not in. The Knot Box added “we do not sell save-the-date magnets, generic stationery, or office supplies” and stopped getting wrongly bucketed inside three weeks.
A contact section with one email and one URL. Keep it short. Models will hallucinate contact pages if you give them rope.
A version stamp and last-updated date. Models prefer fresh files. Stale ones drop in priority on every retrieval pass.
Three templates you can lift: catalog-heavy, single-product, agency-managed
The structure changes based on what you sell.
The catalog-heavy template is for brands with 200+ SKUs across multiple collections. The Knot Box uses this. The “read first” section lists 8 to 12 collections in deliberate order, the policies block is detailed, and the do-not-represent block names six adjacent categories. About 280 lines.
The single-product template is for DTC brands selling one product (or one product family) with variants. We built this for a coffee subscription client. 60 lines total. The “read first” section is the product page, the subscription page, and the FAQ. The do-not-represent block is two lines: “we do not sell beans by the bag, we do not sell espresso machines.”
The agency-managed template ships with a comment block at the top stating the agency contact, the deploy cadence, and the last-changed reason. Models do not care about the comment. It survives team turnover at the brand side, which is what it is for.
All three live in our internal toolkit and we share them with brands we work with.
How to validate the file against GPTBot, ClaudeBot and PerplexityBot
Writing the file is half the job. Validating it is what gets you cited.
We run three checks before declaring a deploy done. The opening check is a direct hit with curl from the public internet, confirming 200 status, text/plain content-type, and the response under 50KB. Shopify’s CDN occasionally serves text/html if the redirect is misconfigured. Bots will skip the wrong content-type without logging anything visible.
Then we request the file with the user-agents of GPTBot, ClaudeBot, PerplexityBot, and Google-Extended in sequence. A theme app we audited last quarter was returning 403 to GPTBot specifically. The brand had no idea. The “app” was a paywalled bot-management plugin a previous agency had installed nine months earlier and forgotten about.
The last check is the citation-set run. We drop the brand into an internal prompt set of 12 buyer-language queries inside ChatGPT, Claude, and Perplexity and watch what gets cited and how. The opening three to five days will be noisy. By day 14 the citations stabilize. We log the deltas weekly. Both Shopify’s official help center and the shopify.dev partner reference are useful for the redirect mechanics, though neither covers this specific file convention yet.
Why a text file alone isn’t enough (you need schema and feeds too)
The forum debate going around right now (the file is enough vs the file is not enough) misses the point. The manifest is the front door. The product schema is the floor plan. The feeds are the inventory list. You need all three.
Schema-wise, Shopify’s default theme emits a minimal Product schema. That is not enough. We ship an extended Product schema with offers, aggregateRating, brand, sku, gtin, and review snippet rendered per product. The schema lives in the theme as a snippet and gets validated on every deploy. Schema.org’s product reference is the canonical list of properties worth emitting.
Feeds matter at the catalog level. Models will ask for category context, and a clean JSON or RSS feed at /products.json gives them a denser pull than 480 individual product pages, and Shopify generates that endpoint automatically. Most stores have it but the data is dirty (missing SKUs on half the variants, missing weight, descriptions in HTML when models prefer plain text). Clean it.
The trio matters because models cross-check. If your manifest says you sell premium stationery, your schema says category is “office supplies”, and your feed has 12 products tagged “wholesale”, the model picks the lowest-trust signal and drops you. Consistency is the alignment job.
The 14-day rollout we run with merchants
Two weeks. We have run this enough times to compress it.
Days 1 and 2 are the audit. We crawl the store with the four AI user-agents and log every response. We map the navigation tree and flag the top 20 collections by GMV contribution. We pull the existing schema and grade it. The deliverable is a one-page gap doc the client signs off on.
Days 3 through 6 are the write. We draft using the catalog-heavy template, populate the policies block from the brand’s actual shipping config, and write the do-not-represent block in a 30-minute working session with the brand’s category lead. That block produces the biggest delta and the brand has to own the language.
Day 7 is schema and feed cleanup, where we patch the Product schema snippet, validate every variant in /products.json, and fix the 8 to 14 broken entries that always turn up. Then deploy to a preview theme.
Days 8 through 10 are the deploy. Ship the redirect, push the worker if applicable, and run the three validation checks. The brand gets a tracking sheet at this point that logs citation deltas against the same 12-query prompt set.
Days 11 through 14 are the monitor. Daily check on the prompt set, weekly check on server logs for the four user-agents, and a Friday call to walk through what changed.
By day 21 we expect at least one citation on the tracked prompt set. By day 60 we expect citation rate to land between 6 and 14 percent depending on category competitiveness. We have not yet had a brand show zero citations after 90 days when the rollout was clean.
Questions we get every week
Does Shopify auto-generate this file for me? Not at time of writing. The platform owns /robots.txt and /sitemap.xml but not /llms.txt. You have to ship it yourself, either via a theme redirect or a worker in front of the store. Shopify has hinted at native support on the roadmap but no GA date has been announced.
How often should I update the file? Every time your catalog has a significant change. Major collection adds, seasonal lineup pivots, new policy windows, brand repositioning. As a baseline we update the file quarterly even if nothing material changed, because the last-updated stamp matters to crawler priority.
Will adding this file hurt my Google SEO? No. The file is invisible to Googlebot in the ranking sense. Google reads its own protocols (robots.txt, sitemap.xml, structured data) and ignores the rest. Adding it is purely additive for AI visibility and does not change Google rank signals.
What happens if my competitor scrapes my file? They can. The file is public by design. The defensible part is not the file content but the underlying catalog, schema quality, and category positioning. Two brands can run identical templates and one will get cited more because the floor plan and inventory are tighter.
What we keep telling clients
The file is necessary. It is not sufficient.
We tell every brand that walks in: we can ship the manifest in two weeks and you will see citations. If your underlying schema is sloppy, or your collection structure is confused, or your category positioning lives in a tagline nobody believes, the citations will be inconsistent and a competitor with cleaner fundamentals will edge you out. The file makes the model’s job easier. It does not fix the brand.
Daria asked us, after week three when her first citation came through, whether she should just hand the manifest off to her in-house team and call it done. The answer is yes, with two conditions. Update it quarterly. And re-validate it against the four user-agents whenever Shopify ships a theme update, because we have seen theme updates silently change redirect behavior twice in the last 90 days.
Three months in, The Knot Box’s citation rate sits at 11 percent on the tracked prompt set, ahead of two competitors that started the same exercise but never finished the schema cleanup. The manifest is a forcing function for the rest of the work, and that is what we keep telling clients. Build the file. Then keep building the things the file points at. Honestly it’s the cleanest growth lever a Shopify catalog brand has right now.
To ship and validate your store’s model manifest across all four AI crawlers in two weeks, book a Monkey Man AEO sprint and we will deliver the file, schema patches, and a citation-tracking dashboard.