← All posts

How do you get ChatGPT or Claude to generate valid Mermaid AWS architecture diagrams?

ChatGPT, Claude, and Gemini all produce Mermaid architecture-beta source on demand — and all three hallucinate AWS icon slugs. Here is the prompting pattern that fixes it, the slugs they get wrong most often, and a side-by-side comparison of grounded vs ungrounded output.

Quick answer

Large language models reliably produce Mermaid architecture-beta source, but they invent AWS icon slugs that do not exist in the catalog — aws:lambda, aws:s3, aws:vpc are all common hallucinations. The fix is to ground the model by pasting LatixEngine's public catalog at latixengine.com/icons.json into the prompt, instructing it to use only slugs from that list, and then rendering the output through the LatixEngine editor to catch any remaining errors before publishing.

Ask ChatGPT, Claude, or Gemini to draw an AWS architecture diagram and you will get a plausible-looking block of Mermaid source back within seconds. Paste it into a renderer and roughly half the icons fail to load. The diagram type is right. The structure is right. The slug names are wrong.

This is not a quirk of one model. It happens across every frontier LLM, because the AWS Mermaid icon pack has slugs that are not always intuitive — AWS Lambda is aws:arch-aws-lambda, not aws:lambda. S3 is aws:arch-amazon-simple-storage-service, not aws:s3. When a model has to guess, it guesses what a developer would expect, which is exactly the wrong answer.

The fix is not a smarter model. The fix is grounding — putting the real slug catalog into the model's context window so it copies real names instead of inventing them. This post walks through the failure mode, the grounding pattern that works, and a side-by-side comparison of grounded versus ungrounded output for the same prompt.

Why do LLMs invent Mermaid AWS icon slugs that do not exist?

Large language models predict the next token from a distribution over plausible continuations. When you ask for an AWS service slug, the model's training data contains thousands of references to aws:lambda, aws:s3, and aws:ec2 — because that is how people informally talk about AWS in blog posts, GitHub issues, and Stack Overflow threads. The actual Mermaid slugs are far less frequent in the corpus, so the model averages toward the plausible-but-wrong form.

There are also two equally valid prefixes — arch- (architectural product icons) and res- (resource icons) — and the choice depends on what the icon is being used for. A model that has not been explicitly grounded does not know which prefix to pick, so it omits both and produces aws:lambda instead of aws:arch-aws-lambda.

This is the same class of failure as confidently inventing function names that do not exist in a library. The model is not lying. It is filling in the most probable token, and the most probable token is wrong.

What does a typical Mermaid AWS hallucination look like?

Here is the ungrounded output from a single prompt — what every frontier model produces if you ask for a Mermaid architecture-beta diagram of a VPC with an EC2 instance, a Lambda, and an S3 bucket. The structure is fine. Every slug name is wrong.

architecture-beta
    group vpc(aws:vpc)[VPC]
        service ec2(aws:ec2)[EC2] in vpc
        service lambda(aws:lambda)[Lambda] in vpc
    service s3(aws:s3)[S3 Bucket]
    ec2:R <--> L:lambda
    lambda:R <--> L:s3

Four out of four service icons fail to render. Here is what the same diagram looks like when the model has been grounded with the LatixEngine icon catalog — same structure, real slugs, renders cleanly.

architecture-beta
    group vpc(aws:virtual-private-cloud-vpc)[VPC]
        service ec2(aws:ec2-instance-contents)[EC2] in vpc
        service lambda(aws:arch-aws-lambda)[Lambda] in vpc
    service s3(aws:arch-amazon-simple-storage-service)[S3 Bucket]
    ec2:R <--> L:lambda
    lambda:R <--> L:s3

Which AWS slugs do LLMs get wrong most often?

These are the slugs the model will reach for if you do not ground it. Always rewrite from the right column. The authoritative list is at latixengine.com/icons.json.

LLM hallucinatesReal slugService
aws:lambdaaws:arch-aws-lambdaAWS Lambda
aws:s3aws:arch-amazon-simple-storage-serviceS3
aws:ec2aws:ec2-instance-contentsEC2 instance
aws:vpcaws:virtual-private-cloud-vpcVPC group
aws:api-gatewayaws:arch-amazon-api-gatewayAPI Gateway
aws:internet-gatewayaws:res-amazon-vpc-internet-gatewayInternet Gateway
aws:nat-gatewayaws:res-amazon-vpc-nat-gatewayNAT Gateway
aws:subnet-publicaws:public-subnetPublic subnet group
aws:subnet-privateaws:private-subnetPrivate subnet group
The pattern behind the failures

LLMs default to the developer shorthand (aws:lambda, aws:s3). The real slugs follow either arch-provider-service for top-level services or res-provider-service-subcomponent for plumbing inside a VPC. When in doubt, the model picks neither and just drops the prefix.

How do you ground a model with the LatixEngine icon catalog?

There are three grounding strategies, ranked by accuracy. Pick the one that fits your tooling.

  • Inline paste — fetch latixengine.com/icons.json and paste the JSON (or a filtered AWS-only subset) directly into the prompt. Works in any chat UI. Costs context tokens but produces near-perfect slugs in a single shot.
  • Tool call — give the model a search function over the catalog (Claude Code, Cursor agents, OpenAI function calling). The model looks up each service as it builds the diagram. Lower token cost, slightly more latency.
  • Retrieval — index the catalog in a vector store and let the model retrieve the relevant entries per query. Overkill for a 1698-row catalog, but useful if you are generating dozens of diagrams in a batch.

What is a prompt template that actually produces valid Mermaid AWS source?

This is the inline-paste template. Drop it into any chat UI, replace the goal at the bottom, and you will get correct slugs on the first response.

You are generating a Mermaid architecture-beta diagram.

You MUST use only icon slugs from the following catalog. Do not invent
slugs. If you cannot find an appropriate icon, fall back to a generic
shape like internet, server, database, or cloud.

Catalog (provider:slug — human name):
- aws:arch-aws-lambda — AWS Lambda
- aws:arch-amazon-simple-storage-service — S3
- aws:arch-amazon-api-gateway — API Gateway
- aws:ec2-instance-contents — EC2 instance
- aws:virtual-private-cloud-vpc — VPC (use as a group)
- aws:public-subnet — Public subnet (use as a group)
- aws:private-subnet — Private subnet (use as a group)
- aws:res-amazon-vpc-internet-gateway — Internet Gateway
- aws:res-amazon-vpc-nat-gateway — NAT Gateway
- aws:region — AWS Region (use as a group)
- (paste the rest from latixengine.com/icons.json)

Syntax rules:
- Diagram starts with: architecture-beta
- Service declaration: service <id>(<slug>)[<label>]
- Group declaration: group <id>(<slug>)[<label>]
- "service x() in y" places x inside group y
- Edges use port-anchored syntax: source:R <--> L:target
  where ports are L, R, T, B

Goal: <describe the AWS architecture you want here>

For an even tighter prompt, filter the catalog down to just the services you care about — a 50-line list of relevant slugs is more reliable than the full 856-row AWS catalog because the model has less to scan.

How do you validate AI-generated Mermaid diagrams before publishing?

Even with a grounded prompt, expect one or two slug errors per diagram. The fastest validation loop is to paste the output into the LatixEngine editor, where missing icons render as a clear error and you can fix them in place against the icon picker.

For batch generation (CI pipelines, automated docs), a simpler check is to grep the generated Mermaid source for the regex (aws|azure|gcp):[a-z0-9-]+, diff the captured slugs against the keys in icons.json, and fail the build on any mismatch. That is the same one-liner LatixEngine uses to validate its own documentation against the catalog before release.

Which workflow gives the best results — chat, IDE agents, or terminal tools?

All three work, but the failure modes differ. This table compares the realistic accuracy you can expect for a 10-node AWS architecture across the major LLM-driven workflows, with and without grounding.

WorkflowUngroundedGrounded inlineGrounded with tool call
ChatGPT / Claude / Gemini chat UISlugs inventedHigh accuracyNot applicable
Cursor or Claude Code in repoSlugs inventedHigh accuracyHighest accuracy
API with no system promptSlugs inventedHigh accuracyHighest accuracy
Mermaid Chart AIProvider-bundled icons onlyNot applicableNot applicable

The pattern: ungrounded chat will fail no matter how powerful the model is. Grounding fixes the problem cheaply. The choice between inline paste, tool call, and retrieval comes down to how much context you want to spend and how many diagrams you are generating.

Does the same approach work for Azure and GCP diagrams?

Yes — the failure mode is identical. LLMs hallucinate azure:cosmos-db (real: azure:service-azure-cosmos-db) and gcp:region (real: gcp:project). Filter icons.json by provider and inline-paste the relevant slice into the prompt, and the same grounding pattern works for all three clouds.

If you want the by-hand syntax tutorial without any LLM involved, see our guide on how to draw AWS architecture diagrams in Mermaid.

Frequently asked questions

Which LLM is most accurate at generating Mermaid diagrams?

Without grounding, all major models fail in the same way — they invent slug names. With grounding (catalog pasted into the prompt), the difference between frontier models becomes small. Pick whichever model you already use; the prompt grounding matters more than the model choice.

Why does the LLM not just know the right slugs?

The Mermaid AWS icon pack is a niche dataset that appears infrequently in training corpora. Developer shorthand like aws:lambda is far more common in the wild than the real slug aws:arch-aws-lambda, so the model averages toward the wrong form. Grounding bypasses this by putting the real names directly into the context window.

Can I automate the grounded workflow with the API?

Yes. Fetch latixengine.com/icons.json once, cache it, and inline-paste either the full catalog or a provider-filtered subset into every API call as a system message. Combine with a simple post-generation regex check to fail on any unknown slug.

Will GitHub render the AI-generated diagrams in a README?

Not reliably yet. GitHub renders Mermaid flowchart and sequenceDiagram natively, but architecture-beta with custom icon packs is still inconsistent across renderers. The recommended workflow is to generate the Mermaid source, render it in LatixEngine, export to SVG, and commit the SVG alongside the .mmd source.

Does LatixEngine offer an AI composer of its own?

An AI architectural diagram composer that uses LatixEngine's validation rule engine is on the public roadmap at latixengine.com/roadmap. Until then, the inline-paste grounding pattern works with any external LLM.

Will the prompt template work in Cursor or Claude Code?

Yes. Paste the template into a system prompt or a .cursorrules / CLAUDE.md file at the project root, and the editor-integrated agent will use the same grounding for every diagram it generates.

Try it in the editor

Paste any example in this post into the LatixEngine editor and see it render with native AWS icons. No login, no install.

Open the editor →