AFS Interview Prep
0/13

AFS Interview Prep

Accenture Federal Services · GenAI Applications Engineer · Agents & RAG · Req 6718

Section 1

60-Second Intro

What you say when they say "tell me about yourself."

"So my background is a mix of engineering and consulting. I started in product at a law firm, building compliance systems that tracked regulations across multiple jurisdictions. That's where I learned how to build things that have to be right — where a mistake means legal exposure, not just a bug."

"For the last two years I've been deep in GenAI. I built and deployed a multi-agent RAG platform — it's live right now. Python backend, LangGraph workflow, ChromaDB for vector search. A supervisor agent routes queries to retrieval, search, and synthesis agents. It streams responses back to the user. I can walk you through it if you want."

"What I'm looking for is the kind of work where the reliability bar is high. Where you can't just throw something over the wall. That's what attracted me to this role."

About 165 words. Say it out loud 5 times tonight until it sounds like talking.

Section 2

ToolChainDev

toolchain.vercel.app — live and defensible

This is the system you walk them through when they ask "show me something you built." It has everything the JD asks for.

The architecture in plain English

The user asks a question. A supervisor agent classifies the intent and routes to the right specialist. If it's a tool question, the RAG agent queries ChromaDB for the top 5 matches. If it needs fresh info, the search agent calls Tavily. Then the explain agent synthesizes everything into a structured markdown response that streams back to the user.

Guardrails you built

• Max 5 iterations then force-finish (prevents infinite loops)

• Pydantic catches malformed LLM output before it propagates

• Embedding fallback with explicit logging

• Structured logs via structlog on every agent

• Prometheus metrics tracked per agent

If they ask "what chunking strategy?"

"Each AI tool is a natural semantic unit. Name, provider, category, pricing, pros and cons. So I use document-structure chunking, not fixed tokens. For unstructured docs I'd use recursive splitting, 512 tokens, 50 overlap."

Section 3

How to Explain Each Project

When they ask "tell me about X," find the matching card.

ToolChainDev · LIVE

One-liner: "A multi-agent RAG platform I built and shipped. LangGraph supervisor routes queries to a RAG agent over a vector DB, a search agent that calls external APIs, and an explain agent that synthesizes the answer."

What it does: "It helps developers discover and compare AI tools. You ask 'best vector databases for RAG,' the supervisor routes to retrieval, the RAG agent pulls the top 5 from my indexed database, and the explain agent writes back a structured comparison."

Why it matters: "The patterns transfer. Supervisor routing, guardrails, structured output, streaming. That's the same shape you'd build for a mission system — just different data and stricter constraints."

SMS Marketing · LIVE

One-liner: "An SMS marketing platform with TCPA compliance baked in. Quiet-hours logic, automated opt-out, consent audit trails."

Why TCPA matters: "TCPA is the closest consumer-side analog to federal compliance. You encode legal rules into automated guardrails. Don't message outside allowed hours. Process opt-outs instantly. Log every consent action."

If they ask about regulated work: "TCPA compliance is the same pattern — encode the legal rules into automated guardrails, log everything, deny by default. I've shipped that pattern in production."

Regulated Compliance Platform (2020-2023)

Do not name the client.

One-liner: "I built a compliance platform for a regulated industry client tracking regulations across multiple jurisdictions in real time."

What it tracked: "Per-jurisdiction licensing requirements, expiration dates, audit trails. Voice agent for natural-language queries. Elasticsearch for full-text search across the regulation corpus."

Why it matters: "Same problem federal missions face. Data classified by jurisdiction. Audit-ready logging. Multi-source regulation tracking."

If they ask the client name or industry: "Under NDA." If they push: "The architecture pattern is what matters."

AI Learning Platform

One-liner: "An AI learning platform that uses RAG to ground generated content in source material. The interesting part is the context engineering system. 177 structured skills that guide agent behavior."

What context engineering means: "Treating prompts as a system, not a string. Each skill is a markdown file with trigger conditions, step-by-step instructions, and pitfalls. The agent loads the right skills contextually."

The federal bridge: "Federal needs prompt versioning and policy-as-code. Context engineering is the precursor. Structured, versioned, testable prompt libraries."

Section 4

JD Requirements → What to Say

Every line from the job description, mapped to a sentence you can say.

Agent frameworks & orchestration

"I use LangGraph with a StateGraph and Pydantic-typed router decisions. Supervisor pattern routing to specialist agents."

Vector search

"ChromaDB in ToolChainDev. Vectorize in a second system. For federal I'd pick pgvector or OpenSearch. They run on existing PostgreSQL and reduce attack surface."

AWS Bedrock / Azure OpenAI / Vertex AI

"I've documented these platforms and their patterns. Haven't shipped production on them yet. That's where I'd partner with the team to get up to speed fast."

Strong Python

"FastAPI backend, LangGraph workflow, Pydantic for structured output. That's my primary stack."

Tool-using agents

"Per-agent tool scoping. Search agent only sees web search. RAG agent only sees the vector store. No blanket access."

Production rigor

"Prometheus on every agent decision. Structured logs. Sentry for errors. Champion-challenger for prompt A/B testing. Auto-rollback on guardrail breach."

SLIs/SLOs

"Faithfulness above 95%. P95 latency under 3 seconds. Guardrail violation rate under 0.1%. Cost per query under 5 cents."

Restricted / air-gapped

"Open-source model on-prem via vLLM. Local embeddings. pgvector in existing PostgreSQL. Trade-off is model quality versus data sovereignty."

Zero Trust, audit-ready

"Every component authenticates every other. Audit trail on every query. I've shipped audit-first compliance systems in production."

Docker / K8s / Terraform

"Shipped Docker. Reading-level familiarity with Kubernetes and Terraform. I pick up infrastructure stacks fast."

Vertex AI (Google Cloud)

"Vertex is Google's equivalent of Bedrock. Gemini, Claude via Model Garden, open models like Llama. GCP is FedRAMP High authorized. If the agency is on Google, Vertex is the natural pick."

n8n — workflow orchestration

"I use n8n for visual workflow orchestration when I need to wire APIs, webhooks, and data pipelines fast. Different tool, same patterns. LangGraph for complex agents, n8n for quick integrations."

Self-hosted / air-gapped stack

"For air-gapped: Llama or Mistral via vLLM, local embeddings via sentence-transformers or BGE, pgvector for vector search in existing PostgreSQL. No data egress. Trade-off is model quality versus data sovereignty."

Responsible AI

"PII detection at ingestion. Redaction before embedding. Provenance tracking. Human-in-the-loop on low-confidence outputs."

Section 5

The 5 Accenture Topics

Confirmed by Medium, LinkedIn, Dataford, DataCamp, and InterviewBit.

95%
RAG
90%
Halluc.
85%
Agents
80%
Chunking
75%
Eval

If you have 1 hour: 30 min on RAG, 15 on multi-agent, 10 on hallucinations, 5 on chunking.

Section 6

Q-Drill: RAG

Tap to reveal. Say your answer first, then check.

Q1 · Medium
What is RAG and why is it important?

RAG combines information retrieval with generative models. It retrieves relevant documents from a knowledge base using vector search, then uses a generative model to synthesize an answer grounded in that retrieved context.

It grounds outputs in actual data. More factual, domain-specific responses without retraining. For federal use cases where data is sensitive or changes frequently, RAG is the right tool.

Q2 · Medium
How does RAG differ from standard LLM generation?

Standard LLM generation relies on pre-trained knowledge, frozen at training time. RAG retrieves real-time or proprietary information from a database that the model uses to generate.

This reduces hallucinations, provides domain-specific answers, and adapts to dynamic content without retraining. For federal missions, RAG lets you keep sensitive data in your own environment while using a general-purpose model.

Q3 · Medium
What is multi-hop retrieval and when is it useful?

Multi-hop retrieval sequentially retrieves context across multiple documents or steps. Instead of one search then answer, it's: retrieve document A, extract a clue, search for document B, synthesize.

Useful for complex queries requiring synthesis across sources. "Compare compliance requirements in jurisdictions X and Y" requires retrieving X's rules, then Y's, then comparing.

Q5 · Medium
Why are vector databases important in RAG?

Vector databases store high-dimensional embeddings and enable efficient similarity search via Approximate Nearest Neighbor. They allow fast retrieval of semantically similar documents.

Without them, you'd compute similarity against every document on every query. Doesn't scale. In my system I use ChromaDB. For federal scale I'd use OpenSearch or pgvector.

Section 7

Q-Drill: Hallucinations + Agents

Q8 · Accenture
How do you reduce hallucinations?

1. Grounding — Answer only from retrieved context. System prompt: "Answer based ONLY on provided context."

2. Low temperature — 0.1 to 0.3 for factual retrieval.

3. Confidence thresholds — Below threshold, return "no results."

4. Citation enforcement — Agent references which sources it's drawing from.

5. Post-generation validation — Verify claims against source chunks.

6. Human-in-the-loop — High-stakes outputs route to review.

Q10 · Accenture
Design an agentic system that reads email, queries a DB, drafts a response.

Supervisor pattern. Email intake agent classifies intent and extracts entities. Database agent queries internal systems via structured tool calls. Response drafting agent generates a grounded response. Guardrail agent validates output. Human review for sensitive cases.

Supervisor manages flow: circuit breakers, retries, fallback. Each agent has its own tool scope.

Q14 · Medium
LangGraph vs LangChain?

LangGraph extends LangChain with graph-based orchestration. Instead of linear chains, you define a StateGraph with nodes (agents) and conditional edges (routing).

In my system: supervisor, search, rag, explain nodes. Supervisor routes conditionally. All specialists return to supervisor. This cyclic flow is what linear chains can't do.

Key advantage: typed state object carries context between nodes.

Q16 · Accenture
How do you implement policy-based routing and guardrails?

Multiple layers. Input: prompt injection defense, PII detection. Tool scope: each agent only sees its own tools. Output: structured output enforcement, content filtering.

Loop prevention: max 5 iterations, max tokens, max tool calls. Cost guardrails: token budget. Audit logging: every decision logged. Circuit breakers on failures.

Section 8

Q-Drill: Chunking + Eval + Federal

Q19 · LinkedIn
What chunking strategies exist?

Fixed-size — split at N tokens. Quick prototyping.

Recursive — paragraph then sentence then word. General purpose.

Semantic — split where meaning shifts. Long-form.

Document-structure — headings and sections. Regulations, legal.

Late chunking — embed full doc first, then chunk. Preserves context.

For my tool data: document-structure. For unstructured: recursive 512-token with 50 overlap.

Q27 · Medium
How would you evaluate RAG performance?

Retrieval: precision, recall, NDCG@k (most relevant ranked highest).

Generation: faithfulness (grounded in context?), answer relevance (addresses question?).

Golden test set of 50-100 Q&A pairs. LLM-as-a-judge for faithfulness. The RAG triad is my north star.

Q28 · Accenture
What SLIs/SLOs for a GenAI app?

Quality: faithfulness above 95%, relevance above 90%.

Latency: p95 under 3 seconds.

Safety: violation rate under 0.1%.

Reliability: uptime 99.9%.

Cost: under 5 cents per query.

Q33 · JD-aligned
What vector DB for federal deployment?

For prototyping: Chroma or FAISS. For production: pgvector (runs on PostgreSQL, reduces attack surface, helps ATO) or OpenSearch (FedRAMP-authorized, hybrid search).

Avoid Pinecone for air-gapped. Key federal constraint: can it run on-prem?

Section 9

RAG Pipeline — 7 Steps

Memorize this order. If they ask "walk me through your pipeline," say these in order.

1. Ingest — Load documents. Clean and normalize text.

2. Chunk — Split into pieces. 512 tokens, 50 overlap. Preserve section boundaries.

3. Embed — Generate vectors. OpenAI 1536d or local sentence-transformers.

4. Store — Vector store. ChromaDB, pgvector, or OpenSearch.

5. Retrieve — Query embedding, top-K similarity, metadata filtering, reranking.

6. Generate — Retrieved context plus grounded prompt. Low temperature.

7. Validate — Citation check, hallucination check, guardrail check. Deliver with citations.

7 words: Ingest, Chunk, Embed, Store, Retrieve, Generate, Validate.

Section 10

Red Flags — What Not to Say

Federal mindset is not startup mindset.

DON'T

"I used GPT-4 for everything"

SAY

"I evaluate models across quality, safety, latency, cost. For federal I'd add FedRAMP as a gate."

DON'T

"I move fast and break things"

SAY

"I ship in weeks with guardrails, monitoring, and rollback capability."

DON'T

"I'm expert in ATO/STIGs"

SAY

"Working familiarity. I understand the constraints and would partner with security teams."

DON'T

"It worked well"

SAY

"I evaluated with NDCG@k and faithfulness scoring, hit X% on the golden set."

DON'T

Deep-dive on a project you can't defend

SAY

"Most production work is under NDA. I can walk through architecture and patterns."

Section 11

Questions to Ask Them

Pick 4. Never say "I don't have questions."

Technical

"What does your evaluation pipeline look like? How do you measure quality and safety in production?"

"Which cloud platforms are you primarily building on? Bedrock, Azure OpenAI, Vertex?"

"How do you handle model inference in air-gapped or restricted environments?"

Cultural

"What does success look like in the first 90 days?"

"How much is individual engineering versus collaborative design with the client?"

"What's the team structure?"

Federal

"How do you balance 'ship in weeks' with ATO and security reviews?"

"Standalone GenAI apps, or AI integrated into existing federal systems?"

Section 12

Closing — Last 60 Seconds

When they ask "any final thoughts?"

"I guess if I had to boil it down — I build things that have to work. My compliance background taught me that. My RAG platform proves I can do it with GenAI. And I've been living inside the tools I use every day long enough to know where the sharp edges are."

"I know I don't have deep federal experience yet. But I understand the constraints — air-gapped inference, audit trails, data classification. And I learn infrastructure fast."

"I'd love to hear more about what the team is actually building."

Section 13

Cram Checklist

Memorize Cold

• 60-second intro (5x aloud)

• ToolChainDev architecture (draw from memory)

• SLI/SLO table

• 4 hallucination techniques

• NDCG = "rewards relevant results ranked higher"

• Bedrock vs Azure vs Vertex (one line each)

• "Regulated multi-jurisdiction compliance platform"

• 4 questions to ask them

Setup

• Bookmark toolchain.vercel.app

• Walk through all 13 sections once

• Water, phone silenced, notepad

• Quiet room, neutral background

• 15 min early

You've built this. You can defend it.