Architecture
A microservices architecture with a secure gateway as the single entry point for all AI operations.
System Overview
All traffic flows through the Secure AI Gateway for rate limiting, security checks, and observability.
┌─────────────────────────────────────────────────────────────────────────────┐ │ Applied AI Engineering Portfolio │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────────┐ │ │ │ Web (Next.js)│ │ │ │ Port 3000 │ │ │ └───────┬───────┘ │ │ │ │ │ ▼ │ │ ┌────────────────────────────────────────────────────────────────┐ │ │ │ Secure AI Gateway (8000) │ │ │ │ ┌──────────┐ ┌──────────┐ ┌───────────┐ ┌──────────────┐ │ │ │ │ │ Rate │ │ PII │ │ Prompt │ │ Cost │ │ │ │ │ │ Limit │─▶│ Redact │─▶│ Injection │─▶│ Estimation │ │ │ │ │ └──────────┘ └──────────┘ └───────────┘ └──────────────┘ │ │ │ └────────────────────────────┬───────────────────────────────────┘ │ │ │ │ │ ┌───────────────────────┼───────────────────────┐ │ │ │ │ │ │ │ ▼ ▼ ▼ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ RAG │ │ Eval │ │ Incident │ │ │ │ (8001) │ │ (8002) │ │ (8003) │ │ │ │ + ChromaDB │ │ │ │ + ChromaDB │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ │ ┌──────────────────────┘ │ │ │ ▼ │ │ │ ┌─────────────┐ │ │ │ │ DevOps │ │ │ │ │ (8004) │◀─── Cross-service incident lookup │ │ │ │ + ChromaDB │ │ │ │ └─────────────┘ │ │ │ │ │ │ ▼ ▼ │ │ ┌─────────────────────────────────────────────────────────────┐ │ │ │ OpenAI API (External) │ │ │ │ Embeddings (text-embedding-3-small) │ │ │ │ Chat Completions (gpt-4o-mini) │ │ │ └─────────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────────────────┘
Request Flow
How requests are processed through the system
Web Frontend (Next.js)
User interacts with the portfolio website. All API calls are made to the Secure Gateway.
Secure AI Gateway
Applies rate limiting, PII redaction, prompt injection detection, and cost estimation before routing to backend services.
Backend Services
RAG, Eval, Incident, and DevOps services process requests using ChromaDB for vector storage and OpenAI for embeddings/completions.
Response with Metadata
Responses are enriched with gateway metadata (request_id, latency, cost estimates) before returning to the client.
Services
Web Frontend
User interface for all demos and documentation
Secure AI Gateway
Central entry point with security middleware
Knowledge Retrieval
RAG system with document chunking and citations
Evaluation Service
LLM testing and regression detection
Incident Investigation
Root cause analysis with evidence retrieval
DevOps Risk Analysis
Deployment risk scoring and recommendations
Design Decisions
Key architectural choices and tradeoffs
Gateway Pattern
All traffic flows through a single gateway, enabling centralized security, observability, and rate limiting without duplicating logic in each service.
Service Isolation
Each AI capability runs as an independent service with its own data store, allowing independent scaling and deployment.
Evidence-Based AI
All LLM outputs require evidence citations. Strict mode enables refusal when confidence is low, preventing hallucination in production.
Local-First Development
Docker Compose enables full-stack local development. ChromaDB provides vector storage without external dependencies.
Explore the Systems
See these architectural patterns in action through the live demos.