Agentic AI & Deployment | AI Engineer Portal

Multi-Agent Coordination Patterns

Pattern 1: Hierarchical (Orchestrator + Specialists)

An Orchestrator agent receives the high-level goal, decomposes it, and delegates to specialized sub-agents. Most suitable for business workflows.

flowchart TD U[User Goal:\nрезник Analyze Q3 sales and prepare report] --> O[Orchestrator Agent\nPlans and delegates] O --> R[Researcher Agent\nPulls data from Snowflake] O --> A[Analyst Agent\nRuns Python analysis] O --> W[Writer Agent\nFormats executive summary] O --> V[Reviewer Agent\nFact-checks + quality gate] R --> O2[Aggregated Results] A --> O2 W --> O2 V --> O2 O2 --> F[Final Report\nDelivered to User] style O fill:#ccfbf1,stroke:#0d9488 style V fill:#fef3c7,stroke:#d97706

Pattern 2: Pipeline (Sequential Agents)

Output of one agent flows directly into the input of the next. Clear, auditable, good for document processing.

flowchart LR A[Ingestion Agent\nLoad raw docs] --> B[Extraction Agent\nParse + structure] B --> C[Enrichment Agent\nAdd metadata + classify] C --> D[Indexing Agent\nEmbed + store in vector DB] D --> E[Validation Agent\nQuality check] E --> F[(Knowledge Base\nReady for RAG)] style F fill:#f0fdf4,stroke:#16a34a

Pattern 3: Peer-to-Peer (Collaborative Debate)

Agents communicate directly, challenge each other's outputs, reach consensus. Used in AutoGen multi-agent conversation.

Agent Roles Reference

Role	Responsibility	Tools Typically Used
Planner	Decompose goal into sub-tasks	No tools — pure reasoning
Researcher	Gather information	web_search, retriever, file_reader
Executor	Take actions in systems	code_exec, api_call, db_write
Reviewer	Validate quality of output	code_exec (tests), assertion checks
Critic	Challenge assumptions, find flaws	No tools — adversarial reasoning
Summarizer	Condense + format results	file_writer, email_sender
Router	Classify + redirect requests	Classifier tool or pure LLM

Framework Deep Dives

CrewAI

Role-based multi-agent crew

from crewai import Crew, Agent, Task

researcher = Agent(
role="Researcher",
goal="Find accurate data",
tools=[search_tool]
)
writer = Agent(
role="Writer",
goal="Write clear reports"
)
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, write_task]
)
result = crew.kickoff()

Best: Business workflows, content pipelines

LangGraph

State machine agent graphs

from langgraph.graph import StateGraph

workflow = StateGraph(AgentState)
workflow.add_node("research", research_fn)
workflow.add_node("analyze", analyze_fn)
workflow.add_node("write", write_fn)

workflow.add_conditional_edges(
"research",
should_continue,
{"continue": "analyze",
"end": END}
)
app = workflow.compile()

Best: Conditional flows, HITL, loops

AutoGen

Conversational multi-agent

from autogen import AssistantAgent, UserProxyAgent

coder = AssistantAgent(
name="Coder",
llm_config={"model": "gpt-4o"}
)
reviewer = AssistantAgent(
name="Reviewer",
system_message="Critique code"
)
user = UserProxyAgent(
name="User",
code_execution_config={...}
)
user.initiate_chat(coder, message="Solve FizzBuzz")

Best: Code generation, peer review

Azure AI Agent Service

Azure AI Agent Service (Azure AI Foundry) provides a fully managed runtime for agents — handling threads, file storage, tool execution, auto-scaling, and observability out of the box.

flowchart LR subgraph SDK ["Your Code (Python/C#)"] A[Create Agent\nGPT-4o + tools + instructions] B[Create Thread\nconversation context] C[Add Message\nuser input] D[Run Thread\ntrigger execution] E[Poll + Stream\nget results] end subgraph AZURE ["Azure AI Foundry (Managed)"] F[Thread Store\nAzure Cosmos DB] G[File Store\nAzure Blob] H[Tool Execution\nCode Interpreter, Bing, Functions] I[LLM Inference\nGPT-4o / o1] end A --> AZURE B --> F C --> F D --> H D --> I E --> F style AZURE fill:#cffafe,stroke:#0891b2

# Create and run an Azure AI Agent — minimal skeleton

from azure.ai.projects import AIProjectClient

from azure.identity import DefaultAzureCredential

client = AIProjectClient.from_connection_string("<FOUNDRY_CONNECTION_STRING>", DefaultAzureCredential())

agent = client.agents.create_agent(

model="gpt-4o", name="MyAgent",

instructions="You are a helpful data analyst.",

tools=client.agents.get_file_search_tool()

)

thread = client.agents.create_thread()

client.agents.create_message(thread.id, role="user", content="Summarize Q3 report")

run = client.agents.create_and_process_run(thread.id, agent.id)

messages = client.agents.list_messages(thread.id)

print(messages.get_last_text_message_by_role("assistant").text.value)

Built-in Tools

✅ Code Interpreter (Python sandbox)
✅ File Search (auto-RAG on uploaded files)
✅ Bing Search grounding
✅ Custom Function Calling
✅ Azure Functions integration

Managed Features

✅ Thread & message persistence (Cosmos)
✅ File upload & chunking (auto-RAG)
✅ Auto-scaling LLM calls
✅ Cost metering per agent/thread
✅ Azure Monitor integration

Deploying Agents on Kubernetes

Production agents on AKS follow a microservices pattern — each capability isolated, scaled independently, secured via Workload Identity.

flowchart TB subgraph AKS ["AKS Cluster"] subgraph NS_APP ["namespace: ai-app"] FE[Frontend Pod\nnginx static] API[Backend API\nFastAPI :8000] end subgraph NS_AGENT ["namespace: ai-agents"] ORCH[Orchestrator\nLangGraph :8001] SEARCH[Search Agent\nRAG :8002] CODE[Code Agent\nPython executor :8003] end subgraph NS_OBS ["namespace: observability"] PROM[Prometheus] GRAF[Grafana] end end subgraph AZURE ["Azure Services"] ACR[Container Registry] KV[Key Vault\nAPI keys, secrets] OPENAI[Azure OpenAI\nGPT-4o Endpoint] AIS[Azure AI Search\nVector Store] end FE --> API API --> ORCH ORCH --> SEARCH ORCH --> CODE SEARCH --> AIS CODE --> OPENAI ACR --> AKS KV -->|Workload Identity| AKS style NS_AGENT fill:#ccfbf1,stroke:#0d9488

Resource Recommendations for Agent Pods

resources:
  requests:
    memory: "512Mi"
    cpu: "250m"
  limits:
    memory: "2Gi"   # agents need headroom
    cpu: "1000m" # bursted on LLM calls

Workload Identity for LLM Keys

# Agent pod gets OIDC token → Azure AD
# No secrets in K8s — pulls from Key Vault
serviceAccountName: ai-agent-sa
# SA annotated with Azure managed identity
env:
- name: AZURE_CLIENT_ID
  valueFrom:
    configMapKeyRef:
      name: agent-config
      key: client_id

Python Backend Features You Can Add

Core API Features

▸ Chat API with streaming responses via SSE or WebSocket
▸ Document upload API for PDF, DOCX, TXT, HTML ingestion
▸ RAG query endpoint with citations, confidence, and source chunks
▸ Agent run endpoint with status polling and execution history
▸ Authentication, per-user sessions, tenant isolation

Operational Features

▸ Usage tracking: token counts, latency, model used, per-user quotas
▸ Prompt/version registry for experimentation and rollback
▸ Conversation memory store in Postgres or Redis
▸ Background jobs for indexing, re-embedding, scheduled evaluations
▸ Audit logs and guardrails for prompt injection or unsafe tool calls

Recommended Architecture

flowchart LR UI[Static Frontend] --> API[FastAPI Backend] API --> LC[LangChain / LangGraph Layer] API --> AUTH[Auth + Sessions] API --> JOBS[Background Workers] LC --> OPENAI[Azure OpenAI] LC --> SEARCH[Azure AI Search / Vector DB] AUTH --> PG[(Postgres)] JOBS --> REDIS[(Redis / Queue)] JOBS --> N8N[n8n for notifications\nand business workflows] style API fill:#dbeafe,stroke:#2563eb style LC fill:#dcfce7,stroke:#16a34a style N8N fill:#ffe4e6,stroke:#e11d48

# Example backend endpoints

POST /api/chat

POST /api/documents/upload

POST /api/rag/query

POST /api/agents/run

GET /api/agents/{id}/status

GET /api/conversations

GET /api/metrics/usage

POST /api/workflows/trigger

Governance & Safety Layer

Content Filtering

▸ Azure OpenAI content filters (hate, violence, self-harm, sexual)
▸ Prompt injection detection middleware
▸ Output validation before user delivery

Rate Limiting & Budget

▸ Per-user token budgets (e.g., 100K tokens/day)
▸ Max iterations per agent run (prevent infinite loops)
▸ Cost alarms via Azure Cost Management

Audit Logging

▸ Every tool call logged with input/output
▸ LangSmith / Azure Monitor tracing
▸ Immutable audit trail for compliance

Human-in-the-Loop

▸ Checkpoint before irreversible actions (email, delete, deploy)
▸ LangGraph interrupt() for async approvals
▸ Confidence threshold → escalate to human

← AI Agents Next: Orchestration →

Agentic AI &Production Deployment