PATH 05

Automation &
Orchestration

LangGraph state machines, n8n business automation, Temporal durable execution, Airflow ML pipelines, HITL patterns, and event-driven AI workflows.

01

Orchestration Landscape

ToolBest ForComplexityHITL
LangGraphAgent state machines, branching logicMedium✅ interrupt()
n8nLow-code business automation, SaaS integrations, webhook flowsLow⚠️ manual approval patterns
Temporal.ioDurable long-running workflowsHigh✅ Signals/Queries
Apache AirflowBatch ML pipelines, DAG schedulingMedium⚠️ sensors only
Azure Durable FunctionsServerless orchestration on AzureMedium✅ External events
PrefectPython-native MLOps workflowLow⚠️ Limited
Celery + RedisTask queues, distributed worker poolsMedium❌ None
02

n8n for AI Automation

n8n fits the layer between backend services and business process automation. Use it when you need webhook-driven workflows, SaaS integrations, approvals, CRM/email/Slack actions, or scheduled automations without writing every step in Python.

Where n8n is strong

  • Webhook intake: accept form submissions, CRM events, GitHub webhooks, support tickets.
  • SaaS integration hub: Slack, Teams, Gmail, HubSpot, Notion, SharePoint, Google Sheets.
  • Human approval: send approval links/messages before the backend executes a costly or risky action.
  • Low-code ops flows: route incidents, summarize documents, notify teams, fan out work across services.

Where Python backend is still better

  • Complex LangChain/LangGraph orchestration and custom tool logic.
  • RAG indexing, retrieval pipelines, custom ranking and evaluation.
  • Secure API layer, auth, usage tracking, tenant-aware routing, model budgets.
  • Streaming responses, SSE/WebSocket chat, and long-running agent execution.
flowchart LR U[User or External Event] --> W[n8n Workflow\nWebhook / Schedule / SaaS Trigger] W --> P[Python Backend API\nFastAPI + LangChain] P --> A[Azure OpenAI / Agents] P --> V[Vector DB / Search] P --> DB[(Postgres / Redis)] P --> W W --> N[Slack / Teams / Email / CRM] style W fill:#ffe4e6,stroke:#e11d48 style P fill:#dbeafe,stroke:#2563eb

Recommended split: n8n for integration and workflow glue, Python for AI reasoning, APIs, data, and secure execution.

03

LangGraph State Machines

LangGraph models agent workflows as directed graphs. Each node is a function (LLM call, tool use, router). Edges define transitions. State flows through the graph, building up results.

flowchart TD START([__start__]) --> INIT[initialize_state] INIT --> ROUTER{route_query} ROUTER -->|needs_search| SEARCH[search_web] ROUTER -->|needs_code| CODE[execute_code] ROUTER -->|needs_db| DB[query_database] SEARCH --> AGGREGATE[aggregate_results] CODE --> AGGREGATE DB --> AGGREGATE AGGREGATE --> GENERATE[generate_response\nGPT-4o] GENERATE --> CRITIQUE{quality_check\nscore >= 0.8?} CRITIQUE -->|pass| HITL{human_review\nrequired?} CRITIQUE -->|fail - revise| GENERATE HITL -->|approved| DELIVER[deliver_answer] HITL -->|rejected| INIT DELIVER --> END([__end__]) style ROUTER fill:#fecdd3,stroke:#e11d48 style CRITIQUE fill:#fef3c7,stroke:#d97706 style HITL fill:#dbeafe,stroke:#2563eb
# LangGraph — state definition + human interrupt
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.memory import MemorySaver

class AgentState(TypedDict):
  messages: list[str]
  result: str
  needs_human: bool

builder = StateGraph(AgentState)
builder.add_node("generate", generate_fn)
builder.add_node("human_review", interrupt("human_review"))

checkpointer = MemorySaver() # Persist state across HITL pause
graph = builder.compile(checkpointer=checkpointer,
               interrupt_before=["human_review"])
04

Temporal.io — Durable Execution

Temporal persists workflow state automatically. If a worker crashes mid-execution — even hours into a complex agentic workflow — it replays from the last checkpoint automatically.

Why Temporal for Agents?

  • Durability: Workflow state persists across crashes, restarts, deploys.
  • Retries: Configurable retry policies per activity — no custom retry code.
  • Visibility: Web UI shows every workflow, its state, history, and errors.
  • HITL: Workflows can wait indefinitely for external signals (human approval).
  • Scale: Handles millions of concurrent long-running workflows.
# Temporal workflow — agentic research pipeline
@workflow.defn
class ResearchWorkflow:
  @workflow.run
  async def run(self, topic: str):
    documents = await workflow.execute_activity(
      search_web, topic, retry_policy=RetryPolicy(
        maximum_attempts=3))
    summary = await workflow.execute_activity(
      llm_summarize, documents)
    return summary
05

Airflow for ML & AI Pipelines

Apache Airflow is the industry standard for batch ML workflows — scheduled DAGs that run data preprocessing, model training, evaluation, and deployment steps.

flowchart LR subgraph DAG ["Airflow DAG: rag_pipeline (daily 2am)"] T1[extract_documents\nAzure Blob] --> T2[clean_text\nPython] T2 --> T3[chunk_text\nRecursive splitter] T3 --> T4[generate_embeddings\nAzure OpenAI Ada] T4 --> T5[upsert_vectors\nAzure AI Search] T5 --> T6[update_index_version\nCosmosDB] T6 --> T7[run_eval_suite\nRAGAS benchmarks] T7 --> T8{eval_pass?\nscore>0.85} T8 -->|yes| T9[notify_success\nTeams webhook] T8 -->|no| T10[alert_team\nPagerDuty] end style T4 fill:#dbeafe,stroke:#2563eb style T7 fill:#fff7ed,stroke:#ea580c
# Airflow DAG skeleton for RAG index refresh
from airflow.decorators import dag, task
from pendulum import datetime

@dag(schedule="0 2 * * *", start_date=datetime(2024,1,1), catchup=False)
def rag_pipeline():
  @task
  def extract_documents(): return pull_from_blob()
  @task
  def embed_and_index(docs): upsert_to_search(embed(docs))
  embed_and_index(extract_documents())
06

Event-Driven AI Pipelines

Trigger agent workflows from events (document uploaded, ticket created, metric threshold crossed) rather than polling or schedules.

flowchart LR subgraph TRIGGERS ["Event Sources"] E1[📁 Blob Storage\nnew doc uploaded] E2[📧 Email\nnew customer query] E3[📊 Azure Monitor\nalert fired] E4[🔗 Webhook\nGitHub PR opened] end BUS[Azure Service Bus\nor Event Grid] subgraph AGENTS ["Agent Workflows"] A1[Document Processor\nextract + index] A2[Support Agent\nclassify + respond] A3[Incident Agent\ndiagnose + remediate] A4[Code Review Agent\nreview + comment] end E1 --> BUS E2 --> BUS E3 --> BUS E4 --> BUS BUS --> A1 BUS --> A2 BUS --> A3 BUS --> A4 style BUS fill:#fdf4ff,stroke:#a855f7

Failure Recovery Strategies

Retry with Exponential Backoff

Wait 1s → 2s → 4s → 8s between retries. Add jitter to avoid thundering herd.

Circuit Breaker

Stop calling a failing service after N failures. Re-probe after cooldown period.

Dead Letter Queue

Failed messages land in DLQ for manual inspection. Prevents data loss.

Saga Pattern

Each step has a compensating action. On failure, unwind completed steps in reverse.

Cost & Resource Controls

Budget

Set per-workflow token budgets. Fail gracefully when exceeded instead of running up costs.

Batching

Batch embedding calls (2048+ texts at once) instead of one-by-one API calls. 100x cheaper.

Caching

Cache LLM responses for identical prompts with semantic cache (e.g., GPTCache). 30-70% cost reduction.

Model Routing

Route simple tasks to gpt-4o-mini ($0.15/1M) and complex to gpt-4o ($5/1M) using a classifier.

07

Observability Stack for AI

🔭
LangSmith
Trace every LangChain/LangGraph run. Step-by-step input/output visibility.
📊
Prometheus + Grafana
Latency, token usage, error rates. Custom dashboards on K8s. Alerts via AlertManager.
🌊
Azure Monitor
App Insights for distributed tracing. Correlated logs, dependency maps, anomaly detection.
📈
RAGAs / RAGAS
Evaluate RAG quality: faithfulness, answer relevancy, context recall. Automated regression suite.
← Agentic AI Next: AI Engineer Roadmap →