System Capabilities Overview

From client email to
branded proposal in minutes

An AI agent built on core agent architecture principles — not a scripted automation. It reads requests, reasons about the best options, uses live tools, remembers outcomes, and improves with every interaction.

2–3 min per proposal 200+ hotels searched 5 memory layers per agent 4 reasoning decision points 24/7 autonomous operation /bin/sh.05 avg AI cost
Core Architecture

Anatomy of our AI agent

Every genuine AI agent shares the same structural components. Here is how each one maps to what our booking agent actually does.

Anatomy of an AI Agent diagram
Standard AI agent anatomy — all five components are present in our system
LLM Reasoning Engine
Think → decide → act → observe (ReAct loop)

Claude Sonnet 4 and Haiku 4.5 serve as the reasoning core. The agent runs a multi-turn tool-calling loop at four specific decision points — not on every operation. It thinks, calls a tool, reads the result, thinks again.

Claude Sonnet 4 Claude Haiku 4.5 ReAct pattern 4 decision points
Memory
Working · Episodic · Semantic + self-improving dictionary

Five memory layers are loaded per job. Episodic memory tracks what proposals each reseller's clients accepted or rejected. Semantic memory holds destination-level statistics. A self-updating dictionary records every typo or abbreviation the system has successfully corrected — applied forever after.

5 memory layers Per-reseller learning Self-improving
Planning
Task decomposition at DP-1 and DP-3

At DP-1 the agent plans the search strategy: which star ratings, filters, and inventory approach to use, informed by reseller history. At DP-3 it plans the ranking: which hotels best serve this specific client and why.

DP-1 Search strategy DP-3 Hotel ranking
Tools
APIs, live inventory, email, PDF generation

The agent calls IRIX inventory APIs (200+ hotels, real-time pricing), mapping enrichment APIs (photos, addresses, facilities), Gmail (inbound email, threaded replies), and a PDF engine — all orchestrated within the reasoning loop.

IRIX Inventory API Mapping API Gmail IMAP/SMTP PDF (WeasyPrint)
Orchestration + Guardrails
Control loop · 11 validation checks · human-in-the-loop

A state machine governs every job through 10 stages. Eleven validation checks run before any agent decision executes. Five independent circuit breakers protect external services. A watchdog detects stuck jobs. Every AI component has a deterministic fallback — zero single points of failure.

11 guardrails 5 circuit breakers 10-stage state machine Deterministic fallbacks
Intelligence That Grows

How the system learns from every interaction

Unlike a static automation, our agent accumulates knowledge over time — from accepted and rejected proposals, from LLM reasoning outputs, and from its own corrections.

📥

Learning from outcomes

Every accepted or rejected proposal writes an episode to memory — recording the client type, destination, star rating, board basis, and what the reseller decided. Future proposals for similar requests are shaped by this history.

e.g. "This agency's family clients accept 4-star BB proposals 80% of the time → start search there, not at 5-star."
🧠

Learning from LLM reasoning

When Claude ranks hotels or analyses feedback, its reasoning is stored alongside the outcome. If a decision led to a rejected proposal, that reasoning is available as context for future similar situations — the system does not repeat the same mistakes.

e.g. "Hotel ranked #1 because of Eiffel Tower proximity → client rejected — lesson stored: this reseller's clients prefer business district locations."
📝

Learning from corrections

Every time the agent successfully corrects a typo, abbreviation, or ambiguous destination name, it writes the correction to a persistent learning dictionary. From that point forward, the correction is applied instantly — no LLM call needed.

e.g. "Buc" → "Bucharest" learned once from the first email, applied to every future request forever.
📊

Destination knowledge (semantic)

The system maintains running statistics per destination: average nightly price by star rating, acceptance rates, top-performing suppliers, seasonal patterns. These are updated with exponential moving averages after every proposal — not fixed at setup time.

e.g. "Paris 4-star average = €180/night" → agent calibrates budget expectations before searching, avoiding proposals that will always be out of range.
🔄

Learning from revision feedback

When a reseller requests a revision ("too expensive", "need a pool", "closer to the conference centre"), the agent analyses the feedback at DP-4, extracts the adjustment, and produces a new version. The delta — what changed and why — is recorded and informs future first-draft proposals.

e.g. Three consecutive 'too expensive' revisions → agent learns to start 15% below stated budget for this reseller.
⚙️

Procedural knowledge

Domain rules — the order in which to relax search constraints, how to interpret client-type keywords, price sanity ranges — are stored as procedural memory. They encode the expertise of experienced travel agents into the agent's decision-making, consistently applied across every proposal.

e.g. Zero results → relax star rating first, then widen dates, then drop board requirement — never drop location first.
Important: the model does not learn — the system does

The underlying language model (Claude) has fixed weights and learns nothing from individual interactions. All learning happens in the surrounding system — databases, episode stores, statistical accumulators, the learning dictionary. This means learning is auditable, controllable, and can be reset or corrected at any time.

Human Oversight

Human-in-the-loop at every critical step

The agent operates autonomously end-to-end, but the reseller retains full oversight and control — consistent with the MCP sampling flow principle of keeping humans in charge of consequential decisions.

MCP Sampling Flow diagram
MCP sampling flow — server requests LLM help, human reviews at each checkpoint
1
Agent builds and sends the proposal
The system completes the full pipeline autonomously — extraction, search, ranking, recommendation, PDF generation — then places the proposal in PENDING_REVIEW on the dashboard. No action is taken until the reseller reviews it.
2
Reseller reviews on the dashboard
The reseller sees the full proposal — hotel cards with images, room matrix, AI recommendation text, pricing — and has three choices: approve & send to client, reject, or request a revision with written feedback.
3
Agent asks when information is missing
If the incoming email lacks a destination or critical detail, the agent pauses (NEEDS_INFO state) and asks a targeted question via the dashboard rather than guessing. The pipeline resumes only once the reseller responds.
4
Client delivery is always reseller-initiated
The agent never sends anything directly to the end client. The reseller explicitly approves the proposal and initiates the client send — maintaining the agency relationship and ensuring brand control.
5
Full visibility and audit trail
Every API call, LLM call, state transition, cost, and duration is logged in the event timeline. Resellers can see exactly what the agent did, when, and why — there are no black-box decisions.
End-to-End Pipeline

How a request becomes a proposal

Ten automated steps from inbox to PDF delivery — most are instant; AI is only called where it adds genuine value.

01
Classify
Fuzzy-match subject line — booking or noise?
02
Extract
AI reads dates, rooms, budget, stars from natural language
03
Map
Resolve "Paris" to IRIX city & country IDs
04
Search
Live inventory — 50–200 hotels with real-time pricing
05
Rank
AI selects the best 5 hotels for this specific client
06
Enrich
Add photos, address, facilities, map links
07
Write
AI crafts personalised recommendation text
08–10
Deliver
PDF built, emailed to reseller, dashboard notified

Highlighted steps use Claude AI. All others are deterministic code — instant and free.

Technical Stack

What it's built on

Backend (Python 3.14)
FrameworkFastAPI + Uvicorn (async + SSE)
AI modelsClaude Sonnet 4 + Haiku 4.5
AI patternReAct (multi-turn tool-calling)
DatabaseMySQL 8 — 12 tables (async)
Hotel inventoryIRIX by DCS Plus (dual OAuth2)
PDF generationWeasyPrint + Jinja2 templates
EmailGmail IMAP (inbound) + SMTP
Frontend (Vue 3 + TypeScript)
FrameworkVue 3 Composition API
BuildVite
StatePinia (6 stores)
StylingTailwind CSS 3
ChartsChart.js + vue-chartjs
Real-timeSSE with auto-reconnect
Auth (roadmap)JWT / OAuth2 (Phase 2)
Knowledge Architecture

What the system knows and where it comes from

Knowledge in our agent comes from three sources: what the LLM already knows from training, what is stored in memory from past interactions, and what is retrieved live from external tools.

Knowledge Type Source What it enables Where it lives
World knowledgeFrom LLM training Claude's pre-trained weights Understanding natural language in any language, interpreting informal dates and prices, recognising city names and landmarks ("Tour Eiffel" = near Eiffel Tower), writing coherent recommendation text Inside the model — not stored externally
Episodic memoryFrom past proposals Outcome of every proposal per reseller + client type Personalising search strategy and ranking for each reseller's specific client patterns — family, corporate, luxury, group, honeymoon MySQL: reseller_episodes table
Semantic memoryFrom aggregated data Statistical patterns across all proposals per destination Calibrating budget expectations, setting realistic star-rating targets, identifying top-performing suppliers before a search starts MySQL: destination statistics (EMA-updated)
Procedural memoryFrom domain expertise Encoded rules from travel industry best practices Consistent application of proven strategies — constraint relaxation order, client-type classification, price sanity ranges — across every proposal Config / code constants
Learned correctionsSelf-improving Successful typo / abbreviation resolutions from past emails Instant correction of recurring input errors without LLM call — accuracy improves continuously as more emails are processed MySQL: learned_corrections table
Live inventory dataFrom tools (real-time) IRIX API — fetched fresh per request Real-time availability, current pricing, room types, cancellation policies — data that changes daily and cannot be pre-stored Retrieved via tool call, cached 2h for mapping
Hotel enrichmentFrom tools (real-time) IRIX Mapping API — photos, descriptions, facilities Providing the LLM with concrete details to write accurate recommendations — the AI can only describe what it has seen Retrieved per proposal, injected into LLM context
Context engineering — curating what the LLM sees

The agent does not dump everything into the LLM's context. Working memory is carefully assembled per job: only the relevant episodes, the destination statistics for this city, the enriched hotel details for the shortlisted candidates, and the current request are included. This keeps reasoning sharp, fast, and cost-efficient.

Tiered intelligence — AI only where it earns its cost

70%
Tier 1
Deterministic code
Email classification, date parsing, room splitting, budget calculation, state-machine transitions, circuit breakers, retry logic, caching, idempotency checks
/bin/sh.00
Instant
100% reliable
20%
Tier 2
Structured AI calls
Natural language extraction (Claude Sonnet), location disambiguation (Claude Haiku), personalised recommendation text — single-turn, structured JSON output, schema-validated
~/bin/sh.02
3–5 seconds
99% reliable
10%
Tier 3
Agent reasoning (ReAct loop)
Search strategy (DP-1), hotel ranking with context (DP-3), feedback analysis (DP-4), self-healing recovery (DP-5) — multi-turn, tool-calling, with 11 guardrail checks per decision
~/bin/sh.03–0.13
3–40 seconds
Fallback ready
Summary

What makes this a real AI agent

Mapped against the standard definition of an AI agent — not a marketing checklist, but concrete operational capabilities.

Autonomy
Processes email requests end-to-end without human instruction — 24/7, including overnight and weekends. 200+ daily proposals with 3 workers.
Tool use
Calls live IRIX inventory, mapping enrichment APIs, reads hotel details, and triggers email and PDF delivery — all within the reasoning loop.
Multi-step reasoning
Four decision points where Claude runs a multi-turn tool-calling loop (ReAct) — searching, filtering, comparing, deciding — as part of a single user request.
Self-healing
DP-5 diagnoses zero-result searches and relaxes constraints in a priority order — star rating → dates → board → location → budget — retrying automatically up to 3 times.
Multi-turn conversation
NEEDS_INFO state pauses the pipeline and asks the reseller a targeted question. Revision cycle (DP-4) incorporates feedback and produces an adjusted proposal — up to 3 rounds.
Memory and learning
Five memory layers. Episodic, semantic, procedural, working, and a self-improving correction dictionary. Every interaction makes future proposals more accurate and personalised.
Guardrails
11 validation checks before every agent decision executes. Daily LLM budget cap. Every AI component has a deterministic fallback. No single point of failure.
Human-in-the-loop
Reseller reviews every proposal before it reaches the client. Full event log, version history, and audit trail. The agent never sends to the end client without explicit reseller approval.