IRIX AI Booking Agent — How It Works

Core Architecture

Anatomy of our AI agent

Every genuine AI agent shares the same structural components. Here is how each one maps to what our booking agent actually does.

Standard AI agent anatomy — all five components are present in our system

LLM Reasoning Engine

Think → decide → act → observe (ReAct loop)

Claude Sonnet 4 and Haiku 4.5 serve as the reasoning core. The agent runs a multi-turn tool-calling loop at four specific decision points — not on every operation. It thinks, calls a tool, reads the result, thinks again.

Claude Sonnet 4 Claude Haiku 4.5 ReAct pattern 4 decision points

Memory

Working · Episodic · Semantic + self-improving dictionary

Five memory layers are loaded per job. Episodic memory tracks what proposals each reseller's clients accepted or rejected. Semantic memory holds destination-level statistics. A self-updating dictionary records every typo or abbreviation the system has successfully corrected — applied forever after.

5 memory layers Per-reseller learning Self-improving

Planning

Task decomposition at DP-1 and DP-3

At DP-1 the agent plans the search strategy: which star ratings, filters, and inventory approach to use, informed by reseller history. At DP-3 it plans the ranking: which hotels best serve this specific client and why.

DP-1 Search strategy DP-3 Hotel ranking

Tools

APIs, live inventory, email, PDF generation

The agent calls IRIX inventory APIs (200+ hotels, real-time pricing), mapping enrichment APIs (photos, addresses, facilities), Gmail (inbound email, threaded replies), and a PDF engine — all orchestrated within the reasoning loop.

IRIX Inventory API Mapping API Gmail IMAP/SMTP PDF (WeasyPrint)

Orchestration + Guardrails

Control loop · 11 validation checks · human-in-the-loop

A state machine governs every job through 10 stages. Eleven validation checks run before any agent decision executes. Five independent circuit breakers protect external services. A watchdog detects stuck jobs. Every AI component has a deterministic fallback — zero single points of failure.

11 guardrails 5 circuit breakers 10-stage state machine Deterministic fallbacks

Intelligence That Grows

How the system learns from every interaction

Unlike a static automation, our agent accumulates knowledge over time — from accepted and rejected proposals, from LLM reasoning outputs, and from its own corrections.

📥

Learning from outcomes

Every accepted or rejected proposal writes an episode to memory — recording the client type, destination, star rating, board basis, and what the reseller decided. Future proposals for similar requests are shaped by this history.

e.g. "This agency's family clients accept 4-star BB proposals 80% of the time → start search there, not at 5-star."

🧠

Learning from LLM reasoning

When Claude ranks hotels or analyses feedback, its reasoning is stored alongside the outcome. If a decision led to a rejected proposal, that reasoning is available as context for future similar situations — the system does not repeat the same mistakes.

e.g. "Hotel ranked #1 because of Eiffel Tower proximity → client rejected — lesson stored: this reseller's clients prefer business district locations."

📝

Learning from corrections

Every time the agent successfully corrects a typo, abbreviation, or ambiguous destination name, it writes the correction to a persistent learning dictionary. From that point forward, the correction is applied instantly — no LLM call needed.

e.g. "Buc" → "Bucharest" learned once from the first email, applied to every future request forever.

📊

Destination knowledge (semantic)

The system maintains running statistics per destination: average nightly price by star rating, acceptance rates, top-performing suppliers, seasonal patterns. These are updated with exponential moving averages after every proposal — not fixed at setup time.

e.g. "Paris 4-star average = €180/night" → agent calibrates budget expectations before searching, avoiding proposals that will always be out of range.

🔄

Learning from revision feedback

When a reseller requests a revision ("too expensive", "need a pool", "closer to the conference centre"), the agent analyses the feedback at DP-4, extracts the adjustment, and produces a new version. The delta — what changed and why — is recorded and informs future first-draft proposals.

e.g. Three consecutive 'too expensive' revisions → agent learns to start 15% below stated budget for this reseller.

⚙️

Procedural knowledge

Domain rules — the order in which to relax search constraints, how to interpret client-type keywords, price sanity ranges — are stored as procedural memory. They encode the expertise of experienced travel agents into the agent's decision-making, consistently applied across every proposal.

e.g. Zero results → relax star rating first, then widen dates, then drop board requirement — never drop location first.

Important: the model does not learn — the system does

The underlying language model (Claude) has fixed weights and learns nothing from individual interactions. All learning happens in the surrounding system — databases, episode stores, statistical accumulators, the learning dictionary. This means learning is auditable, controllable, and can be reset or corrected at any time.

Human Oversight

Human-in-the-loop at every critical step

The agent operates autonomously end-to-end, but the reseller retains full oversight and control — consistent with the MCP sampling flow principle of keeping humans in charge of consequential decisions.

MCP sampling flow — server requests LLM help, human reviews at each checkpoint

1

Agent builds and sends the proposal

The system completes the full pipeline autonomously — extraction, search, ranking, recommendation, PDF generation — then places the proposal in PENDING_REVIEW on the dashboard. No action is taken until the reseller reviews it.

2

Reseller reviews on the dashboard

The reseller sees the full proposal — hotel cards with images, room matrix, AI recommendation text, pricing — and has three choices: approve & send to client, reject, or request a revision with written feedback.

3

Agent asks when information is missing

If the incoming email lacks a destination or critical detail, the agent pauses (NEEDS_INFO state) and asks a targeted question via the dashboard rather than guessing. The pipeline resumes only once the reseller responds.

4

Client delivery is always reseller-initiated

The agent never sends anything directly to the end client. The reseller explicitly approves the proposal and initiates the client send — maintaining the agency relationship and ensuring brand control.

5

Full visibility and audit trail

Every API call, LLM call, state transition, cost, and duration is logged in the event timeline. Resellers can see exactly what the agent did, when, and why — there are no black-box decisions.

End-to-End Pipeline

How a request becomes a proposal

Ten automated steps from inbox to PDF delivery — most are instant; AI is only called where it adds genuine value.

01

Classify

Fuzzy-match subject line — booking or noise?

02

Extract

AI reads dates, rooms, budget, stars from natural language

03

Map

Resolve "Paris" to IRIX city & country IDs

04

Search

Live inventory — 50–200 hotels with real-time pricing

05

Rank

AI selects the best 5 hotels for this specific client

06

Enrich

Add photos, address, facilities, map links

07

Write

AI crafts personalised recommendation text

08–10

Deliver

PDF built, emailed to reseller, dashboard notified

Highlighted steps use Claude AI. All others are deterministic code — instant and free.

Technical Stack

What it's built on

Backend (Python 3.14)

FrameworkFastAPI + Uvicorn (async + SSE)

AI modelsClaude Sonnet 4 + Haiku 4.5

AI patternReAct (multi-turn tool-calling)

DatabaseMySQL 8 — 12 tables (async)

Hotel inventoryIRIX by DCS Plus (dual OAuth2)

PDF generationWeasyPrint + Jinja2 templates

EmailGmail IMAP (inbound) + SMTP

Frontend (Vue 3 + TypeScript)

FrameworkVue 3 Composition API

BuildVite

StatePinia (6 stores)

StylingTailwind CSS 3

ChartsChart.js + vue-chartjs

Real-timeSSE with auto-reconnect

Auth (roadmap)JWT / OAuth2 (Phase 2)

Knowledge Architecture

What the system knows and where it comes from

Knowledge in our agent comes from three sources: what the LLM already knows from training, what is stored in memory from past interactions, and what is retrieved live from external tools.

Knowledge Type	Source	What it enables	Where it lives
World knowledgeFrom LLM training	Claude's pre-trained weights	Understanding natural language in any language, interpreting informal dates and prices, recognising city names and landmarks ("Tour Eiffel" = near Eiffel Tower), writing coherent recommendation text	Inside the model — not stored externally
Episodic memoryFrom past proposals	Outcome of every proposal per reseller + client type	Personalising search strategy and ranking for each reseller's specific client patterns — family, corporate, luxury, group, honeymoon	MySQL: reseller_episodes table
Semantic memoryFrom aggregated data	Statistical patterns across all proposals per destination	Calibrating budget expectations, setting realistic star-rating targets, identifying top-performing suppliers before a search starts	MySQL: destination statistics (EMA-updated)
Procedural memoryFrom domain expertise	Encoded rules from travel industry best practices	Consistent application of proven strategies — constraint relaxation order, client-type classification, price sanity ranges — across every proposal	Config / code constants
Learned correctionsSelf-improving	Successful typo / abbreviation resolutions from past emails	Instant correction of recurring input errors without LLM call — accuracy improves continuously as more emails are processed	MySQL: learned_corrections table
Live inventory dataFrom tools (real-time)	IRIX API — fetched fresh per request	Real-time availability, current pricing, room types, cancellation policies — data that changes daily and cannot be pre-stored	Retrieved via tool call, cached 2h for mapping
Hotel enrichmentFrom tools (real-time)	IRIX Mapping API — photos, descriptions, facilities	Providing the LLM with concrete details to write accurate recommendations — the AI can only describe what it has seen	Retrieved per proposal, injected into LLM context

Context engineering — curating what the LLM sees

The agent does not dump everything into the LLM's context. Working memory is carefully assembled per job: only the relevant episodes, the destination statistics for this city, the enriched hotel details for the shortlisted candidates, and the current request are included. This keeps reasoning sharp, fast, and cost-efficient.

Tiered intelligence — AI only where it earns its cost

70%

Tier 1

Deterministic code

Email classification, date parsing, room splitting, budget calculation, state-machine transitions, circuit breakers, retry logic, caching, idempotency checks

/bin/sh.00

Instant

100% reliable

20%

Tier 2

Structured AI calls

Natural language extraction (Claude Sonnet), location disambiguation (Claude Haiku), personalised recommendation text — single-turn, structured JSON output, schema-validated

~/bin/sh.02

3–5 seconds

99% reliable

10%

Tier 3

Agent reasoning (ReAct loop)

Search strategy (DP-1), hotel ranking with context (DP-3), feedback analysis (DP-4), self-healing recovery (DP-5) — multi-turn, tool-calling, with 11 guardrail checks per decision

~/bin/sh.03–0.13

3–40 seconds

Fallback ready

Summary

What makes this a real AI agent

Mapped against the standard definition of an AI agent — not a marketing checklist, but concrete operational capabilities.

Autonomy

Processes email requests end-to-end without human instruction — 24/7, including overnight and weekends. 200+ daily proposals with 3 workers.

Tool use

Calls live IRIX inventory, mapping enrichment APIs, reads hotel details, and triggers email and PDF delivery — all within the reasoning loop.

Multi-step reasoning

Four decision points where Claude runs a multi-turn tool-calling loop (ReAct) — searching, filtering, comparing, deciding — as part of a single user request.

Self-healing

DP-5 diagnoses zero-result searches and relaxes constraints in a priority order — star rating → dates → board → location → budget — retrying automatically up to 3 times.

Multi-turn conversation

NEEDS_INFO state pauses the pipeline and asks the reseller a targeted question. Revision cycle (DP-4) incorporates feedback and produces an adjusted proposal — up to 3 rounds.

Memory and learning

Five memory layers. Episodic, semantic, procedural, working, and a self-improving correction dictionary. Every interaction makes future proposals more accurate and personalised.

Guardrails

11 validation checks before every agent decision executes. Daily LLM budget cap. Every AI component has a deterministic fallback. No single point of failure.

Human-in-the-loop

Reseller reviews every proposal before it reaches the client. Full event log, version history, and audit trail. The agent never sends to the end client without explicit reseller approval.