An AI agent built on core agent architecture principles — not a scripted automation. It reads requests, reasons about the best options, uses live tools, remembers outcomes, and improves with every interaction.
Every genuine AI agent shares the same structural components. Here is how each one maps to what our booking agent actually does.
Claude Sonnet 4 and Haiku 4.5 serve as the reasoning core. The agent runs a multi-turn tool-calling loop at four specific decision points — not on every operation. It thinks, calls a tool, reads the result, thinks again.
Claude Sonnet 4 Claude Haiku 4.5 ReAct pattern 4 decision pointsFive memory layers are loaded per job. Episodic memory tracks what proposals each reseller's clients accepted or rejected. Semantic memory holds destination-level statistics. A self-updating dictionary records every typo or abbreviation the system has successfully corrected — applied forever after.
5 memory layers Per-reseller learning Self-improvingAt DP-1 the agent plans the search strategy: which star ratings, filters, and inventory approach to use, informed by reseller history. At DP-3 it plans the ranking: which hotels best serve this specific client and why.
DP-1 Search strategy DP-3 Hotel rankingThe agent calls IRIX inventory APIs (200+ hotels, real-time pricing), mapping enrichment APIs (photos, addresses, facilities), Gmail (inbound email, threaded replies), and a PDF engine — all orchestrated within the reasoning loop.
IRIX Inventory API Mapping API Gmail IMAP/SMTP PDF (WeasyPrint)A state machine governs every job through 10 stages. Eleven validation checks run before any agent decision executes. Five independent circuit breakers protect external services. A watchdog detects stuck jobs. Every AI component has a deterministic fallback — zero single points of failure.
11 guardrails 5 circuit breakers 10-stage state machine Deterministic fallbacksUnlike a static automation, our agent accumulates knowledge over time — from accepted and rejected proposals, from LLM reasoning outputs, and from its own corrections.
Every accepted or rejected proposal writes an episode to memory — recording the client type, destination, star rating, board basis, and what the reseller decided. Future proposals for similar requests are shaped by this history.
When Claude ranks hotels or analyses feedback, its reasoning is stored alongside the outcome. If a decision led to a rejected proposal, that reasoning is available as context for future similar situations — the system does not repeat the same mistakes.
Every time the agent successfully corrects a typo, abbreviation, or ambiguous destination name, it writes the correction to a persistent learning dictionary. From that point forward, the correction is applied instantly — no LLM call needed.
The system maintains running statistics per destination: average nightly price by star rating, acceptance rates, top-performing suppliers, seasonal patterns. These are updated with exponential moving averages after every proposal — not fixed at setup time.
When a reseller requests a revision ("too expensive", "need a pool", "closer to the conference centre"), the agent analyses the feedback at DP-4, extracts the adjustment, and produces a new version. The delta — what changed and why — is recorded and informs future first-draft proposals.
Domain rules — the order in which to relax search constraints, how to interpret client-type keywords, price sanity ranges — are stored as procedural memory. They encode the expertise of experienced travel agents into the agent's decision-making, consistently applied across every proposal.
The underlying language model (Claude) has fixed weights and learns nothing from individual interactions. All learning happens in the surrounding system — databases, episode stores, statistical accumulators, the learning dictionary. This means learning is auditable, controllable, and can be reset or corrected at any time.
The agent operates autonomously end-to-end, but the reseller retains full oversight and control — consistent with the MCP sampling flow principle of keeping humans in charge of consequential decisions.
Ten automated steps from inbox to PDF delivery — most are instant; AI is only called where it adds genuine value.
Highlighted steps use Claude AI. All others are deterministic code — instant and free.
Knowledge in our agent comes from three sources: what the LLM already knows from training, what is stored in memory from past interactions, and what is retrieved live from external tools.
| Knowledge Type | Source | What it enables | Where it lives |
|---|---|---|---|
| World knowledgeFrom LLM training | Claude's pre-trained weights | Understanding natural language in any language, interpreting informal dates and prices, recognising city names and landmarks ("Tour Eiffel" = near Eiffel Tower), writing coherent recommendation text | Inside the model — not stored externally |
| Episodic memoryFrom past proposals | Outcome of every proposal per reseller + client type | Personalising search strategy and ranking for each reseller's specific client patterns — family, corporate, luxury, group, honeymoon | MySQL: reseller_episodes table |
| Semantic memoryFrom aggregated data | Statistical patterns across all proposals per destination | Calibrating budget expectations, setting realistic star-rating targets, identifying top-performing suppliers before a search starts | MySQL: destination statistics (EMA-updated) |
| Procedural memoryFrom domain expertise | Encoded rules from travel industry best practices | Consistent application of proven strategies — constraint relaxation order, client-type classification, price sanity ranges — across every proposal | Config / code constants |
| Learned correctionsSelf-improving | Successful typo / abbreviation resolutions from past emails | Instant correction of recurring input errors without LLM call — accuracy improves continuously as more emails are processed | MySQL: learned_corrections table |
| Live inventory dataFrom tools (real-time) | IRIX API — fetched fresh per request | Real-time availability, current pricing, room types, cancellation policies — data that changes daily and cannot be pre-stored | Retrieved via tool call, cached 2h for mapping |
| Hotel enrichmentFrom tools (real-time) | IRIX Mapping API — photos, descriptions, facilities | Providing the LLM with concrete details to write accurate recommendations — the AI can only describe what it has seen | Retrieved per proposal, injected into LLM context |
The agent does not dump everything into the LLM's context. Working memory is carefully assembled per job: only the relevant episodes, the destination statistics for this city, the enriched hotel details for the shortlisted candidates, and the current request are included. This keeps reasoning sharp, fast, and cost-efficient.
Mapped against the standard definition of an AI agent — not a marketing checklist, but concrete operational capabilities.