Live Pilot

Rental Patagonia SPA

Deploying Cortex to capture operational knowledge and build compounding intelligence for heavy equipment service operations in Chile.

Industry
Heavy Equipment
Region
Chile
Duration
4 Weeks
Deploy
March 16, 2026
The Problem

Universal,
not unique.

Every field-based business makes decisions under the same constraints: incomplete information and limited decision-making capacity. The operator doesn’t know which technician is closest. The dispatcher doesn’t know what parts are on the truck. The manager doesn’t know the real cost of running each asset.

The information exists — in someone’s head, in a WhatsApp thread from weeks ago, in a photo that was never labeled. Experience helps, but it’s trapped in one person. It doesn’t transfer. It doesn’t scale. It doesn’t survive turnover.

These problems are industry-agnostic. Heavy equipment, mining, construction, agriculture, logistics — the pattern is the same: people, assets, tasks, and decisions made under uncertainty. The solutions are universal. Only the specific properties change.

This pilot applies that universal intelligence to one specific case — a heavy equipment rental operation in Chile. The problems we solve here transfer to any business where work happens in the field and knowledge lives in people’s heads.

Approach

A live A/B test.

Before (Control)

The operation as it runs today. WhatsApp groups, verbal handoffs, spreadsheets maintained by one person that are outdated the moment they’re created. Knowledge trapped in heads. Decisions based on partial information.

Equipment tracked in stale CSVs, never verified
<5% of operational data systematically captured
Experience leaves when people leave

After (Treatment)

Same team, same WhatsApp groups, same workflows. But now every interaction is captured, enriched, cross-referenced, and made permanently retrievable. Intelligence compounds with every message.

Every interaction automatically structured
AI-enriched with metadata, context, and inference
Knowledge persists forever, compounds over time

By rolling out in weekly phases, each stage acts as its own A/B test. We measure the delta introduced by each new capability in isolation — capture vs. no capture, enrichment vs. raw data, proactive questions vs. passive collection, summarization vs. raw relay. All code is built and tested on local simulations before each phase deploys live.

Architecture

How it works.

Input
WhatsApp Business API

Messages, photos, voice notes, GPS coordinates, timestamps. Zero friction — the team uses WhatsApp exactly as they already do.

Relay
Cortex Ingestion Pipeline

Serverless event-driven pipeline. Every interaction is captured, deduplicated, and structured in real time — messages, media, metadata — before anything reaches the brain.

Brain
Claude by Anthropic

We’re partnering with Claude as the intelligence core of Cortex. Not just calling an API — we’re engineering context. Using Claude Code for rapid development, tool use to give Claude access to our database and external sources, and agentic loops for multi-step reasoning: induction (generate possibilities), deduction (eliminate branches), abduction (pick the best explanation), and reflection (loop or commit).

Claude applies business context to raw data — extracting equipment identity from plaque photos, enriching messages with inferred metadata, identifying data gaps, scoring message importance, and generating operational summaries. The goal: a system that understands the current state of the business and makes informed recommendations as context compounds.

Claude Code Tool Use Agentic Loops Context Engineering
Store
Knowledge Graph

Properties over columns — every fact stored as a typed row with provenance, confidence score, and evidence trail. The schema never changes when tracking something new. Nothing is ever hard‑deleted.

Output
Operational Briefs

Context-aware summaries delivered to managers. Priority-scored alerts, daily operational state, and decision-support recommendations — delivered right back through WhatsApp.

Compounding

The network
effect.

Cortex doesn’t just learn for one company. First-principle rules learned from one operation benefit every operation. A principle about bearing failure at altitude, learned in Chilean mining, helps a construction company in Bolivia without anyone making a phone call.

Every new tenant makes the system smarter for every existing tenant. Every message processed, every pattern recognized, every principle abstracted — the knowledge compounds across the entire network. This is the defensibility: the system gets better the more businesses use it.

Rental Patagonia is the first node in this network. Everything we learn here — about equipment, repairs, scheduling, customer patterns — becomes foundational knowledge that benefits every company that joins after.

Pilot Roadmap

Four weeks.
Four phases.

Each week introduces one new capability on top of the last. Each phase is its own A/B test — we measure the delta of each capability in isolation before stacking the next.

01

Capture & Onboard

Mar 16–22
In Progress

Deploy Cortex relay on active WhatsApp groups. Every message, photo, and voice note is captured and structured in real time. Team onboarding with zero behavior change — they keep using WhatsApp exactly as before.

Full message relay to structured data store
Team onboarding and process alignment
Baseline operational metrics capture
Establish data ingestion pipeline health
02

Enrich & Infer

Mar 23–29
AI Layer

Activate Claude-powered enrichment. Photos are analyzed for equipment metadata — make, model, year extracted from plaque numbers. Messages tagged with GPS coordinates, timestamps, and contextual labels. Live operational state monitoring.

Photo metadata extraction (make/model/year from plaque)
GPS and temporal tagging on all interactions
Equipment identity graph construction
Live business state dashboard
03

Question & Fill

Mar 30–Apr 5
Active Intelligence

Cortex begins identifying data gaps and proactively asking clarifying questions through natural conversation. Missing fields, ambiguous references, and incomplete records are resolved in real time.

Proactive gap detection in operational records
Natural-language follow-up questions via WhatsApp
Cross-reference validation across data sources
Knowledge graph completeness scoring
04

Summarize & Decide

Apr 6–12
Decision Layer

Replace live relay with intelligent summarizations. Each message is scored for importance and business context. Managers receive structured operational briefs instead of raw message streams.

Message importance scoring and prioritization
Structured operational summaries for management
Business context understanding and application
Decision-support recommendations
05

Conclude & Publish

Full results, validated metrics, and performance analysis published here. Pilot conclusions inform production deployment strategy.

Baseline Metrics

Prior estimates.

SME assumptions and industry benchmarks. These are our best estimates before live data collection begins. Each metric will be validated and updated weekly as real-world evidence accumulates.

Metric Before Cortex After Cortex
Data capture rate <5% Informal, nothing systematically recorded ~95% Automatic from existing WhatsApp
Equipment ID accuracy ~30% Outdated CSVs maintained by one person, never verified, stale on creation ~85% AI extraction from plaque photos, cross-verified
Knowledge retention ~20% Lost with employee turnover 100% Permanently stored, indexed, cross-referenced
Manager situational awareness ~40% Manual check-ins throughout day ~90% Real-time state + intelligent briefs
Query response time Hours/days Relies on tribal knowledge and phone calls <30 seconds Instant retrieval from knowledge store

All estimates are subject-matter-expert priors. Updated weekly with observed data as the pilot progresses.

Methodology

How we measure.

Automated Instrumentation

Every message processed, every enrichment applied, every question asked, and every summary generated is logged with full metadata. Pipeline health, latency, and accuracy metrics are computed continuously.

Weekly Structured Reviews

End-of-week review with the operations team. Qualitative feedback on system utility and operational impact. Quantitative comparison against baseline metrics.

Phase Isolation

Each weekly phase introduces exactly one new capability. This isolates the effect of each feature — we can attribute changes in metrics directly to the capability that caused them.

Simulation-First Deploys

All code is tested on local simulations of real conversation flows before live deployment. Live results validate simulation accuracy and inform the next phase’s deployment parameters.

Results coming
April 2026.

Full pilot analysis, validated metrics, and architecture deep-dive will be published here upon completion.