Protecting Your Data in the AI Era: Separate the Treasure from the Agents

🇫🇷 Lire en français : Protéger ses données face à l'IA : trésor et agents
Everyone has jumped on AI agents, and for good reason: beyond the hype, they produce value. But that value is built partly at the expense of another — the value of your data. In the AI era, the question is no longer just “which model?” but “who holds my treasure?”. Here’s the problem, and the first concrete answer I deployed for a teacher here: separate the treasure — your data — from the agents — the intelligence — and keep control of both.
AI is not magic: it’s plain IT

Important
AI doesn’t create new fundamental questions: it makes the old ones — ownership, confidentiality, data backup — far sharper, all at once.
We talk about AI as a break. On the usage side, it is one. But on the data side, these are the same IT questions I’ve been asking for 25 years, simply asked louder. Who owns the data? Where does it live? Who can access it? How do we restore it when something goes wrong?
What’s new is that AI handles three layers of data at once: your company’s, your clients’, and the privacy of the people who appear in your files — data privacy. Three responsibilities at once, on tools whose server and terms you don’t control.
The real question isn’t intelligence, it’s value

If everyone adopts agents, it’s because they produce value. The problem is that this value is built partly with yours. Many users dump their data — and their clients’ — shamelessly, to save time or produce a nice report. Often those documents even stay stored in the AI’s cloud: not only are they exposed to international behemoths, but you run the risk of losing everything if the provider cuts your access — losing everything on your side, while leaving everything on theirs.
I’ll be told the engines aren’t supposed to train on this data. That’s what they claim, and it’s probably true on professional plans. But we know what was done to major media — that’s the whole point of the New York Times lawsuit against OpenAI — and to YouTube, scraped en masse to train countless models. Without reopening the entire digital-sovereignty file, we’re entitled to doubt the loyalty of providers — American or Chinese in particular. No ostracism here: just the observation of a bipolar dominance.
The most telling precedent isn’t in AI, it’s in retail. In 2000, Toys “R” Us — then the world’s number-one toy retailer — handed its entire online sales operation to Amazon: its site redirected to the platform. A few years later, Amazon opened the toy aisles to other sellers, learned the trade, captured the customer relationship. Lawsuit, split in 2006 — but Toys “R” Us had lost six years and its own e-commerce capability; the debt did the rest, all the way to bankruptcy. It was no longer a partnership, it was predation. The lesson isn’t “they stole its data”: it’s that you don’t hand your channel and your customer relationship to a platform that can become — and will become — your competitor.
Two jobs to break free: the treasure and the agents

Tip
To take back control, separate two things we tend to confuse: the treasure (your data) and the agents (the intelligence that processes it).
For a person or an organisation that wants to protect its data and break free as much as possible, there are therefore two distinct jobs: the intelligence — the processing — and the data — what, in the trade, we call the vault, which means exactly “safe”, “treasure”. The common mistake is to hand everything to the same player. Good practice is to keep the treasure at home and only invite agents to work on it, with measured rights.
A company’s value is in its data

This principle is nothing new. A company’s value lies largely in its data, and a working system is treated like any IT system: you maintain it — maintenance — and you must be able to restore it after a failure — backups.
Warning
The number of companies I’ve seen discover their backups were broken the very day they needed them… A small incident then becomes a catastrophe, sometimes fatal.
For a company in Guadeloupe or the Caribbean, I set up a simple strategy: cloud backup + local backup — useful when the connection drops or the datacenter fails, remember the OVH fire in Strasbourg in 2021 — with local copies spread across several places and media rotation. With that, you’re ready for most situations. Unlike those who sink over a detail, I’ve seen companies absorb major setbacks without flinching, losing only a few hours of production. The difference wasn’t luck: it was preparation. (That’s the whole point of a resilience audit, which I won’t redo here.)
My approach to AI: centralise the treasure, grant rights to the agents
I apply exactly the same logic to AI, keeping the two jobs separate. My approach is therefore twofold: centralise my data on hosting I control, where I grant rights to agents — local ones like Gemma, or commercial ones like ChatGPT and Claude, not forgetting the Chinese models Qwen and DeepSeek; and, in time, host AI models locally.
Concretely, my architecture has three pieces. An Obsidian server — my vault — connected to my agents via MCP, the protocol that opens precise doors onto my data. A local replica of the vault on my workstation. All of it versioned on my git server, itself backed up in the cloud and locally. On the agent side, I notably use Hermes, an open-source agent (MIT licence, by Nous Research) that self-hosts, remembers my projects and builds its own skills as it goes. I run it on a VPS I administer and drive it mostly from Telegram — and it can rely on a local model or a commercial API alike, without my treasure ever leaving my infrastructure.
Sovereign architecture
The treasure and the agents
Tap a block of the diagram to see its role. Then compare the two models.
You keep the treasure. The agents only pass through — via a door you control.
Two jobs, one rule
The treasure (your data) stays with you; the agents (the intelligence) come to work via a door you control. Tap a block to learn more.
Interactive diagram — Kimoun. The treasure never leaves: only scoped requests go to outside models.
What I set up for a teacher here

This kind of architecture is entirely accessible, technically and financially, to most organisations in Guadeloupe: I implemented it for a teacher here. I set him up an online Obsidian vault — all his courses and documents, which he owns — and a Hermes server. His AI agents talk directly with his corpus; he calls it his “second brain”. They help him prepare his lessons and follow the curriculum, but he owns his files, he decides the access, and he saves tokens by submitting to the models only what’s useful. At an individual scale, it’s the same emancipation I wrote about regarding the Fable cut-off: you keep the treasure, you invite the intelligence.
Hosting AI in your own walls: what’s next — and the power bill

Running your own models, at home or at your provider’s, is entirely feasible, and it’s the next step — I’ll devote other articles to it. I’m waiting a few more months for the hardware to settle: new Ryzens, cards announced like the RTX 5070 Ti Super 24 GB (still awaited, not yet released) that would open access to models on the order of 31 billion parameters — knowing that a Gemma 4 is already perfectly honourable for everyday tasks, and that by then we’ll have lighter, more capable models.
A warning, because it’s rarely said: a Nvidia card dedicated to AI draws 550 to 600 W for the card alone. Hosting models locally is therefore neither free nor neutral — for the bill as much as the footprint. One more reason to do it with method: size properly, pool resources, and choose between hosting at your place or in ours. We’ll come back to it.
Taking back control of your data isn’t a sovereignty whim: it’s good management, the same that makes you maintain and back up. The treasure first, the agents second — and both under your control. If you want to know where you really stand, that’s exactly what we look at together.