Method · Horizon scanning
Sentinel swarms
Horizon scanning, reimagined for the agent economy
Continuously watching the whole horizon for the first faint signs that the agent economy is arriving — at a scale no human team can reach.
The classical method
Horizon scanning is the foundational foresight discipline: the systematic, ongoing search for early indicators of change — emerging technologies, policy shifts, social movements, weak signals — before they are obvious. Government units like the UK's Foresight programme and Singapore's RAHS built whole institutions around it, because the signals that matter most are quiet, scattered, and easy to dismiss when you only read the mainstream. The craft is not prediction; it is noticing — and noticing early, while a development is still cheap to act on rather than expensive to react to.
Done well, scanning is disciplined rather than impressionistic. The UK Government Office for Science's Futures Toolkit and the UNDP Foresight Manual both codify it the same way: cast deliberately wide, beyond the sources you already trust; separate genuine signal from noise and hype; and track each signal over time rather than treating it as a one-off headline. The whole point is to look where the institution is not already looking — at the periphery, where structural change first shows up as something small and strange.
Why the human-run version hit a ceiling
The constraint was never the method; it was attention. A national scanning function is typically a handful of analysts, and the volume of relevant signal — preprints, patents, standards drafts, procurement notices, funding rounds, developer chatter — grows every year while the headcount does not. So scanning gets rationed: a team reads the journals it can reach, in the languages it can read, and quietly concedes that most of the horizon goes unwatched. The expensive failure of human scanning is the false negative nobody ever sees — the signal that was out there, in a source no one had time to open.
How it works with agents
Agents remove the attention ceiling rather than the analyst. A swarm of scanning agents ingests the firehose continuously — arXiv and bioRxiv preprints, patent filings, standards-body drafts, grant awards, GitHub release velocity, conference programmes, procurement notices, compute-cluster announcements — across many languages, every day, and does two distinct jobs. The first is triage: cluster, deduplicate, classify and route the flood so a human is never asked to drink from it directly. The second, harder job is weak-signal detection: surfacing the faint, early indication that something new is forming before it is obvious.
Crucially, the swarm is tuned to the periphery rather than the headlines. It is rewarded for surfacing the small, strange, early signal — a handful of unrelated labs suddenly citing the same obscure result, a cluster of patents from firms that have never patented together, a procurement line item for a capability that does not yet have a market — not for confirming what everyone already knows. That is the inversion of the human bias toward consensus, and it is the whole reason to build the thing.
What changes most is the economics of coverage. When a busy analyst had to triage ruthlessly, keeping ten weak signals under continuous watch was a stretch; a swarm keeps hundreds under watch for the cost of compute. The institution stops being surprised by developments it could have been tracking — not because the machine is wiser, but because nothing falls off the edge of the horizon for lack of someone to read it.
What the evidence says
This is not aspirational. Wang et al.'s SciMON: Scientific Inspiration Machines Optimized for Novelty shows that an LLM system can scan the scientific literature and surface genuinely novel directions rather than merely retrieving the familiar — and that novelty can be optimised for explicitly, which is exactly the difference between a search engine and a horizon scanner. Companion work on weak-signal detection over the research corpus turns the foresight-theoretic notion of a 'weak signal' into a computable pipeline that flags emerging fronts from sparse, early textual evidence.
The practitioner evidence points the same way. The OECD–WEF survey AI in Strategic Foresight, which canvassed 167 foresight professionals, reports that scanning is the single task where practitioners see the most immediate, highest-confidence value from AI — which is precisely why a serious build starts here. The binding constraint they name is human bandwidth to monitor and synthesise; that is the constraint a swarm is built to dissolve.
Applied to the agent economy
For the agent economy, the horizon is moving fast and in unfamiliar places: model releases, agent frameworks, autonomous-commerce pilots, regulatory drafts, new failure modes nobody had a name for last quarter. Our sentinel swarm watches all of it continuously, so the first signs of a structural shift reach decision-makers while there is still time to respond rather than only time to react.
The value compounds because the agent economy is unusually rich in early, machine-readable signal — code repositories, model cards, benchmark leaderboards, framework changelogs — which is exactly the kind of source a human team never has the hours to monitor and a swarm reads natively. The first place an agentic capability becomes real is rarely a headline; it is a commit, a pricing page, a procurement note.
Where humans stay in command
The dominant failure mode is the plausible-but-spurious cluster: the swarm confects a trend from coincidence, or amplifies a hype wave because the corpus is full of hype. So every flagged signal must cite its primary documents — no claim without provenance — and novelty is scored against a versioned baseline, so an agent cannot quietly redefine what counts as 'new'. A periodic recall audit seeds known historical signals back into the stream to measure how many the swarm would have caught, a standing check on the false-negative rate that benchmark scores never reveal.
And the swarm proposes; a human disposes. An analyst-curator owns the weekly weak-signal shortlist, accepting a candidate into the registry, rejecting it with a logged reason that becomes training feedback, or escalating it. The agent is never permitted to silently drop a signal it judges unimportant — suppression is itself a logged decision a human can audit — because in scanning, the costliest mistake is the one you never get to see.
How we run it
- Cast wide — agents monitor research, patents, filings, tenders, markets, code and fringe communities across languages, continuously.
- Cluster — raw signals are deduplicated and grouped into emerging themes, then tracked over time rather than treated as one-off headlines.
- Rank — each signal is scored for novelty, impact and credibility against a versioned baseline, with its primary sources attached.
- Surface — the strongest early signals are escalated to an analyst-curator as evidence cards, each with an explicit confidence band.
- Curate — every accept / reject / escalate decision is logged and fed back, so the swarm's aim improves and nothing is silently dropped.