Observation Is Not Free: Why Observability Is Now an Attack Surface

26 Apr, 2026
read
Serhan W. Bahar

Read manifesto

A note before you begin.

This manifesto makes one argument, built in seven sections. It is designed to work as a whole. The diagnosis in sections 1–3 depends on the physical principle in section 4. The four principles in section 5 rest on the diagnosis. The applications in section 6 assume both. The argument’s parts support one another, so the reader who can give it the ten thousand words will get more from it than the reader who samples one section. In return, I have tried to make every one of those words earn its place.

Introduction

For twenty years, the security industry has operated on an assumption so deeply embedded that it rarely gets stated. The assumption is that visibility is inherently protective. More telemetry means better defence. More logs mean faster detection. More context means smarter response. The entire industrial apparatus of modern cybersecurity — SIEMs, XDR platforms, data lakes, threat intelligence feeds, behavioural analytics, observability platforms — rests on this assumption. Security budgets scale with telemetry volume. Maturity models measure coverage. Vendor differentiation happens on ingestion rates.

This assumption has inverted.

In adversarial environments operating at machine speed, through AI agents that consume context as input to decisions, the marginal unit of collected observability now creates more risk than it reduces. Every byte of telemetry is both a potential signal and a potential attack surface. Every correlation engine is an adversarial target. Every context store feeding an AI agent is an injection vector. We have built, in the name of visibility, an attack surface that grows faster than the defensive value it was meant to create — and the industry is still adding to it.

A word on the word “observability” is owed here, at the door, because this manifesto uses it more broadly than some readers will expect. In classical SRE usage, observability means defensive telemetry — SIEM logs, EDR traces, network flows. This manifesto uses the word to also cover functional application state that later gets read back: an AI agent’s memory, a RAG retrieval context, the record of tool outputs an agent consumes in its own decision loop. Those are not defensive telemetry in the SRE sense; they are functional application components. What unifies them with classical observability is the structural act — recording state that some later consumer reads — and the cost structure that follows from that act: storage is paid, attack surface is created, and, in the agentic case, the record feeds back into the system’s own decisions. A strict classicist can read the first half of this manifesto as being about observability-as-SIEM-telemetry and the second half as being about agent-context validation. The argument survives that reading. What this manifesto claims is that treating them as manifestations of the same phenomenon is more useful than treating them as two problems that happen to share a surface feature, because the underlying cost structure is shared and the disciplines that address one apply to the other.

This manifesto makes one argument. Observation is not free. It never was, but we could pretend it was when attackers moved at human speed and telemetry was expensive. That pretence is no longer sustainable. Security must shift from visibility-maximisation to visibility-discipline: collecting less, verifying more, treating collected context as adversarially exposed until proven otherwise.

The instinct behind this shift is older than any modern security debate. From Kerckhoffs in 1883 to Saltzer and Schroeder in 1975, serious security thinking has insisted that systems should rest on the smallest trusted surface that does the job, and that every additional mechanism needs to justify its place. The observability argument in this manifesto is the application of that instinct to one domain. The instinct itself applies more broadly — to controls, to architecture, to the whole accumulation of security machinery the industry has spent two decades adding. Visibility is where the inversion is sharpest right now, which is why it is this manifesto’s focus. The underlying question — does this mechanism earn its place? — applies everywhere.

1. The Assumption We’ve Been Operating On

Every security framework written in the last two decades encodes the same unstated premise. The NIST Cybersecurity Framework’s Detect function assumes detection improves with telemetry coverage. ISO 27001’s monitoring controls mandate logging breadth. MITRE ATT&CK’s defensive value is measured in technique visibility. Zero Trust Architecture, for all its reframing of perimeter assumptions, still depends on continuous monitoring of every transaction. The message across frameworks is uniform: see more, know more, respond better.

This is a defensible position in a world where adversaries are slow, telemetry is expensive, and the things you collect are passively observed rather than actively consumed. In that world, every additional log source raises the probability that an attacker leaves a trace, and every additional trace raises the probability of detection. The math of visibility-as-protection works.

The evidence that this math is breaking is not in any single statistic. It is in the gap between what we spend on visibility and what we get from it.

The 2024 Verizon Data Breach Investigations Report, drawn from more than 30,000 incidents and 10,000 breaches across 94 countries, documents a 180% year-on-year increase in vulnerability exploitation as the initial access vector for breaches. The 2025 DBIR shows the trend continuing — a further 34% rise year-on-year, pushing vulnerability exploitation to 20% of all breaches, second only to credential abuse at 22%. Verizon itself attributes the 2025 rise primarily to zero-day exploitation of edge devices and VPNs — which grew eightfold, from 3% to 22% of the vulnerability-exploitation category — and to a doubling of third-party involvement in breaches. That attribution matters. It means the 34% is not direct evidence that internal observability is failing; it is evidence that the perimeter attack surface and the supply chain have widened faster than defenders can respond. What the statistic does establish, and all this argument requires from it, is that attackers are finding initial access through channels that observability-heavy security programmes, as most of them were originally designed, were not primarily built to surface — edge-device vulnerabilities and third-party compromises sit at the boundary of what classical SIEM-centric architectures observe, even as the tooling around them continues to improve. The 2024 IBM Cost of a Data Breach Report, conducted independently by the Ponemon Institute across 604 organisations in 16 countries, found the global average cost of a breach rising to $4.88 million — a 10% increase and the largest single-year jump since the pandemic. That figure fell back in the 2025 edition to $4.44 million, the first decline in five years. The fall is real and deserves to be read honestly.

What the 2025 IBM data actually shows is a bifurcation. Global average cost fell, but the United States average rose 9% to a record $10.22 million. The decline is concentrated in organisations that deployed AI and automation extensively in security operations — those organisations saved around $1.9 million per breach and shortened breach lifecycles by approximately 80 days. The average breach lifecycle fell to 241 days, a nine-year low. On its face, this looks like evidence against this manifesto’s thesis.

Read more carefully, it is not. The 2025 decline tells us that mature security operations leveraging AI-enhanced detection are getting better at containing breaches once they start — which is what one would expect when a well-funded segment of the industry deploys its best tooling against a well-understood class of adversaries. It does not tell us that the visibility-maximisation strategy is succeeding across the industry. The US rise, the 9-year-low lifecycle still sitting above 200 days, the US breach cost exceeding $10 million per incident, and the improvement being concentrated at the top of the maturity distribution while the long tail continues to accumulate telemetry without commensurate defensive benefit — these are the figures that describe the actual state of the industry. More importantly, the 2025 IBM data measures breaches of the kind security programmes have been tuned to catch for twenty years. It does not measure the class of attack that Sections 2 and 4 of this manifesto are primarily about: the coupling of observation and execution in agentic systems, where the telemetry is simultaneously the attack surface. That class is not yet reflected in IBM’s breach-cost data because the attacks are too new to have accumulated enough incidents to move the statistic. They will.

These are not outcomes produced by under-instrumentation. They are outcomes produced by organisations that have deployed SIEMs, XDR platforms, EDR agents, identity providers, cloud security posture tools, and threat intelligence feeds. The mature segment is pulling the global average down through faster containment; the long tail and the agentic frontier are pulling it in the other direction. The assumption that more visibility is uniformly protective does not survive inspection of either.

The experience inside the Security Operations Centre tells the same story. A peer-reviewed 2025 survey in ACM Computing Surveys by Tariq, Chhetri, Nepal and Paris synthesises the state of the alert-fatigue literature, describing how SIEM systems tuned to minimise false negatives produce alert volumes that exceed analyst capacity even when augmented by specialised SOC tooling. The 2022 USENIX Security Symposium study by Alahmadi et al., based on qualitative interviews with working SOC analysts, found that some analysts reported investigating 100 to 200 alerts for every one that represented an actual threat. These findings are not perfectly consistent across studies — the same USENIX paper notes earlier work by Kokulu et al. in which analysts did not view false positives as their primary operational problem — but the direction is clear: analyst time is being consumed at scale by alerts the organisation chose to generate.

The practitioner data, even allowing for vendor interest in the problem, points in the same direction. Graylog’s 2026 analysis reports that enterprise SIEMs now process an average of 24,000 unique log sources per deployment, of which organisations ingest less than 15% due to cost constraints. Forshtec’s 2026 practitioner guidance finds that fewer than 40% of ingested logs provide genuine investigative or security value. These are vendor-adjacent sources with commercial interest in the conclusion, but the figures are consistent with the independent data and with what most senior practitioners report privately about their own environments.

Put plainly: the organisations with the most visibility are not demonstrably safer than those with less. The assumption that underwrote twenty years of security investment is not producing the outcomes the investment was meant to buy.

Individual critiques of SIEM practice, alert fatigue, and compliance theatre exist across the literature. The most developed of these is Kelly Shortridge’s work on security chaos engineering, which names the underlying dysfunction directly: “The fail-safe mindset is a driver of the status quo cybersecurity industry’s lack of systems thinking, its fragmentation, and its futile obsession with prediction.” Her prescription is resilience engineering. Mine, building on hers, is more specific: stop assuming the collection itself is neutral.

Other work has reached adjacent conclusions from different directions, and two lineages in particular deserve explicit credit.

The first is the agent-security community’s work on context as control flow. The insight that in AI agent systems context is indistinguishable from instructions — context is code — has been developed across a community of researchers and practitioners over several years before this manifesto was written. Simon Willison has written consistently since 2022 about prompt injection as control-flow manipulation. Greshake et al.’s 2023 paper “Not What You’ve Signed Up For” formalised indirect prompt injection as an attack class. The “lethal trifecta” framing — that agents combining access to untrusted input, sensitive data, and external action form an inherently exploitable architecture — has been in circulation in the agent-security community since 2024. Jiang et al.’s 2026 paper systematises this line of work in computer-science terms and proposes a zero-trust runtime architecture; the paper synthesises an existing body of insight rather than originating it.

The second is the SIEM-economics community’s work on tiered and disciplined collection. Practitioners and analysts — Anton Chuvakin, Allie Mellen, the broader Gartner SOC research stream, and the composable-SIEM community — have argued for tiered ingestion, hot/cold data-lake separation, and disciplined selection for well over a decade under labels including data-swamp avoidance and intentional assembly. The practical prescriptions that appear later in this manifesto as Principles 1, 3, and 4 draw substantially on what this community has already been arguing in public.

Each of these bodies of work addresses a fragment of the problem this manifesto names as a whole. The contribution here is not to claim the individual insights but to tie them to a single physical principle and to the design tradition that principle already belongs to.

The question asked by existing work is how to do visibility better. The question this manifesto asks is different. It is whether the marginal unit of additional visibility is now net-negative — whether we have crossed an inversion point where more telemetry creates more risk than it reduces.

In three specific domains, I will argue, we have.

2. The Three Inversions

Three domains show the inversion is no longer theoretical. In each, the marginal unit of collected observability generates more risk than it reduces. In each, the industry is still adding to the collection.

Inversion 1: The SIEM and the analyst

The oldest of the three inversions is the one most practitioners have felt personally, even if the industry has not named it as an inversion.

Organisations responded to twenty years of rising threat complexity by collecting more. More log sources, more endpoint telemetry, more cloud audit trails, more network flow data, more identity events, more threat intelligence feeds. The consensus was that the more an organisation could see, the better it could defend. This was not a foolish consensus. In a world where attacker dwell time was measured in months and analysts had time to hunt through data, visibility was genuinely protective.

That world is gone. The evidence cited earlier — breach lifecycles still measured in multiple hundreds of days, alert volumes that exceed analyst capacity by orders of magnitude, AI-driven improvement concentrated in a narrow top tier — is not produced by organisations that cannot see. It is produced by organisations that can see everything and act on nothing. The 2022 USENIX study documented SOC analysts reporting one real threat for every 100 to 200 alerts investigated. The 2025 ACM Computing Surveys synthesis describes SIEM systems deliberately tuned to minimise false negatives, producing alert volumes that are structurally impossible for human analysts to process.

The alert volume itself has become an attack surface. Attackers have learned that the fastest way to hide is inside the noise defenders collect in the name of seeing. Slow, low-severity activity spread across multiple tools over weeks avoids investigation thresholds because overwhelmed analysts triage by severity and recency, not by correlation. The attacker’s dwell time is no longer the gap between their arrival and the defender’s detection. It is the gap between their arrival and the defender’s attention — and attention is the scarcer resource.

The tools that correlate telemetry are themselves adversarial targets. Detection rules are tuned by vendors against known attacker behaviour, which means sophisticated attackers can acquire the same rules and test their techniques against them before deploying. Detection logic can be inferred from alert-suppression behaviour when the same adversary retests repeatedly. Machine-learning-based anomaly detection can be fooled by attacks that fit the training distribution. Every piece of detection sophistication added is a piece of detection sophistication an attacker can study.

The inversion in this domain is not that telemetry is worthless. It is that the marginal log line, once collected, creates more attack surface than signal. Analysts drown. Rules erode. Attackers hide. The spend goes up. The outcomes do not.

Inversion 2: The AI agent and its memory

The second inversion is specific to the agentic era, and it makes the first look benign by comparison.

In classical security architecture, telemetry was passive. Logs existed to be reviewed. Context existed to be queried. The system being protected did not consume its own observability as input. An attacker who poisoned a log might obscure their trail, but they did not thereby change the system’s behaviour.

Agentic AI systems are different. They consume context — memory, retrieval-augmented generation stores, conversation history, tool outputs, documents — as direct input to their reasoning. The same data that observability systems treat as a passive record is, for an AI agent, an active instruction stream. Every piece of context the agent can see is a piece of context that shapes what the agent does next.

This collapses the distinction between observation and execution. Poisoning context is no longer obscuring a trace. It is hijacking the agent.

OWASP’s Top 10 for Agentic Applications 2026 lists this as ASI06: Memory and Context Poisoning, identifying corrupted persistent memory, embeddings, and RAG stores as a top-ten risk for agentic systems. OWASP cites the Gemini Memory Attack as a concrete example of the class.

The research is not theoretical. Across the published literature, memory-poisoning attacks against agentic systems achieve real-world success rates roughly in the 20% to 40% range against agents built on commercial models under production-representative conditions, with substantially higher rates under more permissive laboratory threat models. That is the honest headline figure, and it is the one defenders should plan against.

The most-cited paper in the area is Dong et al.’s 2025 MINJA paper, accepted as a NeurIPS 2025 poster. MINJA demonstrated that attackers can inject malicious records into an AI agent’s long-term memory through nothing more than ordinary user queries — no elevated privileges, no API access, no direct writes. Under MINJA’s laboratory threat model — assuming either a shared-memory setting between attacker and benign users, or an isolated-memory setting in which the attacker adopts a feasible identity disguise to influence later retrievals — the attack achieves injection success rates above 95% and attack success rates above 70% across three types of production-representative agents (a ReAct-based shopping agent, a healthcare agent working over electronic health records, and a general question-answering agent) built on GPT-4 and GPT-4o. Those numbers are real but they are ceilings, not floors. They tell defenders what is possible when the threat model is permissive; they do not tell defenders what they should plan against under realistic conditions.

The shared-memory assumption has been criticised by subsequent work as narrower than real agent deployments will generally be, and follow-up research — including eTAMP (2026) — has demonstrated cross-session memory poisoning through indirect environmental injection without requiring shared memory at all. The eTAMP attack-success rates are lower than MINJA’s: 32.5% on GPT-5-mini, 23.4% on GPT-5.2, and 19.5% on GPT-OSS-120B, versus MINJA’s 70% plus. That drop matters and should be stated directly. Relaxing the threat model toward realistic deployment conditions reduced effectiveness substantially.

The most direct robustness study of MINJA itself is Memory Poisoning Attack and Defense on Memory Based LLM-Agents (2026) — a UMass CS690F course-project preprint, not yet peer-reviewed, but methodologically the most direct test of MINJA under realistic conditions currently in the public literature. It re-evaluates MINJA on EHR agents with pre-existing legitimate memories and varying retrieval parameters and reports MINJA’s attack-success rate dropping to best-case figures around 38% on GPT-4o-mini and 28% on Llama-3.1-8B-Instruct. The preprint status warrants reading the specific numbers with appropriate caution, but the direction is unambiguous and consistent with what the broader literature is showing: relaxation toward production conditions reduces effectiveness. Two things follow. The first is that the 95%/70% MINJA numbers are laboratory-adjacent ceilings, not production floors; the realistic envelope is the 20% to 40% range that the headline figure named. The second is that the manifesto’s central claim about agentic systems does not rest on the ceiling and cannot be falsified by a drop toward the floor, because the claim is structural rather than rate-dependent. A MINJA-class attack that succeeds 28% of the time against a production agentic system is still a systemic risk for any organisation whose agent takes consequential action on retrieved context. The effectiveness curve will move on both sides — attacker technique and defender technique mature together — and what persists regardless is that the attack surface exists wherever persistent memory, RAG, or tool output is fed back into the agent’s decision loop.

The relevant question for defenders is not how the numbers compare across threat models; it is whether real-world attack success rates in that 20% to 40% range against an agentic system in production-representative settings constitute a defensively acceptable baseline. They do not. An e-commerce agent successfully poisoned one time in three against realistic attacker capability is a systemic risk to any organisation deploying it. The surface is present in any organisation deploying agents with persistent memory, and the rate is high enough that defensive architecture has to assume the attack rather than treat it as edge-case.

The attack and its execution are temporally decoupled. The injection happens in one session. The compromise can manifest in a later session, under conditions the MINJA paper models through shared memory or identity disguise and subsequent work extends to cross-session, cross-site settings through environmental injection. Traditional monitoring sees nothing suspicious at any single point in time.

Every piece of context an agent is given to improve its performance is, simultaneously, a piece of context an attacker can corrupt to degrade its integrity. The more context you feed an agent to make it useful, the more surface area you expose. Organisations are building agentic systems with persistent memory, cross-session context, and retrieval-augmented reasoning precisely because these features make agents more capable. They are also, by construction, the attack surface.

The industry’s prescription for AI security risk is, almost universally, more observability. More logging of agent inputs and outputs. More telemetry about tool invocations. More memory of past interactions. Microsoft’s March 2026 guidance on AI observability — which explicitly acknowledges the context-poisoning threat model, including a scenario in which a research agent fetches a page containing hidden instructions and passes them back as trusted input — still prescribes AI-native telemetry collected from design time as the path forward. The implicit assumption is that the same visibility discipline that failed to protect the SIEM world will somehow work here, even though the attack surface being instrumented is the agent’s own decision process.

The inversion in this domain is sharper than the first. It is not that we are collecting too much and missing signals. It is that in AI agent systems, the collection itself is the attack vector.

Inversion 3: Compliance and the defensive interpretation

The third inversion is structural and comes from an unexpected direction. It is produced by regulation — or more precisely, by how regulation gets interpreted inside organisations under pressure.

Modern regulation — DORA in the European Union, NIS2 across member states, the SEC’s cyber incident disclosure rules in the United States, the UK’s operational resilience regime — sits in a complex middle ground. At the level of framing, these regimes are organised around outcomes: DORA requires firms to demonstrate that critical services remain within impact tolerance under severe-but-plausible disruption; NIS2 Article 21 requires appropriate measures proportionate to risk; the UK regime under SYSC 15A is explicitly resilience-outcome based. At the level of text, the picture is more prescriptive than the framing suggests. DORA Article 9 requires continuous monitoring of ICT systems and tools, and the Level 2 RTS on ICT risk management specifies what events must be logged, for how long, and under what protections. NIS2 Article 21 enumerates logging among its baseline measures. The UK’s SYSC 9 mandates orderly record-keeping of business and internal organisation, and MiFID II retention requirements overlay five-to-seven-year storage obligations for specific record classes. What these regimes do not do is specify how much security telemetry a firm must collect, or prescribe that SIEM ingestion must grow without bound. Neither does their text demand the maximalist interpretation most firms have adopted. The regulatory picture is prescriptive on specific record classes and prescriptive on the existence of continuous monitoring, but it is silent on the scale of general-purpose security telemetry — and that silence is what defensive interpretation has filled with over-collection.

In practice, that is not how these regulations are implemented. The defensive interpretation, which dominates in regulated industries, is that more telemetry is always safer from a compliance perspective. If a regulator asks what happened during an incident and you cannot answer because you did not log it, you are exposed. If you logged everything and the logs are messy, you are at least defensible. So organisations log everything.

Compliance-driven logs are collected because they might be needed, not because they serve detection. They are held in hot tiers because regulators expect them to be retrievable. They are never searched because they were never needed for actual security work. They pile up. They cost money. And, increasingly, they create exposure.

The IBM Cost of a Data Breach Report 2024 finds that 35% of breaches involved shadow data — data held in unmanaged data sources, which in IBM’s definition spans uploads to unsanctioned cloud services, forgotten production databases, personal drives, public repositories, and other storage the organisation did not track. Breaches involving shadow data took 26.2% longer to identify and 20.2% longer to contain, and cost 16% more on average than breaches without it. Compliance-driven accumulation is one contributor to the shadow-data surface alongside others; the unifying pattern is that the organisation held data it did not actively manage, and an attacker exploited the gap.

The same compliance logic that demands comprehensive logging also discourages deletion. Retention policies lengthen. Data lakes grow. Every piece of accumulated telemetry is a piece of data an attacker can exfiltrate, a piece of PII that raises breach-notification costs, and a piece of context that must be protected in its own right. Organisations have built what are, in effect, comprehensive records of their own behaviour — records that are, by their nature, difficult to secure and expensive to delete.

The inversion in this domain is that defensive compliance interpretation has turned regulatory pressure — which in its actual text is prescriptive on specific record classes and silent on the scale of general-purpose security telemetry — into a driver of attack surface accumulation. Firms are not over-collecting because regulation strictly requires it. They are over-collecting because the perceived safest interpretation of regulation is to over-collect in areas the text does not prescribe. The solution is not to fight regulation but to argue that defensive over-collection, in the areas regulation leaves to the firm’s judgement, no longer serves the outcomes regulation was written to demand.

The pattern beneath the three inversions

The three inversions are structurally distinct. The SOC inversion is about the economics of attention. The AI agent inversion is about the fusion of observation and execution. The compliance inversion is about institutional incentive structures.

They share a common logic. In each, visibility-maximisation was a defensible strategy under conditions that no longer hold. In each, the strategy has continued past the point of diminishing returns and into the territory of negative returns. In each, the industry’s response has been to recommit to the strategy rather than question it.

Observability has not become useless. It remains essential. The inversion is that the marginal unit now produces negative value — more cost, more noise, more attack surface — than the signal it provides. The discipline the industry needs is not more visibility; it is knowing when visibility is enough.

Why this has happened now is what the next part of the argument addresses: the economic, technological, and physical mechanisms that produced the inversion.

3. Why Now

The three inversions did not arrive by accident. Three mechanisms, operating simultaneously, produced them. One is economic. One is technological. One is structural. Together they explain why visibility-maximisation was a reasonable strategy for twenty years and why it has stopped being one.

The economic mechanism: storage became free, so we stopped choosing

In 1865, the English economist William Stanley Jevons made an observation about coal. As steam engines became more efficient — producing more work per tonne of coal burned — England’s total coal consumption did not fall. It rose sharply. Jevons wrote in The Coal Question: “It is wholly a confusion of ideas to suppose that the economical use of fuel is equivalent to a diminished consumption. The very contrary is the truth.” Cheap coal did not reduce coal use. It expanded the set of things coal was used for, and consumption grew.

The pattern, now called the Jevons paradox, has been observed across domains. Efficient LED lighting did not reduce electricity demand; it enabled ubiquitous illumination. Fuel-efficient engines did not reduce gasoline consumption; they enabled larger vehicles and longer journeys. Efficient data storage did not reduce the volume of stored data; it enabled the capture of everything.

Security telemetry has followed the same path. When logging was expensive — when storage cost real money per gigabyte, when SIEM licences were priced against EPS limits an organisation felt — security teams made deliberate choices about what to collect. The choices were imperfect but they were choices. Someone had to decide that a particular log source was worth its cost. That discipline produced an implicit signal-to-cost filter at the point of collection.

Storage costs then approached zero. Cloud object storage made retention arbitrarily cheap. Pipeline tools made ingestion arbitrarily easy. The constraint that had forced selectivity evaporated. Organisations began collecting everything not because they had decided each source was valuable but because the marginal cost of keeping one more source was too small to justify a decision. The implicit filter at collection became a non-decision: collect by default, filter later, if ever.

SIEM licensing and processing costs did not disappear with storage costs, and practitioners will recognise them as a real constraint — the Graylog data cited earlier on sources ingested versus sources available reflects exactly this bottleneck. But SIEM licensing rewards volume management rather than principled selection. A source that is cheaper to route than to evaluate still gets routed. A source that is free to collect at rest but expensive to process still gets collected — just not indexed. The economic constraint shifted from “what should we keep?” to “what should we route through the expensive tool?”, and that is a different question, one that does not produce the discipline the old question did.

The consequences took time to become visible. For a period, the expanded collection produced real benefit — more investigative material when incidents occurred, more data for detection engineering, more context for threat hunting. But the expansion continued past the point of diminishing returns. The SOC tools could not process what was collected. The analysts could not review what the tools surfaced. The storage bills grew. The data was kept anyway, because keeping it cost less than deciding what to delete.

Jevons’s insight was that efficiency gains in resource use do not produce the expected reduction in consumption. They produce expansion. Modern security learned this lesson in reverse: when collection became efficient, we did not collect less carefully. We collected more, and eventually too much, because the constraint that had forced care was gone.

The technological mechanism: the detection gap is widening again

The second mechanism is a change in how fast things happen relative to how fast defenders can respond.

For most of the history of professional security, the attacker was a human. Human attackers make mistakes. They leave traces. They operate at human speeds, which means investigation has time to catch up. The visibility discipline of the SIEM era was built for this adversary — not as a conscious design choice, but because no other adversary existed at scale.

The industry made real progress against this adversary. Mandiant’s M-Trends reports documented median dwell time — the period between compromise and detection — falling from over 400 days in 2011 to 16 days in 2022 and 10 days in 2023. These were genuine defensive gains, produced in large part by exactly the expanded observability that sections 1 and 2 have criticised. The SIEM-era investment produced results.

That progress has now reversed. M-Trends 2025 reported dwell time rising to 11 days. The most recent report, M-Trends 2026, published in March 2026, showed it rising again to 14 days — the first time in over a decade that the metric had worsened two years in a row. The same 2026 report also shows internal detection of malicious activity improving, from 43% in 2024 to 52% in 2025. Telemetry is doing more defensive work on the detection axis than it used to. That counter-statistic deserves to be read honestly alongside the dwell-time rise. Mandiant itself further attributes part of the dwell-time rise to composition: a higher volume of long-dwell espionage cases and North Korean IT worker operations, both with 122-day median dwell times, pulled the global average up. Those caveats are real. But they do not dissolve the underlying concern. Even stripped of the composition effect, the direction of travel for sophisticated intrusions is away from rapid detection, and the same report documents the median interval between initial access and hand-off to a secondary threat group collapsing from eight hours to 22 seconds between 2022 and 2025 — an index of cybercrime-ecosystem specialisation that moves intrusions forward faster than any human-mediated SOC process can respond to, regardless of whether the initial detection happens internally. Detection improving does not solve the defensive problem if the time between detection and attacker action has compressed to seconds.

M-Trends 2024 observed that Mandiant’s own red teams need only five to seven days on average to achieve their objectives. The defender is now slower than a professional red team by a factor of two or three, in a domain where red teams are widely accepted to be less sophisticated than well-resourced criminal groups. The 2024 Verizon DBIR documented a 180% year-on-year increase in vulnerability exploitation as the initial access vector for breaches, driven largely by rapid exploitation of disclosed CVEs before defenders could patch. In the MOVEit exploitation campaign of 2023, attackers compromised thousands of organisations within days.

The compression is not uniform and the exact causes are contested. Some of it reflects AI-assisted attack tooling; some reflects the industrialisation of ransomware-as-a-service; some reflects the simple fact that as organisations deploy more technology, they expose more vulnerabilities faster than they can secure them. What matters for this argument is not the cause of the compression but its effect on the utility of telemetry.

In the human-adversary era, telemetry was retrospective. It helped analysts reconstruct what happened and respond. Retrospective value was real value, because the window between attack and consequence was long enough for retrospective review to matter. As that window compresses, retrospective telemetry retains forensic utility but loses defensive utility. The telemetry still exists. It is still expensive to maintain. It still represents attack surface. What it no longer does, in many cases, is prevent the attack from completing before anyone reviews it.

The structural mechanism: observation and execution have fused

The third mechanism is the sharpest and the newest, and Section 2’s Inversion 2 made the core case for it at length: in agentic AI systems, context is simultaneously a record of what happened and an instruction set for what happens next. That inversion is not a quirk of particular agent architectures. It is a structural feature of any system that reads its own context to decide what to do — which is what makes AI agents useful and what makes them a new kind of security object.

For the purposes of this section’s argument, what matters is how this fusion relates to the other two mechanisms. Cheap storage drove the accumulation. Compressed attack windows eroded the defensive utility of the accumulation. The observation-execution fusion changes what accumulation is. In classical telemetry, collection created attack surface on the security infrastructure surrounding the system. In agentic systems, collection creates attack surface inside the system’s own decision process. The MINJA and eTAMP research cited in Section 2 demonstrated this in production-representative settings. The attack vector is the telemetry itself.

The consequence is that defenders trained in classical separation cannot reason about this using classical tools. Traditional SIEM-style detection, built for passive logs, cannot distinguish poisoned memory from legitimate memory without a provenance layer the SIEM was not built to provide — and even once a poisoned record is identified, removing it without losing the agent’s usable context requires the trust-scoring and quarantine machinery that Principle 2 will prescribe. A vault protecting memory from external writes does not protect against writes the agent itself performs in response to crafted queries. Traditional visibility controls assume observation and execution are separable. They are not, in the systems the industry is now deploying.

Three mechanisms, one outcome

The three mechanisms operate on different timescales and through different logics — Jevons over decades, the detection-gap mechanism over the last two years, the observation-execution fusion only since agentic AI reached production — but they converge on the same outcome. The industry’s continued investment in expanded observability is a response shaped by the old conditions, applied to new ones where the response has inverted.

What makes this structural rather than accidental is worth treating on its own. There is a physical principle the industry has forgotten. It predicts, with no appeal to the specifics of SIEMs or AI agents or regulatory interpretation, that what the three mechanisms are producing is exactly what should be expected.

4. The Physical Principle We Forgot

Cheap storage, compressed attack windows, the fusion of observation and execution: the three mechanisms just described each tell a modern story. Together they explain what has changed. They do not explain why the outcomes were predictable, or why no amount of better tooling will reverse them.

This is the pivot on which the whole manifesto turns. Without the physical principle that follows, the three mechanisms in section 3 look like a coincidence — three unrelated problems that happened to surface at the same time, each fixable on its own terms. With it, they look like a single phenomenon showing up in three manifestations. The difference matters. If the mechanisms are coincident, better tooling plausibly reverses them. If they are manifestations of something deeper, better tooling deepens the problem. The industry has been betting on the first reading for twenty years. This manifesto argues the second.

What is meant by “physical principle” here matters, because the argument that follows depends on the reader holding the right version of the claim. What follows is an analogy to classical physics, not an identity with it. Physical observation disturbs a system through direct energy exchange between measuring apparatus and measured system. Computational observation disturbs a system through resource consumption, attack-surface creation, and — in agentic systems — decision coupling. These are different mechanisms. What carries across is not the mechanism but the structural fact that measurement is not free, that the cost compounds with scale, and that engineering intuitions built on “observation is approximately costless” break when the approximation stops holding. When this manifesto speaks of a physical principle, what it means is that the compounding cost behaves like a physical cost in the ways that matter to engineers — it is paid whether you account for it or not, it accumulates, and it cannot be reduced to zero by cleverness. The analogy is the spine of the argument. The identity is not claimed.

For that, the industry needs to recover a principle it has forgotten. It is a principle from physics, older than computing, and it has been hiding in plain sight throughout this argument.

Observation is physical

In physics, the observer effect is the disturbance of a system caused by the act of observing it. The principle is classical, not quantum. It applies in ordinary macroscopic systems as readily as in microscopic ones.

The canonical example is a tyre pressure gauge. To measure the pressure in a tyre, some air must enter the gauge. The air that enters the gauge comes from the tyre. The measurement itself reduces the tyre’s pressure, slightly but irreducibly. A mercury thermometer must absorb or release thermal energy to equilibrate with the body it measures, and by doing so it changes that body’s temperature. These are not quantum effects. They are a consequence of the simple fact that measurement requires interaction, and interaction requires exchange of energy, matter, or information.

For most engineering purposes, observer effects are small enough to ignore. The tyre loses a negligible fraction of its pressure to the gauge. The body loses a negligible amount of heat to the thermometer. Engineers treat observation as approximately free and move on. The approximation has held well enough across most of classical physics that it stopped being remarked upon. Observation became synonymous with knowing, and knowing was treated as a costless act.

The approximation breaks down in two places. First, at quantum scales, where the energy exchanged in measurement is comparable to the energy of the system being measured, and the disturbance can no longer be ignored. Second, and more relevantly for this argument, in any system where the act of measurement produces side effects that compound over time and at scale.

Security telemetry is a system of the second kind.

The whole argument reduces to this. Observation requires interaction. Interaction consumes resources and creates consequences. Those consequences are negligible when you observe a little and significant when you observe a lot. Classical security was built in an era where the costs were small enough to ignore, so the industry stopped thinking about them. The costs did not stop existing. They compounded, invisibly, until they exceeded the value the observation was meant to provide.

Every observation produces a trace, and the trace has consequences

When a SIEM ingests a log line, the system being observed is not left unchanged. The log line exists. It must be stored. It must be indexed. It must be transmitted. It must be protected. It must be retained for some period of time. It must eventually be deleted, or it accumulates. Each of these operations consumes resources, and each creates new surfaces on which failures, compromises, or exfiltration can occur.

Take a mundane example. A Windows system logging Event ID 5156 records every allowed network connection. At modern scales this runs into billions of events per day across a large enterprise. Each event is stored on physical media, transmitted over physical networks, indexed by physical compute, retained for a physical duration. The storage is not metaphorical. The bandwidth is not metaphorical. Each event represents real resources consumed by the act of observation, and each one contributes to the surface on which an attacker can operate — the indexing engine that can be poisoned, the retention policy that becomes a discovery vector, the data lake that can be exfiltrated.

A practitioner might reasonably respond: organisations have always budgeted for storage and compute. What is new? The answer is that financial accounting captures the direct costs of observation — the line items on procurement invoices — but it does not capture the downstream costs. The attack surface added by a new log source does not appear on a budget. The analyst attention consumed by the alerts that source produces does not appear on a budget. The decision capacity absorbed by the correlation rules written against that source does not appear on a budget. These costs are real but they are paid in resources that sit outside the accounting. The observer effect is the reminder that they are paid anyway.

Classical security treated telemetry as passive. Once collected, the argument went, the data simply sat there waiting to be useful. But sitting there is not a null operation. Sitting there is a state that must be maintained. Maintaining that state is an ongoing act of observation, and that ongoing act continues to perturb the system.

The observer effect, applied to security, says something simple: there is no such thing as observing a system at zero cost.

In agentic systems, observation and execution are the same operation

The observer effect has always applied to security telemetry. What is new — and what makes the agentic era qualitatively different rather than merely more expensive — is that in AI agent systems, the observation-execution fusion documented in Sections 2 and 3 changes what the cost of observation actually is.

In classical telemetry, the cost was paid in storage, attention, and surface area. Those costs were real but they were paid by the security infrastructure surrounding the system. In agentic systems, the cost is paid inside the decision process itself. Every piece of context collected to make an agent useful is a piece of context that shapes what the agent does. Every piece of context an attacker can corrupt is a piece of context the agent will act upon. The observer and the observed are not separable.

The implication for the observer effect is that in agentic systems, the feedback loop closes in a way it never did before. Classical telemetry had an asymmetry: collection happened in one place, and action happened elsewhere. That asymmetry let defenders reason about telemetry as a bounded cost — expensive, possibly excessive, but contained. The asymmetry does not exist in agentic systems. The collection is the action. The record is the instruction. The observer is the observed.

The consensus defensive response has not yet caught up to this. The standard prescription for AI security risk, almost universally, is more observability — more logging of agent interactions, more telemetry about tool invocations, more memory of past decisions. That prescription assumes the old asymmetry. In a system where observation and execution are coupled, collecting more telemetry about an agent increases the surface on which the agent’s own decisions can be hijacked. The more observable you make the agent, the more of its decision process you expose to manipulation.

Information has a cost, by Shannon

Information theory reinforces the point from a different direction. Shannon’s 1948 theorem established that information is physical — measurable in bits, bounded by channel capacity, paid for in energy, materials, and bandwidth. Shannon alone does not say that more telemetry is harmful. He says that more telemetry is never free. The observer effect says the costs compound. Together they close the gap between “cheap at the margin” and “costly in aggregate” — which is the gap the industry has been walking through for twenty years.

Why this is structural, not accidental

The observer effect explains why the three mechanisms described earlier produced the outcomes they did, rather than being fixable through better engineering.

When storage became nearly free, organisations stopped pricing the observer effect into their collection decisions. The marginal financial cost was negligible, so the decision to collect one more log source no longer required justification. But the observer effect was still operating. Each new log source still consumed resources, still created attack surface, still competed for analyst attention — just invisibly, below the threshold where accounting captured it. The Jevons paradox described earlier explains why consumption expanded. The observer effect explains why that expansion was not costless even when it looked costless.

When attack windows compressed, organisations could not compensate by collecting more. More telemetry increased the observer effect proportionally — more storage, more indexing, more transmission, more attack surface — without proportionally increasing the rate at which the telemetry could be acted on. The detection gap widened precisely because the only lever the industry knew how to pull had diminishing returns in a regime where speed mattered more than completeness.

When observation and execution fused in agentic systems, the observer effect took on a new and qualitatively different character. Collection became not just costly but structurally coupled to the attack surface being created. This is the sharpest of the three cases, and the one most poorly served by classical visibility thinking.

No amount of better tooling reverses this. A more efficient SIEM still obeys the observer effect. A better memory architecture still couples observation and execution. The industry’s response, which is to collect more, correlate harder, and instrument earlier, is rational within a framework that treats observation as free. In a framework that treats observation as physical, it is directly counterproductive.

What this implies for practice

If observation is not free, the discipline the industry needs is to treat collection as a cost-bearing decision rather than a default.

This is not an argument against observability, which remains essential. The argument is that observability must be priced. Every collection decision must be made with explicit awareness of the ongoing physical costs — storage, attention, attack surface, coupled decision risk — that the collection will impose. Collections that survive that accounting are real investments. Collections that do not survive it are liabilities being held as if they were assets.

The industry’s current approach records the benefits of collection without the costs. The observer effect corrects the accounting. When the accounting is corrected, many current collection practices stop making sense.

5. The Replacement Discipline

The industry needs a discipline for deciding what to observe. Not a framework. Not a maturity model. A habit of thinking that makes the cost of observation visible at the point of decision, rather than absorbing it as background noise.

Four principles.

Principle 1: Price observation explicitly

Every collection decision is a cost-bearing decision. The cost is not just financial.

A complete accounting of the cost of collecting a new telemetry source includes, at minimum: the storage footprint it will add, the bandwidth it will consume, the indexing overhead it will impose, the analyst attention it will absorb, the correlation capacity it will use, the attack surface it will expose, and — in agentic systems — the decision surface it will couple. Not all of these costs can be quantified precisely. Some can be estimated. Some can only be acknowledged qualitatively. But every one of them is real, and a collection decision that ignores them is not a decision; it is a default.

The practical discipline is to require each collection to justify itself against its full cost. A log source being added to a SIEM should be able to answer: what detection does this enable, what incident does this help investigate, what question does this help answer — and does that value exceed the footprint, attention, surface, and coupling it adds? Sources that cannot justify themselves against this full cost are liabilities held as if they were assets.

This is not an argument for austerity. It is an argument for accounting. Many collections will still justify themselves easily. Authentication logs from the domain controller are worth their cost many times over. Endpoint process creation on crown-jewel systems is worth its cost. Anomaly signals from outbound traffic on financial systems are worth their cost. These collections are investments. The principle applies to the thousands of collections that sit next to them without being asked the same question — the compliance-only logs in hot tiers, the verbose operational telemetry with no detection rule, the redundant audit trails collected because the vendor default collects them.

Principle 2: Verify context before it reaches decisions

In agentic systems, this principle becomes load-bearing.

Every piece of context an AI agent consumes is a piece of context that influences its behaviour. Every piece of context therefore needs an answer to the question: where did this come from, and do we trust it? Context without provenance is context without a trust label, and context without a trust label cannot be safely consumed by a system that treats all context as equally authoritative.

The mechanisms for this already exist in adjacent fields. The W3C PROV data model defines how to represent provenance for data. Cryptographic attestation can bind context to its source. Trust-aware retrieval — proposed in the MINJA defence literature and elsewhere — can weight context by the confidence the system has in its origin. None of these are new inventions. What would be new is making them a first-order discipline rather than a research topic.

The operational principle is simple. Context entering an agent’s decision process must carry provenance. Context without verifiable provenance should be either refused, quarantined, or consumed only with explicit awareness of the reduced trust. The architectural principle is that memory, retrieval stores, and tool outputs are not trusted by default — they earn trust through the chain that produced them.

This is harder than it sounds. Most agent architectures do not currently track provenance well. Many memory systems write without attribution. RAG pipelines often treat retrieved documents as trusted because they were retrieved. These are defaults inherited from a pre-adversarial era when the worst consequence of bad context was a wrong answer. In an adversarial era, the worst consequence is a hijacked decision.

There is a potential tension here worth resolving. Principle 2 prescribes adding mechanism — cryptographic attestation, provenance tracking, trust-aware retrieval — while the broader discipline in this manifesto argues for subtraction and economy of mechanism in the Kerckhoffs-Saltzer-Schroeder tradition. These are compatible, not contradictory. Economy of mechanism does not forbid addition; it requires that what remains trusted be small. Cryptographic verification is exactly the kind of mechanism that tradition endorses, because it reduces the trusted base rather than enlarging it. Before verification, an agent trusts all context it retrieves. After verification, it trusts the signing authority and the verification logic — and nothing else in the context path. The added mechanism substitutes for a much larger surface of implicit trust that the agent would otherwise carry. Kerckhoffs’s own principle is the precedent: adding the key-based encryption machinery was addition, but what it produced was a system where only the key needed to stay secret, which was a dramatic reduction in what had to be trusted. Principle 2 is doing the same move for context provenance. The discipline is not “fewer lines of code”; it is “a smaller trusted base.” Verification buys the second by spending on the first, which is the trade the tradition was built to make.

Principle 3: Preserve crown-jewel observation at full fidelity; discipline the rest

The argument is not that telemetry is harmful. It is that the marginal unit of additional telemetry, beyond some organisation-specific threshold, generates more cost than value. Where that threshold sits depends on the system.

By “crown jewel” I mean specifically a system whose compromise produces material business consequence that the organisation cannot absorb operationally — typically the systems whose loss would trigger regulatory disclosure, halt revenue-generating operations, or expose data the organisation has fiduciary obligation to protect. The term is deliberately narrow. Most systems in a modern enterprise are not crown jewels by this definition, and applying the term strictly is part of what makes Principle 3 operational.

For crown-jewel assets — domain controllers, authentication systems, financial transaction engines, clinical decision systems, source code repositories, customer data stores — the threshold is high. The value of visibility into these systems is substantial, the cost of missed signals is severe, and the collection should be comprehensive. Full-fidelity logging, aggressive retention, detailed correlation: all justified, all worth their cost.

For everything else, the threshold is lower than organisations typically assume. Kubernetes readiness probes firing every ten seconds are not crown-jewel signal. Windows Event ID 5156 allowed-connection logs across an entire estate are not crown-jewel signal. VPC flow logs between internal load balancers and backends are not crown-jewel signal. Compliance-only audit trails in hot tiers are not crown-jewel signal. These collections populate data lakes and consume SOC attention without commensurate defensive value. They are where the discipline has room to cut.

“Full fidelity” means something different in an agentic crown-jewel system, because Section 4’s argument removes the classical visibility-asymmetry that Principle 3 otherwise assumes. For classical crown jewels — domain controllers, transaction engines, data stores — full fidelity means the traditional thing: comprehensive logs, long retention, aggressive correlation. For an agentic crown-jewel system — a customer-facing agent that takes consequential action, a clinical decision agent, a trading agent — full fidelity does not mean the same thing, because adding more telemetry to the agent’s own decision loop expands the attack surface rather than reducing it. The agentic analogue of full fidelity is the provenance, trust-scoring, and retrieval-discipline machinery Principle 2 prescribes: comprehensive verification of what enters the agent’s decision process, and comprehensive externalised observability that flows outward to systems the agent cannot read back. The crown-jewel instinct — spend on what matters most — applies. What the spending buys in the agentic case is different.

Telemetry is not uniform. Treating it as uniform — ingesting everything at hot-tier rates, correlating everything against everything else, retaining everything for the longest compliance period — is what produces the outcomes this manifesto opened with: drowning analysts, missed signals, enormous spend without commensurate defensive improvement. Tiered collection, with crown-jewel sources at full fidelity and the long tail subject to sampling, summarisation, or deletion, is visibility-discipline in practice.

Mature practitioners have been making this distinction informally for years. The discipline is to make it systematic.

Principle 4: Retrospective and real-time telemetry are not the same economic object

Telemetry collected for retrospective review has different value than telemetry that can trigger action at machine speed.

In the human-adversary era, this distinction was collapsible. Retrospective telemetry had real defensive value because the window between compromise and consequence was long enough for human review to intervene. That window has narrowed. In some classes of attack — ransomware, rapid-exploitation campaigns, agent compromise — the window is effectively closed to human response. Retrospective telemetry retains forensic utility but loses defensive utility.

Retrospective telemetry is not therefore worthless. It remains essential for post-incident investigation, for learning across incidents, for threat hunting, for trend analysis, and for the kind of retrospective studies that informed this manifesto. What it no longer does, in many cases, is prevent the attack that is happening now. That is the distinction the discipline has to be built around.

Real-time telemetry, in contrast, has defensive value in proportion to the speed of the response it can trigger. A signal that arrives in milliseconds and triggers an automated action that takes milliseconds produces defensive value even against machine-speed attack. A signal that arrives in milliseconds and enters a queue for human review tomorrow produces only forensic value.

These two kinds of telemetry should be budgeted and managed separately. Retrospective telemetry should be held cheaply — cold storage, long retention, searchable on demand — but not treated as defensive. Real-time telemetry should be held expensively and aggressively processed, but scoped tightly to the decisions it can actually drive. Conflating them, as the uniform SIEM model does, spends real-time-grade money on retrospective-value data and accepts retrospective-grade latency on real-time-critical signals.

This is where the discipline meets the speed problem the manifesto opened with. Reducing undisciplined telemetry does not, on its own, make defenders faster — and it does not free stream-processing compute in any architecturally meaningful way, because in modern SOC stacks the real-time pipeline (stream processors feeding SOAR) and the retrospective pipeline (data lakes and cold storage) do not share a resource pool. What discipline frees is something subtler and more consequential. It frees analyst confidence. A SOC drowning in retrospective alerts runs its automated playbooks behind human-in-the-loop approval gates, because the false-positive tolerance of an uncurated alert stream is too high to let automation act unsupervised. The 22-second gap between initial access and hand-off to a secondary threat group cannot be closed by a playbook that waits for a human to click approve. It can only be closed by automation that runs without the gate. And the gate only comes off when the signal feeding the playbook has been curated to a fidelity the SOC actually trusts. Disciplined collection is the work that earns that trust. Principle 4 is the bridge between visibility-discipline and defensive speed, but the bridge is built of signal quality and decision authority, not of compute and storage.

Crown-jewel signal is often real-time signal. Long-tail telemetry is often retrospective. The economic distinction between the two reinforces the fidelity distinction in the previous principle.

A lineage: subtraction as a security discipline

The instinct behind these principles is not new. Security has a long tradition — often honoured more in theory than in practice — that rests security on the smallest possible trusted surface rather than on accumulated machinery.

Kerckhoffs articulated it for cryptography in 1883. In La Cryptographie Militaire, Auguste Kerckhoffs laid out six design rules for military ciphers. The second of these, the one that became his principle, insisted — in his own wording — that a cryptosystem “must not require secrecy, and must be able to fall into the hands of the enemy without inconvenience.” Shannon would later tighten this in 1949 to the modern formulation that a cryptosystem’s security should rest on the secrecy of the key alone, on the assumption that “the enemy knows the system.” The insight either way is that every secret a system requires is a potential point of failure; secrecy is brittle; open design, with a small well-protected secret, is robust. Kerckhoffs’s principle in this combined form has been the foundation of serious cryptography ever since.

Saltzer and Schroeder extended the instinct to computer security in 1975. Their paper The Protection of Information in Computer Systems proposed eight design principles that still anchor the field. The first was economy of mechanism: keep the design as simple and small as possible. Their argument was practical — the fewer moving parts a security system has, the fewer places an attacker can find a weakness, and the fewer places a defender needs to audit to trust it.

Visibility-discipline extends the same instinct to observability. The security attack surface an organisation commits to defending should be the smallest one that does the job. Addition requires justification. Subtraction is the default.

This reframes the four principles as applications of a single older idea rather than novel inventions. Price observation explicitly — because every collection is a piece of mechanism that has to earn its place. Verify context before it reaches decisions — because every piece of untrusted context is mechanism the system trusts without having earned the right to. Preserve crown-jewel observation at full fidelity; discipline the rest — because economy of mechanism applies unevenly, and the rarely-valuable should be the first to go. Distinguish retrospective from real-time telemetry — because conflating them adds mechanism that defends nothing and exposes much.

None of this is an argument against observability. It is an argument that observability should be built the way cryptography was built: on a foundation small enough to trust.

The same reasoning extends beyond observability, and this is worth stating plainly. The security industry has spent two decades accreting controls the way it has accreted telemetry — each one justified in isolation, each one adding mechanism the organisation is now obliged to maintain, monitor, and defend. A modern enterprise security programme carries identity platforms, endpoint agents, network sensors, email gateways, DLP systems, CASB layers, SSE frameworks, a dozen or more SaaS security tools, and the integrations between all of them. Each was bought to solve a real problem. The aggregate is a surface so large that no single team understands it, no single audit can validate it, and no attacker needs to defeat all of it to find a way through.

Kerckhoffs and Saltzer-Schroeder would recognise the pattern. The same discipline applies: protect what matters, build on the smallest trusted surface you can, and treat every additional control as a piece of mechanism that has to earn its place against its full cost — operational, architectural, human, and adversarial. A security programme that adds ten controls and maintains them poorly is weaker than one that runs four controls well. This is not an argument for minimalism as an aesthetic. It is an argument that accumulation without discipline produces the same inversion in security architecture generally that it has already produced in observability specifically: surface grows, attention thins, adversaries find the gaps that the accumulation itself created.

This manifesto focuses on observability because that is where the inversion is sharpest, most recent, and most clearly demonstrable. The broader claim — that security programmes should be built the way cryptographic systems are built, on the smallest trusted foundation that does the job — is older than this manifesto and will outlast it. The four principles of visibility-discipline are one domain’s answer to an older question. The question applies everywhere.

What the principles change

The question the industry has been asking is: what else should we collect? These principles propose a different question: what are we already collecting that cannot justify its cost?

The first question has a market of vendors answering it. The second question does not. Organisations will have to develop the answer themselves, source by source, context by context, with the principles above as the test.

This is slow work. It is also the work that makes visibility-discipline real.

6. What Changes in Practice

The four principles are abstract. Their application depends on where an organisation sits — what it is defending, who is asking questions about it, and what adversaries it faces. Three contexts are worth describing in detail because they are where the manifesto’s audience actually works and where the tension between principle and practice is sharpest.

None of what follows is prescription. Each context has its own constraints, regulators, and histories that a document like this cannot account for. What the examples show is the shape of visibility-discipline when it meets real operating conditions.

Applicability matters before the examples, because the prescriptions in this section do not apply equally to every organisation. The sharpest fit is organisations with enough maturity to have accumulated observability debt — mid-to-large enterprises running multi-tier SIEMs, regulated firms with historical compliance-driven collection, and any organisation deploying agentic AI with persistent memory, regardless of size. For organisations still building their first detection capability, the marginal unit of additional telemetry is probably still positive, and the right response is the standard playbook — coverage, correlation, staffing — not this manifesto’s. By asset class, the prescriptions apply most clearly to general-purpose security telemetry (SIEM, XDR, data-lake ingestion) and to AI-agent context. They apply less cleanly to regulated record classes with specific retention obligations — transaction records, MiFID II communications records, HIPAA audit trails — where regulation sets the floor and the manifesto’s discipline applies only to what sits above that floor. By regulatory regime, the prescriptions survive engagement with outcome-oriented regimes (UK operational resilience under SYSC 15A, DORA’s framing layer) more cleanly than with prescriptive regimes (PCI-DSS logging, HIPAA audit trails), though in both cases the manifesto’s narrower claim — that defensive interpretation over-collects beyond what the text strictly requires — holds. A CISO reading this section should calibrate against their own position on these three axes before deciding what to act on.

In regulated financial services

The tension in regulated industries is not between collecting more and collecting less. It is between what regulation actually requires and what defensive compliance interpretation has made habitual.

DORA, NIS2, and comparable regimes are prescriptive on specific record classes — transaction records, incident logs, operational monitoring — and outcome-oriented at the framing level about overall resilience. DORA Article 9 and the RTS on ICT risk management require continuous monitoring and specify what must be logged, for how long, and how protected. These are real obligations. What the regulations do not do is prescribe the scale of general-purpose security telemetry, or equate defensive resilience with unbounded SIEM ingestion. A firm that can demonstrate impact tolerance with carefully-scoped logging — meeting the prescriptive record-class requirements, and meeting the framing-level outcome tests — satisfies the regulation as surely as one that collects everything and searches little.

The defensive interpretation that has come to dominate says the opposite. Collect everything, retain everything, search nothing, because a regulator asking a question you cannot answer is worse than a regulator asking a question you have to sift through a petabyte of logs to answer. This interpretation is understandable — it is the risk-averse choice when ambiguity sits between the firm and the regulator — but it contributes to the shadow-data surface the IBM Cost of a Data Breach Report documents. Compliance-driven telemetry, collected defensively, held in hot tiers indefinitely, and forgotten, joins other sources of unmanaged data — uploads to unsanctioned cloud services, forgotten production databases, personal drives — in producing the 35% of breaches where an attacker exploited data the organisation held but did not actively manage.

Applying visibility-discipline in a regulated context starts with separating two questions. What does the regulation actually require? What has the firm come to collect because “the regulator might ask”? The first question has specific answers, often narrower than the second. The second question has no answer, which is why it expands without limit.

The practical move is to engage with the regulator directly on what outcomes the firm is demonstrating, and what telemetry genuinely supports those outcomes. DORA, in its actual text, is more sympathetic to this conversation than most firms assume. The regulator is not the adversary the defensive interpretation imagines. This is a reading that should be tested with specialist counsel before any firm restructures its telemetry posture against it — supervisory expectations around evidence quality are real and they vary across jurisdictions — but the reading is defensible, and the conversation with the regulator is one that firms under-use.

A complication needs naming. Even when the regulatory text supports a more disciplined telemetry posture, no individual firm benefits from being the test case. The first firm to argue for reduced collection in front of its supervisor risks heightened scrutiny, longer dialogues, and the kind of supervisory attention firms work hard to avoid. This is a real collective-action problem, and it constrains what individual CISOs can plausibly do alone. The leverage to shift defensive interpretation across an industry probably sits with regulator-led clarification — supervisory statements, dear-CEO letters, RTS amendments that explicitly distinguish prescriptive record-class requirements from general telemetry scale — and with industry-coordination bodies that can argue the case collectively. Individual firms can still apply visibility-discipline within the spaces the regulation leaves to firm judgement, and they should. But the broader inversion of defensive over-collection in regulated industries will be unwound through regulator-led and industry-coordinated channels more than through individual firms’ supervisory dialogues. A CISO who recognises this can position the firm to benefit when those channels move, without taking on the disproportionate cost of being first.

This is harder than it sounds. Compliance teams have institutional reasons to prefer collecting more. Internal audit has institutional reasons to prefer collecting more. Legal has institutional reasons to prefer collecting more. A security leader arguing for disciplined reduction is arguing against the gravitational pull of three other functions whose incentives favour accumulation. The argument is winnable but it requires explicit sponsorship and a tolerance for discomfort that most security programs do not currently have.

In AI agent deployments

The tension in AI agent systems is different. There is no regulation yet pushing organisations to collect more agent telemetry. There is, instead, a prevailing vendor and research consensus that more observability is the answer to agent security risk. The pressure comes from the technology’s own trajectory.

The prescription organisations are receiving — instrument the agent comprehensively, log every interaction, build memory of every decision — is exactly the prescription this manifesto argues against. In agentic systems, observation and execution are coupled. Every context surface instrumented for defensive visibility is a context surface an attacker can use to manipulate the agent’s behaviour.

Applying visibility-discipline here starts with rejecting the default architecture. An agent deployment that treats all context as equally trusted, writes everything to memory, and retrieves aggressively is structurally vulnerable in ways that no amount of runtime monitoring will fix. The MINJA research demonstrated that the default architectures are exploitable at high rates with minimal adversary capability.

The alternative discipline has three characteristics. First, provenance is a first-order concern. Context entering the agent’s decision process carries a trust label derived from its source. Unattested context is either refused, quarantined, or consumed with explicit reduction in trust. Second, memory is tiered. Not every interaction earns persistence; the agent’s long-term memory becomes a designed artefact rather than an accumulator. Third, the agent’s observability is distinct from its context. The telemetry used to monitor the agent flows outward to separate systems; it does not re-enter the agent’s own decision loop.

These are not novel inventions. OWASP’s ASI06 guidance gestures at them. The MINJA defence literature proposes related mechanisms. What would be new is making them the default architectural discipline rather than a research afterthought.

The difficulty here is timing. Organisations are deploying agentic systems right now, at speed, with architectures optimised for capability rather than security. The visibility-discipline move — provenance tracking, tiered memory, separated observability — requires either rebuilding or inheriting that rebuild from upstream frameworks. Neither is fast. The organisations that apply the discipline early will carry less technical debt than those that don’t. The organisations that apply it late will carry the weight of compromised memory systems they cannot cleanly remediate.

In Security Operations Centres

The tension in the SOC is the most mature of the three, because the Jevons-paradox problem has been building there for two decades. Analysts, vendors, and CISOs all know the SOC is drowning. The response has been to add more tooling, which has made the problem worse.

Applying visibility-discipline in the SOC starts with honest accounting of what current collection actually produces. Most SOCs have never audited their telemetry sources against the first principle — what detection does this enable, what incident does this help investigate, what question does it help answer, and does its value exceed its full cost? That audit, conducted seriously, will typically find that a substantial fraction of ingested telemetry fails the test. The Graylog 2026 practitioner data suggests less than 15% of available sources are ingested, and even within that 15%, practitioner estimates place the share providing genuine investigative value at under 40%.

The practical work is to tier collection deliberately. Crown-jewel sources at full fidelity, long-tail operational telemetry at reduced fidelity or in cold storage, compliance-only logs routed to cheap retention rather than expensive ingestion. None of this is technologically novel; the pipeline tooling exists. What is novel is treating the tiering as a discipline the organisation applies systematically rather than as an optimisation project the SOC occasionally attempts when the SIEM bill arrives.

There is a second move, and it is harder. Much of what a modern SOC collects is retrospective rather than real-time. That distinction was collapsible in the human-adversary era; it is not collapsible now. The discipline is to separate the two budgets — to scope real-time collection tightly to the signals that can actually drive automated action at machine speed, and to hold retrospective telemetry cheaply in the background without pretending it provides defensive value.

The difficulty in the SOC is cultural. Security teams have been rewarded for coverage metrics, not for disciplined selection. Reducing telemetry volume feels, to a mature security program, like reducing security. The distinction this manifesto insists on is that reducing disciplined telemetry volume is reducing security; reducing undisciplined telemetry volume improves it. Making that distinction real inside the SOC requires leadership willing to be measured on outcomes rather than coverage, and analysts willing to let go of the comfort of seeing everything.

Each context is different, but the pattern inside each is the same. There is a default that has grown up around an older assumption about what observation is for. There is pressure to keep the default in place, whether regulatory, architectural, or cultural. And there is a version of visibility-discipline that applies but requires explicit leadership willingness to apply it against the grain.

The work the examples describe is not easy. It is not quick. It will not be done uniformly, or well, or at all in many organisations. What changes in practice is not a universal rollout; it is the emergence, inside a small number of organisations, of a disciplined account of what is being collected and why. Those organisations will carry less operational debt into the next decade than the ones that continue to accumulate without justification, and the difference will become visible when the next generation of compromises arrives.

What a security leader can do on Monday

The prescriptions above are strategic. Organisational change on telemetry posture, regulatory posture, or SOC automation authority takes quarters, not days — and the manifesto would be dishonest to pretend otherwise. But there is first-week work that sits under leadership authority and does not require cross-functional political lift. Five items:

Inventory, by detection use-case, what is being collected and what it is being used for. Run the exercise for one high-volume source category — Windows event logs, VPC flow logs, or Kubernetes operational telemetry. For each log class, name the specific detection rule, hunt query, or incident-response workflow it supports. Collections without a named consumer are candidates for tier demotion or removal. This is a few days of work with a SOC engineer and produces the evidence base for every later decision.
Separate real-time and retrospective telemetry budgets in internal reporting. Do not yet change contracts or architectures — just make the two line items visible to whoever approves the security budget. The act of separation makes the economic distinction Principle 4 describes visible at the point of decision, and the visibility does work on its own.
Require, for any new agentic system being deployed, a one-page context-provenance description. What sources feed the agent’s decision loop? What validates them? What happens if a source is compromised? This is Principle 2 applied at intake, with no supplier dependency and no architectural change. Systems that cannot answer the questions do not get deployed until they can.
Identify the three SIEM sources with the highest ingestion volume and the lowest linked-detection count. These are the first candidates for tiering, sampling, or archival demotion. The change can be scoped, piloted on a non-production deployment, and measured for false-negative impact before production rollout. Nothing in this item requires regulator engagement, vendor renegotiation, or org-chart change.
For the regulated-industry reader: draft, do not send, a one-page note to your Risk function laying out how carefully-scoped telemetry satisfies your regulatory obligations at least as well as maximal ingestion. The note is not for the regulator. It is the internal document that would need to exist before any supervisory conversation could responsibly begin. The discipline of writing it forces the firm to name its own assumptions. Many firms discover they do not know what their own posture actually rests on.

None of these actions prescribes an outcome. All of them produce information or structure that later decisions can rest on. Leaders who want to apply the manifesto without waiting for cross-functional alignment can start here.

7. What This Manifesto Claims, and What It Does Not

Manifestos are supposed to stake positions. This one does. Before it ends, it is worth making the positions explicit — both the ones I am defending and the ones I am not.

What I am claiming

Visibility-maximisation was a defensible security strategy under historical conditions. Telemetry was expensive enough that collection decisions were forced to justify themselves. Attackers operated at human speed, so retrospective review had real defensive value. Telemetry was passive — it sat in storage waiting to be useful, and it did not re-enter the decision process of the system it was observing. Under these conditions, collecting more was usually better, and the industry’s instincts served it well.

These conditions have changed. Storage is effectively free, so collection no longer forces its own justification. Attack windows have compressed, so retrospective telemetry loses defensive value even as its cost remains. Agentic AI systems consume their own observability as input to decisions, so collection and attack surface are now coupled in ways they never were before. The instincts built under the old conditions continue to produce the old responses, but the old responses now generate more cost than value at the margin.

The observer effect, as a classical physics principle, applies here. Observation requires interaction; interaction has consequences; consequences accumulate. Security telemetry was treated as a null operation — collected and stored and indexed and retained as if none of those acts cost anything — because under historical conditions the costs were small enough to ignore. They are not small enough to ignore any longer.

The four principles in section 5 are the minimum set I believe make visibility-discipline coherent. Price observation explicitly. Verify context before it reaches decisions. Preserve crown-jewel observation at full fidelity while disciplining the rest. Distinguish retrospective telemetry from real-time. These are not the only ways to practise the discipline, but I do not see how to practise it without them.

I am also claiming, more broadly, that the instinct these principles apply to observability is not confined to observability. Security programmes should be built the way cryptographic systems are built — on the smallest trusted surface that does the job, with every additional mechanism required to justify its place against its full operational, architectural, human, and adversarial cost. This claim is older than this manifesto; Kerckhoffs made it for cryptosystems, Saltzer and Schroeder made it for computer systems, and every serious security thinker since has restated it in one vocabulary or another. What is new is that the three conditions which made accumulation-without-discipline tolerable — cheap resources, slow adversaries, passive mechanisms — are all inverting at once, and not only in observability.

Cheap cloud identities made IAM systems accrete roles past anyone’s ability to audit them — every developer can now create a service principal in seconds, and the same economic logic that made log collection expand without bound has made role accumulation expand without bound, such that the trusted boundary of an enterprise’s identity system has become impossible to draw precisely and impossible to defend comprehensively. Compressed patching windows made vulnerability management programmes run faster than their ability to verify the fixes — the window between disclosure and exploitation has closed to days or hours, organisations patch at that speed to stay ahead, and the verification and regression-testing discipline that should accompany a patch is skipped under time pressure, producing a patch pipeline that moves faster than its own defensive guarantees. Passive controls are becoming active ones as AI-assisted attack tooling moves security machinery from “watched by humans” to “probed by machines” — tools designed under the assumption that an attacker would not systematically enumerate their behaviour are now routinely enumerated by automated adversarial tooling, which means the asymmetry on which those tools depended is gone. The same inversion is happening in three or four other domains; observability is simply where it is sharpest right now. This manifesto makes the case in that one domain and leaves the work in the others for the people closest to them, but the pattern is the same and the discipline will be too.

What I am not claiming

I am not claiming that observability should be abandoned. Observability remains essential. The argument is that observation must be priced, not that it must be reduced to zero. Crown-jewel systems deserve comprehensive visibility. Real-time signals that can drive automated response deserve the infrastructure to collect and process them. What I am arguing against is the treatment of observation as a default rather than a decision.

I am not claiming that every organisation has crossed the inversion point, and the strongest hostile reading of this manifesto attacks precisely this point. The reading is worth stating in full before answering. A hostile critic will say: “The author is inferring, from the existence of organisations that collect too much and do little with it, that the marginal collection has become net-negative. That inference is wrong. It is the badly-run SOC that is net-negative, not the additional log line. In organisations with mature detection engineering, proper tiering, real-time analytics, and threat hunting, more telemetry genuinely produces more defensive value. The inversion described here is happening in the bottom half of the industry, not the whole industry. The right response is to help the bottom half catch up, not to tell the top half to collect less.”

This critique is partly right and partly wrong, and the distinction matters. It is right that the inversion is not uniform across organisations. A well-instrumented bank with a mature SOC, disciplined detection engineering, and tight integration between telemetry and automated response is in a genuinely different position than an under-instrumented mid-sized enterprise whose SIEM ingests everything and investigates little. The IBM 2025 data showing that AI-augmented detection saves $1.9 million per breach applies disproportionately to the first kind of organisation, not the second. The manifesto’s four principles apply to both, but the threshold at which the principles bite sits in a different place for each.

It is wrong in its implicit empirical claim that the well-instrumented top tier is exempt from the inversion. The SIEM-era inversion is uneven across organisations. The agentic-era attack surface is not — even if current attack effectiveness varies with model, deployment, and defender capability. The observation-execution fusion in AI agent systems — the claim at the heart of section 2’s Inversion 2 and section 4 — is present structurally in any organisation deploying agentic AI with persistent memory and retrieval-augmented context. A bank with the best SOC in the world is not exempt from MINJA-class attacks on its agentic systems; its classical defences do not reach into the agent’s decision process. A frontier AI lab with the best detection engineering on Earth still has memory systems that can be poisoned through ordinary queries; the attack mechanism does not care how mature the surrounding SOC is. Current attack-success rates in the published research range from 95% under laboratory-adjacent conditions to roughly 20% to 40% under more realistic deployment assumptions, and that range will evolve on both sides as techniques mature. What is universal is that the surface exists and that classical maturity does not remove it. The hostile reading above is a defence of the top tier’s position on the classical axis. It does not defend the top tier’s position on the agentic axis, and that is the axis this manifesto is primarily about.

The practical implication follows. Organisations that have crossed the classical inversion — typically those with poorly-tiered SIEMs, accumulated compliance debt, and unexamined collection practices — should apply the four principles to reduce their classical telemetry surface. Organisations that have not yet crossed it — typically those with disciplined detection engineering and mature tiering — should apply the four principles to keep from drifting across it. All organisations, regardless of classical maturity, should apply the principles to any agentic system they deploy. The inversion line is not at the same place for everyone, but the direction the line is moving is the same for everyone.

The frame is the contribution. The measurement will develop as practitioners apply the frame and report what they find from their own contexts.

I am not claiming that the observer effect in security is identical to the observer effect in physics. Section 4 treats this directly — the argument rests on a structural analogy, not a physical identity, and the analogy carries because the compounding cost of observation behaves like a physical cost in the ways that matter to engineers. The reader who wants the full statement of what is and is not being claimed in that analogy will find it at the opening of section 4.

I am not claiming that the industry’s vendors are acting in bad faith. The vendors building SIEMs, XDR platforms, observability tools, and AI security products are, for the most part, solving the problem their customers are asking them to solve. The problem the manifesto names is upstream of the vendor market — it is the question the industry has been asking, not the answers vendors have been giving. Vendors will follow the question. When the question changes, so will the answers.

I am not claiming that the discipline this manifesto prescribes is costless to apply at the individual level. The political economy of security leadership is that no CISO has ever been fired for collecting too much data, and many have been fired for not having logs when an incident occurred. Visibility-discipline asks individual leaders to absorb private downside — career risk if an incident happens and the missing log was the one they cut — for distributed upside in the form of better industry posture. The compliance-asymmetry argument in Section 6 names this for regulated industries; the broader pattern applies everywhere. The manifesto’s response is not that individuals should ignore the asymmetry, but that the asymmetry is itself a symptom of how the industry has come to think about visibility — as a quasi-insurance product whose cost is paid by the organisation but whose career value accrues to the individual. The discipline this manifesto prescribes will become safer for individuals to apply in proportion to how widely the frame is accepted, which is part of why making the frame visible matters more than any single individual application of it.

I am not claiming that this manifesto is the final word. The argument has weaknesses I have named and others I have not yet identified. Honest critics will push back on the extension of the observer effect, on the empirical grounding of the inversion claim, on the political realism of reversing defensive compliance interpretation, on the timing of applying these principles in fast-moving agentic deployments. Each of these pushbacks deserves engagement. My position is that the frame is correct even where the specifics are debatable. If the frame is wrong, the specifics do not matter. If the frame is right, the specifics will be worked out by people who know more about their own contexts than I do.

What would falsify this manifesto

A position that cannot be falsified is not a position, and every manifesto owes its readers the test that would change its mind. Mine has four.

First, a sustained reversal of the dwell-time and breach-lifecycle trend lines — specifically, two or more consecutive years of declining median dwell time across M-Trends reporting, combined with breach-cost improvement distributed evenly across maturity tiers rather than concentrated in the top quartile — would cut hard against the detection-gap mechanism in section 3. If the industry can show, in the data and not in the marketing, that visibility-maximisation paired with AI-enhanced detection is producing broad-based defensive improvement, the claim that the classical inversion has been crossed becomes much harder to sustain.

Second, large-scale production deployment of agentic AI systems with persistent memory, paired with disclosed monitoring designed to detect memory-poisoning and context-injection compromise, and a sustained absence of disclosed incidents from organisations whose monitoring would have surfaced them, would cut against the observation-execution fusion argument. The current research base is strong on mechanism but thin on real-world incidents, and many incidents in this class will be undisclosed under NDA, classified, or buried inside incident-response retainers. The absence of public reports does not, on its own, falsify the fusion claim. What does falsify it is a year or two of broad agentic deployment with active detection programmes in place — programmes whose existence and scope can be verified — without the predicted compromise pattern emerging in their telemetry. If that condition is met, the fusion claim needs recalibration.

Third, a demonstrated defensive architecture that resolves the observation-execution fusion without applying something recognisable as visibility-discipline — without provenance verification, without trust-aware retrieval, without separating collection from decision input — would falsify Principle 2 specifically. If the problem turns out to be solvable by some other mechanism that this manifesto has not anticipated, the prescription is wrong even if the diagnosis is right.

Fourth, and most decisively: if practitioners who have demonstrably applied the four principles — with documented tiering decisions, provenance verification deployed in their agentic systems, and separate retrospective and real-time telemetry budgets — find that their SOCs perform worse on measurable outcomes than they did before applying them, or worse than published industry baselines such as M-Trends dwell-time medians or IBM Cost of a Data Breach lifecycle figures, then the discipline is wrong. The relevant outcomes are mean time to detect, false-negative rate on red-team exercises, and breach lifecycle duration. The manifesto is a claim about operational outcomes, and it lives or dies by what practitioners find when they try to apply it under conditions where the application itself can be verified — where “you applied them wrong” is foreclosed as a defence by the documentation requirement.

I do not expect any of these falsifications to hold. If I did, I would not have written the manifesto. But naming them is how a position distinguishes itself from a posture, and a manifesto that cannot be wrong is a manifesto that should not be trusted.

Where this leaves us

The security industry has spent twenty years treating observation as free. The conditions that made that treatment defensible have ended. The work ahead is to rebuild the discipline of deciding what to observe — not as a framework to be purchased, but as a habit of thinking that makes the cost of observation visible at the point of decision. The same habit, once built, will be needed in other domains of security architecture too.

Observation is not free. It never was. We could pretend it was when the costs were small; we cannot pretend any longer. What changes first is the question the industry asks itself. What changes after that is everyone’s to work out.

References

Independent research and peer-reviewed sources

Alahmadi, B. A., Axon, L., & Martinovic, I. (2022). “99% False Positives: A Qualitative Study of SOC Analysts’ Perspectives on Security Alarms.” USENIX Security Symposium 2022.
Devarangadi Sunil, B., Sinha, I., Maheshwari, P., Todmal, S., Mallik, S., & Mishra, S. (2026). “Memory Poisoning Attack and Defense on Memory Based LLM-Agents.” arXiv:2601.05504 (UMass CS690F course-project preprint; not peer-reviewed).
Dong, S., Xu, S., He, P., Li, Y., Tang, J., Liu, T., Liu, H., & Xiang, Z. (2025). “Memory Injection Attacks on LLM Agents via Query-Only Interaction” (originally submitted as “A Practical Memory Injection Attack against LLM Agents”). NeurIPS 2025.
Greshake, K., Abdelnabi, S., Mishra, S., Endres, C., Holz, T., & Fritz, M. (2023). “Not What You’ve Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection.” AISec ‘23.
IBM / Ponemon Institute. (2024). Cost of a Data Breach Report 2024. 604 organisations across 16 countries and 17 industries.
IBM / Ponemon Institute. (2025). Cost of a Data Breach Report 2025. 600 organisations across 16 countries and 17 industries.
Jiang, X., Yang, S., Yang, W., Liu, Y., & Ji, C. (2026). “Agentic AI as a Cybersecurity Attack Surface: Threats, Exploits, and Defenses in Runtime Supply Chains.” arXiv:2602.19555.
Jevons, W. S. (1865). The Coal Question: An Inquiry Concerning the Progress of the Nation, and the Probable Exhaustion of Our Coal-Mines. London: Macmillan.
Kerckhoffs, A. (1883). “La Cryptographie Militaire.” Journal des sciences militaires, Vol. IX, pp. 5–38 (January); pp. 161–191 (February).
Mandiant. (2024). M-Trends 2024: Our View from the Frontlines.
Mandiant. (2025). M-Trends 2025: Data, Insights, and Recommendations From the Frontlines.
Mandiant. (2026). M-Trends 2026: Data, Insights, and Strategies From the Frontlines.
Saltzer, J. H., & Schroeder, M. D. (1975). “The Protection of Information in Computer Systems.” Proceedings of the IEEE, Vol. 63, No. 9.
Shannon, C. E. (1948). “A Mathematical Theory of Communication.” Bell System Technical Journal, Vol. 27, pp. 379–423, 623–656.
Tariq, S., Chhetri, M. B., Nepal, S., & Paris, C. (2025). “Alert Fatigue in Security Operations Centres: Research Challenges and Opportunities.” ACM Computing Surveys, 57(9), Article 224. DOI: 10.1145/3723158.
Verizon. (2024). 2024 Data Breach Investigations Report. 30,000+ incidents, 10,000+ breaches, 94 countries.
Verizon. (2025). 2025 Data Breach Investigations Report. 22,000+ incidents, 12,000+ breaches.
Zou, W., Dong, M., Romero Calvo, M., Chang, S., Guo, J., Lee, D., Niu, X., Ma, X., Qi, Y., & Jiang, J. (2026). “Poison Once, Exploit Forever: Environment-Injected Memory Poisoning Attacks on Web Agents.” arXiv:2604.02623.

Standards, frameworks, and industry guidance

MITRE ATT&CK.
NIST Cybersecurity Framework 2.0.
Observer effect (physics). Wikipedia.
OWASP Top 10 for Agentic Applications 2026. OWASP GenAI Security Project.
W3C PROV Data Model.

Intellectual predecessor (cited with attribution)

Shortridge, K. (2024). “The Basics of Software Resilience and Security Chaos Engineering.” Sensemaking by Shortridge. Licensed under CC BY-NC-SA 4.0.

Practitioner and industry sources (used for corroborative data)

Darrington, J. (Graylog). (2026). “Why Your SIEM Ingest Costs Are Too High.” The Visibility Layer.
Forshtec. (2026). “How To Reduce SIEM Costs With Ingestion Controls.”
Microsoft Security Blog. (2026). “Observability for AI Systems: Strengthening visibility for proactive risk detection.”
Willison, S. (2023). “Prompt injection explained, with video, slides, and a transcript.” simonwillison.net.