Why Physical AI deployments stall at the second integration — and what architectural decisions determine whether they scale
A practitioner's guide for architects, founders, and investors navigating the transition from proof of technology to scalable platform.
The binding constraint in Physical AI is not technical capability. It is governance architecture. The organisations most at risk are not the ones lacking technology — they are the ones scaling on the wrong foundation before it has been designed to carry the load.
Some technologies grow in parallel. Others begin to compound — their value becomes systemic, each layer increasing the dependency and the potential of the others. CompoundWorks works at three points where this is happening now.
Physical AI is not a sudden category jump. It's the latest stage of a longer journey: first, learning to model reality well enough to simulate it; then learning to instrument and structure it well enough to make it observable, computable and shareable across systems and users; and now, learning to make it operable for AI, robots and autonomous systems. That journey is why this paper focuses on Physical AI specifically — it's where compound innovation is currently most consequential, and where the architecture problem described here is most exposed. But the constraint underneath it isn't unique to Physical AI.
"Coordination and trust are the binding constraint when compound systems scale — at the scale of the company, and at the scale of the multi-agent system."
Inside a venture, that constraint shows up as technology outrunning organisation and trust — the imbalance the Scaling System Maturity Framework is built to diagnose. One scale up, across a system of systems, the same constraint shows up as capability outrunning the shared truth and governance required to coordinate it. This paper is that second view: the architecture of compound systems that have to coordinate in the physical world, not just inside one company.
The Compound Innovation Gap is not visible in a pilot. It becomes structural at the second integration — when the architecture built for demonstration meets the complexity of real multi-vendor, multi-site, multi-institution deployment.
Systems designed to prove capability are rarely designed to absorb change. Early architectural coupling becomes the ceiling for scale — invisible until the second product, the second site, or the second integration exposes it. The redesign that was manageable before commercial validation becomes a strategic recovery programme afterwards.
As technologies converge, no single team owns the full system. Who controls shared interfaces? Who arbitrates trade-offs across coupled subsystems? Who holds authority when autonomous systems from different vendors reach different conclusions about the same physical state? Governance becomes the defining constraint at the second organisation.
The founding team that built the technology is rarely the configuration that can operate it at scale. The governance structures, authority structures, and trust mechanisms required to run a Physical AI platform at scale are different in kind from those required to build it. They do not emerge organically. They have to be designed before the next scaling pressure makes their absence visible.
Most public conversations jump from sensors and robots to AI reasoning — as if intelligence alone bridges the gap. It does not. Between raw physical data and autonomous action sits a layer that is consistently unnamed in commercial discussions and chronically underinvested in practice.
"Most failures happen not inside the model but at the interfaces: state misalignment, compute bottlenecks, ambiguous human-machine authority, assurance gaps. In Physical AI, maturity begins at the architecture level."
The operational twin is Layer 2 — not a digital twin in the marketing sense (a 3D visualisation that updates periodically). It is a computational replica of physical reality: geometry that is structured, versioned and queryable; physics that is predictive rather than decorative; temporal consistency that allows systems to reason about what is happening now and what is likely next. This is the layer where 3D stops being visualisation and becomes infrastructure.
Military command and control has spent decades chasing exactly this with the Common Operational Picture: a shared, authoritative, real-time representation of the battlespace every actor is meant to coordinate through. Even there, with enormous institutional investment, a genuinely shared picture remains more aspiration than achieved fact — laggy, partial, and contested across echelons and coalitions. The operational twin is the industrial heir to that same ambition, not a borrowed solution — which is exactly why Physical AI ventures should expect this to be hard, not a bolt-on.
The clearest way to understand it is through the fighter pilot. A fighter pilot is the original real-time operational twin: perceiving, computing meaning, and acting in one embodied loop, with the governance layer — authority, priority, rules of engagement — already internalised, so there is no arbitration problem. The pilot is the governance layer.
Physical AI must recreate that loop across systems that do not share sensors, models, or assumptions. That is precisely what turns coordination into a governance problem: who acts and when, what happens when systems disagree, and how authority resolves without a single point of control.
A mobile robot reads a corridor as clear; a safety system flags it as restricted; a digital twin shows a technician at a previous location; a wearable shows that technician walking into it; one robot predicts the human will yield; the human assumes the robot will stop. Several systems, one corridor, several incompatible versions of reality. The question is not which model is most intelligent — it is which version governs action.
World models belong to the cognitive layer. They help each agent understand and predict its own local environment. They are necessary — but not sufficient. A better world model inside each agent does not produce coordination between agents.
The operational twin federates local world models, arbitrates their conflicts, and reconciles them into shared spatial truth the whole environment can act on. It is a shared cockpit — not to centralise every decision, but to make distributed autonomy safe, explainable, and governable.
The most visible work in Physical AI today is racing to give individual agents better internal models. That work is real and necessary. But it is agent-centric. It does not say what happens when many such agents act in the same space at the same time. The shared operational layer above them is still being built — and may be the harder problem to solve.
These are not isolated events. They are a recurring signal set drawn from direct operational experience across multiple ventures and deployment domains. Founders and investors who can recognise them early have a meaningful advantage over those who encounter them for the first time under commercial pressure.
Early architectural decisions made for speed accumulate coupling — implicit contracts between components that make the system work now and make it expensive to change later. When the system is extended to a second customer, a second site, or a second vendor, those implicit contracts become explicit constraints.
Integration cost that grows non-linearly. If each new deployment requires bespoke integration effort roughly equivalent to the first, the architecture is absorbing change through friction rather than through design. This is not a resourcing problem — it is an architectural one. It does not respond to more engineers or faster iteration. It responds to re-architecture.
Analysis of traditional robotics deployments shows roughly 75% of total cost of ownership tied to initial setup and reengineering. Software-defined architectures designed for reconfiguration from the start can reduce those costs by up to 50%. The difference is the cost of getting the architecture wrong in the first place.
There is a precise distinction between two types of interoperability that are frequently conflated. Syntactic interoperability — the ability to exchange data — is largely solved (REST APIs, DDS, MQTT, OPC-UA). Semantic interoperability — the ability to exchange meaning — is not. When System A reports a sensor reading and System B reads it, do they agree on units, coordinate reference frame, temporal context, ontological category?
The cost of making systems agree on what they are observing — not the cost of connecting them technically, but the cost of semantic reconciliation: mapping coordinate systems, aligning ontologies, negotiating temporal reference frames. This effort does not decrease with scale. It compounds.
A 2024 review of spatial interoperability research concluded that semantic interoperability remains unsolved at scale. The IEEE ratified its first Spatial Web standard (P2874) in 2025 — more than a decade after the problem was clearly identified in multi-stakeholder simulation and IoT environments. The gap between the semantic standard and the operational deployment reality has been persistent across every technology generation.
When no open standard exists to carry meaning reliably between systems, the gap is filled by whoever moves first with a working implementation. This is how platform dependency becomes structural. The missing coordination layer becomes a product, and the product becomes dependency. I've spent much of my career inside this exact problem — not reading the history, but contributing to it: coordinating heterogeneous systems in live military environments, not only simulators, through standards efforts including DIS, HLA and CBML, each built to close one layer of the interoperability problem while leaving the next layer open. Twenty-five years and several standards generations later, real-time coordination across heterogeneous systems is still genuinely hard — even for the most advanced military forces with the deepest investment in solving it. That's the clearest evidence available that Physical AI's version of this problem won't be closed by an API either.
A platform is not genuinely interoperable because it exposes an API. It is interoperable when independent systems can coordinate in real time through defined, stable, and governed execution-layer contracts. OpenUSD advances the descriptive layer of the Physical AI stack. What remains undefined is the runtime coordination protocol: how autonomous systems exchange spatial state, negotiate authority, and resolve conflicts across vendors and physical environments in real time.
At deployment scale, this becomes lock-in. At ecosystem scale, fragmentation. For governments, defence programmes, and critical infrastructure operators, it becomes a sovereignty issue: dependence on private or foreign platforms for the operational truth of a factory, port, city, or defence environment is not simply a procurement risk. It is a strategic vulnerability.
XR and digital twin initiatives consistently stall not because the interface is wrong, but because the spatial model beneath the interface is fragmented, inconsistent, or not updated in real time. The headset is the visible failure point. The invisible cause is the absence of a shared spatial abstraction layer that is physics-consistent, synchronised across systems, and trustworthy enough for autonomous systems to act on.
The limiting factor in XR deployments has rarely been the quality of the headset or the resolution of the visualisation. It has been the integrity and synchronisation of the underlying spatial model. Trust in the interface is downstream of trust in the spatial layer — and that is ultimately a governance question, not a rendering one.
When the spatial layer is coherent — geometry structured and versioned, physics predictive rather than decorative, temporal consistency maintained across systems — XR scales. When it is fragmented, every deployment becomes a custom integration project, and the fragmentation propagates upward into every decision made above it.
Compound deep-tech systems generate coordination load that grows non-linearly with scale. A founding team that operates with speed and coherence at ten people finds the same operating model becomes the bottleneck at fifty, across multiple customer environments and vendor relationships. Authority paths that worked when the founder was in every decision become invisible ceilings.
Heroic delivery: every major commitment met through exceptional individual effort rather than reliable system execution. The question is not whether the company can deliver once — it probably can. It is whether delivery depends on the system or on a small number of irreplaceable people. At scale, only the former is sustainable.
This is not a people problem. The founding teams in this space are almost always technically brilliant and deeply committed. It is a system design problem: the organisation was built to prove the technology, not to operate the platform. The governance structures required to run a Physical AI platform at scale are different in kind from those required to build it. They do not emerge organically from growth.
Every founder building in Physical AI faces a version of the same strategic choice — even if they have not framed it explicitly. The choice is not between good technology and bad technology. It is between two architectural postures.
Control every layer of the stack to maximise coherence and performance within a managed ecosystem. This path has produced genuinely important infrastructure — NVIDIA Omniverse, OpenUSD, Siemens domain depth. These are real contributions that have accelerated the technology layer substantially.
The structural dynamic is independent of the quality of the technology it produces. Platform vendors are commercially incentivised to maximise adoption of their own infrastructure — which is compatible with excellent technology, but less compatible with the neutral execution layer that would allow any vendor's systems to coordinate without depending on a single provider's runtime.
I spent over a decade contributing to the standards built to solve exactly this — DIS, HLA, CBML and the broader effort to make heterogeneous military systems, not only simulators, interoperate in real time. Each closed one layer and left the next one open; none of them, even combined, fully solved real-time coordination across coalition forces and legacy systems. That's not a historical footnote — militaries with the deepest investment in this problem still find it hard today. Physical AI is approaching the same structural question with far less institutional investment behind it.
Build on shared standards and ecosystem partnerships where each participant amplifies the others' capabilities. This path scales faster in multi-stakeholder environments because it mirrors how intelligence grows — through connection, not isolation.
It requires something pure technical openness does not provide: a governance layer that is neutral, interoperable, and operationally defined. Open scene graph descriptors are necessary but not sufficient. What the open convergence path requires — and what is still absent at the Physical AI layer — is a defined wire protocol for the execution layer: standardised, open, not controlled by any single vendor.
Not choosing between these paths. It is learning to build open systems with the coherence of integrated design — and developing the governance architecture that makes that combination operable at scale. Founders who engage in pre-competitive standards processes as participants — not observers — have the opportunity to shape the governance architecture of the next decade rather than inherit the one that emerges without them.
"The future will not belong to the best single stack. It will belong to the best-orchestrated ecosystem. And ecosystems require governance architecture, not just open APIs."
Strategic decisions on platform partners, data strategy, and in-house governance capabilities must be made within the next twelve to twenty-four months. After that window, the first generation of Physical AI platform leaders will be established and the cost of redesigning an architecture after commercial validation will rise sharply.
The most expensive mistake in Physical AI is not technical failure. It is commercial success on top of the wrong architecture. A system that works in a pilot can survive for surprisingly long while accumulating the coupling and governance gaps that will later make scale expensive. The danger is often highest when momentum feels strongest.
Requires explicit definitions of authority over shared operational state, update logic, and conflict resolution — not assumptions embedded in implementation choices that become impossible to surface under commercial pressure.
A deployment that works under single-vendor conditions says very little about whether the system can coordinate across organisational and technical boundaries. The architectural test is the second integration, not the first.
If the infrastructure on which the system depends does not expose a defined and durable wire protocol, part of the core coordination logic of the platform remains outside architectural control. Open APIs are necessary but not sufficient — the same lesson that has made real-time military coordination hard for decades applies directly.
Not assumed to emerge from growth. The redesign that is manageable before the next funding round becomes progressively more disruptive once embedded in delivery commitments and customer-specific integrations.
The most common error in this domain is not moving too fast. It is growing around design choices that were never made explicit. Legibility is the prerequisite for the architectural conversation that must happen before scale locks in the wrong foundation.
The central investor error in Physical AI is to confuse proof of capability with proof of system readiness. A company may have impressive deployments and still be structurally unready for scale. The questions that reveal the difference are rarely about the model itself — they are about the system around it.
Was the platform designed for multi-vendor, multi-environment operation, or for a single managed context? Visible in integration approach, governance documentation, and whether the company can describe its second-customer deployment architecture with the same confidence as its first.
Does the company rely on cloud compute, simulation orchestration, or spatial coordination infrastructure where the wire protocol is proprietary and undefined? If so, the switching cost at scale is not a licensing fee — it is an architectural redesign that will arrive at the worst moment, when commercial momentum is highest.
Does the company deliver reliably without founding-team involvement, or does every major commitment still require personal heroics? This is the organisational equivalent of the wire protocol question. Implicit coordination through individual knowledge is the organisational analogue of a proprietary execution layer — and it has the same switching cost at scale.
How does the architecture handle a second customer with a different sensor suite? What is the authority model for the shared spatial layer? How are conflicts resolved when two autonomous subsystems reach different conclusions about the same physical environment? Does the execution layer depend on a proprietary wire protocol? What happens to delivery quality when the founding team is not present?
If Physical AI pilots are technically successful but not converting into scaled deployments, the cause is almost certainly not the technology. The technology works in the pilot because the governance constraints of scale are absent: one vendor, one site, one data model, one update cycle. The fracture appears when the deployment is extended.
Without clear ownership of the shared model of physical reality, safety responsibility, compliance accountability, and operational decision rights cannot be assigned with confidence. This is the first conversion blocker — and it is almost always invisible in the pilot phase.
What looked affordable in the pilot becomes prohibitive once multi-vendor coordination is included in the actual cost of operation. The bespoke integration effort per deployment signals that the architecture is absorbing the interoperability gap through manual effort rather than through design.
Procurement committed to a semantic interface without interrogating whether the execution layer has an open, durable transport protocol. An open API is not the same as an open wire protocol. The history of distributed simulation makes the consequence clear: the semantic layer remains nominally open, the transport layer becomes a vendor moat, and the switching cost is ultimately borne by the operator.
Require explicit documentation from vendors covering the execution-layer interaction model and the wire protocol on which interoperability actually depends — before deployment decisions harden dependencies that become structural constraints. This question should be asked at specification, not at post-deployment review.
The Compound Innovation Gap is the architectural lens. The Scaling System Maturity Framework is the companion diagnostic: it assesses whether the company behind the technology is structurally ready to execute the architectural transition this paper describes — across Technology, Organisation and Trust simultaneously. The architectural problems in this paper do not remain confined to the technical stack. They eventually reappear as delivery failures, coordination breakdowns and trust erosion. The SSMF makes that transition visible before pressure forces it.
Posts and short essays as the practice develops — the same arguments worked out in public, before they're settled enough for this page.
Five systems read the same corridor differently — one sees a clear path, another flags it restricted, a digital twin shows a technician who's already moved on. A world model asks what might happen next. An operational twin asks what's happening now, and what's allowed for everyone.
Read on LinkedInThe launch of Samsung Galaxy XR, co-developed with Google and Qualcomm, marks a turning point in spatial computing. Beyond the specs, it reveals two fundamentally different strategies for innovation in the age of convergence — Apple's deep integration against the open coalition's bet on convergence.
Read on LinkedInIs VR dead? Wrong question — it assumes VR was the product. The headset is not the infrastructure; the real-time spatial model is. Companies that position as content providers will compete in cycles. Companies that position as spatial infrastructure providers will shape the stack.
Read on LinkedInThis paper is the entry point to the Architect door: R&D roadmap, product and platform architecture, where the category is converging, what to build for the convergence that's coming rather than the one that's already here.