Orio 2.0
Research PreviewTeaching machines the physics of dreams.
A physics-prior world model for end-to-end 3D scene synthesis — learning gravity, collision, and spatial coherence from non-semantic data.
Retrieval-based generation fails at physical coherence.
Gravity Violation
Objects float mid-air with no support
Interpenetration
Meshes clip through floors and walls
Orientation Error
Facing vectors misaligned to scene
Material Discontinuity
Incoherent textures on co-located assets
Illumination Mismatch
Lighting ignores scene topology
These systems learn what scenes look like from labelled datasets. They never learn the physical laws governing why objects exist where they do.

From Chaos to Order
Orio 2.0 abandons labelled scene supervision entirely. The model trains on millions of stochastic, non-semantic spatial configurations — randomised object compositions with zero categorical labels.
From this distribution, the network disentangles physical invariants from visual noise: gravitational grounding, rigid-body non-penetration, support-surface inference, and contact-normal estimation.
These emerge as learned latent constraints — not hand-coded rules. The model generalises across arbitrary scene types without any domain-specific fine-tuning.

Dependency Graphs & View-Centric Inference
Scene layout is modelled as a directed acyclic graph (DAG) where nodes represent entities and edges encode support, containment, adjacency, and orientation coupling — resolved before geometry synthesis.
Spatial inference operates in view-centric coordinates. Placement is computed relative to the observer's frustum — "gathered around a focal point" is a geometric constraint on facing vectors and radial distance.
This eliminates the pathological failure case where objects are globally "correct" but visually incoherent from the camera's perspective.
Key Technical Contributions
Physics-Prior Learning
Gravity, non-penetration, and support emerge as learned latent invariants from non-semantic training data — not as post-hoc penalty terms or hand-authored rules.
Co-Generative Synthesis
Geometry, PBR materials (albedo, roughness, metallic), and lighting are jointly synthesised per-object in a single forward pass. Material coherence by construction.
Topology-Conditioned Placement
Object instantiation is conditioned on terrain height-field gradients, surface normals, and local curvature. Ground contact is guaranteed, not corrected.
Prompt → Physics → World
Scene Graph Extraction
Prompt parsed into a structured scene graph encoding entity types, spatial predicates, and physical affordances.
Constraint Resolution
DAG solver computes gravity vectors, support surfaces, collision bounds, and orientation coupling per node.
Co-Generative Synthesis
Per-node geometry, PBR BRDF parameters, and local illumination in a single joint forward pass.
Real-Time Instantiation
Synthesised world deployed to a real-time renderer with live physics and interactive object state.
“True AI resilience means having the patience to teach machines the physics of dreams.”— Orio Research Team