Data-Oriented Design
Web Engine Dev adopts Data-Oriented Design (DOD) as its core architectural principle. This page explains what DOD is, why it matters for a web game engine, and how it manifests throughout the codebase.
ADR Reference
This design is formally documented in ADR-007: Data-Oriented Design.
Why Data-Oriented Design?
Traditional web game engines use object-oriented patterns: class hierarchies for game objects, scattered heap allocations, and per-object update loops. In JavaScript, this approach creates several problems:
| Problem | OOP Impact | DOD Solution |
|---|---|---|
| Cache misses | Objects scattered across heap | Components in contiguous typed arrays |
| Entity capacity | ~1,000-5,000 objects | 50,000+ entities |
| Serialization | JSON.stringify per object (~20ms/1000) | Typed array bulk copy (~0.5ms/1000) |
| Parallelism | Shared mutable objects | Systems with non-overlapping data access |
| GPU uploads | Per-object uniform writes | Batch typed array uploads |
DOD flips the design: instead of organizing code around objects and their behaviors, you organize it around the data and the transformations applied to it.
Structure of Arrays (SoA) vs Array of Structures (AoS)
The key data layout decision in DOD is how component data is stored.
Array of Structures (AoS) -- Traditional OOP
// Each entity is an object with all its properties
const entities = [
{ x: 1.0, y: 0.0, vx: 0.5, vy: 0.0 },
{ x: 2.0, y: 1.0, vx: -1.0, vy: 0.5 },
{ x: 3.0, y: 0.0, vx: 0.0, vy: 9.8 },
];
// Updating positions loads entire objects into cache
for (const e of entities) {
e.x += e.vx * dt;
e.y += e.vy * dt;
}Problem: Iterating positions also loads velocities, health, names, and every other field into the CPU cache line. With 500+ byte objects, most of the cache is wasted on data the current system does not need.
Structure of Arrays (SoA) -- Data-Oriented
// Each field is a separate contiguous array
const positionX = new Float32Array([1.0, 2.0, 3.0]);
const positionY = new Float32Array([0.0, 1.0, 0.0]);
const velocityX = new Float32Array([0.5, -1.0, 0.0]);
const velocityY = new Float32Array([0.0, 0.5, 9.8]);
// Updating positions accesses only the data it needs
for (let i = 0; i < count; i++) {
positionX[i] += velocityX[i] * dt;
positionY[i] += velocityY[i] * dt;
}Benefit: The CPU prefetcher can predict sequential memory access. Only relevant data occupies cache lines. With typed arrays, the data is ~104 bytes per entity versus ~500+ bytes per object in OOP patterns.
ECS Archetype Storage
Web Engine Dev uses archetype-based storage (ADR-001) as the concrete implementation of SoA principles. Entities with the same set of components share an archetype, and each archetype stores data in columnar (SoA) layout.
How Archetype Storage Works
Component signature -- Each unique combination of component types defines an archetype. Entities with
[Position, Velocity]are in a different archetype than entities with[Position, Velocity, Health].Columnar tables -- Each archetype owns a table where each component field is stored as a separate
Column(a typed array). Entities in the same archetype are stored contiguously.Archetype transitions -- When a component is added or removed, the entity moves to the appropriate archetype. An edge cache makes repeated transitions O(1).
Query matching -- Queries use bitmask matching to find all archetypes containing the required components. This is an O(1) operation per archetype.
Benefits for Game Engines
| Capability | How SoA/Archetypes Enable It |
|---|---|
| Cache-friendly iteration | Components of matching entities are contiguous in memory |
| Fast queries | Bitmask matching finds archetypes in O(1), then iterates dense arrays |
| Parallel execution | Systems with non-overlapping component access can run concurrently |
| GPU batch uploads | Typed arrays can be uploaded to GPU buffers directly |
| Network delta encoding | Diff component arrays for minimal network traffic |
| Deterministic replay | Replay command stream against typed array state |
Systems as Data Transformations
In DOD, systems are pure functions that iterate over component data. There are no class hierarchies, no virtual dispatch, and no hidden this context:
const MovementSystem = defineSystem({
query: [Position, Velocity],
run(world, entities) {
const dt = world.getResource(Time).delta;
for (const entity of entities) {
const pos = world.get(entity, Position);
const vel = world.get(entity, Velocity);
world.insert(entity, Position, {
x: pos.x + vel.x * dt,
y: pos.y + vel.y * dt,
});
}
},
});This design makes systems:
- Testable -- Pure input/output with no hidden state
- Composable -- Systems combine freely without class coupling
- Parallelizable -- Systems with non-overlapping component access can run on separate threads
Single Source of Truth
All game state lives in the ECS World. There are no parallel state stores, no shadow copies, and no synchronization bugs:
This means:
- Network serialization reads component arrays directly for delta encoding
- Save/load serializes the World state in a single pass
- Deterministic replay replays a command stream against the World
- Debugging inspects one data store, not scattered object graphs
Performance Implications
Memory Efficiency
| Metric | OOP (AoS) | DOD (SoA) |
|---|---|---|
| Bytes per entity | ~500+ | ~104 |
| Cache utilization | ~20% (loads unused fields) | ~90%+ (only relevant data) |
| Max entity count (browser) | ~5,000 | 50,000+ |
Serialization Speed
| Operation | OOP (JSON.stringify) | DOD (typed array copy) |
|---|---|---|
| 1,000 entities | 5-20ms | 0.1-0.5ms |
| 10,000 entities | 50-200ms | 1-5ms |
GPU Integration
SoA layout enables batch GPU uploads. Instead of per-object uniform buffer writes, the renderer reads component arrays and uploads them as contiguous buffer data:
// SoA transform data is already in the right format for GPU upload
device.queue.writeBuffer(
instanceBuffer,
0,
worldMatrixArray.buffer, // Float32Array from ECS archetype table
0,
entityCount * 64 // 64 bytes per mat4x4f
);Tradeoffs
DOD is not without costs:
| Tradeoff | Impact | Mitigation |
|---|---|---|
| Steeper learning curve | Developers familiar with OOP need to think differently | Comprehensive documentation, progressive disclosure API |
| Component add/remove overhead | Entity moves to a new archetype table | Edge caching makes repeated transitions O(1), batch transitions for 3+ components |
| Less intuitive debugging | Data in arrays, not named object properties | Debug bridge provides entity inspection via MCP tools |
| Third-party integration | Libraries expect objects, not typed arrays | Facade/adapter layers (ADR-004) |
DOD in Practice
The DOD philosophy extends beyond the ECS to influence the entire engine:
- Math library -- Provides
BatchMathfor SIMD-friendly bulkFloat32Arrayoperations alongside the object-orientedVec3/Mat4API - Renderer -- Uses SoA layouts for GPU-driven indirect rendering, instanced draw batching, and frustum culling over contiguous position/bounds arrays
- Physics -- Rapier (Rust/WASM) internally uses SoA storage; the adapter reads ECS component arrays directly
- Netcode -- Delta encoding operates on typed array diffs rather than object comparison
- Scheduler -- Topological system ordering with conflict detection enables parallel execution on systems with non-overlapping data access
Further Reading
- ADR-001: Archetype-Based ECS -- Why archetypes over sparse sets
- ADR-007: Data-Oriented Design -- Formal decision record
- ECS Concepts -- Practical guide to entities, components, systems, and queries
- System Architecture -- How DOD fits into the overall engine design