Skip to content

Rendering Pipeline

The rendering stack is the largest subsystem in Web Engine Dev, spanning multiple packages across Layers 6 and 7. It follows a WebGPU-first strategy (ADR-003) with WebGPU-only runtime execution.

Architecture Overview

Package Responsibilities

PackageLayerResponsibility
render-graph6DAG scheduling of render passes, resource lifetime management
shader-compiler6WGSL tooling: preprocessor, shader variants, manifests, and optional transpilation utilities
renderer7Forward rendering, materials, lighting, shadows, post-processing, culling, GPU-driven rendering
gltf7glTF 2.0 / GLB loading with PBR materials and skinning
particles7GPU-accelerated particle systems
sprites72D sprite batching and animation
text7Text rendering (SDF and bitmap fonts)
terrain7CDLOD terrain with instanced rendering
gizmos7Editor gizmo rendering (translate, rotate, scale handles)

Device Abstraction

The renderer uses a unified GpuDevice interface with a WebGPU runtime backend. Device creation still uses preferredBackend: 'auto' for API compatibility:

typescript
import { createDevice } from '@web-engine-dev/renderer';

const { device, backend } = await createDevice({
  canvas: document.querySelector('canvas')!,
  preferredBackend: 'auto',
  powerPreference: 'high-performance',
});

console.log(`Using ${backend} backend`); // always 'webgpu'

Key Interfaces

InterfacePurpose
GpuDeviceGPU device abstraction (create buffers, textures, pipelines)
GpuBufferGPU buffer (vertex, index, uniform, storage)
GpuTexture / GpuTextureViewTexture resources and views
GpuRenderPipeline / GpuComputePipelineShader pipelines
GpuBindGroup / GpuBindGroupLayoutResource bindings

Device Capabilities

device.capabilities exposes validated WebGPU limits and feature flags such as:

  • maxTextureDimension2D, maxBindGroups, maxUniformBufferBindingSize
  • supportsTimestampQuery, supportsRenderBundles, supportsDeferredRendering
  • supportsTextureCompressionBC / supportsTextureCompressionETC2 / supportsTextureCompressionASTC
  • isMobile, isSafari

If WebGPU is unavailable, createDevice() throws an explicit error.

Forward Rendering Pipeline

The core rendering path uses a forward renderer with the following frame structure:

Frame Execution Order

  1. Shadow Pass -- Renders shadow maps for all shadow-casting lights (cascaded shadows for directional lights, cubemap shadows for point lights, perspective shadows for spot lights)
  2. Forward Pass -- Renders all visible objects with full lighting, sorted by render layer:
    • Opaque (front-to-back, minimizes overdraw)
    • Alpha Test (front-to-back with alpha-to-coverage when MSAA is enabled)
    • Transparent (back-to-front for correct blending)
  3. Post-Processing -- HDR effects pipeline (bloom, tonemapping, FXAA/TAA, SSAO, DOF, color grading)
  4. Output -- Final composited frame to the canvas

Render Queues and Sorting

Objects are sorted into render queues based on their material's blend mode:

Render LayerSort OrderDepth WriteUse Case
Default (Opaque)Front-to-backYesSolid geometry
AlphaTestFront-to-backYesCutout materials (foliage, fences)
TransparentBack-to-frontNoGlass, water, particles

Within each layer, objects are further sorted by material to minimize GPU state changes (bind group switches).

Render Graph

The render-graph package provides a DAG-based render pass scheduler. Passes declare their inputs, outputs, and dependencies, and the graph resolves execution order and resource lifetimes automatically:

typescript
import { RenderGraph } from '@web-engine-dev/render-graph';

const graph = new RenderGraph(adapter);

graph.addPass('shadow', {
  attachments: { depth: shadowTexture },
  execute: (encoder) => shadowPass.render(encoder, scene),
});

graph.addPass('forward', {
  attachments: { color: colorTexture, depth: depthTexture },
  dependencies: ['shadow'],
  execute: (encoder) => forwardRenderer.render(encoder, scene),
});

graph.addPass('postprocess', {
  attachments: { color: outputTexture },
  dependencies: ['forward'],
  execute: (encoder) => postProcess.apply(encoder, colorTexture),
});

graph.execute();

Benefits

  • Automatic dependency resolution -- Passes execute in the correct order based on declared dependencies
  • Resource lifetime management -- Transient textures are allocated only when needed and released after use
  • Parallel pass execution -- Independent passes can execute concurrently (when supported by the backend)
  • Backend-agnostic abstraction -- The graph API is backend-independent while the engine runtime uses WebGPU

Material System

Materials define the visual appearance of objects through shaders, properties, and textures.

Built-in Material Types

MaterialDescription
PBR StandardMetallic-roughness workflow (glTF 2.0 aligned)
PBR AdvancedStandard + clearcoat, transmission, sheen, iridescence, specular extensions
UnlitNo lighting calculations, solid color or textured

Shader Variant System

The renderer uses a template-based shader variant system. Base shader templates contain placeholders that are filled based on enabled features and material defines:

wgsl
{{DEFINES}}

struct VertexInput {
  {{VERTEX_INPUTS}}
}

@vertex fn vs_main(in: VertexInput) -> VertexOutput {
  {{VERTEX_TRANSFORM}}
}

@fragment fn fs_main(in: FragmentInput) -> @location(0) vec4f {
  {{FRAGMENT_PRE_LIGHTING}}
  // Lighting calculations...
  {{FRAGMENT_POST_LIGHTING}}
}

Features like skinning, normal mapping, and parallax occlusion are injected at the appropriate markers. The ShaderSystem manages variant compilation and caching.

Bind Group Layout

The renderer uses a 4-group layout to keep update frequency boundaries explicit and predictable:

GroupPurposeUpdate FrequencyContents
0CameraPer-frameView/projection matrices, camera position
1ModelPer-objectModel matrix, normal matrix
2MaterialPer-materialMaterial properties, textures, samplers
3LightingPer-frameLight array, shadows, IBL environment, fog

Draw calls are sorted by material (group 2) to minimize bind group switches.

Shader Compilation Pipeline

Shaders are authored in WGSL and compiled directly for WebGPU. The shader-compiler package provides preprocessing, variant tooling, manifest generation, and optional transpilation utilities for external workflows:

Compilation & Tooling Details

The shader compiler handles:

  • Preprocessing -- #define, conditionals, and #include expansion
  • Variant composition -- Feature-driven shader permutations
  • Manifest workflows -- Build-time generation and runtime cache integration
  • Optional transpilation utilities -- WGSL transform/transpile tooling for non-renderer workflows

Preprocessor

The shader preprocessor supports:

  • #define / #ifdef / #ifndef / #else / #endif for conditional compilation
  • #include "filename" for shared shader modules
  • Feature defines injected by ShaderSystem based on material capabilities and device limits

Lighting System

The renderer supports three light types with GPU buffer management:

Light TypeShadow TechniqueBuffer Size
DirectionalCascaded Shadow Maps (1-4 cascades)48 bytes
PointCube shadow maps48 bytes
SpotPerspective shadow maps64 bytes

Lights are uploaded to a single GPU buffer with a header indicating counts:

wgsl
// Light buffer layout
// Header: vec4(directionalCount, pointCount, spotCount, 0)
// Followed by: DirectionalLight[], PointLight[], SpotLight[]

Clustered Lighting

For scenes with many lights, the renderer supports clustered forward shading. The view frustum is divided into 3D clusters, and each cluster stores a list of affecting lights. This reduces per-fragment light iteration from O(N) to O(K) where K is the number of lights affecting the cluster.

Image-Based Lighting (IBL)

The IBL pipeline converts HDR environment maps into prefiltered cubemaps for PBR rendering:

Equirect HDR --> Source Cubemap (with mip chain)
                      |
       +--------------+--------------+
       |              |              |
  Irradiance    Prefiltered     BRDF LUT
  (diffuse)     (specular)     (integration)

Anti-firefly techniques follow the Filament reference implementation: K=4 LOD bias and soft HDR compression.

Post-Processing Effects

The post-processing pipeline processes HDR render targets through a chain of effects:

EffectDescription
BloomBright area glow with threshold and intensity control
TonemappingHDR-to-LDR conversion (Reinhard, ACES, Filmic)
FXAAFast approximate anti-aliasing
TAATemporal anti-aliasing with motion vectors
SSAOScreen-space ambient occlusion
SSGIScreen-space global illumination
Depth of FieldBokeh-based depth blur
Color GradingLUT-based color correction
VignetteScreen-edge darkening
CAS/FSRContrast-adaptive sharpening and AMD FidelityFX upscaling

Effects are composable and can be enabled/disabled at runtime:

typescript
import { PostProcessPipeline, BloomEffect, TonemappingEffect } from '@web-engine-dev/renderer';

const postProcess = new PostProcessPipeline(device, {
  hdrEnabled: true,
  effects: [
    new BloomEffect(device),
    new TonemappingEffect(device),
  ],
});

GPU-Driven Rendering

For large scenes, the renderer supports GPU-driven indirect rendering:

  1. Instance buffer -- All object transforms stored in a single SoA-layout buffer
  2. GPU culling -- Compute shader performs frustum and occlusion culling, writing visible instances to an indirect draw buffer
  3. Indirect draw -- A single drawIndexedIndirect call renders all visible instances of a given mesh/material combination

This reduces CPU-side draw call overhead from O(N) per object to O(1) per material batch.

ECS Integration

The renderer provides full ECS integration through components, resources, and systems:

Components

typescript
import {
  Transform3D,                  // Position, rotation, scale
  CameraComponent,              // Camera settings
  DirectionalLightComponent,    // Directional light
  PointLightComponent,          // Point light
  SpotLightComponent,           // Spot light
  MeshComponent,                // Mesh registry reference
  MaterialComponent,            // Material registry reference
  RenderableTag,                // Marks entity for rendering
} from '@web-engine-dev/renderer';

Systems

typescript
import {
  TransformPropagationSystem,   // Parent-child transform hierarchy
  FrustumCullingSystem,         // Visibility culling
  ForwardRenderSystem,          // Main render pass
} from '@web-engine-dev/renderer';

Registries

Meshes, materials, and textures are stored in registries (ECS resources) and referenced by ID from components:

typescript
// Register a mesh in the registry
const meshRegistry = world.getResource(MeshRegistryResource);
const meshId = meshRegistry.add(mesh);

// Reference it from a component
world.spawn(
  Transform3D.create({ position: [0, 1, 0] }),
  MeshComponent.create({ meshId }),
  MaterialComponent.create({ materialId }),
  RenderableTag,
);

WebGPU Best Practices

Buffer Alignment (std140)

All uniform buffers follow std140 layout rules. The most common pitfall is vec3f alignment:

TypeSizeAlignment
f3244
vec2f88
vec3f1216
vec4f1616
mat4x4f6416
wgsl
// Correct: pack f32 after vec3f to fill padding
struct ModelUniforms {
  position: vec3f,   // 12 bytes at offset 0
  intensity: f32,    // 4 bytes at offset 12 (fills vec3 padding)
  color: vec3f,      // 12 bytes at offset 16
  _pad: f32,         // 4 bytes at offset 28
}

Resource Lifecycle

GPU resources must be properly disposed to prevent memory leaks. The renderer supports deferred deletion -- old resources are queued for disposal 2-3 frames after replacement, allowing in-flight GPU commands to complete:

typescript
// Pattern for dynamic resource changes
const newTexture = await loadTexture(url);
deferredDeletionQueue.queue(() => oldTexture.dispose(), 3); // Dispose after 3 frames

Further Reading

Proprietary software. All rights reserved.