GPU

Adapter info, frame stats, raycasting against the 3D scene, screen-to-world projection, and storage buffer plumbing for shaders.

local GPU = import("GPU")
print(GPU.Info().Name, GPU.FrameStats().FPS)

Adapter / frame info

GPU.Info() -> GPUInfo

Static adapter info. Cached, calling repeatedly is free. Useful for logging, settings menus, and gating high-cost features behind DeviceType == "DiscreteGpu".

GPU.Limits() -> GPULimits

Hard limits the active GPU device reports. wgpu doesn't expose total VRAM portably across DX12 / Vulkan / Metal, MaxBufferSize is the closest single number for "biggest thing I can allocate".

GPU.FrameStats() -> GPUFrameStats

FPS / frame time / scene part count. Updated once per heart tick. Cheap; safe to poll every frame for a debug overlay.

Raycasting

GPU.Raycast(origin, direction, filter?, maxDistance?) -> RaycastHit?

GPU-dispatched ray test against every renderable BasePart. One compute thread per part runs in parallel: cubes test as oriented bounding boxes (CFrame + Size), spheres as ellipsoids (CFrame + Size), and models with mesh data test per-triangle via Möller-Trumbore in part-local space, so corners and concavities hit precisely instead of bounding boxes. Models without mesh data fall back to OBB. Parts with Render = false or IgnoreInRaycast = true are skipped. Falls back to a CPU loop only if the GPU has not been initialized (i.e. no window opened yet).

Parameters

originVector, world-space ray start.
directionVector, world-space ray direction. Length doesn't matter; Ruzit normalizes it. Zero-length errors.
filterOptional (part: BasePart) -> boolean. Hits visited nearest-first; return true to accept, false / nil to skip past. If omitted, the first hit is always accepted.
maxDistanceOptional cap in world units. Hits past this distance are ignored. Defaults to ~1e6.

Example: pick a part under the mouse, ignoring red ones

local Mouse = import("Mouse")
local origin, dir = GPU.ScreenToRay(Mouse.Position)
local hit = GPU.Raycast(origin, dir, function(part)
    return part.Color ~= Color3.new(1, 0, 0)
end)
if hit then
    print("clicked", hit.Part, "at", hit.Position, "distance", hit.Distance)
end

Projection

GPU.ScreenToRay(point) -> (Vector, Vector)

Convert a screen pixel (Mouse.Position style) to a world-space ray. Returns (origin, direction), feed straight into GPU.Raycast. Uses the active Renderable.Camera as the eye + projection.

GPU.WorldToScreen(world) -> (Dim?, boolean)

Project a world-space point onto the screen. (Dim, false) when the point is in front of the camera; (nil, true) when at or behind the near plane. Useful for billboards, world-anchored UI, off-screen indicators.

Storage buffers

Read-only data accessible from any 3D fragment shader. Bound at @group(0) @binding(4) as SDATA: array<f32>.

GPU.NewBuffer(size) -> GPUBuffer

Allocate a GPU storage buffer of size floats. Memory lives on the GPU. Freed when the GPUBuffer userdata is garbage-collected.

GPU.SetBuffer(buffer)

Bind a GPUBuffer as the active SDATA storage. Cheap to swap between frames, bind groups are rebuilt only when the binding identity changes.

GPU.ClearBuffer()

Unbind the active storage buffer. SDATA reverts to a 1-float stub so shaders that don't reference it keep working.

GPU.SetMaxFrameRate(fps?)

Cap or uncap the heart-loop tick rate. Pass nil (or 0) to remove the cap and run as fast as the system allows; pass a positive number to throttle. The engine ships uncapped — frames run as fast as the renderer + game logic allow, bounded only by the swapchain present mode (see SetVSync). The cap is applied by sleeping at the end of each tick if the frame finished early, so it has no effect on a system that can't keep up. Cheap to call live (atomic store), wire it to a settings menu freely.

GPU.GetMaxFrameRate() -> number?

Current cap, or nil when uncapped.

GPU.SetVSync(on)

Toggle vertical sync on the swapchain. Default is OFF — the engine picks the highest-FPS tear-free mode the driver offers (Mailbox > Immediate > FifoRelaxed > Fifo). Turning VSync on hard-caps presents to the monitor's refresh rate (Fifo / FifoRelaxed). Use it for laptop battery life or to eliminate tearing on monitors that don't expose a tear-free uncapped mode. Takes effect on the next rendered frame.

GPU.GetVSync() -> boolean

True if the swapchain is currently in a VSync mode.

GPU.SetPowerMode(mode)

Set the GPU power-mode preference. Three options:

  • "Quality"default. Requests the HighPerformance adapter (discrete GPU on dual-GPU laptops) and biases the driver toward high clocks. Best for shooters / action games where you want the highest sustained FPS the hardware can give you.
  • "Performance" — despite the name, the battery-friendly mode. Requests the LowPower adapter (integrated GPU when present) and lets the driver downclock aggressively. Use for lightweight / stylized games where you'd rather extend laptop battery life than push FPS.
  • "Auto" — whatever the driver picks when no preference is given. On laptops that often means LowPower; on desktops it usually means the only available adapter.
Adapter selection happens once, at startup. Calling SetPowerMode after the window opens stores the preference (so GetPowerMode returns it), but doesn't switch the live adapter. To actually change the adapter, call this before opening the window (or set it from a startup script the launcher runs).

GPU.GetPowerMode() -> "Quality" | "Performance" | "Auto"

Current power-mode preference. Default is "Quality".

GPU.GetBounds(part, quality?) -> { { Position, Size, Min, Max } }

Return one or more world-space AABBs covering a BasePart. Each entry is { Position, Size, Min, Max } with Position the AABB center and Size its full extents (max − min). Honors deformations, current animation state, and DynMesh-driven movement, so the boxes follow the live mesh.

quality controls how many boxes you get:

  • 1 (default) — single box that encloses the entire part. Cheapest. Always accurate for cubes / spheres.
  • 2+ — split a model's vertices into N bins along the longest axis and return a tight AABB per bin. Useful for long thin meshes where one big box is loose. Cubes and spheres always return one box regardless of quality.

GPU.CheckArea(center, size, quality?) -> { BasePart }

Return every renderable BasePart whose bounds (computed at the given quality) overlap the AABB centered at center with full extents size. Skips parts with Render = false. Honors deformed / animated meshes and DynMesh offsets.

GPU spatial queries

These run as a compute pass on the GPU: one thread per part tests the part's world-space AABB against the query shape, matches are written to a packed atomic-counter buffer, then the index list is read back and resolved to BaseParts. Skips parts with Render = false. Filter functions (when provided) run on the CPU after the GPU returns candidates, so they're only invoked on hits.

GPU.OverlapSphere(center, radius, filter?) -> { BasePart }

Return every part whose AABB intersects the sphere centered at center with the given radius. Filter is optional (BasePart) -> boolean; return true to keep, false / nil to drop.

GPU.OverlapBox(center, size, rotation?, filter?) -> { BasePart }

Return every part whose AABB intersects the oriented box at center with full extents size, rotated by an Euler rotation (radians, defaults to no rotation). The GPU test is conservative (AABB-vs-OBB-AABB), so a heavily rotated thin box may include a few false positives near the corners. Tighten with the filter callback if you need exact OBB-OBB on a small candidate set.

GPU.OverlapFrustum(filter?) -> { BasePart }

Return every part whose AABB is inside or intersects the current camera's view frustum. Uses Renderable.Camera's CFrame, FOV, near and far plus the window's aspect ratio. Useful as a fast first pass for "what would be drawn this frame" or for area-of-interest streaming around the player.

GPU.GetItemsInZone(cframe, size, filter?) -> { BasePart }

Mesh-precise oriented-zone query. cframe places and rotates the zone, size is its full extents along the zone's local X / Y / Z. The compute shader tests each part's eight world-space corners against the zone OBB, and for parts with mesh data (Models) it additionally transforms the actual triangle vertices and tests each one, so a spike sticking out of a body still registers even when the body's bounding box doesn't reach the zone. Honors deformed meshes (uses the live deformed verts when DynMesh or animations are active, otherwise the source mesh). Also catches the inverse case where the zone is fully inside a large part.

Like the other Overlap queries this runs as one compute thread per part with an atomic-counter index list, then the CPU reads back only the matched indices.

Shadow configuration

Knobs the default 3D shader reads each frame to decide how the sun terminator behaves and how dark receivers get on their unlit side. Real depth-map sampling (per-light render passes, comparison samplers, cascade splits) is scaffolded for a future pass — this API ships now so you can author lighting against it already.

When SetShadowsEnabled(true) the default shader stops using a soft half-Lambert and instead does a smoothstep at the light terminator (sharper, more "shadowy" look). Receivers darken proportional to MapQuality, the PCF knob widens the smoothstep band to simulate softness, and ShadowDistance fades the term back into half-Lambert past that camera distance.

GPU.SetShadowsEnabled(on) new in 1.2.1

Toggle the shadow term in the default 3D shader. Default false. Custom Frag3D shaders can read the same value via F.shadow_enabled (u32, 0 or 1) and choose to use it or ignore it.

GPU.GetShadowsEnabled() -> boolean
GPU.SetShadowMapQuality(size) new in 1.2.1

Texture resolution future shadow maps will be rendered at. Clamped to [64, 8192] and rounded to the next power of two. Today the value also drives the strength of the stylized shadow floor in the default shader (higher = more contrasty receive-shadow darkening). Default 1024.

GPU.GetShadowMapQuality() -> number
GPU.SetShadowDistance(distance) new in 1.2.1

World-space radius (from the camera) beyond which the shadow term fades back to the regular half-Lambert. Set to 0 to disable distance fade. Default 80. Surfaced as F.shadow_distance in user shaders.

GPU.GetShadowDistance() -> number
GPU.SetShadowBias(bias) new in 1.2.1

Depth comparison bias for future shadow-map sampling. Stored in the GPU module state and forwarded to the Frame uniform so user shaders can read it now. Default 0.0015.

GPU.GetShadowBias() -> number
GPU.SetShadowPCF(taps) new in 1.2.1

Percentage-Closer-Filtering tap count, clamped to [1, 9]. 1 is hard shadows; 3 / 5 / 9 are progressively softer. Today this widens the smoothstep band of the default shader's stylized terminator; once real depth-map sampling lands the same value will drive the actual PCF kernel. Default 1.

GPU.GetShadowPCF() -> number

Reading shadow state from your own shaders

Every GPU.SetShadow* knob is mirrored into the F uniform that every Frag3D shader already sees. You don't bind anything extra — just read these from inside your fs_main:

Lua callWGSL fieldTypeNotes
GPU.SetShadowsEnabled(on)F.shadow_enabledu32Branch with if (F.shadow_enabled == 0u) { ... }.
GPU.SetShadowMapQuality(size)F.shadow_qualityu32Power-of-two pixel size. Use as the kernel resolution if you ever bind a real shadow texture.
GPU.SetShadowDistance(d)F.shadow_distancef32World units. Fade your shadow term to zero past this.
GPU.SetShadowBias(b)F.shadow_biasf32Offset the shadow-ray origin along the surface normal to avoid self-shadowing.
GPU.SetShadowPCF(taps)F.shadow_pcfu321, 3, 5, or 9. Use as your tap count for jittered shadow samples.
— derived —F.shadow_strengthf32How dark a receiver gets on its unlit side. Engine-derived (0.3..0.85) from MapQuality.
— derived —F.shadow_softnessf32Smoothstep band width at the light terminator. Engine-derived from PCF.

I.cast_shadow and I.receive_shadow are also available per-part (u32, 0 or 1) so a shader can skip the shadow test on parts that have opted out:

// Skip the work entirely on non-receivers.
var sun_mask = 1.0;
if (I.receive_shadow != 0u && F.shadow_enabled != 0u && ndl > 0.0) {
    sun_mask = compute_shadow(in.world_pos, n, l);
}

A real working ray-traced soft-shadow shader living on the floor plus per-part variant that shadow each other lives in examples/shadow_floor.frag and examples/shadow_part.frag. The short version: read your caster list from SDATA, do a tangent-space jitter using F.shadow_pcf taps and the engine's hash3 helper, offset the ray origin by n * F.shadow_bias, and fade with smoothstep(F.shadow_distance * 0.7, F.shadow_distance, d).

fn compute_shadow(world_pos: vec3<f32>, n: vec3<f32>, l: vec3<f32>) -> f32 {
    if (F.shadow_enabled == 0u) { return 1.0; }

    let tb     = tangent_basis(l);
    let taps   = max(F.shadow_pcf, 1u);
    let spread = 0.015 * f32(taps);
    let ro     = world_pos + n * max(F.shadow_bias, 0.0001);

    var occ_sum = 0.0;
    for (var s: u32 = 0u; s < taps; s = s + 1u) {
        let hv = hash3(world_pos * 137.7 + vec3<f32>(f32(s), f32(s)*1.3, f32(s)*2.7));
        let ox = (hv.x - 0.5) * spread * 2.0;
        let oy = (hv.y - 0.5) * spread * 2.0;
        let rd = normalize(l + tb[0] * ox + tb[1] * oy);
        occ_sum = occ_sum + occluded(ro, rd);
    }
    var occ = occ_sum / f32(taps);

    if (F.shadow_distance > 0.0) {
        let d = distance(world_pos, F.camera_pos);
        let fade = smoothstep(F.shadow_distance * 0.7, F.shadow_distance, d);
        occ = occ * (1.0 - fade);
    }
    return 1.0 - clamp(occ * 0.95, 0.0, 1.0);
}

Caster list. The occluded() helper above walks the SDATA storage buffer for caster geometry. That buffer is the same general-purpose slot GPU.SetBuffer binds, so you populate it from Lua: allocate via GPU.NewBuffer, pack 4 header floats followed by 8 floats per caster (kind then padding then center+half-extents), and rebind on every frame the casters move. The engine does not auto-populate this — the shape of the caster layout is entirely up to you.

GPUBuffer

Returned by GPU.NewBuffer.

buf:Size() -> number method

Number of f32 slots. Same value as the size arg passed to GPU.NewBuffer; doesn't change after construction.

buf:Write(offset, values) method

Write a list of f32 values into the buffer at offset (in floats, not bytes). Bounds-checked , writing past the end errors instead of corrupting adjacent GPU memory. Cheap; one queue-write under the hood.

buf:Fill(value) method

Fill the entire buffer with one number. Convenience for clearing (:Fill(0)) or initialising to a sentinel.

RaycastHit

Returned by GPU.Raycast. All vectors world-space.

PartBasePart, the part the ray hit.
PositionVector, world-space hit point.
Distancenumber, world-space length from the ray's origin.
NormalVector, surface normal at the hit, world-space.

GPUInfo

Static adapter info, filled once at window-open time. Returned by GPU.Info().

Namestring, driver-reported display name ("NVIDIA GeForce RTX 4070").
Vendorstring, friendly label: "NVIDIA" | "AMD" | "Intel" | "Apple" | "ARM" | "Qualcomm" | "Imagination" | "Microsoft" | "Other" | "Unknown".
VendorIDnumber, raw PCI vendor id.
DeviceIDnumber, raw PCI device id.
Backendstring, "Dx12" | "Vulkan" | "Metal" | "Gl" | "BrowserWebGpu".
Driverstring, driver version, free-form.
DriverInfostring, extra driver detail (build, date). May be empty.
DeviceTypestring, "DiscreteGpu" | "IntegratedGpu" | "VirtualGpu" | "Cpu" | "Other". Useful for quality presets at startup.

GPULimits

Hard caps reported by the wgpu device. Returned by GPU.Limits().

MaxTextureSizenumber, largest 2D texture dimension (e.g. 16384 on most GPUs).
MaxBufferSizenumber, largest single buffer allocation in bytes.
MaxBindGroupsnumber, max simultaneously bound bind groups in one pipeline.
MaxVertexBuffersnumber, max vertex buffers attached to one pipeline.
MaxComputeWorkgroup{ X: number, Y: number, Z: number }, max compute workgroup size per axis.

GPUFrameStats

Live frame stats. Smoothed dt is an EMA so FPS doesn't strobe. Returned by GPU.FrameStats().

FPSnumber, smoothed frames-per-second.
FrameTimenumber, smoothed frame time in seconds.
FrameCountnumber, total frames drawn since process start.
Uptimenumber, wall-clock seconds since GPU tracking started.
PartCountnumber, live count of renderable parts in the scene.