Optimizing Large Procedural Tilemaps In Unity

Reducing Draw Calls for Large Procedural Tilemaps

Procedural generation of large tilemaps can easily produce thousands of individual mesh tiles, each resulting in costly draw calls that quickly overwhelm the GPU. To optimize, we need to batch together tiles into larger meshes while maintaining efficient culling. This section covers techniques like static batching, dynamic batching, and manual chunking to reduce draw calls for maximum performance.

Understanding Draw Calls

A draw call occurs whenever the GPU renders a mesh. Each tile in a procedural tilemap is its own mesh by default, demanding a separate draw call. With thousands of tiles, these individual draws strain GPU workload and memory bandwidth. Fewer larger batch meshes can drastically cut total draw calls.

Enabling Static Batching

Static batching in Unity automatically batches together static meshes sharing materials into larger meshes. This combining of draw calls improves rendering performance. To enable for tilemaps:

  • Ensure tilemap gameobjects are marked static
  • Check the static batching checkbox in player settings
  • Use GPU instancing for further optimization

Static batching has limitations though – tiles cannot be combined across chunks or when using certain rendering features.

Implementing Dynamic Batching

For non-static tiles like animated water or lava, dynamic batching is an option. This also batches together meshes at runtime to reduce draw calls. To enable:

  • Check the dynamic batching checkbox in player settings
  • Set tile gameobjects to dynamic
  • Limit use of custom vertex streams

Dynamic batching induces extra CPU overhead though, so profile carefully versus static batching.

Manual Chunk Batching

Alternatively, manually chunk together tile batches using custom scripts. This offers more control than Unity’s batching methods. Steps include:

  • Group tiles into structured chunks of bounded size
  • Combine each chunk’s tiles into larger meshes
  • Draw the batched chunk meshes
  • Leverage frustum culling on chunks

Later sections cover chunking approaches in more detail.

Understanding Frustum Culling for Tilemaps

Frustum culling hides off-screen tiles to skip rendering and reduce draw calls. This section explains how frustum culling works in Unity and best practices for tilemaps.

How Camera Frustums Work

The camera view defines a truncated pyramid-like region or frustum. Any objects outside this view volume are culled or hidden from rendering. Checking if tiles lie inside the frustum gatekeeps costly draw calls.

Unity performs this frustum culling check automatically. But with tilemaps we can further optimize using the level of detail and viewing sphere features.

Enabling Frustum Culling in Unity

Unity handles frustum culling out-of-box, but several settings impact performance:

  • Camera Culling Mask – Controls which objects can be culled
  • Renderers Occlusion Culling – Enables frustum checks per renderer
  • LOD Groups – Automatically lower detail for distant objects
  • Viewing Sphere Clipping – Further culls using sphere

The key is balancing frustum culling against draw call overhead. Check if saving culled objects outweighs culling costs.

Implementing Tilemap Chunking

Chunking maps into structured tile batches optimizes culling, draw calls, and memory usage. This section discusses best practices for chunk size, loading/unloading, and example code.

Chunk Size Considerations

The optimal chunk size balances:

  • Batch efficiency versus draw overhead
  • Detail versus culling efficiency
  • Memory limitations
  • Loading/saving time

Good starting sizes range from 32×32 tiles up to 128×128. Profile to find the best balance for your project.

Dynamically Loading/Unloading Chunks

Only active chunks near the camera should stay loaded in memory. As chunks move out of view, reuse that memory for newly visible areas. Strategies include:

  • Encapsulate chunk data in persistent objects to facilitate activation/deactivation
  • Maintain a lookup table of active chunks indexed by spatial partition or ID
  • Spawn/despawn chunks as the camera view changes
  • Use Object Pools to reuse chunk gameobjects without destroy/create overhead

See example code below for implementation details.

Example Chunking Code

This script defines a TilemapChunk class for batches of tiles with spatial lookup:

public class TilemapChunk {

  public Bounds bounds;
  
  TileBase[,] tiles;
  
  GameObject meshObject;
  
  public void CreateMesh() {
    // Combine tiles into single mesh
    meshObject = new GameObject("Chunk Mesh");
  }
  
  public void DestroyMesh() { 
    Destroy(meshObject);
  }
  
}

public class TilemapManager {

  Dictionary chunks;
   
  void Update() {
    // Get camera view 
    Bounds visibleRegion = camera.OrthographicBounds();
    
    // Unload invisible chunks
    foreach(chunk in chunks.Values) {
      if(!visibleRegion.Contains(chunk.bounds)) {
        chunk.DestroyMesh(); 
        chunks.Remove(chunk.bounds.center);  
      }
    }

    // Load newly visible chunks
    foreach(potentialPoint in gridRasterPoints) {
      if(visibleRegion.Contains(potentialPoint)) {
        TilemapChunk newChunk = GenerateChunkAtPoint(potentialPoint);
        newChunk.CreateMesh();
        chunks.Add(potentialPoint, newChunk);   
      }
    }
  }

}  

This demonstrates basic patterns like lookup tables, activation/deactivation and object pools. Expand on these concepts to suit project needs.

Optimizing Tilemap Visuals

In addition to culling and batches, visual aspects like textures, rendering layers and shaders also impact performance.

Using Texture Atlases

Texture atlases combine multiple tile textures into larger sheets. This allows batching tiles using the same material for bigger combined meshes and fewer unique materials to manage.

Downsides include increased memory usage and atlas size limits. Use the smallest size for needed resolution.

Batching Mesh Renderers

When manually batching chunked tiles together:

  • Use fewer mesh renderer components versus individual tiles
  • Reduce unique materials through reuse and atlases

Having too many individual renderers strains draw calls, while reusable materials improve batching potential.

Reducing Overdraw

Overdraw refers to redundant pixels drawn repeatedly by overlapping objects. This taxes fill rate. Strategies to reduce overdraw:

  • Enable depth buffer sharing when manually batching chunks
  • Organize transparency layers from back-to-front
  • Use larger tile sprites to minimize texture overlap

The Unity frame debugger tools can profile overdraw hotspots. Tune backface culling and layer depth accordingly.

Performance Testing Large Tilemaps

To establish an optimization baseline and measure improvements, conduct tilemap profiling across standard metrics:

Profile GPU Usage

In Unity statistics or profiling tools, inspect key GPU stats:

  • Vertex and Fragment Load – Checks fill rate limits
  • Draw Calls – Verify reduction from batching
  • Triangles – Important for pixel operations
  • Batched Draw Calls – Check batching efficacy

Profile while navigating the tilemap in play mode to test different usage conditions.

Measure Frame Time Variance

Steady frame timing correlates closely with perceived performance. Use the frame time graph to expose variance hotspots for tiles that may benefit from culling or optimization. Target the spikes first.

Ideally shoot for 30 FPS minimum given typical hardware limits today. Budget 16 ms maximum per frame at 60 FPS.

Identify Bottlenecks

During tests, monitor memory usage, device utilization and scene statistics to pinpoint optimization opportunities:

  • Use GPU/CPU profiling tools built into Unity
  • Check batch counts, unique material stats
  • Inspect memory allocations per frame

Slowly narrow down specific trouble spots dragging down broader performance. Focus optimization efforts there.

Achieving 60 FPS Target for Large Maps

By combining the techniques explored so far – culling, batching, visual tuning – we can optimize even massive procedural tilemaps.

Review Tradeoffs

There are always tradeoffs around:

  • Visual quality versus performance
  • Memory usage versus draw call reduction
  • Accuracy versus culling efficiency

Evaluate these factors against project requirements to guide optimization priorities.

Apply Optimizations Incrementally

An iterative approach allows measuring incremental gains at each step:

  1. Establish baseline performance
  2. Implement chunking and assess impact
  3. Tune visual aspects like atlases and overdraw
  4. Profile culling and LOD configurations
  5. Check common bottlenecks like fill rate

Incremental refinement also helps identify the combination of factors that most improve frame rates for a given map.

Verify with Benchmarks

Formal benchmarking provides quantitative guidance. Record stats across standard test runs on target devices to verify optimization efficacy.

Use the same base tilemap for input consistency across runs. Also test with different map sizes and terrain combinations to better generalize optimizations.

With diligent testing and profiling, virtually any procedural tilemap can hit 60 FPS or better. Applying the techniques discussed combined with general Unity performance best practices allows pushing millions of tiles while maintaining fast frame rates on mid-range hardware.

The key is analyzing bottlenecks, profiling iteratively, and targeting optimizations at the corners dragging down broader performance. There is no single cure-all, but combining tools like culling, batching and visual tuning together can unlock even the most complex procedural tilemaps.

Leave a Reply

Your email address will not be published. Required fields are marked *