ECS in JS – A Closer Look at Component Storage
A pattern to avoid when designing an ECS for JavaScript
I'm in the middle of adding multithreading support to yet another JavaScript ECS library. This post serves as a way to help me think through a few decisions I've made along the way, as well as share my thoughts with the JavaScript ECS community while they are top-of-mind.
Universal Component Arrays
A simple way to implement storage in an ECS is to keep all components of a particular type in the same data structure. I'll call these universal component arrays (UCA), where every Health component is stored in a common Health[] array, every Squad component is stored in a common Squad[] array, and so on.
I'll make a couple of arguments as to why I feel this is not ideal. But let's look at why universal component arrays appear to be a good idea at first glance.
Pros of UCA
1) Performance: as long as entity ids are unique integers, each entity implicitly reserves its own index in a component array without conflicts. This makes accessing value for a specific component as simple as:
Position[entity]
By using an entity id as an index, reads and writes are O(1).
2) Parallelism: threaded JS programs are best implemented using TypedArrays (specifically ones that are backed by SharedArrayBuffer). TypedArrays are fixed-length, meaning once initialized, they can neither grow nor shrink. When you need to orchestrate events across many threads, the fewer blocks of shared memory you have to manage, the better.
3) Ergonomics: with a single array per-component type, component storage is easy to implement and understand. If you need a Squad component, you know exactly where to look. This makes library features like iterable queries and serialization trivial to implement.
Cons of UCA
Memory Usage
In most ECS libraries, entity ids can be big, like 32 bits big. Using large numbers to index component data in a plain Array is perfectly fine, because JavaScript arrays can be sparse. Take the following example:
let Squad = []
let entity = 0xffffff
Squad[entity] = 123
Squad // [empty × 16777215, 123]
A sparse array behaves similarly to an object: writing to an array at a currently out-of-bounds index doesn't allocate a bunch of unused memory. But as mentioned earlier, the best way to support multithreading is to use TypedArrays, and TypedArrays can't be sparse:
let Squad = new Uint8Array(1e6) // 8Mb
let entity = 0xffffff
Squad[entity] = 123
Squad[entity] // undefined
You could grow* TypedArrays linearly with the number of entities in your ECS:
function addComponent(entity, component) {
let array = getComponentArray(component)
if (entity >= array.length) {
array = growComponentArray(array, ...)
}
array[entity] = component
}
*copy to a new ArrayBuffer, which is costly because you'll need to copy (not transfer) a slice of game state into a new ArrayBuffer and discard the old one
But this isn't a great idea: you'll still end up allocating a lot of memory that will remain unused, because not every entity will have every type of component. And what heuristic do you use to decide how much to grow the array by? If you double the array, it may still not be big enough to hold a value for entity, and if you grow the array just enough so that a component for entity can be stored, you have to perform another expensive resize if entity+1 is set soon after.
A better approach would be to use a memory efficient data structure similar to a sparse array, that would provide same quick O(1) accesses, while at the same time storing component values in a tightly-packed array. This data structure is called a sparse set (or sparse map). Sparse sets can of course be implemented with TypedArrays, but you can still fall into the trap of using entity identifiers as indices.
I won't cover sparse sets in this post. They're useful, but using a single sparse set per-component type won't solve the problem of data locality and cache misses, which we'll look at next.
Data Locality
If you're familiar with high-performance ECS jargon, you've probably heard the term data locality (or cache locality, or locality of reference). I won't explain this concept in depth because I'm not well educated on the subject. Check this FAQ for a better explanation. But the gist is simple: CPUs assume that when a program accesses memory, there's a high likelihood that it will also utilize adjacent memory in the near future. This assumption allows a CPU to load neighboring memory into a cache, which is much faster to access than hitting RAM.
When two parts of memory (e.g. values in a TypedArray) are close together, they are said to have high spatial locality. Take the following loop, where each value in an array is doubled:
for (let i = 0; i < array.length; i++) {
array[i] *= 2
}
This loop block contains no conditionals, and elements are iterated tightly from left to right; therefore, the data access has a high degree of spatial locality, allowing the CPU to speed up reads and writes.
Another way to demonstrate this is random vs. linear iteration. Take the following example setup:
let values = new Uint8Array(Array(1e6).fill(0).map((_, i) => i))
let indices = Array.from(values).map((_, i) => i)
let indicesRandom = indices.slice().sort(() => Math.random() >= 0.5 ? 1 : -1)
A loop that accesses values at each index from indices, performs at roughly two-thousand operations per second on a high-end CPU:
// ~2k ops/s
for (let i = 0; i < values.length; i++) {
values[indices[i]]
}
Perhaps surprisingly, using indicesRandom gives identical results:
// ~2k ops/s
for (let i = 0; i < values.length; i++) {
values[indicesRandom[i]]
}
But when you add in some math, like a simple addition, the effects of data locality become clear:
// ~1.5k ops/s
let sum = 0
for (let i = 0; i < values.length; i++) {
let x = values[indices[i]]
sum += x + x
}
// ~750 ops/s (~50% slower)
let sum = 0
for (let i = 0; i < values.length; i++) {
let x = values[indicesRandom[i]]
sum += x + x
}
The first loop accessed data sequentially and performed about twice as fast. When the array is accessed randomly, the CPU is going to RAM more frequently because it can't predict the next access. When a CPU fails to access a value in the cache and must go to RAM to find it, a cache miss is said to have occurred. We'll see how cache misses have an effect on ECS details in the final section.
Queries
Let's take another look at component storage now that we're familiar with data locality.
One of the fundamental features of all ECS libraries is the ability to query and iterate entities based on their component signature. When we need to retrieve and/or modify only the entities that match certain criteria, we end up iterating a smaller, fragmented subset of indices in the array.
Using entity ids as indices essentially leads to random accesses, similar to values within indicesRandom example above. For example, if a loop iterates a narrowed iterator of entities:
let burning = entitiesWithComponents(Health, Burn)
for (let i = 0; i < burning.length; i++) {
let entity = burning[i]
Health[entity] -= 1
}
This loop will likely access the universal Health component array in an unpredictable manner, leading to fewer accesses with high spatial locality, resulting in more cache misses and reduced performance.
Conclusion
Let's review why using universal component arrays indexed by entity id (i.e. Position[entity]) might not be a good idea:
- If entity ids are big, and they probably are, you'll need to allocate a lot of unused memory because TypedArrays aren't sparse
- When component arrays inevitably need to grow, it may be difficult to decide what amount to grow them by
- Entity ids are essentially random indices and can't be used to iterate arrays sequentially, which results in more cache misses, which may reduce performance
The somewhat obvious conclusion is: use multiple, dense component arrays. If you have an entity with Squad and Health components, put it in a table that contains an array for each component type:
{
length: number
columns: [
Entity[],
Squad[],
Health[],
]
}
These tables are sometimes called archetypes in ECS-speak. However, I've seen the term archetype used loosely in JavaScript ECS libraries. The benefit of archetypes is not that you have an iterable collection of entities that share the same component signature, but that an iterable collection of entities and their components, so that component data can be iterated serially and predictably to squeeze the most out of the CPU.