This is the start of a series of articles on Entity Component System (ECS) architecture. The posts will be a mix of theory and practical. The goal is to build a functional and efficient library by the end.
What is an ECS
An ECS or Entity Component System is programming pattern that separates code, data, and relationships. The three pillars of an ECS are unsurprisingly:
Entity provides for an identity and the relationship between Components. A overly simple yet useful view of an Entity is to think of it as simply a name for a set of Components.
Component is a piece of data representing some aspect of an Entity. An example of a Component might be a position component which stores 3 floats representing the x,y,z position of the Entity in the world.
System is a procedure that does work on a set of Components. You can think of a System as a replacement for an Object Oriented Entity’s Think function. Although, generally Systems are broken down into much more granular pieces of logic than you would see in a typical OO Entity.
There exists several public implementations of the ECS pattern all of which vary widely in their interpretation of these fundamental constructs. Nevertheless they are immediately recognizable as Entity Component Systems. This nebulous nature can be frustrating; there is no set blueprint to follow to implement an efficient ECS. But it also affords us great freedom to design a system that suits our goals.
Let’s aim for one million entities with a game running at 90 hz.
If you’re going to set a goal aim high. Running at 90 hz is important for VR and seems a reasonable high-end goal.
One million entities on the other hand is ambitious. Most games I’ve worked on have a few hundred to single-digit thousand entities at a time. However, those games have mostly used a traditional Object Oriented Entity hierarchy rife with virtual functions and poor locality. Maybe this goal is absurd and unreachable, but pushing things to extremes makes it easier to explore some problems and it’ll be fun to find out.
Let’s meet our primary weapon in the battle to reach these lofty goals.
Data-Oriented Design (DOD)
Data-oriented design is a coding paradigm that focuses on optimal layout and minimal transformation of data in memory to efficiently solve problems.
A data-oriented design is not a requirement for an ECS. Unity is a popular engine that uses a component-based system without much DOD.
UPDATE: Since first writing this back in early 2017 Unity has introduced an ECS which is very much a data-oriented system.
The goal for DOD with respect to an ECS is to maximize the throughput of our Systems. Instruction processing ability has massively out-paced memory access times. On contemporary architectures main memory access is almost always enemy number one for performance. In the following chart look at how many clock cycles a processor has to wait before it can get the result of a main memory access.
Note the huge discrepancy in the time it takes to access L1/L2 cache and main memory. In order to avoid expensive main memory access we want to keep our data in cache.
A great deal of effort has been spent trying to hide this memory latency. Occasionally we’ll come across a topic, such as this, which can’t be thoroughly explored here without derailing the whole series. In these cases I will try to drop references that expand upon the material presented here.
This talk was an instant classic. Herb Sutter explains the basis for data-oriented design by digging into how hardware has changed over time.
Urlich Drepper’s seminal paper on memory. This paper demystifies all aspects of the memory hierarchy. It does contain minor bits of out of date information but overall it’s still fantastic and very relevant today.
In the next post we’ll explore cache behavior through simple experiments.