One of the more interesting things I worked on was the 3D grass in FIFA Switch. It was basically an experiment in seeing how much work I could run on the GPU and how far I could push instancing. The grass system instances a single 4 triangle blade of grass where very little is precomputed.
The general idea is to tile a number of points over the ground near the camera. The tile gets precomputed and it's just a list of random points within an area, pretty simple. We also precompute a quadtree that covers the area of the ground we want to have grass. In FIFA we can make some pretty good assumptions given the ground is a plane and where we want grass is a very specific part of the ground. The quadtree leaf nodes are just a point on the ground and the dimensions are equal to the size of our precomputed tile.
The next step is gathering the tile locations, this is pretty easy! We adjust the camera frustum so the far plane is equal to the far distance of the grass and then just gather up the visible tiles. This is done on the CPU and is very fast, but it could be done on the GPU as well if we wanted to go that route. The visible tile locations are then sorted based on distance from the camera and only the nearest N number are chosen since we also cap the maximum number just to ensure performance. In practice, the number of visible tiles should always be less than N or something is wrong with our tuning.
So now we have a list of tile locations and a tile with a list of positions for individual grass blades. Now comes the fun stuff where we pass everything off to a compute shader. At its simplest level, this shader builds a list of all the grass blade positions based on the tile positions passed in. We also do per-grass blade visibility testing here (simple sphere test with the current frustum), which saves quite a lot of unnecessary geometry along with getitng the blade rotation, color and other per-blade parameters so we're not sampling those textures per-vertex. This is all written to a structured buffer. Then the blade mesh is instanced via an indirect draw (use a buffer constructed on the GPU to set the instanced draw parameters) using that buffer.
In the vertex shader we read those parameters and apply any deformation to the blade like the rotation, curvature, if it's squished from a footprint or not, vertex color and whatnot. Grass was scaled per blade as they approached the far draw distance so the transition to no grass was virtually seamless. The fragment shader does some very simple lighting and applies the lightmap from the ground so everything blends into the scene nicely along with dynamic shadows.
Controlling the grass on a per-blade level gave us a huge amount of flexibility with it and it allowed us to reduce density in the distance by just dropping instances based on how far away they are and other trickery like that. Or, in handheld mode, we'd just reduce density overall by a fixed percentage so we could really easily account for the reduced GPU clock rate. Things like that aren't easily done when tiling an entire patch of grass.
Ultimately, this meant that the grass you see in game is a single mesh being drawn probably over 10,000 times (I don't remember the exact numbers, it was a long time ago) on the Switch and the whole thing took up around 5ms of GPU time in both docked and handheld modes. Given this was only active in 30fps sequences, the cost wasn't too bad and we were able to get full 3D grass on the Switch.
This whole system is actually very similar to how the rain/snow system worked, which I also wrote an eternity ago (in 2014). That worked with an octree that covered the entire stadium and the compute shader moved the positions within a repeating box. Otherwise it was the same general idea, though.
I always wanted to get this system running on the other consoles to see how it compared performance-wise but unfortunately never had the time.
Hope you enjoyed gaining a little insight into how grass on FIFA Switch worked!