This Week In Veloren 81
This week 0.7 was released! However, not everything went well with the release party, and @xMAC94x wrote a section on that. @Sharp's colossal branch is getting merged at long last. The description of all the changes is below.
- AngelOnFira, TWiV Editor
Contributor Work
Thanks to this week's contributors, @xMAC94x, @Rdbaker, @Pfau, @Songtronix, @Slipped, @yusdacra, @imbris, @zesterer, @WelshPixie, @scottc!
We had our largest release party yet, with a peak of 57 players concurrently! Although Veloren has been in feature freeze and locked down quite a lot, there was still a lot of work done! @LunarEclipse updated the AUR packages to 0.7. @Sam is working on a new boss, and focusing on a shockwave attack that it will have. @imbris is helping with collision detection on the shockwave, and @lobster is working on particles for it.
@Ellinia got a good amount of SFX work done. @DoctaRay added borderless and exclusive fullscreen options. @lobster got the particle MR merged right before the 0.7 release. This included fireworks items, which used the bomb item as a template.
0.7 Release Party Statistics
With the 0.7 release this past weekend, we held our largest release party yet! Unfortunately, not everything went smoothly. Be sure to check out the section below by @xMAC94x to read more about this. Here are the stats from the rest of the party though!
0.7 Release Party Kick Disaster by @xMAC94x
Last Saturday we had our 0.7 release party. A release party is usually a 2-3 hour-long event where we celebrate a new release, play around, and have fun. Preparation for such a party usually starts a week before and the last 48 hours we were all focused on preparing the release build, testing, setting up a server to handle ~40 players. The server started at 18:00 GMT+0 within 5 minutes we had 53 players coming online! One minute later I had to kill the server as it froze. (see A) Suddenly the whole event changed. From what was going to be a fun party changed to focus on stability for the next 3 hours.
The problem (part 1)
It took us about 15 minutes of investigation to figure out what was happening. We noticed players suddenly dropping, all at once. But the server didn't restart. This was bothersome, a crash would give us a clear error message, but we had nothing. I had 2 thoughts: networking or new bug introduced in the feature freeze period. Networking hasn't been changed for a month, so I was pretty sure it's unrelated. We saw that try to connect, but their requests weren't picked up by the server. We also noticed that our metrics are still available but didn't update.
Countering the "crash"
In order to find a quick solution, we did multiple things simultaneously. As a team, we discussed and tried to analyse the process as well as attach a debugger to the process. We also started to build a local version of Veloren to apply a potential fix. At this point we were 30 mins in the release party already (see B). 10 minutes later we had a compiled local build and to pure luck, the server was stable for 30 mins (see C). During this time we achieved the most players simultaneously on the server at 57. We attached the gdb debugger to collect backtraces once the bug occurred again (see D). We were still pretty unsure what caused it, as we haven't seen such a bug outside this party.
End of the party without a fix
The party ended after a total of 3 hours (see E) with the remaining 15 players transferring to the official Veloren server hosted by me. Sadly we didn't have a fix for the "crash" yet.
Analysing the Backtraces
Analysis was pretty hard as we didn't have debuginfos enabled, so I collected new backtraces on the official server. I found that 2 threads doing something with the same std::sync::Mutex
which made me think it might be actually related to a Deadlock. That would fit the pattern we saw (main thread didn't update metrics, but we didn't crash). With debuginfos I found the problem:
Thread 1 (Thread 0x7fa0a9b37c40 (LWP 19872)):
...
#5 std::sync::condvar::Condvar::wait () at src/libstd/sync/condvar.rs:200
...
#10 0x000055a1ffe16733 in futures_executor::local_pool::run_executor (f=...) at /home/b-server/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-executor-0.3.5/src/local_pool.rs:83
#11 futures_executor::local_pool::block_on (f=...) at /home/b-server/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-executor-0.3.5/src/local_pool.rs:317
#12 veloren_server::events::player::handle_client_disconnect (server=0x7ffd0ddc6490, entity=...) at server/src/events/player.rs:84
#13 0x000055a1ffdcd7df in veloren_server::events::<impl veloren_server::Server>::handle_events (self=<optimized out>) at server/src/events/mod.rs:105
#14 0x000055a1ffdd8354 in veloren_server::Server::tick (self=0x7ffd0ddc6490, _input=..., dt=...) at server/src/lib.rs:400
#15 0x000055a1ffcb4e9e in veloren_server_cli::main () at server-cli/src/main.rs:56
The problem (part 2)
That backtrace told us a lot about what was going on:
Thread 1
is the main thread, the one that was doing the critical phase of the server. See line #14.wait()
in line #5 tells us that it's waiting for an event to happen.handle_client_disconnect()
in line #12 gives us the exact thing what we are trying to do: it's in lineserver/src/events/player.rs:84
So lets look at the source code:
84: if let Err(e) = block_on(participant.disconnect()) {
85: debug!(
86: ?e,
87: "Error when disconnecting client, maybe the pipe already broke"
88: );
89: };
The error seems to be blocking on the client.disconnect()
function. This function is there to cleanly shutdown a client after they disconnect or are kicked. It should try to send all relevant info to them and once this is done close the connection. For some reason, it took way to long (I haven't analyzed why yet) and blocks the main thread sporadically.
The solution
Resolving the error was quite easy once we isolated it. We just move the teardown to another thread. In a background thread this can take as long as it wants without interrupting any player again. Further investigation will show why it takes so long:
Aug 19 08:15:43.126 WARN veloren_server::events::player: disconecting took quite long elapsed=2.534882113s
Aug 19 08:16:20.984 WARN veloren_server::events::player: disconecting took quite long elapsed=3.543180779s
Aug 19 08:28:19.724 DEBUG veloren_server::events::player: disconecting took elapsed=23.553655ms
Aug 19 08:29:38.784 DEBUG veloren_server::events::player: disconecting took elapsed=21.045099ms
Aug 19 08:35:02.043 DEBUG veloren_server::events::player: disconecting took elapsed=25.217615ms
Aug 19 08:37:39.406 DEBUG veloren_server::events::player: disconecting took elapsed=19.948233ms
Aug 19 09:19:16.508 WARN veloren_server::events::player: disconecting took quite long elapsed=23.326647409s
Aug 19 11:29:26.872 WARN veloren_server::events::player: disconecting took quite long elapsed=18.663718063s
Aug 19 12:00:04.742 WARN veloren_server::events::player: disconecting took quite long elapsed=972.050707757s
As you can see often the disconnect is done within 25ms, but sometimes it requires multiple seconds, or even up to 15 minutes.
@Sharp's Lighting and World Changes Branch
The following is the description of @Sharp's massive branch as it is being merged into master. It covers LOD, shadow maps, greedy meshing, new lighting, world size refactoring, and other performance fixes.
Pretty much a Veloren fork at this point. Here's a high-level overview of the changes. At a high level, this MR incorporates roughly two groups of changes.
The first group consists of new game features: more flexible map sizes, level of detail terrain, shadow maps, and a new lighting engine. This is "feature work" that (mostly) only adds new things to Veloren, and mostly shouldn't affect old stuff.
The second big group of changes are those addressing the fallout from all the new features. These include performance fixes of various sorts: the addition of multiple graphics options and optimization of the cheap ones to avoid work, switching all voxel models to use some variant of greedy meshing, switching over much of our CPU-side vector math to exploit SIMD instructions (coinciding with a fork of vek
), and a rewrite of how the UI handles text rendering (coinciding with updates to our fork of conrod
). Making Veloren's hardcoded colors appear correct under the new lighting engine also required considerable changes.
The second category of changes often heavily touches code owned by other people, including frequently modified code "owned" by a handful of people, so I recommend that this code be reviewed particularly carefully.
Table of Contents
At a high level (each will be described in more detail below):
- The world map has been refactored
- The world size is no longer hardcoded
- The map generation code was made generic to allow using it outside of the
world
crate - On world creation, we now compute horizon maps
- The way we pass the world from the server to the client has been updated
- Artifacts related to image rotation were fixed
- Multiflow rivers were enabled
- In the process of making changes related to the world map, various incidental fixes and optimizations were required
- The new level of detail feature was added
- A new LOD terrain rendering step was added to the pipeline
- The LOD terrain quality was made configurable via a graphics setting
- Horizon maps were used to cast shadows from LOD chunks on both LOD and non-LOD terrain
- A "voxelization" effect was incorporated into rendered LOD terrain to make it blend better into the world
- In the process of making changes related to LOD, various incidental fixes and optimizations were required
- Veloren's lighting has been completely overhauled
- A semi-accurate index of refraction was assigned to our materials
- A new, more realistic, physically-based approach to lighting was used using solar panel light exposure
- We attempt to compute realistic light attenuation in water using its real material properties
- In the process of making changes related to LOD, various incidental fixes and optimizations were required
- Point and directional lights now cast realistic shadows, using shadow mapping
- Point light shadow maps were added to the rendering pipeline, using geometry shaders and seamless cube maps.
- Directional light shadows were added to the rendering pipeline, using LISPSM together with disabling depth clamping.
- "Shadow-only" chunks and NPCs were added to prevent shadows from models behind you from disappearing.
- In the process of making changes related to shadow maps, various incidental fixes and optimizations were required.
The addition of shadow maps, LOD terrain, and the new lighting all led to significant performance degradation, on top of other changes happening in master. Therefore, a large number of performance improvements were also needed:
- The graphics options were made much more flexible and configurable, and shaders were optimized
- New options were provided for how to render lights and shadows
- Graphic setting storage and configuration were overhauled to make adding new features easier
- Shaders were rewritten to utilize GLSL's preprocessor to avoid overhead
- In the process of making changes related to providing additional rendering options, various incidental fixes and optimizations were required.
- Voxel model creation was switched to use greedy meshing
- A new voxel meshing method, greedy meshing, was added
- Uses of the older meshing methods were migrated to use greedy meshing
- New restrictions were added to terrain, figure, and sprites to future proof them for further optimizations
- Most positions are now relative to either chunk or player position for better precision
- In the process of making changes related to greedy meshing, various incidental fixes and optimizations were required
- Animation and terrain math was switched to use SIMD where possible
- Fixes were made to vek to make its SIMD feature usable for us
- The interface and types used in bone animation were changed in various ways
- Redundant code generation for body animation is now partly taken care of by a macro
- Animation code was modified to use vek's SIMD representation where possible
- Terrain meshing code and shadow map math were also modified to use vek's SIMD representation
- SIMD instruction generation was enabled
- In the process of making changes related to greedy meshing, various incidental fixes and optimizations were required
- The way we cache glyphs was completely refactored, fixed, and optimized
- Our fork of
conrod
was optimized in various ways - Our fork of
conrod
now exposes whether a widget was updated during the current frame - Our use of the glyph cache was rewritten for correctness
- A text cache was introduced that lets us skip remeshing glyphs that have not changed
- Various changes were made to reduce pressure on the glyph cache, with more planned
- In the process of making changes related to the glyph cache, various incidental fixes and optimizations were required
- Our fork of
- Colors were changed to keep Veloren's look consistent with master
- Some older tree models were brought back
- All hardcoded colors were extracted and made hot-loadable.
- Hardcoded colors were fixed to conform to Veloren's style.
- Color models were fixed to conform to Veloren's style.
A detailed description of the involved changes follows.
World map information was refactored
The world size is no longer hardcoded. All functions dependent on world size now take a
WorldSizeLg
, which holds the base 2 logarithm of each actual world dimension and is guaranteed to maintain certain properties (outlined incommon/src/terrain/map.rs
). Additionally, many utility functions that utilize the world size were moved intocommon
as well (mostlycommon/src/terrain/mod.rs
). Finally, the world map format was updated in order to store its size explicitly, with a migration path from the old format that should work whenever the old formatted map was a square (practically always). Seeworld/src/sim/mod.rs
for these changes.The map generation code was made generic to allow using it outside of the
world
crate. The parts of the map generating code that do not need to query the world were moved over tocommon/src/terrain/map.rs
, allowing them to be used from the client without creating a dependency onworld
. The rest of it was turned into helper functions inworld/src/sim/map.rs
, which can be passed as closures to the generic map generation code to complete its construction. This also means that colors are now passed in separately to the map generation function. See devblog 78 for more details.On world creation, we now compute horizon maps. See the function in
world/src/sim/util.rs
.Given a height map and a plane intersecting that height map, our horizon maps allow us to encode enough information to reconstruct shadows for each point on the height map using only the horizon angle (the angle at which the sun starts to become visible). As Veloren's sun only covers one plane, this is sufficient for encoding sun shadows for LOD terrain, by encoding two angles per chunk (one for each 90 degrees the sun covers). We can also use this for the moon, if we want, since the moon follows the same path. Additionally, we store the height of the furthest occluder, to try to make the shadows volumetric; so this means 4 bytes in total for each chunk.
Support for horizon maps has been merged into the map functionality in common as well.
- The way we pass the world from the server to the client has been updated. Rather than passing the prerendered map, we instead pass three maps with values for each chunk; one with the color information, a second with altitude information, and a third with horizon map information. We then reconstruct the map on the client, together with some additional information we send from the server (like the sea level and maximum height). See
common/src/msg/server.rs
for a detailed description of the format ofWorldMapMsg
, andserver/src/libr.rs
andclient/src/lib.rs
for details of the map construction and parsing. - Artifacts related to image rotation were fixed. See the commit message for commit SHA
cf74d55f2e3d2ae7d25fd68d5c73b01a6afde86e
for a detailed explanation. This involved changes to shaders, the addition of a new type of graphic (also reflected in the graphic cache) that allows specifying a border color (which automatically makes the associated texture immutable), and some related fixes. I reproduce the first two paragraphs of the MR description as well:
Fix map image artifacts and remove unneeded allocations.
Specifically, we address three concerns (the image stretching during
rotation, artifacts around the image due to clamping to the nearest
border color when the image is drawn to a larger space than the image
itself takes up, and potential artifacts around a rotated image which
accidentally ended up in an atlas and didn't have enough extra space to
guarantee the rotation would work).
- Multiflow rivers were enabled. This does not really need to be part of this MR, and would be easy to revert, but since it seemed to provide a nice improvement it's currently packaged with it. We already computed multiple outflows from each chunk for erosion purposes long before this MR. However, we never modified river rendering to be able to handle this case (just a single downhill river flow is complex enough!) so this was not exposed when deciding which chunks were rivers.
- In the process of making changes related to the world map, various incidental fixes and optimizations were required. Some examples of fixes include making sure terrain is never lowered to below sea level (to make the shadow maps report correct values), fixing map altitudes and colors to understand things like cliffs and "block level" coloring (that doesn't exist on the column level), and fixing a crash bug when rendering images for the UI where source pixels are strongly rectangular. Some examples of related performance fixes include avoiding allocating a fresh vector for all the maps (i.e. copying it over to change the format from
[u32; n]
toDynamicImage
and then copying again to convert toRgbaImage
), and instead using thegfx::memory::slice
function to accomplish the same thing. These sorts of changes are spread all around the code.
A new LOD terrain rendering step was added to the pipeline
This includes the addition of a new scene, voxygen/src/scene/lod.rs
, a new pipeline voxygen/src/render/pipeline/lod_terrain.rs
, and new shaders assets/voxygen/shaders/lod-terrain-vert.glsl
and assets/voxygen/shaders/lod-terrain-frag.glsl
, as well as associated changes to the renderer in voxygen/src/render/renderer.rs
.
The main idea behind our initial approach to LOD was to take the world data we now get from the server (altitude, color, and horizon mapping).
In the process of making changes related to LOD, various incidental fixes and optimizations were required
- Some previously computed values were turned into shader uniforms for better prediction on weak processors.
- Calls to power or trig functions were removed or replaced with multiplications, where possible.
Voxel model creation was switched to use greedy meshing.
Veloren-specific considerations
We explicitly designed the greedy meshing system with figures and sprites in mind. In both cases, we want to be able to efficiently pack many different models into the same texture, especially in cases where we know we will either not be removing any of the grouped-together from the models from the texture, or will remove all of them at once (so they can be packed into some specific subtexture).
For sprites, since we know every model in advance and never intend to deallocate them, we currently pack them all as efficiently as possible into one giant texture atlas. However, in the future, we might opt to pack them slightly less efficiently in exchange for shrinking the sprite vertex size. For figures, we pack all the textures for each model into the same atlas. This is a global texture atlas used for every sprite, and for figures which is why we have the ability to mesh multiple models to the same texture area (using the simple texture atlas allocator) without requiring intermediate vector allocations.
This is accomplished by delaying the time when we actually write the color and light data to the texture until after all the model vertices have been meshed; then, we can just allocate the whole color/light array at once, making the atlas we use an exact fit. In computer science-y terms, we accomplish this delay by not continuing to create the texture data after we perform the initial greedy meshing (without texture information). Instead, we construct a continuation. That is, a function that, when called, will execute the rest of the computation. In Rust terms, this continuation is a FnOnce
closure that takes the ColLightsInfo
that it is supposed to write to as context.
In the process of making changes related to greedy meshing, various incidental fixes and optimizations were required
- Matrix multiplications in the shader were reduced for figure data
- Vertex "waves" for fluid data were removed
- Terrain "bending" near edges was removed
- Scaling was fixed to make sure empty space was not introduced in a space previously occupied by a block. It was also changed to take ownership of its voxel data, rather than sharing it, to let it be used with meshing
- Rust's nightly version was bumped in order to use the
array_map
function, which lets us reuse more code between the simple map andFigureModelCache
References
I tried to cite sources in many cases where I needed features from elsewhere but I am particularly grateful for the following resources, especially where they have accompanying source code. I linked all of them that are accessible to the public (those that are not were obtained through legal means).
Eisemann, Elmar, Michael Schwarz, Ulf Assarsson, Michael Wimmer. Real-Time Shadows. A K Peters/CRC Press (T&F), 20160419.
Lloyd,B. 2007. Logarithmic perspective shadow maps. PhD thesis, University of North Carolina.
Wimmer, M., Scherzer, D., and Purgathofer, W. 2004. Light space perspective shadow maps. In Proceedings of Eurographics Symposium on Rendering 2004, pp. 143– 152.
Pharr, Matt, et al. Physically Based Rendering: From Theory to Implementation. Third edition, Morgan Kaufmann Publishers/Elsevier, 2017.
mikolalysenko. Meshing in a Minecraft Game 0 FPS, 30 June 2012
blackflux. Meshing in Voxel Engines – Part 1 Blackflux.Com, 23 Feb. 2014
I am also especially grateful to Khronos, Wikipedia, and Stack Overflow for answering many of my specific questions while writing the MR.