This Week In Veloren 85

This week, we have a new art blog by @Pfau. @xMAC94x goes into detail on how we
measure networking statistics.

– AngelOnFira, TWiV Editor

Contributor Work

Thanks to this week’s contributors, @xMAC94, @Sharp, @tylerlowrey, @Imbris,
@Slipped, @TheThirdSpartan, and @mttmartin42!

@Pfau is back at the Veloren art blogs! You can check out No. 7
here
. @Capucho is still
trudging through the magic that in WGPU. @Awkor has been doing work on an
inventory autosort proof of concept. @James is Working on new hotbar skill for
the axe and a new m2 for the hammer (moving current m2 to the hotbar). @Sam is
working on a sword overhaul.

@Songtronix wrote a discord bot (@Veloren) to control his testing server. It
includes:

  • Switching branches easily
  • Querying the status of the server (updating, compiling) at any time
  • Downloading the database
  • Viewing server settings

Retouch of a repeating crossbow by @AlbinoAxolotl

Improved Server Metrics To Improve Server Performance by @xMAC94x

The first step of optimizing Veloren’s server code is to know what runs the
slowest. If we were to optimize less impactful areas, they wouldn’t impact the
end-user as much. We are using Prometheus metrics to gather data in real-time
from our main server.

With some recent additions, we are now able to get the following metrics:

  • Chunks requested from the server: every time a client asks for a chunk we
    will see how the server reacts. We can see if it need to compute the chunk or
    if it can just send a chunk which is already loaded because another player is
    already there.
  • Chunk generation: when a chunk needs to be generated it is queued. This
    metrics will help us to see if this queue is getting bigger over time or if it
    remains at a healthy size.
  • Active players: players online is now split into players connected and
    disconnected. This makes it easy to detect single events and even lets us
    detect when a client joins in the same time other leaves. The online players
    will be calculated with the difference. For disconnect, we now also track the
    event (i.e. Gracefully Close, Timeout, Network error). Though currently, under
    some conditions a network error is thrown for what is actually a graceful
    close.
  • State timings: Even deeper look at our engine’s State timings. Before we
    only had a total time for our entire state. Now we can get the state timings
    based on each subsystem. It shows that physics likely needs the next
    optimisation.

Theory Of Continuous Prometheus Metrics

Metrics are generated every tick, but are actually only polled by the Prometheus
server every 2 seconds. This means we only get a metric for every 60th tick.
There are now multiple ways to handle this, especially in relation to timing
data. Until now, we have always exported the last tick’s timing. This meant we
got 100% accurate timings on 1 out of 60 ticks, and no timing information on the
remaining 59 ticks. We just had to assume that all the other 59 ticks are
probably the same as this one tick. But we have no way of knowing if the tick we
received was part of the 95th percentile, meaning much faster or slower than the
rest. This information is important, as super slow ticks are the ones that cause
lag.

Another way to tackle this problem is by summing up the timings. We can still
poll the data every 2 seconds (60 ticks). So we get the total time spend for our
all of these ticks, but aggregated together. However, just take the average and
assume each tick was just 1/60 of that value. Or if there is a slow tick and 59
fast ticks, there is no way of knowing.

A Steppesman’s recurve bow by @AlbinoAxolotl

For this case, Prometheus has histograms. They provide buckets for different
time ranges, lets say <1ms, 1-5ms, 5-10ms, >10ms. With Histograms, each tick
will increase a number in its corresponding bucket. So if we have a super fast
tick, it will increase the number in the <1ms bucket. if we have a super slow
tick we increase the number in >10ms. We will be using a combination of BOTH.
summing up all ticks AND histograms in the future to detect the average time we
spend in a certain part of our code as well as make sure, that we don’t have any
spikes causing lags.

Altogether, these new techniques will allow us to get much better telemetry
about the outliers. Often it’s hard to make networking work well purely from
what seems to be best in theory. Veloren’s application is unique in many ways,
and so we will have to keep striving to analyze how it runs from the data.

Deer by the river. See you next week!

Support our devs!

Veloren Open Collective

This Week In Veloren 81

This week 0.7 was released! However, not everything went well with the release
party, and @xMAC94x wrote a section on that. @Sharp’s colossal branch is getting
merged at long last. The description of all the changes is below.

– AngelOnFira, TWiV Editor

Contributor Work

Thanks to this week’s contributors, @xMAC94x, @Rdbaker, @Pfau, @Songtronix,
@Slipped, @yusdacra, @imbris, @zesterer, @WelshPixie, @scottc!

We had our largest release party yet, with a peak of 57 players concurrently!
Although Veloren has been in feature freeze and locked down quite a lot, there
was still a lot of work done! @Mckol updated the AUR packages to 0.7. @Sam is
working on a new boss, and focusing on a shockwave attack that it will have.
@imbris is helping with collision detection on the shockwave, and @lobster is
working on particles for it.

It’s been a long road, but the time has finally come to merge @Sharp’s behemoth branch. Read more below for all the details!

@Ellinia got a good amount of SFX work done. @DoctaRay added borderless and
exclusive fullscreen options. @lobster got the particle MR merged right before
the 0.7 release. This included fireworks items, which used the bomb item as a
template.

0.7 Release Party Statistics

With the 0.7 release this past weekend, we held our largest release party yet!
Unfortunately, not everything went smoothly. Be sure to check out the section
below by @xMAC94x to read more about this. Here are the stats from the rest of
the party though!

90% uptime… or was it? Read more to find out

0.7 Release Party Kick Disaster by @xMAC94x

Last Saturday we had our 0.7 release party. A release party is usually a 2-3
hour-long event where we celebrate a new release, play around, and have fun.
Preparation for such a party usually starts a week before and the last 48 hours
we were all focused on preparing the release build, testing, setting up a server
to handle ~40 players. The server started at 18:00 GMT+0 within 5 minutes we had
53 players coming online! One minute later I had to kill the server as it froze.
(see A) Suddenly the whole event changed. From what was going to be a fun party
changed to focus on stability for the next 3 hours.

The problem (part 1)

It took us about 15 minutes of investigation to figure out what was happening.
We noticed players suddenly dropping, all at once. But the server didn’t
restart. This was bothersome, a crash would give us a clear error message, but
we had nothing. I had 2 thoughts: networking or new bug introduced in the
feature freeze period. Networking hasn’t been changed for a month, so I was
pretty sure it’s unrelated. We saw that try to connect, but their requests
weren’t picked up by the server. We also noticed that our metrics are still
available but didn’t update.

Countering the “crash”

In order to find a quick solution, we did multiple things simultaneously. As a
team, we discussed and tried to analyse the process as well as attach a debugger
to the process. We also started to build a local version of Veloren to apply a
potential fix. At this point we were 30 mins in the release party already (see
B). 10 minutes later we had a compiled local build and to pure luck, the server
was stable for 30 mins (see C). During this time we achieved the most players
simultaneously on the server at 57. We attached the gdb debugger to collect
backtraces once the bug occurred again (see D). We were still pretty unsure what
caused it, as we haven’t seen such a bug outside this party.

Emergency mitigation meeting

End of the party without a fix

The party ended after a total of 3 hours (see E) with the remaining 15 players
transferring to the official Veloren server hosted by me. Sadly we didn’t have a
fix for the “crash” yet.

Analysing the Backtraces

Analysis was pretty hard as we didn’t have debuginfos enabled, so I collected
new backtraces on the official server. I found that 2 threads doing something
with the same std::sync::Mutex which made me think it might be actually
related to a Deadlock. That would fit the pattern we saw (main thread didn’t
update metrics, but we didn’t crash). With debuginfos I found the problem:

Thread 1 (Thread 0x7fa0a9b37c40 (LWP 19872)):
...
#5  std::sync::condvar::Condvar::wait () at src/libstd/sync/condvar.rs:200
...
#10 0x000055a1ffe16733 in futures_executor::local_pool::run_executor (f=...) at /home/b-server/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-executor-0.3.5/src/local_pool.rs:83
#11 futures_executor::local_pool::block_on (f=...) at /home/b-server/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-executor-0.3.5/src/local_pool.rs:317
#12 veloren_server::events::player::handle_client_disconnect (server=0x7ffd0ddc6490, entity=...) at server/src/events/player.rs:84
#13 0x000055a1ffdcd7df in veloren_server::events::<impl veloren_server::Server>::handle_events (self=<optimized out>) at server/src/events/mod.rs:105
#14 0x000055a1ffdd8354 in veloren_server::Server::tick (self=0x7ffd0ddc6490, _input=..., dt=...) at server/src/lib.rs:400
#15 0x000055a1ffcb4e9e in veloren_server_cli::main () at server-cli/src/main.rs:56

The problem (part 2)

That backtrace told us a lot about what was going on:

  • Thread 1 is the main thread, the one that was doing the critical phase of
    the server. See line #14.
  • wait() in line #5 tells us that it’s waiting for an event to happen.
  • handle_client_disconnect() in line #12 gives us the exact thing what we are
    trying to do: it’s in line server/src/events/player.rs:84

So lets look at the source code:

84: if let Err(e) = block_on(participant.disconnect()) {
85:   debug!(
86:     ?e,
87:     "Error when disconnecting client, maybe the pipe already broke"
88:   );
89: };

The error seems to be blocking on the client.disconnect() function. This
function is there to cleanly shutdown a client after they disconnect or are
kicked. It should try to send all relevant info to them and once this is done
close the connection. For some reason, it took way to long (I haven’t analyzed
why yet) and blocks the main thread sporadically.

New armor set by @Snowram

The solution

Resolving the error was quite easy once we isolated it. We just move the
teardown to another thread. In a background thread this can take as long as it
wants without interrupting any player again. Further investigation will show why
it takes so long:

Aug 19 08:15:43.126  WARN veloren_server::events::player: disconecting took quite long elapsed=2.534882113s
Aug 19 08:16:20.984  WARN veloren_server::events::player: disconecting took quite long elapsed=3.543180779s
Aug 19 08:28:19.724 DEBUG veloren_server::events::player: disconecting took elapsed=23.553655ms
Aug 19 08:29:38.784 DEBUG veloren_server::events::player: disconecting took elapsed=21.045099ms
Aug 19 08:35:02.043 DEBUG veloren_server::events::player: disconecting took elapsed=25.217615ms
Aug 19 08:37:39.406 DEBUG veloren_server::events::player: disconecting took elapsed=19.948233ms
Aug 19 09:19:16.508  WARN veloren_server::events::player: disconecting took quite long elapsed=23.326647409s
Aug 19 11:29:26.872  WARN veloren_server::events::player: disconecting took quite long elapsed=18.663718063s
Aug 19 12:00:04.742  WARN veloren_server::events::player: disconecting took quite long elapsed=972.050707757s

As you can see often the disconnect is done within 25ms, but sometimes it
requires multiple seconds, or even up to 15 minutes.

@Sharp’s Lighting and World Changes Branch

The following is the description of @Sharp’s massive branch as it is being
merged into master. It covers LOD, shadow maps, greedy meshing, new lighting,
world size refactoring, and other performance fixes.

Pretty much a Veloren fork at this point. Here’s a high-level overview of the
changes. At a high level, this MR incorporates roughly two groups of changes.

The first group consists of new game features: more flexible map sizes, level of
detail terrain, shadow maps, and a new lighting engine. This is “feature work”
that (mostly) only adds new things to Veloren, and mostly shouldn’t affect old
stuff.

The second big group of changes are those addressing the fallout from all the
new features. These include performance fixes of various sorts: the addition of
multiple graphics options and optimization of the cheap ones to avoid work,
switching all voxel models to use some variant of greedy meshing, switching over
much of our CPU-side vector math to exploit SIMD instructions (coinciding with a
fork of vek), and a rewrite of how the UI handles text rendering (coinciding
with updates to our fork of conrod). Making Veloren’s hardcoded colors appear
correct under the new lighting engine also required considerable changes.

The second category of changes often heavily touches code owned by other people,
including frequently modified code “owned” by a handful of people, so I
recommend that this code be reviewed particularly carefully.

Table of Contents

At a high level (each will be described in more detail below):

  • The world map has been refactored
    • The world size is no longer hardcoded
    • The map generation code was made generic to allow using it outside of the
      world crate
    • On world creation, we now compute horizon maps
    • The way we pass the world from the server to the client has been updated
    • Artifacts related to image rotation were fixed
    • Multiflow rivers were enabled
    • In the process of making changes related to the world map, various
      incidental fixes and optimizations were required
  • The new level of detail feature was added
    • A new LOD terrain rendering step was added to the pipeline
    • The LOD terrain quality was made configurable via a graphics setting
    • Horizon maps were used to cast shadows from LOD chunks on both LOD and
      non-LOD terrain
    • A “voxelization” effect was incorporated into rendered LOD terrain to make
      it blend better into the world
    • In the process of making changes related to LOD, various incidental fixes
      and optimizations were required
  • Veloren’s lighting has been completely overhauled
    • A semi-accurate index of refraction was assigned to our materials
    • A new, more realistic, physically-based approach to lighting was used using
      solar panel light exposure
    • We attempt to compute realistic light attenuation in water using its real
      material properties
    • In the process of making changes related to LOD, various incidental fixes
      and optimizations were required
  • Point and directional lights now cast realistic shadows, using shadow
    mapping

    • Point light shadow maps were added to the rendering pipeline, using geometry
      shaders and seamless cube maps.
    • Directional light shadows were added to the rendering pipeline, using LISPSM
      together with disabling depth clamping.
    • “Shadow-only” chunks and NPCs were added to prevent shadows from models
      behind you from disappearing.
    • In the process of making changes related to shadow maps, various incidental
      fixes and optimizations were required.

The addition of shadow maps, LOD terrain, and the new lighting all led to
significant performance degradation, on top of other changes happening in
master. Therefore, a large number of performance improvements were also needed:

  • The graphics options were made much more flexible and configurable, and
    shaders were optimized

    • New options were provided for how to render lights and shadows
    • Graphic setting storage and configuration were overhauled to make adding new
      features easier
    • Shaders were rewritten to utilize GLSL’s preprocessor to avoid overhead
    • In the process of making changes related to providing additional rendering
      options, various incidental fixes and optimizations were required.
  • Voxel model creation was switched to use greedy meshing
    • A new voxel meshing method, greedy meshing, was added
    • Uses of the older meshing methods were migrated to use greedy meshing
    • New restrictions were added to terrain, figure, and sprites to future proof
      them for further optimizations
    • Most positions are now relative to either chunk or player position for
      better precision
    • In the process of making changes related to greedy meshing, various
      incidental fixes and optimizations were required
  • Animation and terrain math was switched to use SIMD where possible
    • Fixes were made to vek to make its SIMD feature usable for us
    • The interface and types used in bone animation were changed in various ways
    • Redundant code generation for body animation is now partly taken care of by
      a macro
    • Animation code was modified to use vek’s SIMD representation where possible
    • Terrain meshing code and shadow map math were also modified to use vek’s
      SIMD representation
    • SIMD instruction generation was enabled
    • In the process of making changes related to greedy meshing, various
      incidental fixes and optimizations were required
  • The way we cache glyphs was completely refactored, fixed, and optimized
    • Our fork of conrod was optimized in various ways
    • Our fork of conrod now exposes whether a widget was updated during the
      current frame
    • Our use of the glyph cache was rewritten for correctness
    • A text cache was introduced that lets us skip remeshing glyphs that have
      not changed
    • Various changes were made to reduce pressure on the glyph cache, with more
      planned
    • In the process of making changes related to the glyph cache, various
      incidental fixes and optimizations were required
  • Colors were changed to keep Veloren’s look consistent with master
    • Some older tree models were brought back
    • All hardcoded colors were extracted and made hotloadable.
    • Hardcoded colors were fixed to conform to Veloren’s style.
    • Color models were fixed to conform to Veloren’s style.

A detailed description of the involved changes follows.

World map information was refactored

Support for horizon maps has been merged into the map functionality in common
as well.

  • The way we pass the world from the server to the client has been updated.
    Rather than passing the prerendered map, we instead pass three maps with
    values for each chunk; one with the color information, a second with altitude
    information, and a third with horizon map information. We then reconstruct the
    map on the client, together with some additional information we send from the
    server (like the sea level and maximum height). See common/src/msg/server.rs
    for a detailed description of the format of WorldMapMsg, and
    server/src/libr.rs and client/src/lib.rs for details of the map
    construction and parsing.
  • Artifacts related to image rotation were fixed. See the commit message for
    commit SHA cf74d55f2e3d2ae7d25fd68d5c73b01a6afde86e for a detailed
    explanation. This involved changes to shaders, the addition of a new type of
    graphic (also reflected in the graphic cache) that allows specifying a border
    color (which automatically makes the associated texture immutable), and some
    related fixes. I reproduce the first two paragraphs of the MR description as
    well:
Fix map image artifacts and remove unneeded allocations.

Specifically, we address three concerns (the image stretching during
rotation, artifacts around the image due to clamping to the nearest
border color when the image is drawn to a larger space than the image
itself takes up, and potential artifacts around a rotated image which
accidentally ended up in an atlas and didn't have enough extra space to
guarantee the rotation would work).
  • Multiflow rivers were enabled. This does not really need to be part of this
    MR, and would be easy to revert, but since it seemed to provide a nice
    improvement it’s currently packaged with it. We already computed multiple
    outflows from each chunk for erosion purposes long before this MR. However, we
    never modified river rendering to be able to handle this case (just a single
    downhill river flow is complex enough!) so this was not exposed when deciding
    which chunks were rivers.
  • In the process of making changes related to the world map, various incidental
    fixes and optimizations were required. Some examples of fixes include making
    sure terrain is never lowered to below sea level (to make the shadow maps
    report correct values), fixing map altitudes and colors to understand things
    like cliffs and “block level” coloring (that doesn’t exist on the column
    level), and fixing a crash bug when rendering images for the UI where source
    pixels are strongly rectangular. Some examples of related performance fixes
    include avoiding allocating a fresh vector for all the maps (i.e. copying it
    over to change the format from [u32; n] to DynamicImage and then copying
    again to convert to RgbaImage), and instead using the gfx::memory::slice
    function to accomplish the same thing. These sorts of changes are spread all
    around the code.

A new LOD terrain rendering step was added to the pipeline

This includes the addition of a new scene, voxygen/src/scene/lod.rs, a new
pipeline voxygen/src/render/pipeline/lod_terrain.rs, and new shaders
assets/voxygen/shaders/lod-terrain-vert.glsl and
assets/voxygen/shaders/lod-terrain-frag.glsl, as well as associated changes to
the renderer in voxygen/src/render/renderer.rs.

The main idea behind our initial approach to LOD was to take the world data we
now get from the server (altitude, color, and horizon mapping).

In the process of making changes related to LOD, various incidental fixes and optimizations were required

  • Some previously computed values were turned into shader uniforms for better
    prediction on weak processors.
  • Calls to power or trig functions were removed or replaced with
    multiplications, where possible.

Voxel model creation was switched to use greedy meshing.

Veloren-specific conisderations

We explicitly designed the greedy meshing system with figures and sprites in
mind. In both cases, we want to be able to efficiently pack many different
models into the same texture, especially in cases where we know we will either
not be removing any of the grouped-together from the models from the texture, or
will remove all of them at once (so they can be packed into some specific
subtexture).

For sprites, since we know every model in advance and never intend to deallocate
them, we currently pack them all as efficiently as possible into one giant
texture atlas. However, in the future, we might opt to pack them slightly less
efficiently in exchange for shrinking the sprite vertex size. For figures, we
pack all the textures for each model into the same atlas. This is a global
texture atlas used for every sprite, and for figures which is why we have the
ability to mesh multiple models to the same texture area (using the simple
texture atlas allocator) without requiring intermediate vector allocations.

This is accomplished by delaying the time when we actually write the color and
light data to the texture until after all the model vertices have been meshed;
then, we can just allocate the whole color/light array at once, making the atlas
we use an exact fit. In computer science-y terms, we accomplish this delay by
not continuing to create the texture data after we perform the initial greedy
meshing (without texture information). Instead, we construct a continuation.
That is, a function that, when called, will execute the rest of the computation.
In Rust terms, this continuation is a FnOnce closure that takes the
ColLightsInfo that it is supposed to write to as context.

In the process of making changes related to greedy meshing, various incidental fixes and optimizations were required

  • Matrix multiplications in the shader were reduced for figure data
  • Vertex “waves” for fluid data were removed
  • Terrain “bending” near edges was removed
  • Scaling was fixed to make sure empty space was not introduced in a space
    previously occupied by a block. It was also changed to take ownership of its
    voxel data, rather than sharing it, to let it be used with meshing
  • Rust’s nightly version was bumped in order to use the array_map function,
    which lets us reuse more code between the simple map and FigureModelCache

References

I tried to cite sources in many cases where I needed features from elsewhere but
I am particularly grateful for the following resources, especially where they
have accompanying source code. I linked all of them that are accessible to the
public (those that are not were obtained through legal means).

Eisemann, Elmar, Michael Schwarz, Ulf Assarsson, Michael Wimmer. Real-Time
Shadows. A K Peters/CRC Press (T&F), 20160419.

Lloyd,B. 2007. Logarithmic perspective shadow
maps
. PhD
thesis, University of North Carolina.

Wimmer, M., Scherzer, D., and Purgathofer, W. 2004. Light space perspective
shadow
maps
. In
Proceedings of Eurographics Symposium on Rendering 2004, pp. 143– 152.

Pharr, Matt, et al. Physically Based Rendering: From Theory to
Implementation
.
Third edition, Morgan Kaufmann Publishers/Elsevier, 2017.

mikolalysenko. Meshing in a Minecraft
Game
0 FPS, 30 June
2012

blackflux. Meshing in Voxel Engines – Part
1

Blackflux.Com, 23 Feb. 2014

I am also especially grateful to Khronos, Wikiepdia, and stackoverflow for
answering many of my specific questions while writing the MR.

The most interesting bag. See you next week!

Support our devs!

Veloren Open Collective