This Week In Veloren 81

This week 0.7 was released! However, not everything went well with the release
party, and @xMAC94x wrote a section on that. @Sharp’s colossal branch is getting
merged at long last. The description of all the changes is below.

– AngelOnFira, TWiV Editor

Contributor Work

Thanks to this week’s contributors, @xMAC94x, @Rdbaker, @Pfau, @Songtronix,
@Slipped, @yusdacra, @imbris, @zesterer, @WelshPixie, @scottc!

We had our largest release party yet, with a peak of 57 players concurrently!
Although Veloren has been in feature freeze and locked down quite a lot, there
was still a lot of work done! @Mckol updated the AUR packages to 0.7. @Sam is
working on a new boss, and focusing on a shockwave attack that it will have.
@imbris is helping with collision detection on the shockwave, and @lobster is
working on particles for it.

It’s been a long road, but the time has finally come to merge @Sharp’s behemoth branch. Read more below for all the details!

@Ellinia got a good amount of SFX work done. @DoctaRay added borderless and
exclusive fullscreen options. @lobster got the particle MR merged right before
the 0.7 release. This included fireworks items, which used the bomb item as a
template.

0.7 Release Party Statistics

With the 0.7 release this past weekend, we held our largest release party yet!
Unfortunately, not everything went smoothly. Be sure to check out the section
below by @xMAC94x to read more about this. Here are the stats from the rest of
the party though!

90% uptime… or was it? Read more to find out

0.7 Release Party Kick Disaster by @xMAC94x

Last Saturday we had our 0.7 release party. A release party is usually a 2-3
hour-long event where we celebrate a new release, play around, and have fun.
Preparation for such a party usually starts a week before and the last 48 hours
we were all focused on preparing the release build, testing, setting up a server
to handle ~40 players. The server started at 18:00 GMT+0 within 5 minutes we had
53 players coming online! One minute later I had to kill the server as it froze.
(see A) Suddenly the whole event changed. From what was going to be a fun party
changed to focus on stability for the next 3 hours.

The problem (part 1)

It took us about 15 minutes of investigation to figure out what was happening.
We noticed players suddenly dropping, all at once. But the server didn’t
restart. This was bothersome, a crash would give us a clear error message, but
we had nothing. I had 2 thoughts: networking or new bug introduced in the
feature freeze period. Networking hasn’t been changed for a month, so I was
pretty sure it’s unrelated. We saw that try to connect, but their requests
weren’t picked up by the server. We also noticed that our metrics are still
available but didn’t update.

Countering the “crash”

In order to find a quick solution, we did multiple things simultaneously. As a
team, we discussed and tried to analyse the process as well as attach a debugger
to the process. We also started to build a local version of Veloren to apply a
potential fix. At this point we were 30 mins in the release party already (see
B). 10 minutes later we had a compiled local build and to pure luck, the server
was stable for 30 mins (see C). During this time we achieved the most players
simultaneously on the server at 57. We attached the gdb debugger to collect
backtraces once the bug occurred again (see D). We were still pretty unsure what
caused it, as we haven’t seen such a bug outside this party.

Emergency mitigation meeting

End of the party without a fix

The party ended after a total of 3 hours (see E) with the remaining 15 players
transferring to the official Veloren server hosted by me. Sadly we didn’t have a
fix for the “crash” yet.

Analysing the Backtraces

Analysis was pretty hard as we didn’t have debuginfos enabled, so I collected
new backtraces on the official server. I found that 2 threads doing something
with the same std::sync::Mutex which made me think it might be actually
related to a Deadlock. That would fit the pattern we saw (main thread didn’t
update metrics, but we didn’t crash). With debuginfos I found the problem:

Thread 1 (Thread 0x7fa0a9b37c40 (LWP 19872)):
...
#5  std::sync::condvar::Condvar::wait () at src/libstd/sync/condvar.rs:200
...
#10 0x000055a1ffe16733 in futures_executor::local_pool::run_executor (f=...) at /home/b-server/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-executor-0.3.5/src/local_pool.rs:83
#11 futures_executor::local_pool::block_on (f=...) at /home/b-server/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-executor-0.3.5/src/local_pool.rs:317
#12 veloren_server::events::player::handle_client_disconnect (server=0x7ffd0ddc6490, entity=...) at server/src/events/player.rs:84
#13 0x000055a1ffdcd7df in veloren_server::events::<impl veloren_server::Server>::handle_events (self=<optimized out>) at server/src/events/mod.rs:105
#14 0x000055a1ffdd8354 in veloren_server::Server::tick (self=0x7ffd0ddc6490, _input=..., dt=...) at server/src/lib.rs:400
#15 0x000055a1ffcb4e9e in veloren_server_cli::main () at server-cli/src/main.rs:56

The problem (part 2)

That backtrace told us a lot about what was going on:

  • Thread 1 is the main thread, the one that was doing the critical phase of
    the server. See line #14.
  • wait() in line #5 tells us that it’s waiting for an event to happen.
  • handle_client_disconnect() in line #12 gives us the exact thing what we are
    trying to do: it’s in line server/src/events/player.rs:84

So lets look at the source code:

84: if let Err(e) = block_on(participant.disconnect()) {
85:   debug!(
86:     ?e,
87:     "Error when disconnecting client, maybe the pipe already broke"
88:   );
89: };

The error seems to be blocking on the client.disconnect() function. This
function is there to cleanly shutdown a client after they disconnect or are
kicked. It should try to send all relevant info to them and once this is done
close the connection. For some reason, it took way to long (I haven’t analyzed
why yet) and blocks the main thread sporadically.

New armor set by @Snowram

The solution

Resolving the error was quite easy once we isolated it. We just move the
teardown to another thread. In a background thread this can take as long as it
wants without interrupting any player again. Further investigation will show why
it takes so long:

Aug 19 08:15:43.126  WARN veloren_server::events::player: disconecting took quite long elapsed=2.534882113s
Aug 19 08:16:20.984  WARN veloren_server::events::player: disconecting took quite long elapsed=3.543180779s
Aug 19 08:28:19.724 DEBUG veloren_server::events::player: disconecting took elapsed=23.553655ms
Aug 19 08:29:38.784 DEBUG veloren_server::events::player: disconecting took elapsed=21.045099ms
Aug 19 08:35:02.043 DEBUG veloren_server::events::player: disconecting took elapsed=25.217615ms
Aug 19 08:37:39.406 DEBUG veloren_server::events::player: disconecting took elapsed=19.948233ms
Aug 19 09:19:16.508  WARN veloren_server::events::player: disconecting took quite long elapsed=23.326647409s
Aug 19 11:29:26.872  WARN veloren_server::events::player: disconecting took quite long elapsed=18.663718063s
Aug 19 12:00:04.742  WARN veloren_server::events::player: disconecting took quite long elapsed=972.050707757s

As you can see often the disconnect is done within 25ms, but sometimes it
requires multiple seconds, or even up to 15 minutes.

@Sharp’s Lighting and World Changes Branch

The following is the description of @Sharp’s massive branch as it is being
merged into master. It covers LOD, shadow maps, greedy meshing, new lighting,
world size refactoring, and other performance fixes.

Pretty much a Veloren fork at this point. Here’s a high-level overview of the
changes. At a high level, this MR incorporates roughly two groups of changes.

The first group consists of new game features: more flexible map sizes, level of
detail terrain, shadow maps, and a new lighting engine. This is “feature work”
that (mostly) only adds new things to Veloren, and mostly shouldn’t affect old
stuff.

The second big group of changes are those addressing the fallout from all the
new features. These include performance fixes of various sorts: the addition of
multiple graphics options and optimization of the cheap ones to avoid work,
switching all voxel models to use some variant of greedy meshing, switching over
much of our CPU-side vector math to exploit SIMD instructions (coinciding with a
fork of vek), and a rewrite of how the UI handles text rendering (coinciding
with updates to our fork of conrod). Making Veloren’s hardcoded colors appear
correct under the new lighting engine also required considerable changes.

The second category of changes often heavily touches code owned by other people,
including frequently modified code “owned” by a handful of people, so I
recommend that this code be reviewed particularly carefully.

Table of Contents

At a high level (each will be described in more detail below):

  • The world map has been refactored
    • The world size is no longer hardcoded
    • The map generation code was made generic to allow using it outside of the
      world crate
    • On world creation, we now compute horizon maps
    • The way we pass the world from the server to the client has been updated
    • Artifacts related to image rotation were fixed
    • Multiflow rivers were enabled
    • In the process of making changes related to the world map, various
      incidental fixes and optimizations were required
  • The new level of detail feature was added
    • A new LOD terrain rendering step was added to the pipeline
    • The LOD terrain quality was made configurable via a graphics setting
    • Horizon maps were used to cast shadows from LOD chunks on both LOD and
      non-LOD terrain
    • A “voxelization” effect was incorporated into rendered LOD terrain to make
      it blend better into the world
    • In the process of making changes related to LOD, various incidental fixes
      and optimizations were required
  • Veloren’s lighting has been completely overhauled
    • A semi-accurate index of refraction was assigned to our materials
    • A new, more realistic, physically-based approach to lighting was used using
      solar panel light exposure
    • We attempt to compute realistic light attenuation in water using its real
      material properties
    • In the process of making changes related to LOD, various incidental fixes
      and optimizations were required
  • Point and directional lights now cast realistic shadows, using shadow
    mapping

    • Point light shadow maps were added to the rendering pipeline, using geometry
      shaders and seamless cube maps.
    • Directional light shadows were added to the rendering pipeline, using LISPSM
      together with disabling depth clamping.
    • “Shadow-only” chunks and NPCs were added to prevent shadows from models
      behind you from disappearing.
    • In the process of making changes related to shadow maps, various incidental
      fixes and optimizations were required.

The addition of shadow maps, LOD terrain, and the new lighting all led to
significant performance degradation, on top of other changes happening in
master. Therefore, a large number of performance improvements were also needed:

  • The graphics options were made much more flexible and configurable, and
    shaders were optimized

    • New options were provided for how to render lights and shadows
    • Graphic setting storage and configuration were overhauled to make adding new
      features easier
    • Shaders were rewritten to utilize GLSL’s preprocessor to avoid overhead
    • In the process of making changes related to providing additional rendering
      options, various incidental fixes and optimizations were required.
  • Voxel model creation was switched to use greedy meshing
    • A new voxel meshing method, greedy meshing, was added
    • Uses of the older meshing methods were migrated to use greedy meshing
    • New restrictions were added to terrain, figure, and sprites to future proof
      them for further optimizations
    • Most positions are now relative to either chunk or player position for
      better precision
    • In the process of making changes related to greedy meshing, various
      incidental fixes and optimizations were required
  • Animation and terrain math was switched to use SIMD where possible
    • Fixes were made to vek to make its SIMD feature usable for us
    • The interface and types used in bone animation were changed in various ways
    • Redundant code generation for body animation is now partly taken care of by
      a macro
    • Animation code was modified to use vek’s SIMD representation where possible
    • Terrain meshing code and shadow map math were also modified to use vek’s
      SIMD representation
    • SIMD instruction generation was enabled
    • In the process of making changes related to greedy meshing, various
      incidental fixes and optimizations were required
  • The way we cache glyphs was completely refactored, fixed, and optimized
    • Our fork of conrod was optimized in various ways
    • Our fork of conrod now exposes whether a widget was updated during the
      current frame
    • Our use of the glyph cache was rewritten for correctness
    • A text cache was introduced that lets us skip remeshing glyphs that have
      not changed
    • Various changes were made to reduce pressure on the glyph cache, with more
      planned
    • In the process of making changes related to the glyph cache, various
      incidental fixes and optimizations were required
  • Colors were changed to keep Veloren’s look consistent with master
    • Some older tree models were brought back
    • All hardcoded colors were extracted and made hotloadable.
    • Hardcoded colors were fixed to conform to Veloren’s style.
    • Color models were fixed to conform to Veloren’s style.

A detailed description of the involved changes follows.

World map information was refactored

Support for horizon maps has been merged into the map functionality in common
as well.

  • The way we pass the world from the server to the client has been updated.
    Rather than passing the prerendered map, we instead pass three maps with
    values for each chunk; one with the color information, a second with altitude
    information, and a third with horizon map information. We then reconstruct the
    map on the client, together with some additional information we send from the
    server (like the sea level and maximum height). See common/src/msg/server.rs
    for a detailed description of the format of WorldMapMsg, and
    server/src/libr.rs and client/src/lib.rs for details of the map
    construction and parsing.
  • Artifacts related to image rotation were fixed. See the commit message for
    commit SHA cf74d55f2e3d2ae7d25fd68d5c73b01a6afde86e for a detailed
    explanation. This involved changes to shaders, the addition of a new type of
    graphic (also reflected in the graphic cache) that allows specifying a border
    color (which automatically makes the associated texture immutable), and some
    related fixes. I reproduce the first two paragraphs of the MR description as
    well:
Fix map image artifacts and remove unneeded allocations.

Specifically, we address three concerns (the image stretching during
rotation, artifacts around the image due to clamping to the nearest
border color when the image is drawn to a larger space than the image
itself takes up, and potential artifacts around a rotated image which
accidentally ended up in an atlas and didn't have enough extra space to
guarantee the rotation would work).
  • Multiflow rivers were enabled. This does not really need to be part of this
    MR, and would be easy to revert, but since it seemed to provide a nice
    improvement it’s currently packaged with it. We already computed multiple
    outflows from each chunk for erosion purposes long before this MR. However, we
    never modified river rendering to be able to handle this case (just a single
    downhill river flow is complex enough!) so this was not exposed when deciding
    which chunks were rivers.
  • In the process of making changes related to the world map, various incidental
    fixes and optimizations were required. Some examples of fixes include making
    sure terrain is never lowered to below sea level (to make the shadow maps
    report correct values), fixing map altitudes and colors to understand things
    like cliffs and “block level” coloring (that doesn’t exist on the column
    level), and fixing a crash bug when rendering images for the UI where source
    pixels are strongly rectangular. Some examples of related performance fixes
    include avoiding allocating a fresh vector for all the maps (i.e. copying it
    over to change the format from [u32; n] to DynamicImage and then copying
    again to convert to RgbaImage), and instead using the gfx::memory::slice
    function to accomplish the same thing. These sorts of changes are spread all
    around the code.

A new LOD terrain rendering step was added to the pipeline

This includes the addition of a new scene, voxygen/src/scene/lod.rs, a new
pipeline voxygen/src/render/pipeline/lod_terrain.rs, and new shaders
assets/voxygen/shaders/lod-terrain-vert.glsl and
assets/voxygen/shaders/lod-terrain-frag.glsl, as well as associated changes to
the renderer in voxygen/src/render/renderer.rs.

The main idea behind our initial approach to LOD was to take the world data we
now get from the server (altitude, color, and horizon mapping).

In the process of making changes related to LOD, various incidental fixes and optimizations were required

  • Some previously computed values were turned into shader uniforms for better
    prediction on weak processors.
  • Calls to power or trig functions were removed or replaced with
    multiplications, where possible.

Voxel model creation was switched to use greedy meshing.

Veloren-specific conisderations

We explicitly designed the greedy meshing system with figures and sprites in
mind. In both cases, we want to be able to efficiently pack many different
models into the same texture, especially in cases where we know we will either
not be removing any of the grouped-together from the models from the texture, or
will remove all of them at once (so they can be packed into some specific
subtexture).

For sprites, since we know every model in advance and never intend to deallocate
them, we currently pack them all as efficiently as possible into one giant
texture atlas. However, in the future, we might opt to pack them slightly less
efficiently in exchange for shrinking the sprite vertex size. For figures, we
pack all the textures for each model into the same atlas. This is a global
texture atlas used for every sprite, and for figures which is why we have the
ability to mesh multiple models to the same texture area (using the simple
texture atlas allocator) without requiring intermediate vector allocations.

This is accomplished by delaying the time when we actually write the color and
light data to the texture until after all the model vertices have been meshed;
then, we can just allocate the whole color/light array at once, making the atlas
we use an exact fit. In computer science-y terms, we accomplish this delay by
not continuing to create the texture data after we perform the initial greedy
meshing (without texture information). Instead, we construct a continuation.
That is, a function that, when called, will execute the rest of the computation.
In Rust terms, this continuation is a FnOnce closure that takes the
ColLightsInfo that it is supposed to write to as context.

In the process of making changes related to greedy meshing, various incidental fixes and optimizations were required

  • Matrix multiplications in the shader were reduced for figure data
  • Vertex “waves” for fluid data were removed
  • Terrain “bending” near edges was removed
  • Scaling was fixed to make sure empty space was not introduced in a space
    previously occupied by a block. It was also changed to take ownership of its
    voxel data, rather than sharing it, to let it be used with meshing
  • Rust’s nightly version was bumped in order to use the array_map function,
    which lets us reuse more code between the simple map and FigureModelCache

References

I tried to cite sources in many cases where I needed features from elsewhere but
I am particularly grateful for the following resources, especially where they
have accompanying source code. I linked all of them that are accessible to the
public (those that are not were obtained through legal means).

Eisemann, Elmar, Michael Schwarz, Ulf Assarsson, Michael Wimmer. Real-Time
Shadows. A K Peters/CRC Press (T&F), 20160419.

Lloyd,B. 2007. Logarithmic perspective shadow
maps
. PhD
thesis, University of North Carolina.

Wimmer, M., Scherzer, D., and Purgathofer, W. 2004. Light space perspective
shadow
maps
. In
Proceedings of Eurographics Symposium on Rendering 2004, pp. 143– 152.

Pharr, Matt, et al. Physically Based Rendering: From Theory to
Implementation
.
Third edition, Morgan Kaufmann Publishers/Elsevier, 2017.

mikolalysenko. Meshing in a Minecraft
Game
0 FPS, 30 June
2012

blackflux. Meshing in Voxel Engines – Part
1

Blackflux.Com, 23 Feb. 2014

I am also especially grateful to Khronos, Wikiepdia, and stackoverflow for
answering many of my specific questions while writing the MR.

The most interesting bag. See you next week!

Support our devs!

Veloren Open Collective

This Week In Veloren 80

This week we prepare for the 0.7 release. With the code freeze, we don’t have
too many new features to talk about, but we get a rundown on particle timing
from @lobster.

– AngelOnFira, TWiV Editor

Contributor Work

Thanks to this week’s contributors, @yusdacra, @Songtronix, @scottc, @Imbris,
@IBotDEU, @Slipped, @nepo, @zesterer, @Pfau, @xMAC94x, @Varpie, @Sharp,
@AngelOnFira, @Sam, @Silentium, @BottledByte, and @JudgeGriesa!

0.7: “The Progression Update” is releasing this weekend! In this version, many
changes have been made to combat, animations, worldgen, and a lot more systems.
The release party will be happening this Saturday at 18:00 GMT+0. Come join us
on the main server to check out all the new changes!

A Solemn Quest by @Eden

@nepo added Dullahan into the game with help from @Slipped and @Snowram. The
model was created by @Gemu. @lobster wrote a basic scheduler for particles, so
we have emitter-like functionality. @PlaneCat has been working on adding a
weapons guide to the book. @Slipped merged swimming improvements including
controlling and animations. @zesterer added @WelshPixie’s furniture models into
the game.

Swimming improvements

Particle Timing by @lobster

Since we got rid of our ECS particle emitter component, we also lost our timing
mechanism to generate particles at the correct time. Which means our particle
spawn rate is now tied to the framerate. So if the frame rate stutters, it can
cause changes in particle density or other weird visual artifacts.

To fix this we need to track the last time a particle was created for any given
effect, so we can have consistent timing between particles. And because we have
a lot of particles, reducing the work done per-particle is important because it’s
multiplied 10,000 times or more, which can have a big performance impact.

New furniture by @WelshPixie

Since many particles are generated using the same or similar timings, we can
optimise this by putting all particles that spawn at the same rate into
frequency buckets. So now instead of tracking last created time for a single
particle, we can do it per frequency.

for _ in 0..self.scheduler.heartbeats(Duration::from_millis(10)) {
    // spawn a smoke particle
}

// ...

/// Accumulates heartbeats to be consumed on the next tick.
struct HeartbeatScheduler {
    /// Duration = Heartbeat Frequency/Intervals
    /// Instant = Last update time
    /// u8 = number of heartbeats since last update
    /// - if it's more frequent then tick rate, it could be 1 or more.
    /// - if it's less frequent then tick rate, it could be 1 or 0.
    /// - if it's equal to the tick rate, it could be between 2 and 0, due to
    /// delta time variance etc.
    timers: HashMap<Duration, (Instant, u8)>,
}

impl HeartbeatScheduler {
    /// updates the last elapsed times and elasped counts
    /// this should be called once, and only once per tick.
    pub fn maintain(&mut self) {
        for (frequency, (last_update, heartbeats)) in self.timers.iter_mut() {
            // the number of frequency cycles that have occurred.
            *heartbeats =
                (last_update.elapsed().as_secs_f32() / frequency.as_secs_f32()).floor() as u8;

            // Note: we want to preserve incomplete heartbeats, and roll them
            // over into the next tick.
            *last_update += frequency.mul_f32(*heartbeats as f32);
            // alternatively expressed as: now - remaining_partial_heartbeat_time
        }
    }

    /// returns the number of times this duration has elasped since the last
    /// tick:
    /// - if it's more frequent then tick rate, it could be 1 or more.
    /// - if it's less frequent then tick rate, it could be 1 or 0.
    /// - if it's equal to the tick rate, it could be between 2 and 0, due to
    /// delta time variance.
    pub fn heartbeats(&mut self, frequency: Duration) -> u8 {
        self.timers
            .entry(frequency)
            .or_insert_with(|| (Instant::now(), 0))
            .1
    }
}

The newly added cultist hammer by @Pfau

Another advantage to this approach is now we now have a larger and more
consistent batch of particles that can further avoid memory fragmentation (dead
particles wasting memory) by grouping even more particles with the same lifespan
together. They can then be more efficiently arranged and uploaded to GPU, by taking
advantage of bulk operations such as replacing an entire slice of memory with
another set of particles.

But the downside to this approach is that all effects with the same frequency
will be in sync and might look weird on the screen. But this is not an issue for
effects that have a fast particle spawn frequency, because it’s imperceivable to
most humans. But for particles that have slow spawn rates, such as 100ms and
above, it starts to become noticeable.

Dullahan standing tall. See you next week!

Support our devs!

Veloren Open Collective