Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine draw commands to improve rendering performance #2421

Open
wants to merge 24 commits into
base: dev
Choose a base branch
from

Conversation

douira
Copy link
Contributor

@douira douira commented Apr 14, 2024

Prerequisite PR: #2352 (built on top of that branch, when that PR is merged this PR will only have one differing commit)

This PR makes it so that draw commands that read from adjacent vertex data are combined. This reduces the number of draw commands by around 30% and improves fps on my system by 16-30% depending on the scene and circumstances. I'm on macOS with a 6900 XT. This performance improvement likely comes, as jellysquid stated on discord, from reduced CPU overhead in the driver and better GPU occupancy.

Please test if this results in a similar improvement or other effect, as it's probably dependent on graphics card, memory bandwidth, and platform (os/driver/vendor etc).

Here's a recording of the number of draw commands per pass:
ts on, before:
Draw total for pass Solid: 15531
Draw total for pass Cutout: 13277
Draw total for pass Translucent: 2298

ts on, after:
Draw total for pass Solid: 9571
Draw total for pass Cutout: 8306
Draw total for pass Translucent: 2298

ts off, before:
Draw total for pass Solid: 15531
Draw total for pass Cutout: 13277
Draw total for pass Translucent: 3812

ts off, after:
Draw total for pass Solid: 9571
Draw total for pass Cutout: 8306
Draw total for pass Translucent: 3645

Here's some screenshots without and with this patch:

Screenshot 2024-04-14 at 04 29 12
Screenshot 2024-04-14 at 04 27 59
Screenshot 2024-04-14 at 04 30 51
Screenshot 2024-04-14 at 04 32 25
Screenshot 2024-04-14 at 04 48 21
Screenshot 2024-04-14 at 04 49 16

…ing distance sorting through

the detection of primary intersectors when geometry is intersecting and then sorting them in a fixed order
…iately instead of keeping them to avoid memory usage

buffer caching would be a better solution but that's complicated and doesn't currently work correctly
also removed the warning message about unpartitionable geometry as it seems to not be a relevant problem
… not recalculated when the normal is quantized.

also fixed aligned quads not receiving the more accurate center based on the average of the unique vertexes.
@douira
Copy link
Contributor Author

douira commented May 8, 2024

Testing on Discord has shown that these changes can improve performance by around 35%, highly variable depending on specific combinations of many system and scene-related factors. There don't seem to have been any regressions that are statistically significant.

An even more radical optimization that attempts to organize sections such that then combining draw commands across sections is possible did not yield useful results, but I suspect the implementation has a bug. It can be found here (link), but isn't included in this PR.

@douira
Copy link
Contributor Author

douira commented May 8, 2024

I'm marking it as ready for review/merging

@douira douira marked this pull request as ready for review May 8, 2024 03:01
@jellysquid3
Copy link
Member

Because the changes from #2352 have been squashed into /dev, the pull request needs to be re-based to properly isolate the relevant changes.

# Conflicts:
#	src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/compile/ChunkBuildBuffers.java
@douira
Copy link
Contributor Author

douira commented May 20, 2024

I think it's good to go now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants