Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad main game performance #505

Open
phobos2077 opened this issue Mar 25, 2017 · 15 comments
Open

Bad main game performance #505

phobos2077 opened this issue Mar 25, 2017 · 15 comments

Comments

@phobos2077
Copy link
Contributor

Map: klamall
Compiler: MSVC 2017
OS: Windows 7 x64
Resolution: 640x480 (windowed)

About 200 FPS in Release configuration and 10-15 FPS in Debug (both x64).

@Zervox
Copy link
Contributor

Zervox commented Mar 28, 2017

I don't understand bad main game performance, debug usually is slow no matter what compiling in visual studio as it ensures everything is null on initialization amongst other things, 200FPS in release I wouldn't say is bad.

That being said if there is a part of the project that could be built separately as a static library(assuming almost all work on that part is complete or is project agnostic) you separate that code and build it as release and build the test code towards that library to speed up parts of the code.

@phobos2077
Copy link
Contributor Author

Maybe you are right, but I definitely remember there was issues at least with how mouse events are handled. If there is a simple optimization/improvements that could be made, that would be useful for improving debugging experience.

@Zervox
Copy link
Contributor

Zervox commented Mar 28, 2017

I guess you could try doing
http://www.bfilipek.com/2015/09/visual-studio-slow-debugging-and.html
and/or
https://blogs.msdn.microsoft.com/visualstudioalm/2015/03/03/make-debugging-faster-with-visual-studio/
not sure if the makefile resulting project or similar have debug heap enabled or disabled.

Not sure how the input is handled but if FPS causes it to skip I am guessing it is because of not using buffered events or not done asynchronously from the game loop.

@phobos2077
Copy link
Contributor Author

phobos2077 commented Apr 1, 2017

Debug heap allocator is disabled by default in Visual Studio. When I tried enabling it, map load time increased from ~20 to ~180 seconds... But FPS was still relatively at the same level.
Input is handled in the main loop extremely inefficiently at the moment.

@Zervox
Copy link
Contributor

Zervox commented Apr 2, 2017

So I looked into this abit and found two settings which can help alot.
project settings->code generation-> runtime library =Multi-threaded DLL (/MD)
project settings->optimization->Inline function expansion = Any Suitable (/Ob2)
this did not seem to impede the possibility of debugging.

@phobos2077
Copy link
Contributor Author

Did you notice any load time or FPS improvements with these settings?

@Zervox
Copy link
Contributor

Zervox commented Apr 2, 2017

noticed a huge change in load time of tilemap(for the better)
FPS went from 17-28 to 34 to 43

@phobos2077
Copy link
Contributor Author

I couldn't compile with /MD flag but enabling inline expansion increases performance dramatically. This may be related to how we abuse property accessors in all of our classes. Not sure if it's relevant because debug is supposed to be slow anyway. I'm concerned that 200 FPS in Release for such tiny resolution is too low on my machine.

Issue #439 addresses how we iterate all objects inefficiently, but I wander if there is anything else that can explain such performance.

@Zervox
Copy link
Contributor

Zervox commented Apr 11, 2017

the problem with performance is due to rendering, average 40% of all the time is spent inside Sprite rendering highest I've seen(lots of objects on screen)is 64%, its around 46-48% average without my sprite change(not related to draw but due to dynamic vectors) 30-36% of that is due to external code of GL which suggests to me it is because of state changes or rebinding of texture per object(to my knowledge the code checks if the previous draw used the same binded texture which saves a bit performance if there is multiple of the same in a row) but the scene does not do a secondary sort to try and draw as many walls together as possible(where applicable)

the percentage is depending on where you are looking, the rendering cost is alot higher if you are looking over alot of objects due to more texture rebind submissions to GPU.

now, rendering floor and roof is fast, but objects and flatobjects spends alot of time

Edit: it would probably need some rework, but it might be worth testing to see if using depth buffer is faster or not, but before that I think it would be well worth it taking a look to see if something can't be done about reducing the texture binding, possibly even just merge into a spritesheet/texture atlas so that all wall objects is using the same texture for sprite rendering, this way we wouldn't need to rebind a texture for each segment of different wall after on another, or barrels and boxes are on the same atlas so that if there are areas where alot of crate and barrel sprites are stacked together they have a higher chance of being draw without having to rebind a texture(something like https://github.com/NicolasPerdu/TexturePacker or https://github.com/scriptum/Cheetah-Texture-Packer)

@phobos2077
Copy link
Contributor Author

phobos2077 commented Apr 11, 2017

I think to optimize rendering the first step to take is to make it actually render only objects that are visible on screen. If I'm not mistaken, now the whole map is being drawn every time.
Also there is an iteration over every hex (of 40000 something) while there are much less objects on the map than hexes. So it might be more efficient to make some kind of rendering queue for objects. But also take care about moving objects, etc...

(the same is true for mouse event handling - now it's extremely inefficient)

@Zervox
Copy link
Contributor

Zervox commented Apr 11, 2017

mouse event handling slowest part I'd say is that every time it moves/stops to hover and tells the mouse icon class to recreate the icon it should render, moving this to an array of icons which is created at Mouse icon class creation makes it alot faster(means you only need to get the cursor icon and not create it too when it changes).

hex iteration is actually not consuming that much time.
objects rendering are far slower, I am actually profiling the project, removing floor and roof rendering improves almost nothing, objects drawing costs alot, the reason most likely is that all objects are transparent combined with texture rebinding.

and moving the screen out of the map so you see mostly black shows the same thing, in debug 150fps+(because there is a screen rectangle intersect test before rendering inside object)

just doing simple sdl tick check shows the exact same thing,
entire Location::handle uses 0-1 SDL ticks per frame(updates based on mouse movement/keyboard input)
enitre Location state think() uses 0-1 SDL ticks per frame(depends if camera and/or an object is moving)
rendering floor takes 0 ticks per frame
rendering roof takes 0 ticks per frame
rendering flat objects 6-7 ticks per frame// most time is spent inside Sprite::render function
rendering objects 6-7 ticks per frame// most time is spent inside Sprite::render function
out of the entire Location's render function 100% of the time is basically spent rendering those objects not iterating them

@phobos2077
Copy link
Contributor Author

Hm.. that's surprising results. But I remember when I move my mouse, FPS goes down almost 50%.. So I figured it was due to iterations...

@Zervox
Copy link
Contributor

Zervox commented Apr 11, 2017

This could be as I briefly mentioned at the beginning when you move your mouse it tries to update its icon(and current code recreates the icon every time it's "state" is changed).

I should note you are not incorrect about the way mouse handling is done is inefficient, but it is not that inefficient.

Uploaded a branch regarding Input::Mouse I know you will probably mention the use of std::map and the ugly if elses(I was very lazy when I wrote it)
doing it this way the mouse from point of creation(Falltergeist startup) will have all icons available instead of recreating data constantly for every minor change or update.
Input::Mouse

@FakelsHub
Copy link

FakelsHub commented Oct 29, 2019

tried on old machine Celeron 2500 GHz
Release build, speed optimization enable: 5 FPS)))
on modern computers it works acceptably, but all the same guzzles resources as if there 3D game.

@AdamFx990
Copy link
Contributor

Now that we're using delta time for everything, it wouldn't be too much work to add a synthetic benchmark that can be used to measure performance regressions/improvements. Perhaps via a launch parameter "--benchmark".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants