Bunch of microoptimizations #1600

olefirenque · 2024-02-01T21:48:20Z

No description provided.

…ic_cover

…l_shape

olefirenque · 2024-02-01T21:56:02Z

src/xrGame/cover_manager.cpp


+    // avoiding extra allocations with a static storage for m_covers
+    static xr_vector<std::optional<CCoverPoint>> quadTreeStaticStorage;


I'm not sure if this is a good trade-off between allocs/RSS. It would be nice to hear an opinion on this

It's not recommended to use global/static objects which allocate memory dynamically.

olefirenque · 2024-02-01T22:01:23Z

src/xrGame/space_restrictor.cpp

    if (!actual())
-        prepare();
+    {
+        static std::mutex prepareMutex;
+        std::lock_guard lock(prepareMutex);
+
+        // Double-checked locking
+        if (!actual())
+            prepare();
+    }


I believe it was only non-concurrent part for xr_parallel_for in CSpaceRestrictionShape::fill_shape. Since it is called only once for a sequence of CSpaceRestrictor::inside calls, maybe it would be better to extract this to reduce contention

olefirenque · 2024-02-01T22:02:28Z

src/xrGame/cover_manager.h

-    xr_vector<bool> m_temp;
+    // vector<bool> is not applicable for `m_temp`
+    // since it is filled in parallel_for (https://timsong-cpp.github.io/cppwp/container.requirements.dataraces).
+    xr_vector<int> m_temp;


I suppose it wasn't thread-safe before

olefirenque · 2024-02-01T22:06:44Z

src/xrGame/cover_manager.cpp

-        if (m_temp[i] && critical_cover(i))
-            m_covers->insert(xr_new<CCoverPoint>(ai().level_graph().vertex_position(ai().level_graph().vertex(i)), i));
+    for (auto &p : quadTreeStaticStorage)
+        if (p.has_value())


It seemed that accessing m_temp and calling critical_cover took some time here

for (u32 i = 0; i < levelVertexCount; ++i) if (m_temp[i] && critical_cover(i)) m_covers->insert(xr_new<CCoverPoint>(ai().level_graph().vertex_position(ai().level_graph().vertex(i)), i));

src/Layers/xrRender_R2/r2_R_lights.cpp

Xottab-DUTY

Some small initial review. I didn't checked changes in xrGame yet.
Please, fix the code style there as per review comments :)

src/xrAICore/Navigation/level_graph_inline.h

src/Layers/xrRender_R2/r2_R_lights.cpp

src/xrGame/space_restriction_shape.cpp

olefirenque · 2024-02-02T00:26:12Z

src/xrGame/space_restriction_shape.cpp

+        {
+            if (inside(graph.vertex_id(&vertex), true) &&
+                !inside(graph.vertex_id(&vertex), false))
+                m_border_chunk.push_back(graph.vertex_id(&vertex));
+        }
+        std::lock_guard lock(mergeMutex);
+        if (m_border.capacity() < m_border.size() + m_border_chunk.size())
+            m_border.reserve(m_border.size() + m_border_chunk.size());
+        for (auto x : m_border_chunk)
+            m_border.push_back(x);


I did a little research on this, and it seemed to me that the body of CBorderMergePredicate::operator(...) is a bigger problem than merging a certain number of chunks under lock

Ok, it might be irrelevant since I accidentally used UI Freeze option.
I'm sorry for the misinformation. Apparently, this flame graph represents only those samples that were sampled when the NPC's spawning was lagging.

The UI Freeze event indicates time intervals where the application was unable to respond to user input. More specifically, these are time intervals where window messages were not pumped for more than 200 ms or processing of a particular message took more than 200 ms.

@olefirenque, what profiler did you use?

To me this strongly looks like tracy but I might be wrong there

…eView calls

src/Layers/xrRender_R2/r2_R_lights.cpp

src/Layers/xrRenderDX11/dx11SH_Texture.cpp

Xottab-DUTY · 2024-05-23T09:14:39Z

@olefirenque, hi! Sorry for a big delay in review & merging!
I've made big improvements to the task manager: now it supports lambda with captures.
Could you rebase this PR to accommodate the changes?

olefirenque added 3 commits February 2, 2024 00:51

xrRender_R2/r2_R_lights.cpp: get rid of indexed erasing in render_lights

23e6928

xrGame/cover_manager.cpp: optimize allocations, refactor compute_stat…

ff3fac7

…ic_cover

xrGame/space_restriction_shape.h: parallelize iterate_vertices in fil…

b846569

…l_shape

olefirenque force-pushed the bunch-of-microoptimizations branch from ca140d7 to b846569 Compare February 1, 2024 21:51

olefirenque commented Feb 1, 2024

View reviewed changes

src/Layers/xrRender_R2/r2_R_lights.cpp Outdated Show resolved Hide resolved

Xottab-DUTY reviewed Feb 1, 2024

View reviewed changes

Xottab-DUTY added Enhancement Renderer AI Artificial Intelligence labels Feb 1, 2024

fix codestyle issues

39c3f31

Xottab-DUTY reviewed Feb 1, 2024

View reviewed changes

src/xrGame/space_restriction_shape.cpp Outdated Show resolved Hide resolved

src/xrGame/space_restriction_shape.cpp Outdated Show resolved Hide resolved

src/xrGame/space_restriction_shape.cpp Outdated Show resolved Hide resolved

olefirenque force-pushed the bunch-of-microoptimizations branch from baf96ec to ec99fb2 Compare February 1, 2024 23:49

fix codestyle issues 2

d9da42f

olefirenque force-pushed the bunch-of-microoptimizations branch from ec99fb2 to d9da42f Compare February 2, 2024 00:08

olefirenque commented Feb 2, 2024

View reviewed changes

olefirenque added 2 commits February 3, 2024 14:52

fix codestyle issues 3

525d793

Layers/xrRenderDX11/dx11SH_Texture.cpp: aggregate CreateShaderResourc…

6f1c24d

…eView calls

Xottab-DUTY reviewed Feb 3, 2024

View reviewed changes

src/Layers/xrRender_R2/r2_R_lights.cpp Outdated Show resolved Hide resolved

Xottab-DUTY requested a review from vTurbine February 3, 2024 14:56

olefirenque force-pushed the bunch-of-microoptimizations branch from df2b8ae to 6f1c24d Compare February 3, 2024 15:06

olefirenque commented Feb 3, 2024

View reviewed changes

src/Layers/xrRenderDX11/dx11SH_Texture.cpp Outdated Show resolved Hide resolved

revert dx11SH_Texture.cpp changes, fix spacing

95534b8

Xottab-DUTY force-pushed the dev branch from b0ba60a to 289f78b Compare February 7, 2024 17:13

Xottab-DUTY force-pushed the dev branch 2 times, most recently from 5b2ec76 to 6fffce9 Compare May 4, 2024 03:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bunch of microoptimizations #1600

Bunch of microoptimizations #1600

olefirenque commented Feb 1, 2024

olefirenque Feb 1, 2024 •

edited

375gnu Feb 16, 2024

olefirenque Feb 1, 2024 •

edited

olefirenque Feb 1, 2024

olefirenque Feb 1, 2024 •

edited

Xottab-DUTY left a comment

olefirenque Feb 2, 2024 •

edited

olefirenque Feb 2, 2024

Xottab-DUTY Jun 2, 2024

AMS21 Jun 2, 2024

Xottab-DUTY commented May 23, 2024 •

edited


		// avoiding extra allocations with a static storage for m_covers
		static xr_vector<std::optional<CCoverPoint>> quadTreeStaticStorage;

Bunch of microoptimizations #1600

Are you sure you want to change the base?

Bunch of microoptimizations #1600

Conversation

olefirenque commented Feb 1, 2024

olefirenque Feb 1, 2024 • edited

Choose a reason for hiding this comment

375gnu Feb 16, 2024

Choose a reason for hiding this comment

olefirenque Feb 1, 2024 • edited

Choose a reason for hiding this comment

olefirenque Feb 1, 2024

Choose a reason for hiding this comment

olefirenque Feb 1, 2024 • edited

Choose a reason for hiding this comment

Xottab-DUTY left a comment

Choose a reason for hiding this comment

olefirenque Feb 2, 2024 • edited

Choose a reason for hiding this comment

olefirenque Feb 2, 2024

Choose a reason for hiding this comment

Xottab-DUTY Jun 2, 2024

Choose a reason for hiding this comment

AMS21 Jun 2, 2024

Choose a reason for hiding this comment

Xottab-DUTY commented May 23, 2024 • edited

olefirenque Feb 1, 2024 •

edited

olefirenque Feb 1, 2024 •

edited

olefirenque Feb 1, 2024 •

edited

olefirenque Feb 2, 2024 •

edited

Xottab-DUTY commented May 23, 2024 •

edited