Don't recompute IR children #9915

hubertp · 2024-05-10T09:43:59Z

Pull Request Description

Profiling benchmarks revealed that we spend non-neglible amount of time building IR preorder which in turns recomputes IR nodes' children. IR is immutable so there is not need to recompute that data for every pass.
Note that the change uses lazy val because benchmarks weren't showing much improvements when allocating children on creation (with val). We could probably improve the situation by using a poor version of lazy val that does not employ synchronization.

Important Notes

Pending benchmark results.

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

The documentation has been updated, if necessary.
Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
All code follows the
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
Unit tests have been written where possible.

Profiling benchmarks revealed that we spend non-neglible amount of time building IR preorder which in turns recomputes IR nodes' children. IR is immutable so there is not need to recompute that data for every pass. Note that the change uses `lazy val` because benchmarks weren't showing much improvements when allocating children on creation (with `val`). We could probably improve the situation by using a poor version of `lazy val` that does not employ synchronization.

4e6 · 2024-05-10T13:32:00Z

engine/runtime-parser/src/main/scala/org/enso/compiler/core/ir/CallArgument.scala

@@ -138,7 +138,7 @@ object CallArgument {
         |""".toSingleLine

    /** @inheritdoc */
-    override def children: List[IR] = name.toList :+ value
+    override lazy val children: List[IR] = name.toList :+ value


Not related to the change, but it should've been Vector, instead of a List

Once I heard @4e6 saying we shouldn't use List as it does an eager copy, but a lazy iterator. With List the preorder() implementation:

calls children on each one

the children implementation constructs a List

the preoder then concatenates the list to its own List representing the whole IR tree

Where can the slowdown be? Only in the construction of of children List, right? This PR tries to trade that CPU cost for increased memory consumption.

Wouldn't it be better to change the children to follow more efficient pattern and not allocate its own List?

List is a classic linked list https://github.com/scala/scala/blob/e858de50d51e13d225ac5b35015aac54cd71e23e/src/library/scala/collection/immutable/List.scala#L655-L660
It is suitable for implementing control flows (where you can efficiently pattern-match on head and tail), and maybe in prepending elements https://august.nagro.us/list-vs-vector.html. But it is not how we're using it.

Vector is a radix-balanced finger tree backed by arrays and should be more efficient in mixed use-cases (like ours append+traverse) both CPU and memory wise scala/scala#8534

elements https://august.nagro.us/list-vs-vector.html

Our usecase is "Construction with Append + Traversals" where Vector is faster. Possibly even better with children(addTo : Vector.Builder) : Unit.

Up to you guys, I don't want to be expert on Scala data structures.

I don't want to be expert on Scala data structures.

I continue to believe the best might be to have a lazy iterator like in this case of iterating all children of a directory - e.g. a queue where each nextElement() adds its children to the end of the queue.

JaroslavTulach · 2024-05-13T05:14:40Z

Profiling benchmarks revealed that we spend non-neglible amount of time building IR preorder which in turns recomputes IR nodes' children.

I believe I have also seen preorder to occupy a lot of time when profiling the static compilation.

IR is immutable so there is not need to recompute that data for every pass.

Can't we just cache the result of preorder()? E.g. make it lazy val?

Note that the change uses lazy val because benchmarks weren't showing much improvements when allocating children on creation (with val).

Can that be explained by any other hypothesis than: We do create some IR instances that we don't iterate thru with preorder?

We could probably improve the situation by using a poor version of lazy val that does not employ synchronization.

I don't understand this sentence.

engine/runtime-parser/src/main/scala/org/enso/compiler/core/ir/Name.scala

JaroslavTulach · 2024-05-13T05:24:03Z

What's the effect on benchmarks?

hubertp · 2024-05-13T08:00:08Z

What's the effect on benchmarks?

I ran the benchmarks and locally I was seeing minor improvements but on CI not so much. So I'm wondering if it is worth integrating. Unless I'm doing it wrong.

hubertp · 2024-05-13T08:53:56Z

IR is immutable so there is not need to recompute that data for every pass.

Can't we just cache the result of preorder()? E.g. make it lazy val?

Note that children is used in ChangesetBuilder. We could do both.

Note that the change uses lazy val because benchmarks weren't showing much improvements when allocating children on creation (with val).

Can that be explained by any other hypothesis than: We do create some IR instances that we don't iterate thru with preorder?

Maybe.

We could probably improve the situation by using a poor version of lazy val that does not employ synchronization.

I don't understand this sentence.

I meant that if this is a hotspot and lazy val behind the scenes does some synchronization, it is not really necessary. On the other hand it might be a premature optimization.

hubertp added the CI: No changelog needed Do not require a changelog entry for this PR. label May 10, 2024

hubertp requested review from 4e6, JaroslavTulach and Akirathan as code owners May 10, 2024 09:44

4e6 approved these changes May 10, 2024

View reviewed changes

JaroslavTulach reviewed May 13, 2024

View reviewed changes

engine/runtime-parser/src/main/scala/org/enso/compiler/core/ir/Name.scala Show resolved Hide resolved

This was referenced May 13, 2024

Cannot create a project in cloud #9789

Open

Investigate compilation of Standard.Base library with Native Image #9774

Closed

JaroslavTulach approved these changes May 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't recompute IR children #9915

Don't recompute IR children #9915

hubertp commented May 10, 2024

4e6 May 10, 2024

JaroslavTulach May 13, 2024

4e6 May 13, 2024

JaroslavTulach May 14, 2024

JaroslavTulach May 14, 2024

JaroslavTulach commented May 13, 2024

JaroslavTulach commented May 13, 2024

hubertp commented May 13, 2024

hubertp commented May 13, 2024

Don't recompute IR children #9915

Are you sure you want to change the base?

Don't recompute IR children #9915

Conversation

hubertp commented May 10, 2024

Pull Request Description

Important Notes

Checklist

4e6 May 10, 2024

Choose a reason for hiding this comment

JaroslavTulach May 13, 2024

Choose a reason for hiding this comment

4e6 May 13, 2024

Choose a reason for hiding this comment

JaroslavTulach May 14, 2024

Choose a reason for hiding this comment

JaroslavTulach May 14, 2024

Choose a reason for hiding this comment

JaroslavTulach commented May 13, 2024

JaroslavTulach commented May 13, 2024

hubertp commented May 13, 2024

hubertp commented May 13, 2024