Improve parameterization algorithm for SV modules/classes (#4497) #4629

donlon · 2023-10-26T18:05:43Z

Requires: #4581, #4582
Probably fixes: #3875, #4013

Backgrounds

#4497 fails because uvm_sequencer#(uvm_sequence_item) and uvm_sequencer#(uvm_sequence_item, uvm_sequence_item) are treated as different parameterization of class uvm_sequencer, but they're actually the same thing. There're some issues about the previous algorithm, for example

It only processes the overridden part of the parameters, other parameters (may be an expression depends on other parameters) are not evaluated.
So references with different number of parameter pins can't be matched to the same parameterization instance.
It didn't re-evaluated overridden values under the context of the target module, even if the values are already constant.
Sometime values should be truncated/expended after assigned to module, based on the variable type. E.g. Cls#(1024) and Cls#(0) should refer to the same parameterization, where Cls is

class Cls #(bit [7:0] A);
endclass

It didn't propagate parameters/parameter types through parameter list when evaluating the values. The types could be different in different instances and affect constant folding. E.g.

typedef struct packed {int a;} S1;
typedef struct packed {logic [3:0] b;} S2;
class Cls #(type T, T VAL);
endclass
// and
Cls #(logic[3:0], 100) obj1;
Cls #(S1, '{a: 10}) obj2;
Cls #(S2, '{b: 20}) obj3;

The new algorithm

To solve problems mentioned above and handle the general case, we have to substitute the overridden values into a copy of the original module, and then evaluate all of the parameters one by one (using V3Const::constifyParamsEdit). Then we can collect all of the evaluated constants (into a parameter set) and lookup the parameterization instance previously created with the same parameter set, then it's possible to match an exact instance for the reference.

New algorithm searches parameterization instances in a hash map, with the calculated parameter set. Instead of using longer string name that represents different parameterization before, it's expected to have higher efficiency.

However, deep cloning a module is expensive. To accelerate the procedure, it's possible to cache the instance by adding mapping from only the overridden part of the parameter set. If later reference has the same parameter overrides as before, it can find the instance with only the overridden part of the parameter set, and avoid another deep clone (See ModInfo::insertOverriddenParamSet and ModInfo::findNodeWithOverriddenParamSet in V3Param.cpp).

For hierarchical blocks, searching the library wrapper is now using the same routine as normal modules, so it should be more clear.

And I'm not sure whether constant folding can be performed without cloning a clean module (using something like V3Simulate?), if it's possible maybe I can optimize it in the future.

TODOs/Not yet solved issues

It's almost done for the new deparam algorithm! But there's still some problems need to be fixed.

Add more tests covering all cases mentioned above
Fix top-level interface port used in lint-only mode #4582
Testcase t_hier_block_trace* fails. It generates fst/vcd files with different structure than before.
It seem that previously, cells without parameter pins failed to match the correct (the library version of) parameterized hierarchical blocks, instead it match the original one. Probably it's needed to be fixed in current codebase first. Fixed by Fix linking parameterized hierarchical blocks and recursive hierarchical blocks #4654
During constant folding, some expressions result in numbers with wrong signedness. E.g. the value in parameter int A = int'(10.00) is converted into an unsigned number, and sometimes it causes instances mismatch. Currently, only the V3Number field of AstConst is compared when searching an instance (the signedness is ignored when comparing/hashing V3Numbers(?)), as a workaround. (link)
During constant folding, V3Width may move some datatypes in the tree to the global type table, and the nodes may contain reference to a class removed during parameterization, which causes 'Broken link' error later. It's very similar to Internal error Broken link in node #3875. I think the best way to handle such problems is that keep the unused classes but mark them as 'dead', and garbage-collect all unused types and classes/modules in next V3Dead. I tried this (using function here) but it causes more problems (e.g. child nodes of AstBracketArrayDType are removed, and the algorithm is not perfect, likes AstClassRefDTypes inside dead module does contribute to reference count of the class). Probably we have to improve the algorithm (instead of using reference counters, we can perform 'connectivity check' through the tree, and then remove the 'disconnected' nodes). I think I need more advice on this.

Performance and memory usage

It's a little bit better than before. For t_uvm_todo:

New:

Stage, Elapsed time (sec), 008_param         0.048460
Stage, Memory (MB), 008_param               95.632812

Old: (cf6e362)

Stage, Elapsed time (sec), 008_param         0.105867
Stage, Memory (MB), 008_param              140.031250

do not process mod without parameters skip parameterization for some ClassOrPackageRef do not delete nodes with VNDeleter::pushUnlinkDeletep

simple fix for interface add dumper for BaseModInfo and fix the destructor add recent fixes

donlon · 2023-10-26T18:13:50Z

test_regress/t/t_class_extends_aliased_real_bad.out

@@ -1,5 +1,4 @@
 %Error: t/t_class_extends_aliased_real_bad.v:14:10: Attempting to extend using non-class
-                                                  : ... note: In instance 't'


Some instance hints disappeared because the node of pointed type has been moved to the type table during constant folding, but the instance name is from the containing module back-searched. If the type is inside type table, the module 't' can not be found.
I think it can be improved by providing the processing state to the error builder by some other means, so the accurate error context can be determined.

src/V3Param.cpp

donlon · 2023-10-26T18:26:05Z

src/V3Dead.cpp

+void V3Dead::deadifyModuleDTypes(AstNetlist* nodep) {
+    UINFO(2, __FUNCTION__ << ": " << endl);
+    {
+        DeadVisitor{nodep, false, true, false, false, !v3Global.opt.topIfacesSupported()};


Failed to garbage-collect all unnecessary types and modules together :( Maybe can improve the algorithm first? Probably related to #2227

wsnyder · 2023-10-27T12:22:21Z

To accelerate the procedure, it's possible to cache the instance by adding mapping from only the overridden part of the parameter set. If later reference has the same parameter overrides as before, it can find the instance with only the overridden part of the parameter set, and avoid another deep clone

Note it's critical for C-compiler and later performance that we have multiple cells with same parameter set share a cloned module. For example some FPGAs use parameters to customize a LUT and will have 10K parameterized cells with identical final parameters.

wsnyder

Thanks for working on this, I agree this seems the right track for these fixes.

src/V3AstNodes.cpp

src/V3Broken.cpp

src/Verilator.cpp

test_regress/t/t_hier_block_sc_trace_fst.pl

test_regress/t/t_interface_localparam.v

test_regress/t/t_xml_first.out

…go-wip-staging Conflicts: src/V3Param.cpp test_regress/t/t_hier_block_sc_trace_fst.out test_regress/t/t_hier_block_sc_trace_vcd.out test_regress/t/t_hier_block_trace_fst.out test_regress/t/t_hier_block_trace_vcd.out

donlon · 2023-11-08T17:58:49Z

Hi, I just saw that you mentioned V3Dead somewhere else (#4646 (comment)) so I wonder if you have started coding?
I was also digging into it and conceived some immature ideas, perhaps I can open an issue for discussion?

wsnyder · 2023-11-09T00:57:48Z

I have a new V3Dead all coded, just in debug phase which I underestimated!

donlon · 2023-11-09T09:41:51Z

I have a new V3Dead all coded, just in debug phase

Thank you! Looking forward to the improvements.

which I underestimated!

My humble understanding about its complexity is that you need to handle references from every pointer fields, and the dependency relationship between nodes is also complex (nodes can depend on one node or one node and all its children, while all its parents should be kept). Especially if we need to provide options to remove specific types of the nodes (e.g. module, dtype, scope, ...).

em2machine · 2023-12-01T16:02:54Z

This looks like would help #4391 and similar.

donlon · 2023-12-01T17:51:56Z

This looks like would help #4391 and similar.

I think #4659 is more likely to fix it. If we process modules in the order of source code, the xxx_class#(Cfg0) can be parameterized before using cmd_tag_t (in #4391) so the field can be resolved correctly. But #4659 indeed depends on this pull.

Hopping that it can also fix #4391 :)

wsnyder · 2023-12-01T18:06:16Z

If we process modules in the order of source code

Unfortunately source code order is not sufficient, and changing to that will break other cases. We need to implement following IEEE 1800 3.12 and 23.10.4 (and perhaps others) to do better.

donlon · 2023-12-02T16:08:36Z

Unfortunately source code order is not sufficient, and changing to that will break other cases

Are they cases containing hierarchical names?

So I think actually we do need to support performing elaboration out of order, as hierarchical names (and $root) can reference objects after itself (however merely an identifier without any dots can not, so for this source code order is good).

Maybe the description for #4659 is not accurate, as the 'elaboration' process defined in the LRM is split into multiple stages in Verilator, but #4659 actually affects only the order of ParamProcessor::nodeDeparam called in V3Param, and other stages are not changed, so it won't break cases related to hierarchical names.

For hierarchical names and, moreover, upwards referencing, they do add difficulties and it still need further improvements to make the whole stages perfect (especially when used during parameter resolution). But I think #4659 is almost sufficient for #4391.

Here's the new algorithm I implemented (https://github.com/donlon/verilator/commits/wip/elab-order-1), and the key changes are here (donlon@b92d2cc). Luckily I find it doesn't break anything (the CI failures are due to this pull, which it depends on).

For 23.10.4 it looks like a way to support arbitrary defparams? (which we probably tend not to provide good support for?) And it's also difficult to be supported with current algorithm...

donlon and others added 12 commits October 26, 2023 14:19

clean cellInterfaceCleanup in V3Param

53e79ef

fix error on interface sel

795ec6c

simple rewrites

1cec1c2

do not use AstClass::isParameterized which is inaccurate and redundant

d80b7b8

remove this->

0f5020c

always copy original AstClass nodes for parameterization

0a65275

do not process mod without parameters skip parameterization for some ClassOrPackageRef do not delete nodes with VNDeleter::pushUnlinkDeletep

stash the deparam algorithm changes

0f015f7

simple fix for interface add dumper for BaseModInfo and fix the destructor add recent fixes

squash recent changes

3f6e8b7

revert something

69c826a

fix lint-only mode with top-level interfaces (temporary fix)

da9bcd0

Apply 'make format'

91a2616

fix compiler error

164445d

donlon marked this pull request as draft October 26, 2023 18:05

donlon commented Oct 26, 2023

View reviewed changes

wsnyder reviewed Oct 27, 2023

View reviewed changes

donlon added 7 commits October 27, 2023 14:56

update

99550aa

update tests for hierarchical blocks

27068d2

fix using VLT files for recursive modules

da20584

update

033f983

temporarily fix for recursive hier blocks

b7eab0e

merge map for full param sets and overridden param sets

983fd32

use hash-based module name for hierarchical blocks

ee6a3b2

donlon mentioned this pull request Nov 2, 2023

Question about modules level and elaboration order #4659

Closed

donlon added 2 commits November 5, 2023 17:22

some fixes

439f6e1

donlon mentioned this pull request Jan 4, 2024

Fix localparam elaboration (#3858) #4794

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve parameterization algorithm for SV modules/classes (#4497) #4629

Improve parameterization algorithm for SV modules/classes (#4497) #4629

donlon commented Oct 26, 2023 •

edited

donlon Oct 26, 2023

donlon Oct 26, 2023

wsnyder commented Oct 27, 2023

wsnyder left a comment

donlon commented Nov 8, 2023

wsnyder commented Nov 9, 2023

donlon commented Nov 9, 2023

em2machine commented Dec 1, 2023

donlon commented Dec 1, 2023

wsnyder commented Dec 1, 2023 •

edited

donlon commented Dec 2, 2023 •

edited

		@@ -1,5 +1,4 @@
		%Error: t/t_class_extends_aliased_real_bad.v:14:10: Attempting to extend using non-class
		: ... note: In instance 't'

Improve parameterization algorithm for SV modules/classes (#4497) #4629

Are you sure you want to change the base?

Improve parameterization algorithm for SV modules/classes (#4497) #4629

Conversation

donlon commented Oct 26, 2023 • edited

Backgrounds

The new algorithm

TODOs/Not yet solved issues

Performance and memory usage

donlon Oct 26, 2023

Choose a reason for hiding this comment

donlon Oct 26, 2023

Choose a reason for hiding this comment

wsnyder commented Oct 27, 2023

wsnyder left a comment

Choose a reason for hiding this comment

donlon commented Nov 8, 2023

wsnyder commented Nov 9, 2023

donlon commented Nov 9, 2023

em2machine commented Dec 1, 2023

donlon commented Dec 1, 2023

wsnyder commented Dec 1, 2023 • edited

donlon commented Dec 2, 2023 • edited

donlon commented Oct 26, 2023 •

edited

wsnyder commented Dec 1, 2023 •

edited

donlon commented Dec 2, 2023 •

edited