-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve parameterization algorithm for SV modules/classes (#4497) #4629
base: master
Are you sure you want to change the base?
Conversation
do not process mod without parameters skip parameterization for some ClassOrPackageRef do not delete nodes with VNDeleter::pushUnlinkDeletep
simple fix for interface add dumper for BaseModInfo and fix the destructor add recent fixes
@@ -1,5 +1,4 @@ | |||
%Error: t/t_class_extends_aliased_real_bad.v:14:10: Attempting to extend using non-class | |||
: ... note: In instance 't' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some instance hints disappeared because the node of pointed type has been moved to the type table during constant folding, but the instance name is from the containing module back-searched. If the type is inside type table, the module 't' can not be found.
I think it can be improved by providing the processing state to the error builder by some other means, so the accurate error context can be determined.
void V3Dead::deadifyModuleDTypes(AstNetlist* nodep) { | ||
UINFO(2, __FUNCTION__ << ": " << endl); | ||
{ | ||
DeadVisitor{nodep, false, true, false, false, !v3Global.opt.topIfacesSupported()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Failed to garbage-collect all unnecessary types and modules together :( Maybe can improve the algorithm first? Probably related to #2227
Note it's critical for C-compiler and later performance that we have multiple cells with same parameter set share a cloned module. For example some FPGAs use parameters to customize a LUT and will have 10K parameterized cells with identical final parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this, I agree this seems the right track for these fixes.
…go-wip-staging Conflicts: src/V3Param.cpp test_regress/t/t_hier_block_sc_trace_fst.out test_regress/t/t_hier_block_sc_trace_vcd.out test_regress/t/t_hier_block_trace_fst.out test_regress/t/t_hier_block_trace_vcd.out
Hi, I just saw that you mentioned V3Dead somewhere else (#4646 (comment)) so I wonder if you have started coding? |
I have a new V3Dead all coded, just in debug phase which I underestimated! |
Thank you! Looking forward to the improvements.
My humble understanding about its complexity is that you need to handle references from every pointer fields, and the dependency relationship between nodes is also complex (nodes can depend on one node or one node and all its children, while all its parents should be kept). Especially if we need to provide options to remove specific types of the nodes (e.g. module, dtype, scope, ...). |
This looks like would help #4391 and similar. |
I think #4659 is more likely to fix it. If we process modules in the order of source code, the xxx_class#(Cfg0) can be parameterized before using cmd_tag_t (in #4391) so the field can be resolved correctly. But #4659 indeed depends on this pull. Hopping that it can also fix #4391 :) |
Unfortunately source code order is not sufficient, and changing to that will break other cases. We need to implement following IEEE 1800 3.12 and 23.10.4 (and perhaps others) to do better. |
Are they cases containing hierarchical names? So I think actually we do need to support performing elaboration out of order, as hierarchical names (and Maybe the description for #4659 is not accurate, as the 'elaboration' process defined in the LRM is split into multiple stages in Verilator, but #4659 actually affects only the order of For hierarchical names and, moreover, upwards referencing, they do add difficulties and it still need further improvements to make the whole stages perfect (especially when used during parameter resolution). But I think #4659 is almost sufficient for #4391. Here's the new algorithm I implemented (https://github.com/donlon/verilator/commits/wip/elab-order-1), and the key changes are here (donlon@b92d2cc). Luckily I find it doesn't break anything (the CI failures are due to this pull, which it depends on). For 23.10.4 it looks like a way to support arbitrary defparams? (which we probably tend not to provide good support for?) And it's also difficult to be supported with current algorithm... |
Fixes #4497
Fixes #3858
Requires: #4581, #4582
Probably fixes: #3875, #4013
Backgrounds
#4497 fails because
uvm_sequencer#(uvm_sequence_item)
anduvm_sequencer#(uvm_sequence_item, uvm_sequence_item)
are treated as different parameterization of class uvm_sequencer, but they're actually the same thing. There're some issues about the previous algorithm, for exampleIt only processes the overridden part of the parameters, other parameters (may be an expression depends on other parameters) are not evaluated.
So references with different number of parameter pins can't be matched to the same parameterization instance.
It didn't re-evaluated overridden values under the context of the target module, even if the values are already constant.
Sometime values should be truncated/expended after assigned to module, based on the variable type. E.g.
Cls#(1024)
andCls#(0)
should refer to the same parameterization, where Cls isThe new algorithm
To solve problems mentioned above and handle the general case, we have to substitute the overridden values into a copy of the original module, and then evaluate all of the parameters one by one (using
V3Const::constifyParamsEdit
). Then we can collect all of the evaluated constants (into a parameter set) and lookup the parameterization instance previously created with the same parameter set, then it's possible to match an exact instance for the reference.New algorithm searches parameterization instances in a hash map, with the calculated parameter set. Instead of using longer string name that represents different parameterization before, it's expected to have higher efficiency.
However, deep cloning a module is expensive. To accelerate the procedure, it's possible to cache the instance by adding mapping from only the overridden part of the parameter set. If later reference has the same parameter overrides as before, it can find the instance with only the overridden part of the parameter set, and avoid another deep clone (See
ModInfo::insertOverriddenParamSet
andModInfo::findNodeWithOverriddenParamSet
in V3Param.cpp).For hierarchical blocks, searching the library wrapper is now using the same routine as normal modules, so it should be more clear.
And I'm not sure whether constant folding can be performed without cloning a clean module (using something like V3Simulate?), if it's possible maybe I can optimize it in the future.
TODOs/Not yet solved issues
It's almost done for the new deparam algorithm! But there's still some problems need to be fixed.
Add more tests covering all cases mentioned above
Fix top-level interface port used in lint-only mode #4582
Testcase t_hier_block_trace* fails. It generates fst/vcd files with different structure than before.Fixed by Fix linking parameterized hierarchical blocks and recursive hierarchical blocks #4654It seem that previously, cells without parameter pins failed to match the correct (the library version of) parameterized hierarchical blocks, instead it match the original one. Probably it's needed to be fixed in current codebase first.
During constant folding, some expressions result in numbers with wrong signedness. E.g. the value in
parameter int A = int'(10.00)
is converted into an unsigned number, and sometimes it causes instances mismatch. Currently, only the V3Number field of AstConst is compared when searching an instance (the signedness is ignored when comparing/hashing V3Numbers(?)), as a workaround. (link)During constant folding, V3Width may move some datatypes in the tree to the global type table, and the nodes may contain reference to a class removed during parameterization, which causes 'Broken link' error later. It's very similar to Internal error
Broken link in node
#3875. I think the best way to handle such problems is that keep the unused classes but mark them as 'dead', and garbage-collect all unused types and classes/modules in next V3Dead. I tried this (using function here) but it causes more problems (e.g. child nodes of AstBracketArrayDType are removed, and the algorithm is not perfect, likes AstClassRefDTypes inside dead module does contribute to reference count of the class). Probably we have to improve the algorithm (instead of using reference counters, we can perform 'connectivity check' through the tree, and then remove the 'disconnected' nodes). I think I need more advice on this.Performance and memory usage
It's a little bit better than before. For t_uvm_todo:
New:
Old: (cf6e362)