Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRT not converging after 64 iterations #1767

Open
gkamendje opened this issue Jan 18, 2024 · 21 comments
Open

DRT not converging after 64 iterations #1767

gkamendje opened this issue Jan 18, 2024 · 21 comments
Assignees
Labels
drt Detailed Routing

Comments

@gkamendje
Copy link

Subject

[Stage]: Detail Router.

Describe the bug

After approximately 5 iterations, the number of violations is down to 1 or two (spacing violations on M2). However, although these violations can be easily fixed manually, DRT most of the time runs until the 64th iteration and most of the time fails to fix the single violation remaining. The behavior has been reproduced on Ubuntu 22 as well as on CentOS7 machines. Sometime however, DRT will run until the ~50th iteration or so and fix the violation.

Expected Behavior

DRT should be able to fix those violations (specially when thy can be easily fixed manually)

Environment

Git commit: 50feb216d19d6922e2de10daea82437244ed5fa0
kernel: Linux 5.15.133.1-microsoft-standard-WSL2
os: Ubuntu 22.04.3 LTS (Jammy Jellyfish)
cmake version 3.28.0-rc2
CMake Deprecation Warning at third-party/abc/CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- The CXX compiler identification is GNU 11.4.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- OpenROAD version: v2.0-11900-g50feb216d
-- System name: Linux
-- Compiler: GNU 11.4.0
-- Build type: RELEASE
-- Install prefix: /usr/local
-- C++ Standard: 17
-- C++ Standard Required: ON
-- C++ Extensions: OFF
-- The C compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Found Python: /usr/bin/python3.10 (found version "3.10.12") found components: Interpreter
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Performing Test C_COMPILER_SUPPORTS__-Wall
-- Performing Test C_COMPILER_SUPPORTS__-Wall - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wall
-- Performing Test CXX_COMPILER_SUPPORTS__-Wall - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-array-bounds
-- Performing Test C_COMPILER_SUPPORTS__-Wno-array-bounds - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-array-bounds
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-array-bounds - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-nonnull
-- Performing Test C_COMPILER_SUPPORTS__-Wno-nonnull - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-nonnull
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-nonnull - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-maybe-uninitialized
-- Performing Test C_COMPILER_SUPPORTS__-Wno-maybe-uninitialized - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-maybe-uninitialized
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-maybe-uninitialized - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format-overflow
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format-overflow - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format-overflow
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format-overflow - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-variable
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-variable - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-variable
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-variable - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-function
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-function - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-function
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-function - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-write-strings
-- Performing Test C_COMPILER_SUPPORTS__-Wno-write-strings - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-write-strings
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-write-strings - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-sign-compare
-- Performing Test C_COMPILER_SUPPORTS__-Wno-sign-compare - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-sign-compare
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-sign-compare - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-deprecated
-- Performing Test C_COMPILER_SUPPORTS__-Wno-deprecated - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-deprecated
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-deprecated - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-c++11-narrowing
-- Performing Test C_COMPILER_SUPPORTS__-Wno-c++11-narrowing - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-c++11-narrowing
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-c++11-narrowing - Failed
-- Performing Test C_COMPILER_SUPPORTS__-Wno-register
-- Performing Test C_COMPILER_SUPPORTS__-Wno-register - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-register
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-register - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format -CMake Error at src/gpl/CMakeLists.txt:44 (find_package):
  By not providing "Findortools.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "ortools", but
  CMake did not find one.

  Could not find a package configuration file provided by "ortools" with any
  of the following names:

    ortoolsConfig.cmake
    ortools-config.cmake

  Add the installation prefix of "ortools" to CMAKE_PREFIX_PATH or set
  "ortools_DIR" to a directory containing one of the above files.  If
  "ortools" provides a separate development package or SDK, be sure it has
  been installed.


 Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal
-- Performing Test C_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal - Failed
-- Performing Test C_COMPILER_SUPPORTS__-fpermissive
-- Performing Test C_COMPILER_SUPPORTS__-fpermissive - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-fpermissive
-- Performing Test CXX_COMPILER_SUPPORTS__-fpermissive - Success
-- Performing Test C_COMPILER_SUPPORTS__-x
-- Performing Test C_COMPILER_SUPPORTS__-x - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-x
-- Performing Test CXX_COMPILER_SUPPORTS__-x - Failed
-- Performing Test C_COMPILER_SUPPORTS__c++
-- Performing Test C_COMPILER_SUPPORTS__c++ - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__c++
-- Performing Test CXX_COMPILER_SUPPORTS__c++ - Failed
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-but-set-variable
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-but-set-variable - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-but-set-variable
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-but-set-variable - Success
-- TCL library: /usr/lib/x86_64-linux-gnu/libtcl.so
-- TCL header: /usr/include/tcl/tcl.h
-- TCL readline library: /usr/lib/x86_64-linux-gnu/libtclreadline.so
-- TCL readline header: /usr/include/x86_64-linux-gnu
-- Found SWIG: /usr/bin/swig4.0 (found suitable version "4.0.2", minimum required is "3.0")
-- Found Boost: /usr/local/lib/cmake/Boost-1.78.0/BoostConfig.cmake (found version "1.78.0")
-- boost: 1.78.0
-- Found Python3: /usr/include/python3.10 (found version "3.10.12") found components: Development Development.Module Development.Embed
-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.11")
-- spdlog: 1.9.2
-- Found BISON: /usr/bin/bison (found version "3.8.2")
-- Found Doxygen: /usr/bin/doxygen (found version "1.9.1") found components: doxygen dot
-- STA version: 2.4.0
-- STA git sha: 42b994d429aef6d553baae6aac8c23477b6e0654
-- System name: Linux
-- Compiler: GNU 11.4.0
-- Build type: RELEASE
-- Build CXX_FLAGS: -O3 -DNDEBUG
-- Install prefix: /usr/local
-- Found FLEX: /usr/bin/flex (found version "2.6.4")
-- TCL library: /usr/lib/x86_64-linux-gnu/libtcl.so
-- TCL header: /usr/include/tcl/tcl.h
-- SSTA: 0
-- STA executable: /home/gkamendje/tmp_compile/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/app/sta
-- Configuring incomplete, errors occurred!

To Reproduce

Download and untar the following file https://drive.google.com/file/d/125sPK0yOeynKy30qLSbE2zCuWgajuA7S/view?usp=drive_link and run make clean_all finish

Relevant log output

INFO DRT-0195] Start 64th optimization iteration.
    Completing 10% with 1 violations.
    elapsed time = 00:00:00, memory = 960.48 (MB).
    Completing 20% with 1 violations.
    elapsed time = 00:00:00, memory = 960.48 (MB).
    Completing 30% with 1 violations.
    elapsed time = 00:00:01, memory = 960.48 (MB).
    Completing 40% with 1 violations.
    elapsed time = 00:00:01, memory = 960.48 (MB).
    Completing 50% with 1 violations.
    elapsed time = 00:00:01, memory = 960.48 (MB).
    Completing 60% with 1 violations.
    elapsed time = 00:00:01, memory = 960.48 (MB).
    Completing 70% with 1 violations.
    elapsed time = 00:00:01, memory = 960.48 (MB).
    Completing 80% with 1 violations.
    elapsed time = 00:00:01, memory = 960.48 (MB).
    Completing 90% with 1 violations.
    elapsed time = 00:00:01, memory = 960.48 (MB).
    Completing 100% with 1 violations.
    elapsed time = 00:00:01, memory = 960.48 (MB).
[INFO DRT-0199]   Number of violations = 1.
Viol/Layer      METAL2
SpacingTableTw       1


### Screenshots

_No response_

### Additional Context

_No response_
@vijayank88
Copy link
Contributor

vijayank88 commented Jan 18, 2024

@gkamendje
Is restricted drive access ?

@gkamendje
Copy link
Author

gkamendje commented Jan 18, 2024 via email

@osamahammad21
Copy link
Member

The config file is missing

@gkamendje
Copy link
Author

strangely enough the issue script is not putting everything in the tar archive. I have added the following file https://drive.google.com/file/d/129Iw7Lt5ZcEny8rx2vbRk3wjI9IrZAwK/view?usp=sharing that contains the config.mk, the def file used and some other stuff

@osamahammad21
Copy link
Member

Also the config file for the platform is missing

@gkamendje
Copy link
Author

@osamahammad21
Copy link
Member

The placement of the pins is making it hard for the detailed router to reach them. The pins are of min width and their centers are not aligned with the tracks. is there a reason you're placing them that way?

@gkamendje
Copy link
Author

There is no particular reason why they are like that. The position/size of the pins was given by the DEF file provided by the analog team that owns the other macros the block communicates with. This can easily be changed if that improves the convergence.
Could you please point me to warnings/errors that highlights this issue?
Is there a way to visualize the tracks in question?

@osamahammad21
Copy link
Member

The warnings look like this:
[WARNING DRT-0422] No routing tracks pass through the center of Term o_eeprom_read
This pin is of size 280x280 which is the min width of metal 4:
image

As you can see, the pin's center is not aligned with the routing track. This results in a Non sufficient metal violation when the net tries to access the pin vertically.

I took the liberty of changing all Metal4 pins sizes to 350x350 and it converged after 4 iterations.

PA finds a horizontal access point for that pin, but DRT is short on understanding the access point direction. It sees no issue with accessing it vertically which is violating. I am working out a fix for that issue but that shouldn't stop you from modifying pin placement

@rovinski
Copy link
Member

rovinski commented Jan 22, 2024

FWIW, this is a problem commercial routers struggle with too. I had a tapeout last year where this was an issue (analog team gave us a block with many off-grid pins, router couldn't create clean connections). We asked the analog team to align the pins to the routing grid and it did significantly better, but still had errors due to congestion. We then asked them to spread out some of the pins (we had tens of thousands of pins) so they weren't at 100% density, and then the violations finally went away.

@gkamendje
Copy link
Author

gkamendje commented Jan 23, 2024

@osamahammad21 indeed changing the pin size for M4 pins does lead to a very quick convergence. However, changing the size of all pins M1 to M4 to 350x350 (which removes the all the [WARNING DRT-0422] warnings) shows the issue again. After 31 iterations there is only 1 M2 spacing violation remaining but it takes up to 64 iterations to fix that single violation.
Moreover, I do see so strange pin access patterns such as this one
image

@rovinski any idea why make detail_route_issue does not include the platform and design configuration files in the resulting tar archive anymore?

@rovinski
Copy link
Member

@rovinski any idea why make detail_route_issue does not include the platform and design configuration files in the resulting tar archive anymore?

@vvbandeira

@osamahammad21
Copy link
Member

The M2 access is not strange because the center of the M2 pin is not aligned with the M2 track. If the net tries to access it directly vertically, it will get a Non sufficient metal violation. So it tries to access it horizontally by doing this weird hock shape.

I am working on a fix that makes such pin access a bit smoother.

As for the remaining violation, it went under my nose. One of the cells is placed very close to a PDN via. It makes it impossible for the router to access this cell pin without creating a spacing violation with the M2 PDN via enclosure. There is an option in the detailed_route command to ignore such violations and remove the violating PDN vias at the end of routing, do you think this could be a good solution or you don't want to lose any of the PDN vias?

@vvbandeira
Copy link
Member

@rovinski any idea why make detail_route_issue does not include the platform and design configuration files in the resulting tar archive anymore?

@vvbandeira

#1771 should fix it, just waiting to see the impact on file size before merging

@gkamendje
Copy link
Author

The M2 access is not strange because the center of the M2 pin is not aligned with the M2 track. If the net tries to access it directly vertically, it will get a Non sufficient metal violation. So it tries to access it horizontally by doing this weird hock shape.

I am working on a fix that makes such pin access a bit smoother.

As for the remaining violation, it went under my nose. One of the cells is placed very close to a PDN via. It makes it impossible for the router to access this cell pin without creating a spacing violation with the M2 PDN via enclosure. There is an option in the detailed_route command to ignore such violations and remove the violating PDN vias at the end of routing, do you think this could be a good solution or you don't want to lose any of the PDN vias?

Getting rid of the violating PDN via could certainly be an option to explore. Could you please remind me how to activate this option?
It is still not quite clear to me why moving away from min M4 pins only successfully masks the issue.

@osamahammad21
Copy link
Member

This is purely placement luck. Matt pointed out that this is should be an issue in the placer to use a site so close to a PDN via, which makes hard for the router to reach the placed cell. but again, this a design with a 100% utilization, so I am not sure if we're asking too much from the placer.
As for repairing the PDN vias option, I found out that it only works on removing vias if the PDN grid is 2 consecutive layers. I am finalizing a PR that should work on repairing a range of via layers between the 2 PDN layers. it should be ready by the end of the day.

@osamahammad21
Copy link
Member

After discussion, we found that removing pdn vias is a bit risky and going to lead in IR drop. you said when you started the issue that the DRC can be easily fixed, Could you specify how you could manually fix it.
Screenshot from 2024-01-23 19-33-46

@gkamendje
Copy link
Author

gkamendje commented Feb 12, 2024 via email

@maliberty
Copy link
Member

I'm interested in how these "easily fixed manually" violations are resolved in Virtuoso. What do you change to make these legal? Are you moving the cells? Changing the pdn? Neither seems trivial and I'm wondering if there is something easy we aren't seeing.

@gkamendje
Copy link
Author

@maliberty feedback from analog team is that they are essentially moving cells around. Agreed that this might not be "trivial" at all! I had certainly misunderstood the statement of the analog team when they said that they had been able to "easily" fix the issue.

@maliberty
Copy link
Member

Better handling of the interaction between placement and pdn vias is an area for improvement. It isn't 'easy' but we have it on the list. The placement engine needs some 'mini-drc/pin-access' check to look for pins that are being blocked.

@maliberty maliberty added the drt Detailed Routing label Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
drt Detailed Routing
Projects
None yet
Development

No branches or pull requests

6 participants