-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AXI4-Lite support #185
Comments
Hello, Currently, only the standard AXI4 master interface is available, AXI4-Lite is not there yet. Just as a note, the feature/AXI branch is there to fix some issues with the AXI caches, but it is not related to the AXI protocol support itself. |
Thanks. So the AXI4 master interface is available in the standard release, and I don't need to do a special build? |
Exactly. |
Ok, thanks. I'm thinking about importing synthesized accelerators into an SoC that has AXI4 master for DMA's but needs either AXI4-Lite or APB for configuration of the accelerator. I can't yet tell if bambu provides this. |
With the latest dev/panda version, it is possible to generate a memory-mapped top-level interface passing the --memory-mapped-top option. In this case, the top module will expose a slave memory bus which you may use to initialize the accelerator and start the computation. The available protocols for this interface are the Wishbone B4 protocol and the internal memory bus protocol used by Bambu. The latter is a straightforward protocol that you may adapt to AXI4-Lite if this suits your needs. |
Ok, thanks very much. I see the build issue with |
Hi, I tried to add
Any idea where to start looking? |
Hi, can you please provide the input files and the full command line you used to call Bambu? |
The source code is from the pytorch tutorial in soda-opt. The only change to the
I'm attaching the final output of soda 05_llvm_baseline.ll. Thanks! |
Is there any documentation on this interface? |
I tried to perform the synthesis with the provided command line and input description, but I had no issues with that. Which version of bambu are you using? I tried that with this AppImage. |
I built the dev/panda branch. I had to apply the following patch to get the code to compile. Hopefully this isn't the cause of the error.
|
|
I do not think that the patch is causing any issues. |
Thanks for the reply!
I didn't mention previously that I have to edit the IR that comes out of soda-opt. Without editing it, bambu fails to ingest it, with this error.
I remove the
|
I am going to check the .gimplePSSA as soon as I can. In the meanwhile, just a note on the soda-generated IR, it may have been computed using a newer version of the LLVM toolchain with respect to the Clang 12 version, thus this may cause issues with the parser. Bambu also supports Clang 16 as a fronted, maybe you can avoid the .ll editing if you use that one as a frontend compiler. |
Ok, thanks. I am working from the latest soda provided docker image, but I can go ahead and build soda and bambu with 16. |
The dev-panda AppImage is shipped with Clang 16 too, if you want to avoid the build. |
Ok, thanks. Since I have to rebuild soda it's not much more trouble to build everything. |
I managed to use the appimage bambu with clang16 specified so that I don't need to edit the IR. It fails differently than it previously did. Below is the whole output.
|
Can you point me to some verilog that shows an interface coming from use of the |
Hi, I had the chance to debug the issue, and I can tell that it is only related to the testbench generation. |
Ok, thanks, I'll try this. I notice that when I try to compile the test benches I get errors of the following form. I expect it is because I am including the AppImage into the soda-opt container which is ubuntu 20.04, whereas I see some paths in the AppImage that indicate 18.04. So I think there is some include path inconsistencies. However, I was able to use the Thanks again!
|
Using the
|
I tried on my local version with the latest dev/pand and I'm able to synthesize the code. Since the hash of the branch is the same, I would like to understand which version of clang you are using. I'm using the binaries downloaded from the github release page of Clang: https://github.com/llvm/llvm-project/releases/download/llvmorg-16.0.0/clang+llvm-16.0.0-x86_64-linux-gnu-ubuntu-18.04.tar.xz |
Bambu can manage IR generated by clang with different intel target architectures. The default one is the 32-bit architecture where -m32 is passed at clang. We added support also to -mx32 and -m64 but in the last case, the address space is 64 bits. So, in order to control the size of the address bus we added a bambu option but control the number of bits used by the minimal interface bus. since you asked for a multi-channel minimal interface. The minimal interface may have one or two channels. Two channels means that you may have two on-flight memory transactions.
you will ask for a single channel minimal interface bus and so you will have this top-level interface: that should be what you actually expected. Isn't it? |
I am using the clang 16 release on Ubuntu 20.04. Here are some steps I use to install it in my docker image.
|
Configuring with --enable-release implies passing to the compiler -DNDEBUG. This hides some errors leaving the error catch only to the THROW_ASSERTS. One way to improve the tracking of the bug could be to configure with --disable-release instead of --enable-release and see where the issues pop out. |
I rebuilt with
|
The error you see seems to be due to a non-detected buffer overflow during the minimum bit computation. I cannot reproduce the issue, so please let me know if #206 fixed the problem. |
That works using the
It fails when using
|
I was able to run the simulation but to be sure, I need forward_kernel_test.xml. From what I understood, the top function signature is different from the one used by the original tutorial. |
Here is a reproducer. I get different errors when running on linux vs Intel Mac. Please let me know if anything is broken in the reproducer. Thanks!
|
Hi, did you get a chance to try reproduce my errors? Thanks! |
I’m working on it. It just takes longer than expected the setup. |
Hi, need to be changed in
The latest option fixes the issue, but you can get more from this example. Instead of the minimal interface, you may use the option --generate-interface=INFER that follows the same assumption adopted by Vitis HLS (see these pragmas). In this last case, Bambu can infer the interface to connect the three parameters to three different BRAMs. Since the bus is no longer a constraint, you are going to half the number of cycles. Since the array protocol requires to know exactly how large is the BRAM attached and since from .ll files it is impossible to specify the size of the array and the size of the base elements (at least with opaque pointers), I've recently extended the forward_kernel_interface.xml file by adding a new attribute to the parameters.
|
Hi, |
Great, I'll give it a try! |
I still get an error with the reproducer, which is not using
|
That error means the kernel is trying to access a memory area not allocated on the accelerator. The error lines are intended to be similar to Valgrind output if you are familiar with that. The first error line,
Also, it may be useful to write a C/C++ testbench to check the kernel functionality before the synthesis. As a starting point, you may use the generated testbench, which you can find in |
You need to pass --interface-xml-filename=../../forward_kernel_interface.xml with forward_kernel_interface.xml having this content:
|
It works! Thanks! |
I changed to another neural network and run into new issues with the This is the error.
|
Hi,
Is AXI4-Lite support in a state where it can be used? I tried to enable it with
but the build fails with
Maybe I need to be on the
feature/AXI
branch? Or maybe I shouldn't be trying to test it.Thanks.
The text was updated successfully, but these errors were encountered: