-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPFS Icicle Kit knsh config failure #12259
Comments
Looks like a pmp/memory protection issue. The knsh config does not work out of the box, you need opensbi ton run this config |
I took a look and OpenSBI is running (at least the logo displays so I assume it is running as part of the HSS). Is there any more information I can get regarding what needs to be done to get this to work? |
Opensbi domains control pmp configuration. Set the memory areas and access rights accordingly. Or open up everything. Disabling pmp is not enough as supervisor/user modes need to be given explicit access to memory. I don't use hss so can't help you with that |
@pussuw to summarize what you are saying, we have to set up the HSS OpenSBI memory areas and access rights in a way that match/support the NuttX knsh config. Did I get that right? @MainframeReboot I'm pretty sure the PMP configuration is done through Microchip's MSS Configurator tool, which generates an .xml file. The HSS makefile runs this .xml file through a Python script to generate a bunch of header/source files for the HSS build. This config is what the HSS applies to the MSS during boot. You can see here that the sample .xml for the Icicle kit has config for PMP ICICLE_MSS_mss_cfg.xml#L797-L800:
Here are some resources on OpenSBI. |
Easiest way is to just open up everything and not worry about PMP. PMP is used for inter-hart isolation, if you don't use AMP I would not worry about it. Our own stripped down / limited SBI does it during boot:
Don't know how this is done for HSS bootloader but there must be some way. Adding one rule in the first pmpcfg register is enough, as the CPU goes through the PMP list in-order, and if a rule that matches the memory area is found, access is either immediately given or revoked, so it does not matter what the rest of the pmpcfg registers even contain. Why not just leave PMP unconfigured ? Like I said, the access must be explicitly GRANTED for U-/S-modes, otherwise access is given only for M-mode. This is why the PMP configuration step is mandatory. |
I have spent another day debugging and although some small progress has been made in advancing when the kernel build crashes, I am still stuck on getting the build to run. So far I have done the following:
It appears that in the default knsh config, CONFIG_RAM_VSTART is set to 0x0 and CONFIG_RAM_START is set to 0x80200000 (this matches the linker script). When the code enters the following function with the passed in paddr value set to 0x8020E000:
The returned value is 0xE000. This value is then subsequently passed to mmu_ln_getentry as the value for lnvaddr and is set as memory address for lntable. The DEBUGASSERT call succeeds but the following index of lntable immediately causes a crash:
Out of curiousity I changed CONFIG_RAM_VSTART to 0x80200000 so that lnvaddr is no longer 0xE000 as this does not seem right but this just causes issues further down the line (riscv_stack_color function to be exact) so I am assuming this is not correct. Any ideas on what I could be doing wrong here given all of the things I have tried? |
I have not tried the knsh config in many months, maybe there is some regression in upstream. The mapping should be vaddr=paddr so CONFIG_RAM_START==CONFIG_RAM_VSTART==0x80200000. I think this has been the default behavior before, maybe something has changed that breaks this? If setting CONFIG_RAM_VSTART explicitly in the defconfig fixes this, then it is a correct fix. I'm just wondering, where does this query for address 0x8020E000 come from ? As far as I can remember riscv_pgvaddr is used only in the MMU / address environment logic and those addresses should be in page pool memory. Problems in riscv_stack_color could indicate a problem with user heap, as the process's stack is allocated from there. I run a downstream mpfs target with kernel mode all the time so that should work just fine. Thus, it would make sense the icicle:knsh defconfig has some issues that cause your crashes. Or a regression in upstream we have not detected yet. |
I can provide some more information regarding the queries in question. I started debugging from the function mpfs_start which contains a function call to mpfs_mm_init. This function itself performs two things:
After step 2, nx_start is called, which runs until it hits the function kmm_map_initialize. Within this function is a call to up_addrenv_kmap_init and this is where the problems begin. In here, the value of paddr (0x8020E000) is obtained from g_kernel_pgt_pbase and the value of vaddr is obtained from CONFIG_ARCH_KMAP_VBASE (which is set to 0xBF000000 in the default config). paddr is then passed_ on to the function riscv_pgvaddr. Since this value is not >= CONFIG_ARCH_PGPOOL_PBASE (0x80400000) but >= to CONFIG_RAM_START (0x8020000), the value of 0xE000 is returned (until I explicitly set CONFIG_RAM_VSTART to 0x80200000). Hopefully this provides more clarity into what is occurring in my system at the moment. I will attempt a few more things tomorrow, including going over the older versions of the knsh config to see if I can spot anything, as well as looking into my user heap to ensure it's being configured correctly. Thanks again for your support through this debug. |
Yes that makes sense, forgot I enabled kmap for icicle. Adding CONFIG_RAM_VSTART to the defconfig should fix this issue. What's causing the stack coloring issue depends on which stack is in question. |
I did some more debugging and have managed to boot NuttX in kernel mode, sort of. After updating CONFIG_RAM_VSTART to 0x80200000, the initial crash went away but led to another crash that I tracked down to the function Within this function, the The value returned by What isn't good is that
So we can see a value for the idle stack is being created and set but then it's being accessed incorrectly within the
Is the idea that NuttX kernel must be used alongside SMP? Even so, in then scenarios where the HSS is used to boot say a non-NuttX application on cores 1&2 and NuttX on cores 3&4 will mean For the purposes of my debug, I have indexed the array directly at 0 just to get NuttX to boot in kernel mode. It now does so, albeit I get an error when it attempts to load the ELF file |
Hi,
I am new to NuttX and have hit a wall when it comes to running NuttX as a kernel build. The provided flat build (nsh) config worked and I was able to run NuttX without issue. Building NuttX using the provided kernel build config (knsh) causes errors when running on the Icicle kit:
I have attempted to debug this issue on my own and while comparing the Icicle kit knsh config to other RISC-V platform knsh configs, I've noticed that a number of parameters that differ. For instance, in the Icicle kit knsh config, the number of pages for DATA, HEAP and TEXT are all set to 0. As are the addresses for TEXT and DATA.
Is this potentially the problem or am I completely off track here? Any direction here would be greatly appreciated!
The text was updated successfully, but these errors were encountered: