Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what is polynomial have to give for broadcom controllers? #1

Open
prakash56755 opened this issue Apr 1, 2020 · 23 comments
Open

what is polynomial have to give for broadcom controllers? #1

prakash56755 opened this issue Apr 1, 2020 · 23 comments

Comments

@prakash56755
Copy link

No description provided.

@StrayLightning
Copy link

The README lists the polynomial used for T=4, N=13: 0x201b. It's not clear whether that is used by all controllers, or indeed in all situations.
For newer controllers, you could try T=4, N=14, and poly 0x5803. But note that the nibble-shift is not required for N=14 in the situation I've observed. Again, there is no guarantee that this configuration will work for what you are trying to achieve.

@prakash56755
Copy link
Author

prakash56755 commented Apr 11, 2020 via email

@StrayLightning
Copy link

You mean shift bye right is not for field order M =14. Is it correct?

I'm no expert in BCH or ECC so I can't say for sure, but it worked for me. The 4-bit shift looks like it is necessary with T=4 and an odd M to ensure proper alignment.

It's relatively easy to script something that searches the polynomial space and compares the regenerated ECC with that which is present. If you've got a huge flash you could probably just truncate it to get a good idea of which poly is being used. For a 1Gb flash it took, say, half an hour to search on modest hardware. Be aware that if you've dumped the raw contents of your NAND it may include bit errors. But even with a significant number of errors you should still see a spike where the calculated ECC correlates significantly with the stored ECC for a specific poly.

Then the problem is whether the raw readout has introduced more errors than can be corrected (4 bits per sector for T=4.) I had significantly more success after dumping the NAND again after I'd found the poly -- possibly EM interference, dodgy wiring, maybe something magical happening in the NAND cells...

@merbanan
Copy link

@prakash56755 and me have access to all kinds of Broadcom hardware and flash chips but what looked like an easy task of adding support for has not been successful.

The README mentions a linearity check. Over what range should that be done and how?

Can you elaborate some more regarding calculating the polynomial?

We have access to good ecc data but some hints would be very welcome. And why do you suggest poly 0x5803 ?

@StrayLightning
Copy link

Again, I'm not an expert, I can only advise on what I observed in the one case of SOC and NAND that I looked at. I created a modified version of brcm-nand-bch.c which:

  • used memcpy instead of shift_half_byte -- note that the first two args need to be swapped, and take care with the lengths of the copies (the +1s)
  • compared the computed ECC with the ECC present in the raw dump and counted sectors that were different
  • reported the polynomial used and the count of different sectors at the end

Then create a shell script (or even a 1-liner) to search the polynomial space, passing the polynomial as an argument. All the cases where the poly is invalid will be rejected at the init_bch() call. For most of the remainder of the polys, I would expect it to indicate that every sector is incorrect. In my case I had one poly (0x5803) which reported significantly fewer differences. I did reread the NAND after that point, which resulted in calculated ECC that differed in only a single sector.

Crude, I know, but it worked for me. I do plan to upload this code at some point, but I'll be too busy over the next fortnight.

@merbanan
Copy link

@StrayLightning ok it all makes sense now I just didn't assume that the hardware ecc/bch implementation and bch lib in this repo would create the same ecc values. Is it possible to get a copy of the code you wrote?

@StrayLightning
Copy link

I was hoping to clean it up and refactor before sharing it, but it's forked to https://github.com/StrayLightning/brcm-nand-bch -- no idea if it'll work for you, but feel free to tidy up, correct it, or point out the flaws.

@merbanan
Copy link

cat .out | ./reencode 0x5803 >verify.out
poly 22531 different OOB sectors: 0

So we are all good now and have a strategy that should work on other geometries also.

@Mat-Alm
Copy link

Mat-Alm commented Jun 22, 2020

Hi! Sorry to hijack this ticket. I have one module with boradcom BCM9583xx that have the nand device below. With the info from the sdk it seems to use this BCH scheme ( >= v5.0: ECC_REQ = ceil(BCH_T * 14/8))

NAND device: Manufacturer ID: 0x98, Chip ID: 0xdc (Toshiba NAND 512MiB 3,3V 8-bit), 512MiB, page size: 2048, OOB size: 64
iproc_nand: timing mode 4
proc_nand: following bootloader settings
NAND 8-bit 5-addr-cycles
512MiB total, 128KiB blocks, 2KiB pages
8bit/512B BCH-ECC 16B/512B OOB
iproc_nand: ECC correction status threshold set to 5 bit
proc_nand: user oob per page: 6 bytes (4 steps)

How to find the oob area layout? And fix the bit errors?

@merbanan
Copy link

If the nand controller >= v5.0 then use the code found here: https://github.com/StrayLightning/brcm-nand-bch otherwise you need to modify it so it uses the 6,5 bytes ecc layout that is found in this repository.

@Mat-Alm
Copy link

Mat-Alm commented Jun 22, 2020

My concern it's about oob area layout,

My oob area it like this:

FF FF 9A E2 B3 08 5E 02 7F F6 00 2D 85 09 81 08
FF FF 5A 43 25 17 C7 D9 DB D8 01 B7 5E 76 92 33
FF FF A7 CF BC 4D 74 04 86 7F 55 00 E9 F7 81 92
FF FF 97 E6 46 3B EE C7 6E D3 08 21 3B B9 2C 2C

For a page full of 0 it's like this:

FF FF AD 71 A9 1F C9 C9 27 A5 91 F5 EB 49 F8 C1
FF FF AD 71 A9 1F C9 C9 27 A5 91 F5 EB 49 F8 C1
FF FF AD 71 A9 1F C9 C9 27 A5 91 F5 EB 49 F8 C1
FF FF AD 71 A9 1F C9 C9 27 A5 91 F5 EB 49 F8 C1

That is quite different from this repository. What is the size of user oob area used on this repository?

@merbanan
Copy link

Total amounts of ecc per page is 64. Each sector/subpage has 16 bytes of ecc data. In your case 2 bytes are allocated for bad block handling. That gives you 14*8=112 bits of ecc data possible. We assume this is still valid ceil(BCH_T * 14/8)). That gives 14/(14/8)=8. BCH_T can thus be 8. This repo mainly uses BCH4. But Broadcom nand controllers can handle bch4,8 and 12. (And 1-bit hamming also).

@merbanan
Copy link

We are no sure the polynomial that is found here https://github.com/StrayLightning/brcm-nand-bch is the same for bch8. But you had a good zero test vector that you can use to brute force the ecc parameters. Use https://github.com/StrayLightning/brcm-nand-bch/blob/master/reencode.c with some changes.

@merbanan
Copy link

#define BCH_T 4 -> 8
#define OOB_ECC_OFS 9 -> 2
#define OOB_ECC_LEN 7 -> 14

and something that loops over the polynomial.

@merbanan
Copy link

Is BCM9583xx arm ?

@Mat-Alm
Copy link

Mat-Alm commented Jun 22, 2020

Hum... Thanks for the tips! I will work on this!

@Mat-Alm
Copy link

Mat-Alm commented Jun 22, 2020

Is BCM9583xx arm ?

Yes cortex A9

@merbanan
Copy link

The poly is very likely 0x5803. And then the following should work.

cat .out | ./reencode 0x5803 >verify.out

@merbanan
Copy link

We have this tool:
https://dev.iopsys.eu/broadcom/nand-image-builder
if I follow the code logic it looks like bch8 might be supported.

@StrayLightning
Copy link

For a page full of 0 it's like this:

FF FF AD 71 A9 1F C9 C9 27 A5 91 F5 EB 49 F8 C1
FF FF AD 71 A9 1F C9 C9 27 A5 91 F5 EB 49 F8 C1
FF FF AD 71 A9 1F C9 C9 27 A5 91 F5 EB 49 F8 C1
FF FF AD 71 A9 1F C9 C9 27 A5 91 F5 EB 49 F8 C1

I created a file of 2048 zero bytes with the above tacked on the end. I built reencode.c with the following changes:

#define BCH_T 8
#define BCH_N 14
#define OOB_ECC_OFS 2
#define OOB_ECC_LEN 14

I used poly 0x5803, and it regenerated the ECC with zero differences. So it does look like those are the parameters for your NAND, @Mat-Alm.

@Mat-Alm
Copy link

Mat-Alm commented Jun 22, 2020

For a page full of 0 it's like this:
FF FF AD 71 A9 1F C9 C9 27 A5 91 F5 EB 49 F8 C1
FF FF AD 71 A9 1F C9 C9 27 A5 91 F5 EB 49 F8 C1
FF FF AD 71 A9 1F C9 C9 27 A5 91 F5 EB 49 F8 C1
FF FF AD 71 A9 1F C9 C9 27 A5 91 F5 EB 49 F8 C1

I created a file of 2048 zero bytes with the above tacked on the end. I built reencode.c with the following changes:

#define BCH_T 8
#define BCH_N 14
#define OOB_ECC_OFS 2
#define OOB_ECC_LEN 14

I used poly 0x5803, and it regenerated the ECC with zero differences. So it does look like those are the parameters for your NAND, @Mat-Alm.

Yes! I also tested and it works fine! Thanks for the help!
Also this repo https://github.com/SySS-Research/nand-dump-tools have some work for other cpus in case anyone needs!

@hwti
Copy link

hwti commented Jun 16, 2023

Poly 0x5803 is used on BCM58360 (Sercomm FG1000B.11, ie DT Glasfaser-modem 2), thanks @StrayLightning.
I was able to fix the bit errors on a NAND dump, which prevented the full rootfs extraction (now the device is rooted 😄 ).
I created a PR on your repo to fix a mistake in the decode tool.

For two sectors full of 00, the first one being at the start of the block, I have :

ff 85 19 03 20 08 00 00  00 c2 b8 22 c9 78 ff 97
ff ff ff ff ff ff ff ff  ff ee 94 23 4b a3 78 19

Note that the "user-defined" values are the same as in this repo, but in little endian (so they should be interpreted as 0x1985, 0x2003 and 0x00000008).

@StrayLightning
Copy link

@hwti Thanks for that. I'm somewhat surprised that it produced even vaguely sensible results. It would be nice to finally tidy up these gash programs; I'll need to dig through my notes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants