| FORUM

FEDEVEL
Platform forum

bad block NAND boot problem

ypkdani , 09-02-2020, 01:21 AM
Hi,

we have an imx6 board similar to the TinyRex with a problem during the boot from a NAND.
We are experiencing some reliability problems on a i.MX6ULL system with a Micron MLC MT29F32G08CBADAWP NAND and we have some questions on the best practices to follow, using NANDs.

We have few systems that sometimes can't boot due to a CRC error on the initramfs partition. We suppose that sometimes the error can also be present on the kernel and dtb MTD partitions. In this case the system have only few months of life, so it's not a very old NAND.
The error we see:

U-Boot 2016.03-nxp/imx_v2016.03_4.1.15_2.0.0_ga+ga57b13b (Aug 04 2020 - 09:37:05 +0000)

CPU: Freescale i.MX6ULL rev1.1 528 MHz (running at 396 MHz)
CPU: Industrial temperature grade (-40C to 105C) at 60C
Reset cause: WDOG
Board: MX6ULL 14x14 EVK
I2C: ready
DRAM: 512 MiB
NAND: 4096 MiB
MMC: FSL_SDHC: 0, FSL_SDHC: 1
*** Warning - bad CRC, using default environment

Display: TFT43AB (480x272)
Video: 480x272x24
In: serial
Out: serial
Err: serial
Net: FEC0
Normal Boot
Autoboot in 3 seconds

NAND read: device 0 offset 0x4000000, size 0x800000
8388608 bytes read: OK
NAND read: device 0 offset 0x5000000, size 0x100000
1048576 bytes read: OK
NAND read: device 0 offset 0x6000000, size 0x1200000
Skipping bad block 0x06600000
18874368 bytes read: OK

Kernel image @ 0x80800000 [ 0x000000 - 0x6f9708 ]
## Loading init Ramdisk from Legacy Image at 83800000 ...
Image Name: Init Ram Disk
Image Type: ARM Linux RAMDisk Image (gzip compressed)
Data Size: 8897996 Bytes = 8.5 MiB
Load Address: 00000000
Entry Point: 00000000
Verifying Checksum ... Bad Data CRC

Ramdisk image is corrupt or invalid
Our NAND has an erase size of 2MB and is split as follow: uboot (64MB), kernel (16MB), dtb (16MB), system (rest of the 4GB NAND) and there is plenty of spare blocks on each partition.
When the system fails, looking at the Bad Blocks Table from u-boot we see:

=> nand bad Device 0 bad blocks: 06600000 0b400000 0b600000
The first one being, in fact, inside the initramfs partition. (we see the others two on every system, so we support they are present by default on the NAND chip itself).

Considering that the u-boot, kernel, dtb and initramfs are only written once in production, and the system may go many months without a reboot, we wonder how those partitions may get corrupted.

We have few hypotheses:
  1. over time an excessive number of bitflips happens on a block, making it uncorrectable
  2. a single bitflip happens on the first byte on the OOB part of a block, changing it from 0xff and as such marking it as bad

Does these ideas make sense?

How can it be that, sometimes, the same system is then able to boot again (without any intervention) powering on/off this many times?

Another strange thing we notice is that re-flashing only the u-boot (and nothing else) have made the bad block on the initramfs partition disappear.
To flash partitions we erase them with flash_erase /dev/mtdX and then write them using nandwrite (or kobs-ng for u-boot).

We also have some questions about how we can make the system more reliable: is it a good idea to duplicate the kernel, dtb and initramfs partitions and detect in some way a failure in u-boot, so that a second copy can be loaded if the first fails? or there are better solutions?

Now, a very wild guess: is it possible that reading these partitions periodically from a live system (using nanddump, for example) could improve their health? the reasoning here is as follow: reading it with nanddump would detect (and correct?) bitflips, preventing them to accumulate over time. Does it make sense?
We are asking because we read on Micron documentation that bad blocks are marked only on erase and write operations, but we assume that they are also marked as bad when reading them if there are too many bitflips to correct the errors. Is it right? In our case we never erase or write them again, just read them on a reboot so we do not understand how they can be marked as bad.



Best regards.
robertferanec , 09-04-2020, 11:52 PM
I asked a software engineer to have a look at your question and he will answer if he will know.
ypkdani , 09-05-2020, 06:15 AM
Hello Robert,

Thanks! We have study more this problem and find that kobs-ng create two copy of uboot. In our case it put the images at 0x1000000 and 0x2800000 memory address. If a sector become damage in the first uboot image the second uboot is called. There is so a simple redundance of uboot.
marek , 09-05-2020, 12:14 PM
This reply is just from what i can still remember. I did not made any specific research on this topic so some information might not be accurate anymore. But as an overview it is ok.

About NAND bad block.
==================
Every NAND memory will soon all later develop bad blocks. I think it is caused by writing into memory and not by reading. There is some silicon barrier that is used to trap charge state (data). And by frequent writing this barrier degrades and the reading circuit is not reading correct data from that cell anymore.
NAND chips are coming from vendor pre-checked. They already contain information that some sector/page is bad. It was a byte and ti was stored at the begging of each block (somewhere in OOB). Other approach was to used back block table. 'Bad block table' and 'bad block table backup' was linux kernel specific. It was block that contained information about all bad blocks. It was created when linux kernel booted-up so it does not need to scan NAND for bad block every-time it boots.

=> nand bad Device 0 bad blocks: 06600000 0b400000 0b600000
"0b400000 0b600000 (we see the others two on every system, so we support they are present by default on the NAND chip itself)."
This might be linux kernel BBT.

About quality of NAND chips
======================
NAND memory vendors were usually selling only memories that have always first few megabytes good. This was probably for tiny ROM bootloaders inside the SOC that are not aware of bad blocks and failed to load uboot. Most modern SOC are capable of skipping bad blocks nowadays.

  1. over time an excessive number of bitflips happens on a block, making it uncorrectable
  2. a single bitflip happens on the first byte on the OOB part of a block, changing it from 0xff and as such marking it as bad

Does these ideas make sense?
1. In the case bitflips is "erase whole block and write whole block" than Yes. Every data change require erase and erase block is smallest erase i think. Single unrecoverable failure inside block means whole block shall be marked as bad.
2. Yes. I don't know which byte it is. This is not done automatically. Someone must write this information into memory (uboot?, kernel?, external nand application?).

How can it be that, sometimes, the same system is then able to boot again (without any intervention) powering on/off this many times?
I don't know. It does not make sense. Unless reading circuit inside NAND can sometimes read correct data and sometimes not.

We also have some questions about how we can make the system more reliable: is it a good idea to duplicate the kernel, dtb and initramfs partitions and detect in some way a failure in u-boot, so that a second copy can be loaded if the first fails? or there are better solutions?
I don't think there is better solution. I usually put u-boot at the begging of the flash into small separate block. In your case i would create 2MB. The uboot environment is also in separate block. In your case i would create 2MB. Uboot have some scripting support i think. So data validity can checked before booting linux kernel and there can be multiple copies. Everything what can be read only is read only. There are also filesystems that designed for flash memory.

In the case your hw supports even two uboot copies than it is even better.

The solution that i saw:
a) primary and backup data are the same (ie: Simple devices where is a lot of memory space, usually done to simplify firmware update)
b) primary data are unique and backup data are just for booting device into factory recovery (ie: Android phones, Western digital multimedia boxes, Sony PS3)
c) primary data are unique and backup data are used to replace the corrupted primary data and not for booting purposes.

Now, a very wild guess: is it possible that reading these partitions periodically from a live system (using nanddump, for example) could improve their health?
No. The live system should do it without nanddump most likely on linux kernel driver level. NAND chips are coming from vendors already properly marked.
I would use nandump to find and mark bad blocks only after some low NAND level erase that erased everything including information about bad blocks.





robertferanec , 09-07-2020, 02:11 AM
I don't know. It does not make sense. Unless reading circuit inside NAND can sometimes read correct data and sometimes not.
- this is interesting point from @marek

@ypkdani are you sure there is not hardware problem (e.g. crosstalk on bus, somethign with NAND power, etc)? I would maybe run some extensive testing for reading NAND (and I would run this testing in environmental chamber for different temperatures), just to be sure, reading is consistent.
ypkdani , 09-07-2020, 03:29 AM
Originally posted by marek
I don't think there is better solution. I usually put u-boot at the begging of the flash into small separate block. In your case i would create 2MB. The uboot environment is also in separate block. In your case i would create 2MB. Uboot have some scripting support i think. So data validity can checked before booting linux kernel and there can be multiple copies. Everything what can be read only is read only. There are also filesystems that designed for flash memory.
Thanks marek , that's a great answer.

From what we now understand, as long as we will keep kernel, dtb and initramfs on MTD partitions, we will have three single points of failure.

The solution could be one of the following:
  1. move kernel, dtb and initramfs inside the UBIFS of the system (in the MTD that occupies the rest of the NAND), which should be more resilient to bad sectors
  2. also duplicate each one of these MTD partitions (kernel, dtb and initramfs) and introduce a way in u-boot to read the second copy, if the first one is damaged.

Originally posted by marek
No. The live system should do it without nanddump most likely on linux kernel driver level. NAND chips are coming from vendors already properly marked.
I would use nandump to find and mark bad blocks only after some low NAND level erase that erased everything including information about bad blocks.
I see.
Just to explain better: we did not plan to do this "nanddump read" on the whole system (which resides in a UBIFS, and so far seems to not be affected by these problems), but only on the kernel, dtb and initramfs MTD partitions in the hope that reading them periodically would "heal" them. But as said, we are not sure it would really work.

ypkdani , 09-07-2020, 03:37 AM
Originally posted by robertferanec
- this is interesting pint from @marek

@ypkdani are you sure there is not hardware problem (e.g. crosstalk on bus, somethign with NAND power, etc)? I would maybe run some extensive testing for reading NAND (and I would run this testing in environmental chamber for different temperatures), just to be sure, reading is consistent.

Hi Robert,
yes we have hundreds of board produced and this problems was occour only on two or three of they. We have already do many stress test (write/read/erase) without problems...
Unfortunally micron didn't give us reply on our past questions... their support never response us
marek , 09-07-2020, 11:22 AM
Originally posted by ypkdani
The solution could be one of the following:
  1. move kernel, dtb and initramfs inside the UBIFS of the system (in the MTD that occupies the rest of the NAND), which should be more resilient to bad sectors
  2. also duplicate each one of these MTD partitions (kernel, dtb and initramfs) and introduce a way in u-boot to read the second copy, if the first one is damaged.
I would personally go for duplicate MTD partitions. Putting kernel, dtb and initramfs inside the UBIFS is also an option, that in used in some products. But it require uboot with ubifs support and it must match ubifs support in linux kernel. There is tiny compatibility risk that i'm avoiding it by placing kernel and dtb directly into nand flash.

But i never saw the unreliability you are facing.
alberanid , 09-09-2020, 01:10 AM
Originally posted by marek

I would personally go for duplicate MTD partitions. Putting kernel, dtb and initramfs inside the UBIFS is also an option, that in used in some products. But it require uboot with ubifs support and it must match ubifs support in linux kernel. There is tiny compatibility risk that i'm avoiding it by placing kernel and dtb directly into nand flash.

But i never saw the unreliability you are facing.
You are right, that makes sense.

Regarding the unreliability we are experiencing, we wonder if using somewhat large partitions didn't made things worse.

Our partitions (64MB u-boot, 16MB kernel, 16MB dtb, 64MB initramfs) were based on an example in the EVK. We thought it made sense to have a lot of free space so that any bad block could be easily relocated.
Now we wonder if in fact it could create problems, instead, since a single bad block anywhere in the partition could make the "nand read" step fail (must double-check this). Do you have any experience with this?

Thanks for your advices.
marek , 09-09-2020, 12:07 PM
I was active in development 5 years ago, so my hints might be little outdated.

There are 3 operations:
nand read <address> <nand offset> <size>
- Uboot will read data from <nand offset> until <size> and put them into <address>. If bad block is detected, whole block is skipped and operation is resumed on next block.

nand erase <offset> <size>
- Uboot will erase data at <offset> with size <size>. Here i'm not sure. I assume operation is not performed on bad block and same skipping as mentioned above is performed.

nand write <memory address> <offset> <size>
- Uboot will write data from <memory address> into nand flash at <offset> with length <size>. If bad block is detected, whole block is skipped and operation is resumed on next block.

Nand read do not fail if there is properly marked bad block. It just automatically skips it. Large partition size should not be the problem. Uboot should not care about it. It just reads the data and auto-skips all block marked as bad.
Reusing EVK examples is ok and i would do it also. The block should not became bad just be reading from it. It may happen but it was never mentioned in datasheets so the probability should be low. Uboot should mark block as bad by itself during erase or write. It has the information if the operation was successful or not.
Use our interactive Discord forum to reply or ask new questions.
Discord invite
Discord forum link (after invitation)

Didn't find what you were looking for?