Hello fellas,
I'm working on a project that contains ATSAMA5D2 processor with 2 DDR3L chips from micron and emmc chip. The first version of this design was done before my time and was believed to work properly in the field. I was mainly involved in the next-gen of this project which had some improvements together with added chips to create more features (generate keys etc)
I haven't done any changes to the DDR3L interface which was imported (in the first revision) from the ref design of the manufacturer.
The only change in these critical parts was to change the emmc part as it's becoming obsolete. I went for panasonic emmc RP-SEMC08DA1. Given that the previous version worked I haven't even changed the stack-up and kept everything on the critical path (DDR3L) as the same.
However, when we had our first prototype boards, I noticed that the boards failing specifically when doing the emmc test. DDR tests usually work fine even when left for weeks on my desk. We did another spin adding some parts (again far away from the critical path) and unfortunately again ended up having 20-30% boards which will fail on emmc tests (bit-fade test write all memory cells, wait some time, then read everything). I did another test to confirm that the good boards are really good, by performing temperature stress inside a chamber. The profile was way below the limits of the lowest components which were the batteries (5-55C) the cycling. The results of this test after 48 hrs of temperature cycling in the chamber, most boards that were "good" in ambient failed in a similar way sometime after being in the chamber. I did another comparison experiment with V1 boards and found out that the previous revision was surviving the test in the chamber (although the sample was limited to 2 boards only).
When turning off the data cache in the processor things get improved and boards start to pass the emmc test. However, I don't know if the problem lies in emmc part as the read-mismatch errors are happening when comparing both read/write buffers stored in DDR. This made me believe that there might be an issue with DDR3L interface although as I said earlier nothing has changed since V1 in that department.
I did a comparison between V1 & V2 boards in terms of VDDIODDR, DDR_VREF and VDDCORE and found that the ripple vp-p values are closely matched if not improved in V2 yet these problems seem to happen only on V2 design.
The errors I'm having are read-mismatches when reading back from the emmc (sometimes only few bits are different, other times its the whole byte)
this happens and the test continues until the code is hit by a translation/data fault, undefined instruction or software interrupt.
I doubted that the software might have been the cause but having the same software working on V1 quite well redirected me back to fundamental HW issues.
Would really appreciate if you could give your thoughts on this issue as it's getting really frustrating. Can provide snippets of my schematic/layout if needed.
I'm working on a project that contains ATSAMA5D2 processor with 2 DDR3L chips from micron and emmc chip. The first version of this design was done before my time and was believed to work properly in the field. I was mainly involved in the next-gen of this project which had some improvements together with added chips to create more features (generate keys etc)
I haven't done any changes to the DDR3L interface which was imported (in the first revision) from the ref design of the manufacturer.
The only change in these critical parts was to change the emmc part as it's becoming obsolete. I went for panasonic emmc RP-SEMC08DA1. Given that the previous version worked I haven't even changed the stack-up and kept everything on the critical path (DDR3L) as the same.
However, when we had our first prototype boards, I noticed that the boards failing specifically when doing the emmc test. DDR tests usually work fine even when left for weeks on my desk. We did another spin adding some parts (again far away from the critical path) and unfortunately again ended up having 20-30% boards which will fail on emmc tests (bit-fade test write all memory cells, wait some time, then read everything). I did another test to confirm that the good boards are really good, by performing temperature stress inside a chamber. The profile was way below the limits of the lowest components which were the batteries (5-55C) the cycling. The results of this test after 48 hrs of temperature cycling in the chamber, most boards that were "good" in ambient failed in a similar way sometime after being in the chamber. I did another comparison experiment with V1 boards and found out that the previous revision was surviving the test in the chamber (although the sample was limited to 2 boards only).
When turning off the data cache in the processor things get improved and boards start to pass the emmc test. However, I don't know if the problem lies in emmc part as the read-mismatch errors are happening when comparing both read/write buffers stored in DDR. This made me believe that there might be an issue with DDR3L interface although as I said earlier nothing has changed since V1 in that department.
I did a comparison between V1 & V2 boards in terms of VDDIODDR, DDR_VREF and VDDCORE and found that the ripple vp-p values are closely matched if not improved in V2 yet these problems seem to happen only on V2 design.
The errors I'm having are read-mismatches when reading back from the emmc (sometimes only few bits are different, other times its the whole byte)
this happens and the test continues until the code is hit by a translation/data fault, undefined instruction or software interrupt.
I doubted that the software might have been the cause but having the same software working on V1 quite well redirected me back to fundamental HW issues.
Would really appreciate if you could give your thoughts on this issue as it's getting really frustrating. Can provide snippets of my schematic/layout if needed.
Comment