| FORUM

FEDEVEL
Platform forum

Unexpected behavior of custom hardware design

Kulunu , 08-07-2018, 10:00 AM
Hi All,

I have done DDR memory calibration for my imx6q custom hardware design using NXP stress tester version 2.9 in ambient temperature and got proper and reliable read,write,dqs gating values for DDR memory. And I updated u-boot configs according to those values as follows.

Design passed all memory tests.

Memory AS4C256M16D3A-12BCN



============================================
DDR Stress Test (2.6.0)
Build: Aug 1 2017, 17:33:25
NXP Semiconductors.
============================================

============================================
Chip ID
CHIP ID = i.MX6 Dual/Quad (0x63)
Internal Revision = TO1.2
============================================

============================================
Boot Configuration
SRC_SBMR1(0x020d8004) = 0x18000030
SRC_SBMR2(0x020d801c) = 0x21000011
============================================

ARM Clock set to 1GHz

============================================
DDR configuration
BOOT_CFG3[5-4]: 0x00, Single DDR channel.
DDR type is DDR3
Data width: 64, bank num: 8
Row size: 15, col size: 10
Chip select CSD0 is used
Density per chip select: 2048MB
============================================

Current Temperature: 26
============================================

DDR Freq: 396 MHz

ddr_mr1=0x00000004
Start write leveling calibration...
running Write level HW calibration
Write leveling calibration completed, update the following registers in your initialization script
MMDC_MPWLDECTRL0 ch0 (0x021b080c) = 0x00190022
MMDC_MPWLDECTRL1 ch0 (0x021b0810) = 0x00250017
MMDC_MPWLDECTRL0 ch1 (0x021b480c) = 0x0011001C
MMDC_MPWLDECTRL1 ch1 (0x021b4810) = 0x0010001E
Write DQS delay result:
Write DQS0 delay: 34/256 CK
Write DQS1 delay: 25/256 CK
Write DQS2 delay: 23/256 CK
Write DQS3 delay: 37/256 CK
Write DQS4 delay: 28/256 CK
Write DQS5 delay: 17/256 CK
Write DQS6 delay: 30/256 CK
Write DQS7 delay: 16/256 CK

Starting DQS gating calibration
. HC_DEL=0x00000000 result[00]=0x11111111
. HC_DEL=0x00000001 result[01]=0x11011111
. HC_DEL=0x00000002 result[02]=0x00000000
. HC_DEL=0x00000003 result[03]=0x00000000
. HC_DEL=0x00000004 result[04]=0x11111111
. HC_DEL=0x00000005 result[05]=0x11111111
. HC_DEL=0x00000006 result[06]=0x11111111
. HC_DEL=0x00000007 result[07]=0x11111111
. HC_DEL=0x00000008 result[08]=0x11111111
. HC_DEL=0x00000009 result[09]=0x11111111
. HC_DEL=0x0000000A result[0A]=0x11111111
. HC_DEL=0x0000000B result[0B]=0x11111111
. HC_DEL=0x0000000C result[0C]=0x11111111
. HC_DEL=0x0000000D result[0D]=0x11111111
DQS HC delay value low1 = 0x02020202, high1=0x03030303
DQS HC delay value low2 = 0x02020102, high2=0x03030303

loop ABS offset to get HW_DG_LOW
. ABS_OFFSET=0x00000000 result[00]=0x11111111
. ABS_OFFSET=0x00000004 result[01]=0x11011111
. ABS_OFFSET=0x00000008 result[02]=0x10011111
. ABS_OFFSET=0x0000000C result[03]=0x10011111
. ABS_OFFSET=0x00000010 result[04]=0x10011111
. ABS_OFFSET=0x00000014 result[05]=0x10111111
. ABS_OFFSET=0x00000018 result[06]=0x10011111
. ABS_OFFSET=0x0000001C result[07]=0x10111111
. ABS_OFFSET=0x00000020 result[08]=0x00010111
. ABS_OFFSET=0x00000024 result[09]=0x00010111
. ABS_OFFSET=0x00000028 result[0A]=0x00010111
. ABS_OFFSET=0x0000002C result[0B]=0x00010001
. ABS_OFFSET=0x00000030 result[0C]=0x00000000
. ABS_OFFSET=0x00000034 result[0D]=0x00000000
. ABS_OFFSET=0x00000038 result[0E]=0x00000000
. ABS_OFFSET=0x0000003C result[0F]=0x00000000
. ABS_OFFSET=0x00000040 result[10]=0x00000000
. ABS_OFFSET=0x00000044 result[11]=0x00000000
. ABS_OFFSET=0x00000048 result[12]=0x00000000
. ABS_OFFSET=0x0000004C result[13]=0x00000000
. ABS_OFFSET=0x00000050 result[14]=0x00000000
. ABS_OFFSET=0x00000054 result[15]=0x00000000
. ABS_OFFSET=0x00000058 result[16]=0x00000000
. ABS_OFFSET=0x0000005C result[17]=0x00000000
. ABS_OFFSET=0x00000060 result[18]=0x00000000
. ABS_OFFSET=0x00000064 result[19]=0x00000000
. ABS_OFFSET=0x00000068 result[1A]=0x00000000
. ABS_OFFSET=0x0000006C result[1B]=0x00000000
. ABS_OFFSET=0x00000070 result[1C]=0x00000000
. ABS_OFFSET=0x00000074 result[1D]=0x00000000
. ABS_OFFSET=0x00000078 result[1E]=0x00000000
. ABS_OFFSET=0x0000007C result[1F]=0x00000000

loop ABS offset to get HW_DG_HIGH
. ABS_OFFSET=0x00000000 result[00]=0x00000000
. ABS_OFFSET=0x00000004 result[01]=0x00000000
. ABS_OFFSET=0x00000008 result[02]=0x00000000
. ABS_OFFSET=0x0000000C result[03]=0x00000000
. ABS_OFFSET=0x00000010 result[04]=0x00000000
. ABS_OFFSET=0x00000014 result[05]=0x00000000
. ABS_OFFSET=0x00000018 result[06]=0x00000000
. ABS_OFFSET=0x0000001C result[07]=0x00000000
. ABS_OFFSET=0x00000020 result[08]=0x00000000
. ABS_OFFSET=0x00000024 result[09]=0x00000000
. ABS_OFFSET=0x00000028 result[0A]=0x00000000
. ABS_OFFSET=0x0000002C result[0B]=0x00000000
. ABS_OFFSET=0x00000030 result[0C]=0x00000000
. ABS_OFFSET=0x00000034 result[0D]=0x00000000
. ABS_OFFSET=0x00000038 result[0E]=0x00000000
. ABS_OFFSET=0x0000003C result[0F]=0x00000000
. ABS_OFFSET=0x00000040 result[10]=0x01000000
. ABS_OFFSET=0x00000044 result[11]=0x01000000
. ABS_OFFSET=0x00000048 result[12]=0x01000000
. ABS_OFFSET=0x0000004C result[13]=0x01001000
. ABS_OFFSET=0x00000050 result[14]=0x01001010
. ABS_OFFSET=0x00000054 result[15]=0x01001110
. ABS_OFFSET=0x00000058 result[16]=0x01001111
. ABS_OFFSET=0x0000005C result[17]=0x01001111
. ABS_OFFSET=0x00000060 result[18]=0x11101111
. ABS_OFFSET=0x00000064 result[19]=0x11101111
. ABS_OFFSET=0x00000068 result[1A]=0x11101111
. ABS_OFFSET=0x0000006C result[1B]=0x11111111
. ABS_OFFSET=0x00000070 result[1C]=0x11111111
. ABS_OFFSET=0x00000074 result[1D]=0x11111111
. ABS_OFFSET=0x00000078 result[1E]=0x11111111
. ABS_OFFSET=0x0000007C result[1F]=0x11111111


BYTE 0:
Start: HC=0x01 ABS=0x30
End: HC=0x03 ABS=0x54
Mean: HC=0x02 ABS=0x42
End-0.5*tCK: HC=0x02 ABS=0x54
Final: HC=0x02 ABS=0x54
BYTE 1:
Start: HC=0x01 ABS=0x2C
End: HC=0x03 ABS=0x4C
Mean: HC=0x02 ABS=0x3C
End-0.5*tCK: HC=0x02 ABS=0x4C
Final: HC=0x02 ABS=0x4C
BYTE 2:
Start: HC=0x01 ABS=0x2C
End: HC=0x03 ABS=0x50
Mean: HC=0x02 ABS=0x3E
End-0.5*tCK: HC=0x02 ABS=0x50
Final: HC=0x02 ABS=0x50
BYTE 3:
Start: HC=0x01 ABS=0x20
End: HC=0x03 ABS=0x48
Mean: HC=0x02 ABS=0x34
End-0.5*tCK: HC=0x02 ABS=0x48
Final: HC=0x02 ABS=0x48
BYTE 4:
Start: HC=0x01 ABS=0x30
End: HC=0x03 ABS=0x68
Mean: HC=0x02 ABS=0x4C
End-0.5*tCK: HC=0x02 ABS=0x68
Final: HC=0x02 ABS=0x68
BYTE 5:
Start: HC=0x00 ABS=0x20
End: HC=0x03 ABS=0x5C
Mean: HC=0x01 ABS=0x7D
End-0.5*tCK: HC=0x02 ABS=0x5C
Final: HC=0x02 ABS=0x5C
BYTE 6:
Start: HC=0x01 ABS=0x08
End: HC=0x03 ABS=0x3C
Mean: HC=0x02 ABS=0x22
End-0.5*tCK: HC=0x02 ABS=0x3C
Final: HC=0x02 ABS=0x3C
BYTE 7:
Start: HC=0x01 ABS=0x20
End: HC=0x03 ABS=0x5C
Mean: HC=0x02 ABS=0x3E
End-0.5*tCK: HC=0x02 ABS=0x5C
Final: HC=0x02 ABS=0x5C

DQS calibration MMDC0 MPDGCTRL0 = 0x024C0254, MPDGCTRL1 = 0x02480250

DQS calibration MMDC1 MPDGCTRL0 = 0x025C0268, MPDGCTRL1 = 0x025C023C

Note: Array result[] holds the DRAM test result of each byte.
0: test pass. 1: test fail
4 bits respresent the result of 1 byte.
result 00000001:byte 0 fail.
result 00000011:byte 0, 1 fail.

Starting Read calibration...

ABS_OFFSET=0x00000000 result[00]=0x11111111
ABS_OFFSET=0x04040404 result[01]=0x11111111
ABS_OFFSET=0x08080808 result[02]=0x11111111
ABS_OFFSET=0x0C0C0C0C result[03]=0x11111111
ABS_OFFSET=0x10101010 result[04]=0x11111011
ABS_OFFSET=0x14141414 result[05]=0x01111001
ABS_OFFSET=0x18181818 result[06]=0x00011000
ABS_OFFSET=0x1C1C1C1C result[07]=0x00010000
ABS_OFFSET=0x20202020 result[08]=0x00000000
ABS_OFFSET=0x24242424 result[09]=0x00000000
ABS_OFFSET=0x28282828 result[0A]=0x00000000
ABS_OFFSET=0x2C2C2C2C result[0B]=0x00000000
ABS_OFFSET=0x30303030 result[0C]=0x00000000
ABS_OFFSET=0x34343434 result[0D]=0x00000000
ABS_OFFSET=0x38383838 result[0E]=0x00000000
ABS_OFFSET=0x3C3C3C3C result[0F]=0x00000000
ABS_OFFSET=0x40404040 result[10]=0x00000000
ABS_OFFSET=0x44444444 result[11]=0x00000000
ABS_OFFSET=0x48484848 result[12]=0x00000000
ABS_OFFSET=0x4C4C4C4C result[13]=0x00000000
ABS_OFFSET=0x50505050 result[14]=0x00000000
ABS_OFFSET=0x54545454 result[15]=0x00000000
ABS_OFFSET=0x58585858 result[16]=0x00000000
ABS_OFFSET=0x5C5C5C5C result[17]=0x00000000
ABS_OFFSET=0x60606060 result[18]=0x00000000
ABS_OFFSET=0x64646464 result[19]=0x00000110
ABS_OFFSET=0x68686868 result[1A]=0x00001111
ABS_OFFSET=0x6C6C6C6C result[1B]=0x11001111
ABS_OFFSET=0x70707070 result[1C]=0x11101111
ABS_OFFSET=0x74747474 result[1D]=0x11101111
ABS_OFFSET=0x78787878 result[1E]=0x11111111
ABS_OFFSET=0x7C7C7C7C result[1F]=0x11111111

Byte 0: (0x18 - 0x64), middle value:0x3e
Byte 1: (0x14 - 0x60), middle value:0x3a
Byte 2: (0x10 - 0x60), middle value:0x38
Byte 3: (0x1c - 0x64), middle value:0x40
Byte 4: (0x20 - 0x74), middle value:0x4a
Byte 5: (0x18 - 0x6c), middle value:0x42
Byte 6: (0x18 - 0x68), middle value:0x40
Byte 7: (0x14 - 0x68), middle value:0x3e

MMDC0 MPRDDLCTL = 0x40383A3E, MMDC1 MPRDDLCTL = 0x3E40424A

Starting Write calibration...

ABS_OFFSET=0x00000000 result[00]=0x11111111
ABS_OFFSET=0x04040404 result[01]=0x11111111
ABS_OFFSET=0x08080808 result[02]=0x11111110
ABS_OFFSET=0x0C0C0C0C result[03]=0x10110010
ABS_OFFSET=0x10101010 result[04]=0x10110010
ABS_OFFSET=0x14141414 result[05]=0x10100000
ABS_OFFSET=0x18181818 result[06]=0x00000000
ABS_OFFSET=0x1C1C1C1C result[07]=0x00000000
ABS_OFFSET=0x20202020 result[08]=0x00000000
ABS_OFFSET=0x24242424 result[09]=0x00000000
ABS_OFFSET=0x28282828 result[0A]=0x00000000
ABS_OFFSET=0x2C2C2C2C result[0B]=0x00000000
ABS_OFFSET=0x30303030 result[0C]=0x00000000
ABS_OFFSET=0x34343434 result[0D]=0x00000000
ABS_OFFSET=0x38383838 result[0E]=0x00000000
ABS_OFFSET=0x3C3C3C3C result[0F]=0x00000000
ABS_OFFSET=0x40404040 result[10]=0x00000000
ABS_OFFSET=0x44444444 result[11]=0x00000000
ABS_OFFSET=0x48484848 result[12]=0x00000000
ABS_OFFSET=0x4C4C4C4C result[13]=0x00000000
ABS_OFFSET=0x50505050 result[14]=0x00000000
ABS_OFFSET=0x54545454 result[15]=0x00000000
ABS_OFFSET=0x58585858 result[16]=0x00000000
ABS_OFFSET=0x5C5C5C5C result[17]=0x00000000
ABS_OFFSET=0x60606060 result[18]=0x00000000
ABS_OFFSET=0x64646464 result[19]=0x00001000
ABS_OFFSET=0x68686868 result[1A]=0x00001000
ABS_OFFSET=0x6C6C6C6C result[1B]=0x00001111
ABS_OFFSET=0x70707070 result[1C]=0x00001111
ABS_OFFSET=0x74747474 result[1D]=0x01001111
ABS_OFFSET=0x78787878 result[1E]=0x01001111
ABS_OFFSET=0x7C7C7C7C result[1F]=0x11011111

Byte 0: (0x08 - 0x68), middle value:0x38
Byte 1: (0x14 - 0x68), middle value:0x3e
Byte 2: (0x0c - 0x68), middle value:0x3a


(0x18 - 0x7c), middle value:0x4a
Byte 6: (0x0c - 0x70), middle value:0x3e
Byte 7: (0x18 - 0x78), middle value:0x48

MMDC0 MPWRDLCTL = 0x363A3E38,MMDC1 MPWRDLCTL = 0x483E4A46


MMDC registers updated from calibration

Write leveling calibration
MMDC_MPWLDECTRL0 ch0 (0x021b080c) = 0x00190022
MMDC_MPWLDECTRL1 ch0 (0x021b0810) = 0x00250017
MMDC_MPWLDECTRL0 ch1 (0x021b480c) = 0x0011001C
MMDC_MPWLDECTRL1 ch1 (0x021b4810) = 0x0010001E

Read DQS Gating calibration
MPDGCTRL0 PHY0 (0x021b083c) = 0x024C0254
MPDGCTRL1 PHY0 (0x021b0840) = 0x02480250
MPDGCTRL0 PHY1 (0x021b483c) = 0x025C0268
MPDGCTRL1 PHY1 (0x021b4840) = 0x025C023C

Read calibration
MPRDDLCTL PHY0 (0x021b0848) = 0x40383A3E
MPRDDLCTL PHY1 (0x021b4848) = 0x3E40424A

Write calibration
MPWRDLCTL PHY0 (0x021b0850) = 0x363A3E38
MPWRDLCTL PHY1 (0x021b4850) = 0x483E4A46


Success: DDR calibration completed!!!




--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

DDR configaration file

/*
* Copyright (C) 2013 Boundary Devices
*
* SPDX-License-Identifier: GPL-2.0+
*
* Refer doc/README.imximage for more details about how-to configure
* and create imximage boot image
*
* The syntax is taken as close as possible with the kwbimage
*/

/* image version */
IMAGE_VERSION 2

/*2GB configarations for Tengri Stack 2 Rev 2
* Boot Device : one of
* spi, sd (the board has no nand neither onenand)
*/
BOOT_FROM spi

#define __ASSEMBLY__
#include <config.h>
#ifdef CONFIG_SECURE_BOOT
CSF CONFIG_CSF_SIZE
#endif
#include "asm/arch/mx6-ddr.h" // DDR Script I/O settings
#include "asm/arch/iomux.h"
#include "asm/arch/crm_regs.h"

/* Kulunu VTOS Change */
DATA 4 0x020C4018 0x00260324
/* IOMUX configuration: DDR DQS signals */
DATA 4 0x020E05A8 0x00000028
DATA 4 0x020E05B0 0x00000028
DATA 4 0x020E0524 0x00000028
DATA 4 0x020E051C 0x00000028

DATA 4 0x020E0518 0x00000028
DATA 4 0x020E050C 0x00000028
DATA 4 0x020E05B8 0x00000028
DATA 4 0x020E05C0 0x00000028

/* IOMUX configuration: DDR DQ signals */
DATA 4 0x020E05AC 0x00000028
DATA 4 0x020E05B4 0x00000028
DATA 4 0x020E0528 0x00000028
DATA 4 0x020E0520 0x00000028

DATA 4 0x020E0514 0x00000028
DATA 4 0x020E0510 0x00000028
DATA 4 0x020E05BC 0x00000028
DATA 4 0x020E05C4 0x00000028

/* IOMUX configuration: DDR control signals */
DATA 4 0x020E056C 0x00000028
DATA 4 0x020E0578 0x00000028
DATA 4 0x020E0588 0x00000028
DATA 4 0x020E0594 0x00000028

DATA 4 0x020E057C 0x00000028
DATA 4 0x020E0590 0x00003000
DATA 4 0x020E0598 0x00003000
DATA 4 0x020E058C 0x00000000

DATA 4 0x020E059C 0x00000028
DATA 4 0x020E05A0 0x00000028

/* IOMUX configuration: DDR group control */
DATA 4 0x020E0784 0x00000028
DATA 4 0x020E0788 0x00000028
DATA 4 0x020E0794 0x00000028
DATA 4 0x020E079C 0x00000028

DATA 4 0x020E07A0 0x00000028
DATA 4 0x020E07A4 0x00000028
DATA 4 0x020E07A8 0x00000028
DATA 4 0x020E0748 0x00000028

DATA 4 0x020E074C 0x00000028
DATA 4 0x020E0750 0x00020000
DATA 4 0x020E0758 0x00000000
DATA 4 0x020E0774 0x00020000

DATA 4 0x020E078C 0x00000028
DATA 4 0x020E0798 0x000C0000

/* MMDC: PHY 1 Read Delay Registers */
DATA 4 0x021B081C 0x33333333
DATA 4 0x021B0820 0x33333333
DATA 4 0x021B0824 0x33333333
DATA 4 0x021B0828 0x33333333

/* MMDC: PHY 2 Read Delay Registers */
DATA 4 0x021B481C 0x33333333
DATA 4 0x021B4820 0x33333333
DATA 4 0x021B4824 0x33333333
DATA 4 0x021B4828 0x33333333

DATA 4 0x021B0018 0x00011740

/* MMMDC: initialization sequence */
DATA 4 0x021B001C 0x00008000
DATA 4 0x021B000C 0x676B5333 //
DATA 4 0x021B0010 0xB66D8B63
DATA 4 0x021B0014 0x01FF00DB
DATA 4 0x021B002C 0x000026d2

DATA 4 0x021B0030 0x006B1023
DATA 4 0x021B0008 0x00333040
DATA 4 0x021B0004 0x0002002D //new
DATA 4 0x021B0040 0x00000047
DATA 4 0x021B0000 0x841A0000

/* MMDC: write mode registers: 2,3,1,0 */
DATA 4 0x021B001C 0x02808032
DATA 4 0x021B001C 0x00008033
DATA 4 0x021B001C 0x00048031
DATA 4 0x021B001C 0x15208030
/* MMDC: Enable ZQ calibration */
DATA 4 0x021B001C 0x04008040
DATA 4 0x021B0800 0xa1390003
DATA 4 0x021B4800 0xa1390003
DATA 4 0x021B0020 0x00007800
DATA 4 0x021B0818 0x00022227
DATA 4 0x021B4818 0x00022227


/*Read DQS Gating calibration*/
DATA 4 0x021B083C 0x024C0254
DATA 4 0x021B0840 0x02480250
DATA 4 0x021B483C 0x025C0268
DATA 4 0x021B4840 0x025C023C

/*Read calibration*/
DATA 4 0x021B0848 0x40383A3E
DATA 4 0x021B4848 0x3E40424A

/*Write calibration*/
DATA 4 0x021B0850 0x363A3E38
DATA 4 0x021B4850 0x483E4A46


/*Write leveling calibration*/
DATA 4 0x021B080C 0x00190022
DATA 4 0x021B0810 0x00250017
DATA 4 0x021B480C 0x0011001C
DATA 4 0x021B4810 0x0010001E

DATA 4 0x021B08B8 0x00000800
DATA 4 0x021B48B8 0x00000800

DATA 4 0x021B001C 0x00000000
DATA 4 0x021B0404 0x00011006

DATA 4 0x021b0004 0x0002556D

/* set the default clock gate to save power */
DATA 4, CCM_CCGR0, 0x00C03F3F
DATA 4, CCM_CCGR1, 0x0030FC03
DATA 4, CCM_CCGR2, 0x0FFFC000
DATA 4, CCM_CCGR3, 0x3FF00000
DATA 4, CCM_CCGR4, 0x00FFF300
DATA 4, CCM_CCGR5, 0x0F0000C3
DATA 4, CCM_CCGR6, 0x000003FF

/* enable AXI cache for VDOA/VPU/IPU */
DATA 4, MX6_IOMUXC_GPR4, 0xF00000CF
#ifdef CONFIG_MX6QP
/* set IPU AXI-id1 Qos=0x1 AXI-id0/2/3 Qos=0x7 */
DATA 4, MX6_IOMUXC_GPR6, 0x77177717
DATA 4, MX6_IOMUXC_GPR7, 0x77177717
#else
/* set IPU AXI-id0 Qos=0xf(bypass) AXI-id1 Qos=0x7 */
DATA 4, MX6_IOMUXC_GPR6, 0x007F007F
DATA 4, MX6_IOMUXC_GPR7, 0x007F007F
#endif

/*
* Setup CCM_CCOSR register as follows:
*
* cko1_en = 1 --> CKO1 enabled
* cko1_div = 111 --> divide by 8
* cko1_sel = 1011 --> ahb_clk_root
*
* This sets CKO1 at ahb_clk_root/8 = 132/8 = 16.5 MHz
*/
DATA 4, CCM_CCOSR, 0x000000fb






System was running properly on the day I calibrated and after few days I'm getting kernel panics.

As well as I have 50 units of this units some units work properly and some are getting kernel panics.
  1. Any dependency on memory calibration on humidity or moisture ?
  2. Can you see any fault in my configuration file ?

Regards,
Kulunu.
robertferanec , 08-08-2018, 08:41 AM
Use stressapptest to test your hardware: http://www.imx6rex.com/software/stre...stressapptest/

I tried couple of different memory test softwares, stressapptest was the best one. Try it and let me know the results.

PS: Be sure you use heatsink during stressapptest otherwise heat can influence the reliability. Also, I recommend to test your boards when running OS from SATA HDD. SD cards are not reliable and they can get easily corrupted.
Kulunu , 08-15-2018, 12:09 AM
Dear Robert,

Thank you for your feedback. Hear I have attached stressapp test results of 4 boards. I ran 12 hours and I didn't get any error unless following messages. All the boards shown passed message at the end.

Log: Seconds remaining: 1210
Log: Seconds remaining: 1200
Log: Pausing worker threads in preparation for power spike (1200 seconds remaining)
Log: Seconds remaining: 1190
Log: Resuming worker threads to cause a power spike (1185 seconds remaining)
Log: Seconds remaining: 1179
Log: Seconds remaining: 1170

As well as I have attached NXP calibration log here with.

1) According to your idea why this boards are failing with segmentation faults and kernel panics while running user application ? But in the same batch some boards are working really well without any issue.

2) Can we say there isn't any issue with memory layout or hardware design as there are some good working boards ?

3) Could this be any manufacturing issue or component issue ? But how could some boards in the same batch are working really well without any issue ?

4) According to your experience how do you experiment on this issues to narrow down the fail rate ?

Thanks and Best regards,
Kulunu
robertferanec , 08-15-2018, 01:57 AM
If stressapptest works ok, the problem can easily be SD card. Are you using SD cards or SATA? If SD card, what manufacturer are you using? There are huge differences in reliability between good SD cards and cheap SD cards.
Comments:
Kulunu, 08-15-2018, 03:26 AM
Hi Robert,I'm using EMMC and SD card. But problem is same. But how this SD card is working for other good boards if the SD card is the problem. Even with the EMMC I'm getting this problem. I'm using 16GB Kingston SD cards.What do you think about the design if some of the boards are working really well ? Have you ever get manufacturing problems while you making more prototypes ? (Like some are working really well and some are not) Waiting for your quick response.Thank you,Kulunu.
robertferanec , 08-15-2018, 04:42 AM
What do you think about the design if some of the boards are working really well ? Have you ever get manufacturing problems while you making more prototypes ? (Like some are working really well and some are not)
- This means, there is something wrong. If everything is ok, all the boards will work ok.

Another way how to find out what is wrong is to test your board in environmental chamber e.g. run your boards at -20 to -40C Deg (depends on the components you used). Many times, if there is something wrong with hardware, even boards which were running ok on your table, these may start failing in the environmental chamber.

PS: We use Sandisk SD cards a lot (we have had a lot of failings with Kingston) - maybe try to use different manufacturer for SD cards ... just to check if the boards which were failing before will work ok with different SD card. I guess, you do not have SATA as you have not mentioned this (SATA HDD is the most reliable).
Comments:
Kulunu, 08-15-2018, 10:38 AM
Dear Robert,Could you please suggest me any type of low cost environment chamber which can do these testings ? And where can I buy via online. With your experience what kind of issues did you figure out by doing stress test in environment chamber ? Did you get any fault in Open Rex memory layout or hardware design ? Regards,Kulunu.
robertferanec , 08-16-2018, 01:14 AM
- we rent environmental chamber, we do not have our own
- I have seen memory failing on different boards, not OpenRex. Here are some examples what I have seen failing in the chamber: memory, crystals (board will not boot up), ethernet, current leaking through connected connectors (board will not boot up), ...

PS: Do not forget, you may need to place a good heatsink on the processors when doing all the tests.

OpenRex passed all the tests. Here you can find more info: http://www.imx6rex.com/open-rex/soft...ental-chamber/


Comments:
Kulunu, 09-26-2018, 06:16 AM
Dear Robert,Could you please suggest me a company which can do hardware debug using environment chamber ? I'm in Sri Lanka(Asia) it is quiet difficult me to find a company in my country to do memory stress test in environment chamber.Regards,Kulunu.
robertferanec , 10-01-2018, 12:30 AM
@Kulunu, I do not really know. You would need to search locally (you would like to be able to travel there). Search for the companies doing certifications (e.g. EMC/EMI) they usually have also these machines.
Use our interactive Discord forum to reply or ask new questions.
Discord invite
Discord forum link (after invitation)

Didn't find what you were looking for?