Micron SDRAM bit decay (Refresh issue)

Question

I am using Micron SDRAM "MT48LC8M16A2P" with Cirrus Logic EP9307 microprocessor. I am using a RTOS on the system, as well. The SDRAM refresh rate is being set to "5us" in the processor register, against 15.625 us specified by the datasheet. I do not have a low power mode, and hence no self-refresh commands are sent to the SDRAM.

Observation: -> I could observe bit rots in random sections of the SDRAM cells, when I start multi-tasking. Out of nowhere I go to a data-abort after about 10 mins of runtime. -> I could observe known sections of the data memory getting changed. -> I was able to avoid this issue by adding a refresh cyclic task, which touches each SDRAM rows and hence an explicit refresh is generated. -> However,I could still observe bit rots in the memory cells, as soon as I connect the emulator to debug the code. -> There is no issue seen with normal read and write operation to the SDRAM.

Questions: -> Just wanted to clear my suspicion, if this could be a refresh issue or has anyone faced a similar situation ? -> I have only done a one-time configuration of the internal SDRAM controller of the EP9307 microprocessor. Is there any configuration that needs to be updated at runtime ?

Thanks in advance.

-Gaurav

Typically the host controller issues refreshes. You could verify this by sitting in a tight loop with interrupts locked and observe a few SDRAM control lines. If you see cycles, that is a refresh. It is hard to know what issue you have. Typical SDRAM will be much better than the spec which is worst case. SDRAM can last minutes in real life. Also your system has cache so probably a bit flip in physical RAM may not be noticed for some time. The best way to verify is with a scope on a control line. Also sometime these lines are very noisy! It is possible that hw routing is bad. — artless noise, Mar 22 '19 at 16:41
Cross talk between lines and bad row/col could also result in this crash. The timing of SDRAM (300MHz+ has a wave length of ~100cm or less) can be the size of a PCB. Control lines must be very closely routed with data lines. Vias, plane differences (different layer) can cause very different signal times. They need to arrive DDR->CPU or CPU->DDR at the same time as a group. They also need to have good conditioning trace width vs ground/power plane so that the impedance doesn't cause ringing. Another great reason to hook up a scope. — artless noise, Mar 22 '19 at 16:45
Related: [DDRs and DLL](https://electronics.stackexchange.com/questions/125942/do-i-need-to-reset-a-ddrs-dll-when-i-change-clock-frequencies), [ARM memory test](https://stackoverflow.com/questions/11640062/how-to-do-memory-test-on-arm-architecture-hardware-something-like-memtest86). Related to the hw traces, a temperature change (as components heat a case) can cause this issue to develop over time. Also specific hw accesses; for instance, we only had crashes when so an multi-megabyte SSH transfer via ethernet. Ethernet DMA and intense CPU DDR access with SSH encryption triggered the issue. — artless noise, Mar 22 '19 at 17:30
Thank you for the inputs. I have validated the timings using static simulation tool by adding all possible constraints, including PCB routing delays. All holds good for Burst Read/Write and Auto Refresh. However, there is an errata for the microprocessor "EP9307", that Auto-Precharge will lead to instability above 50MHz operation. I am operating at 92 MHz and hence disabled the "Auto-Precharge" bit in the MCU registers. So, do we have to give a manual precharge after every read and write at application runtime. Could this be the issue ? I am using only one bank of the SDRAM i.e 4MB out of 32MB — Gaurav Pratim Talukdar, Apr 01 '19 at 13:53

Micron SDRAM bit decay (Refresh issue)

0 Answers0