Quantcast
Channel: Raspberry Pi Forums
Viewing all articles
Browse latest Browse all 3552

HATs and other add-ons • Re: NVMe overheating issues with GeekWorm dual NVMe hat

$
0
0
I haven't measured the total power drained by pi. Just checked in some forums - some guys measured pi power consumption on max load. Nothing is connected to pi, except the hat + 2 ssds.
I have an active cooler, but the hat is actually below the pi (ssds do not have good cooling). That's why I will add some passive coolers (which is still not very good, because they will still be below the board).

Despite of this, with the nvme_core.default_ps_max_latency_us=3601 (this should avoid going to most low power state [2]), devices stay at around 50 degrees in idle and 55 degrees at some load.
After copying 100 gb from one ssd to other, I reached the 85 degrees (even though I've set max temp to 71 of the host controlled temperature). Then this error appeared (the one hdd couldn't get reset and was switched off) [1].


There will be more experiments ongoing to tweak the controls until I the coolers come.

[1]
[ 420.936029] nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x11
[ 420.936037] nvme nvme0: Does your device have a faulty power saving mode enabled?
[ 420.936040] nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off" and report a bug
[ 421.028039] nvme 0000:03:00.0: enabling device (0000 -> 0002)
[ 421.028064] nvme nvme0: Disabling device after reset failure: -19
[ 421.056077] Buffer I/O error on dev nvme0n1, logical block 61047336, lost async page write


[2]
pi@pi:~ $ sudo smartctl -c /dev/nvme1
smartctl 7.3 2022-02-28 r5338 [aarch64-linux-6.6.58-v8-16k+] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Firmware Updates (0x16): 3 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x00df): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp Verify
Log Page Attributes (0x2f): S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg *Other*
Maximum Data Transfer Size: 128 Pages
Warning Comp. Temp. Threshold: 85 Celsius
Critical Comp. Temp. Threshold: 85 Celsius

Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 7.47W - - 0 0 0 0 0 0
1 + 7.47W - - 1 1 1 1 500 500
2 + 7.47W - - 2 2 2 2 1100 3600
3 - 0.0800W - - 3 3 3 3 3700 2400
4 - 0.0070W - - 4 4 4 4 3700 45000

Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0

Statistics: Posted by jshan — Sat Nov 02, 2024 7:35 am



Viewing all articles
Browse latest Browse all 3552

Trending Articles