Engineer Timo Birnschein is trying to shield his funding in a high-end NVIDIA RTX 5000-series graphics card — by monitoring its 12VHPWR enter for indicators it is overheating.
“As an alternative of whining concerning the energy of my GPU and NVIDIA’s design points,” Birnschein explains, “I made a decision [I’d] relatively implement a thermal watchdog on the connectors. It is a very fast one: [an] Arduino with a number of 4.7k resistors and 100k thermistors. On the Home windows aspect, a Python script working with administrator rights checks the temperature of the three thermocouples and logs them. As soon as a minimum of one passes 100 levels Celsius, the PC mechanically shuts down — no matter knowledge loss. Approach higher than destroying $2000 in {hardware} or extra.”
Involved about reviews of melting graphics playing cards, Timo Birnschein is defending his funding in an NVIDIA RTX 5090 with a thermal watchdog. (📷: Timo Birnschein)
The issue Birnschein is making an attempt to unravel: the extraordinarily excessive energy draw of NVIDIA’s top-end graphics playing cards, with the flagship GeForce RTX 5090 specified at a “thermal design profile” of 575W — only for the GPU, not counting the remainder of the system into which it is put in. These sort of energy envelopes, which include the suggestion to have a system energy provide of a minimum of 950W, are excess of the PCI Categorical slot can deal with, and much more than normal PCIe energy cables. NVIDIA’s answer: 12VHPWR, a high-amperage 12V energy connector on the sting of the cardboard.
Sadly, top-end GPUs from each the brand new GeForce RTX 5000-series and older 4000-series have been identified to scorch, soften, and even flame out — owing, evaluation has urged, to a scarcity of balancing on the ability enter’s a number of pins. If the cable is not fairly clicked residence, or if a number of the pins are lower than good in both floor end or oxidation, the cardboard will draw extra present by way of the higher pins — leading to overheating to the purpose of melting and scorching.
The issue is so pronounced on the corporate’s newest technology that third-party producers have stepped in with cooled cable connectors — including both heatsinks or lively followers to attempt to tame the warmth. Birnschein’s strategy is extra reactive than proactive: monitoring the connector’s temperature and pulling the plug if issues begin to get toasty.
The temperature sensors feed into an Arduino Nano appropriate improvement board, which transfers the information over serial to the host PC. (📷: Timo Birnschein)
“The undertaking is ongoing,” Birnschein says, pointing to some bugs which nonetheless must be ironed out together with stalls within the Python script graphing temperatures. “I used an Arduino Nano [compatible development board]. A0-A7 [are] learn by the firmware. A 4.7k resistor goes from every pin to +5V. The thermistor is related on to the pin and GND as a voltage divider. That is it. I take advantage of the Steinert algorithm to calculate the temperature and dump it onto the serial port. At present, solely A0 to A3 are learn and communicated.”
events can comply with the undertaking’s progress on Birnschein’s Hackaday.io web page.