Server Upgrade

Ever since the idea of setting up my own reliable server had me thinking “I need ECC”…

So, over the period of some 6 months, I finally bit the bullet (not to mention the credit card) and purchased an Asrock C236WSI, Intel Xeon E3-1225v6 and 2x Kingston ValueRAM ECC DDR4 2400MHz 16GB UDIMMs… Culminating in final receipt of all the items in July 2017 (I purchased the two sticks of RAM within a day of each other, but Amazon shipped the wrong item for the second stick… twice).

So yes, this post is long overdue… And basically explains the long gap between the last post mid-2017 till now – I have been fighting with this new server hardware and the original draft was first drawn up in September 2017 (refer to date in the URL/permalink).

Swapping out the existing Gigabyte H87N-WiFi v1.0 board and Intel Core i3-4340 proved to not be too much a hassle, and ECC appeared to be working:

There is the dubious statistic of “128 bits” as the “Total Width” though… Attempts to use edac-utils appears to fail as my Ubuntu LTS version (16.04.03 at time of writing) does not appear to have the required Kaby Lake support (as patched here or here), unlike the predecessors’ Skylake skx_edac.ko.

Trouble In Paradise

However, several weeks went by and I noticed that my server kept rebooting (after gracefully shutting down – I happened to be logged in once on X on session 0 and the pop-up appeared asking me if I wanted to “shutdown or reboot”, which then quickly flashed off as the server began a reboot).

Checking for mces (Machine Check Errors) et. al. showed no issues; checks with the the UPS for power events also turned up empty…

I eventually checked my Intel microcode and noted that it was from 2015! So I rolled up my sleeves and upgraded it

apt-get update
apt-get install intel-microcode

Fingers crossed, currently monitoring it…

2017/10/05 Update: Nope, server is still rebooting, and sometimes the server fails to boot… I took everything apart and ran the CPU, M/B and RAM on its own, and I think it is the RAM… For some odd reason, what used to work no longer does… The only way I can get the server to boot most times with both Kingston RAM sticks is by using a lower clock speed (i.e. 2133MHz instead of the supposed rated 2400MHz)… Funnily enough, both sticks of RAM use different chip manufacturers (one is by SKHynix, the other by Micron)…

2017/11/17 Update: In desperation, I bought another 2x sticks of V-Color 16GB Unregistered ECC 2400MHz RAM and it works beautifully. I currently do not have another M/B + CPU to re-test the Kingston RAM to re-test and file for an RMA…

2017/12/25 Update: ASRock support in Taiwan is working on Christmas Day… In reaching out to them, they actually got themselves some of the same Kingston RAM and tested it successfully… So it is just my RAM…

2018/01/05 Update: Nope, despite the new RAM, server is still rebooting, sometimes not even with a graceful shutdown… And it still fails to boot sometimes. The saga continues

Leave a Reply