Maintenance Log... This is a log of the various issues and maintenance I've encountered / undertaken on my computers. Click here for the 2021-2 year's log [Feb-Mar '21] It was my intention to keep 'crunching' throughout March until I climbed into the top 100 in the UK standings, but saving electricity became the greater desire as the weather warmed and the extra heat from my computers was not so much needed. I climbed to 106th but have now shut down most of my computers. I expect I'll likely resume my participation proper in November. [13-20 Jan '21] After I replaced the PSU in iccleBeast last month (but I don't think caused by it) the computer has been scrapping the work units processed by the GPU. This has happened both under Windows 10 and Ubuntu. At first I thought it was an Ubuntu issue, and then a driver issue when testing under Windows 10. I've now swapped the graphics card with the one in 2600K although that doesn't appear to have moved the problem, or resolved it. I have now moved the graphics card to the second PCI-Express slot. I also did the same thing for P5NE which was having issues POSTing, until I realised the last time I used it it had two graphics cards installed and there is a jumper card that needed reversing to switch the motherboard to single graphics card mode.
Now that the problematic graphics card is in 2600K I can more closely monitor it. I have used MSI Afterburner to lower core, memory clock frequencies and maximum power %, along with manually increasing the fan speed. By doing this and keeping the temperature no higher than 60'c I have seemingly improved stability and Einsteinathome is no longer scrapping the work units part the way through. It seems apparent that the graphics card is faulty, but the Einsteinathome issue is the only symptom of this; stress tests run without fault. The graphics card is still under warranty but I don't know how to proceed with a return without a clear demonstration of its fault. If I can get my three computers with graphics cards in running well I might just get into the Top 10 (average credit) in the UK, and potentially into the Top 100 (total credit) by the end of March when I switch everything back off again. It's hard work participating for only half the year! [27 Dec '20] A routine check of G43T today revealed that it had frozen some 5 hours earlier. I thought a simple press of the reset button would get it back up and running, but no. Ubuntu isn't loading. I chose a recovery option and that just left me with a blue desktop with missing icons. A reinstall is required... [20 Dec '20] I did another check of iccleBeast and again found it was frozen and hadn't reported to the servers for a few days. Similar to the 9th December I tried restarting it, but this time to no avail; Ubuntu would get stuck loading. I considered the hard drive was faulty so I attempted to boot into Windows, but the result was the same, it too froze. I now considered that the PSU was at fault since it's an 'unbranded' one of some age now (probably 10 years old). Not having a spare I took the Corsair branded 430W virtually brand new one from BIOSTAR and tried it in iccleBeast and it started up into Windows 10. I took a closer look at the suspect PSU and indeed a couple of the capacitors were showing signs of failing.
I have put this PSU in BIOSTAR since that only using a CPU to run Einstein@Home, whereas iccleBeast has a graphics card to power. While it might seem on paper that the Corsair 430W PSU would provide less power than the 600W that it replaces the key is in the details. Looking at the labels reveals that the 12V rails on each are around 30A (where they are needed), whereas the cheap 600W makes up its wattage by providing more amps on the 5V rail (where they aren't needed). Other power supplies might have two 12V rails with the power divided between them, making supplying enough current to one power-hungry PSU problematic. I used HDTune to inspect the condition of the hard drive containing Ubuntu from iccleBeast and it reported a "Relocated Sector Count".
I'm now running a full scan on it to determine if there are any damaged sectors. For now I keep running iccleBeast on Windows 10 (which is on 1904 and needs updating, but at least it appears stable. [09 Dec '20] I set to getting my second Phenom machine (Phenom2) up and running. At first it was already installed with BOINC on Windows XP with 2GB of RAM but seeing these two things, particularly the latter, being a limitation for Einstein@home, I decided to install Ubuntu. I wanted to add a partition to the existing hard drive to make room for Ubuntu but I failed in this regard:
A quick check on things via the Einstein@home website and I found that iccleBeast had not reported to the servers for a few days. I found it to be frozen so restarted it. BOINC was then failing to recognise the graphics card/"coprocessor", so assuming it to have corrupted that last instruction to install OpenCL I repeated that procedure, restarted, and all was well (perhaps only a restart was needed). [08 Dec '20] I added BIOSTAR to the fleet. It's running on Windows 7 and with crunching on its AMD A4-5300 APU installation went smoothly. I next set up G43T on Ubuntu, crunching on its quad-core processor. [05 Dec '20] I went to use Core2Quad and found it frozen with a graphical glitches on the screen. This doesn't bode well and I suspect the graphics card is faulty. I switched it off and back on and it worked ok while I used it for a couple of hours, now I will leave it running BOINC again to see how it gets on. Fortunately my graphics cards are under warranty.
[04 Dec '20] I have successfully got
iccleBeast and
Core2Quad (both running Ubuntu)
to run Einstein@Home on their graphics
cards; they needed OpenCL to be
manually installed [ [30 Nov '20B] I've just fired up iccleBeast and gone through a similar process to the previous progress update, hoping that system, which is in some ways similar, will run Einstein@Home on the GPU. [30 Nov '20A] Core2Quad still refuses to download work units for the graphics card. I've updated the Nvidia drivers in Ubuntu from 390 to 455 and checked various settings, but still no joy. With WO running stable on Windows 7 (it completed one round of work units), I have today fired up Core2Quad to see what happens... [26 Nov '20] I went to do a routine check of WO this morning but found the computer to be... well... sleeping. The screen wouldn't come on and the CPU was cold. It wouldn't come back to life with the press of a key. I switched it off and back on and booted into Windows 7 (instead of Ubuntu) to do a hard drive scan with HD Tune. This came back clear although signs that it is an ageing drive (a power on time of close to 4 years!) The Einstein server indicated that this computer hadn't made a call for a couple of days. The last time I checked it was 25% through a couple of work units that would each take around 30 hours. I'm now going to run BOINC in Windows 7 to see if it remains stable. [24 Nov '20] I powered up WO for the first time since March (after installing some RAM). The CMOS data needed resetting (it probably needs a new CMOS battery. I'd found 2GB and hoped that would do, but Ubuntu got only so far with loading and left me with a purple screen so I switched it off and tried again. This then seemed to leave a corrupt file system (it's using an old IDE hard drive so I will need to test that at some point). I did 'fsck /dev/sda5' (Linux's version of fdisk which I'd never manually run before) as the error message implied was required, I said 'yes' to everything, and then all was well. Since starting crunching again [November 2020] I have discovered the CPU fan in 2600K isn't spinning without first giving it a gentle nudge, causing the heatsink to become scolding hot (literally, I burnt the back of my fingers on it!) and the warm air staying within the case causing the whole system to become quite toasty and my room not benefiting from the heat generated. I discovered 2600K wasn't downloading work for the GPU. It turned out I had settings in place putting priority to SETI@home so once I changed Project Preferences at the Einstein website all was well. (Network settings in BOINC were also limiting initial downloads; I derestricted these temporarily). Phenom had stopped switching on, the only issue I found was that one of the 12cm case fans was seized. Now replaced all seems well. |