Recover.IT – HP EX490 MediaSmart Server – Part 2

This is part 2 of the recovery of the HP EX490 MediaSmart Server – which is a Windows Home Server machine. The second drive on this server was seen to be offline, so I had shut down this server to investigate the problem.

Last Saturday, I ran some tests on the drive as previously mentioned. On Wednesday night, I decided to copy the disk to a new 3TB disk that I had just bought – a Toshiba 3TB drive with 3 years warranty for $127 each from a local computer shop. I thought that this was a good price.

Anyway, as you might have guessed – I connected this disk on a Linux machine. The Linux in this case was Ubuntu. I used the dd command (that I have previously mentioned) to copy raw data from the disk directly to the new disk.

dd if=/dev/sdb of=/dev/sdc conv=noerror,sync 2>&1 | tee -a ./logfile.txt

Now, of course – I did a smartctl -a /dev/sdb first to check the source disk, and then another one – smartctl -a /dev/sdc to confirm the destination disk. The source disk is a Seagate – correct, and the destination disk is a Toshiba – also correct, so I was good to go. It is good to check, and don’t assume that because the Seagate is connected to SATA0 and the Toshiba is connected to SATA1 that the disk designations will be in the right order.

Ok, so on Wednesday, the copy was started – then I went back to the machine some time later to check on its progress and I see these errors on the display.

dd: error reading ‘/dev/sdb’: Input/output error
6364552+0 records in
6364552+0 records out
3258650624 bytes (3.3 GB) copied, 346.982 s, 9.4 MB/s
dd: error reading ‘/dev/sdb’: Input/output error
6364552+1 records in
6364553+0 records out

6,364,552 sectors were read and copied before an error occurred. The noerror parameter means that it will continue, and sync means that the unreadable sector will be replaced on the destination with a blank sector. I stopped the copy at that time, since it is not a good idea to keep trying to read bad sectors in case the drive decides to quit permanently.

Then last night, I decided to copy from a point after this sector. This time I used this command line and let it run overnight after it seemed to start without throwing up any errors.

dd if=/dev/sdb of=/dev/sdc conv=noerror,sync bs=1M skip=4000 seek=4000 2>&1 | tee -a ./logfile.txt
1903728+1 records in
1903729+0 records out
1996204539904 bytes (2.0 TB) copied, 24148.3 s, 82.7 MB/s

For that command, I set a block size (bs) of 1MB, then used the skip and seek parameters to begin at a point 4000MB into the drive, on both the source and the destination. I checked this morning when I woke up, and found that it had completed successfully – the time taken for the copy works out to about 6.7 hours.

This evening, I also bought a Toshiba 2TB disk drive on my way home – I will talk about this later on. Ok, so I had copied about 3.3GB on Wednesday before it hit the bad sectors. Last night – I started the copy at 4GB or thereabouts onwards and it copied to the end. Now I did a few more copying commands – I won’t bore you with all of the details however the result was to copy the remaining good sectors, using the count parameter to specify how many blocks to copy.

Eventually, I had copied every sector that was able to be copied. It turns out that sectors 6,364,553 to 6,364,568 – 16 of them was unable to be read, not too bad. I also copied a couple of blocks before and after the bad sectors and had a look at the data – it seems to be file information, most likely parts of the Master File Table – which means that a few files are potentially lost.

Ok, this is where my new 2TB drive comes in. I put the faulty Seagate drive back into the EX490, and then added the new Toshiba drive into the top-most bay. After powering up the MediaSmart Server, and waiting – I was eventually shown two solid green lights – which means that the Seagate drive is now online together with the main WD drive, and one blinking green light which was the new Toshiba drive. I logged onto my Windows Home Server console and went into Server Storage and proceeded to add the new drive.

Screenshot 2016-08-05 19.45.01

The idea is to add the new Toshiba drive, so that WHS knows that it is available for storage, and then tell WHS that I want to remove the Seagate drive.

Screenshot 2016-08-05 19.45.51

You might ask, why am I doing this? The drive has bad sectors – it isn’t a good idea to keep using it. Also WHS allows me to remove this disk – by moving and redistributing the files on the disk to other available disks, like the new one that I just added.

Screenshot 2016-08-05 19.46.11

Great, it says that I have sufficient storage space to have this drive removed.

Screenshot 2016-08-05 21.42.40

Ok, I am not actually going to sit here and wait for it, but eventually it will (hopefully) tell me that the drive is ready to be removed. Depending on how full the disk drive was, it can definitely take many hours. Windows Home Server is actually really good, because most storage systems don’t allow you to remove disk drives once they had been used for storing data.

What about the 3TB drive, you are thinking? That is for insurance – in case the disk stops working during the removal, then I have a copy of it that I can use to copy files from. If this removal works successfully, then my 3TB drive can be retasked. By the way, Windows Home Server cannot use disk drives larger than 2TB without major surgery. The reason for this is that WHS uses partitioning based on the Master Boot Record. In order to use drives larger than 2TB, it is necessary to use GPT partitioning – but that is another story.

What about the 16 bad sectors on this Seagate drive? Once I take it out, I plan to do a factory erase on the Seagate drive – this should rewrite every sector on the disk, including the bad ones and I should end up with a disk drive without bad sectors. I can then use it it either for temporary storage of non-critical data or run lots of diagnostics on it to see if it is continuing to fail. If it holds up to the diagnostics, maybe it gets a second chance on life.

In the meantime, I am off to bed!

Recover.IT – HP EX490 MediaSmart Server

Yesterday, I noticed that the Windows Home Server icon in my taskbar was red.  I opened it up and saw some file conflicts – that is strange.  I could access the files in the server, so what is going on – then the penny dropped, it says that a disk drive is missing. I went out to the computer area and could see only one disk was lit up, the second one is not lit – meaning that it is offline. I went back to the console and shut down the server – which eventually it did, albeit slowly because it had stopped responding for a long time before I could hit the Shutdown button.

DSC_0098

Some of you may have heard about Windows Home Server, many probably haven’t. WHS was a great product for its time – a semi-redundant network storage device that could be packaged like a NAS. I bought this HP EX490 MediaSmart Server back when it was available in 2009. That is the box on the right in the photo above, ok – a little dusty even though it sets on a shelf 2m above the floor.  It came with a single Seagate 1TB disk drive, and over the next few years went to 4x1TB drives, then eventually to 2x2TB drives. The files can be stored in folders that are shared out – and each folder/share can be configured to be redundant or not.

Ok – back to the problem at hand, one of the two drives – the Seagate 2TB had apparently stopped working.  After it had shutdown, I pulled out the second drive and connected it to my test/recovery machine. This second drive was able to spin up, and I ran a few commands on it, to determine what the issue with the drive was and then shut down. I didn’t want to keep the drive running until I had a way to copy its contents – having temporarily run out of disk storage space recently.

One of the commands that I run is “smartctl -a /dev/sdb” which on Linux will check the display the SMART data from the disk drive which is physically connected as /dev/sdb. The interesting things I am looking for are the Reallocated Sector Count and if any of the SMART attributes show that the drive has failed. None of them did and the Reallocated Sector Count was 14760 which is a little high – but this can be normal for the drive. The Power On Hours was 34,235 which equates to nearly 4 years – the drive itself is 5 years old. If I hadn’t used the drive straight away – this might be ok.

Of course, there were other values to be considered. Attribute 187 – Reported Uncorrectable was 0, 188 Command Timeout was 1, 197 Current Pending Sector Count was 216 and 198 Offline Uncorrectable Sector Count was also 216. Now – these last two are concerning – generally a non-zero number on these can indicate that the drive is having issues, and we should plan to replace it.

Smartctl also reports SMART errors that the drive has recorded – the main one occurred at 34,227 hours – like 8 hours before I noticed the problem and shut it down. This was error 8170 – WP at LBA = 0x00611d8f = 6364559 – this probably means that it couldn’t access this particular sector – which is a concern. What I need to do now, is to obtain or get a spare disk of at least 2TB and make a disk to disk copy of it – in order to ensure that my data is copied. I have a few 3TB disks lying around – maybe I can free one up for a little while. I think I will do that during the week.

Remember that I mentioned that we can specify some folders or shares to be redundant – meaning that the contents of those folders have copies that reside on the other disk? Well – not all folders were marked to be redundant, so if any of those folders reside on this particular disk might well be inaccessible. Fortunately, Windows Home Server creates a NTFS file system on each drive, so these drives can be connected to any Windows machine and be accessible – unlike some versions of RAID which can mean that the data is striped across each disk.

The other thing I want to think about is – what I would replace this WHS with. I currently run a virtual Freenas on ESXi server – but I was thinking about building a new standalone network storage appliance. Freenas is great if we can get the right hardware – such as ECC memory, a CPU and motherboard that supports ECC memory – and run ZFS but then I was reading about issues on ZFS – which caused me to look at what other people are using.

I could stay with Linux and run something like MergerFS and SnapRaid or I could go the Windows way – with Storage Spaces which is looking very tempting, except I don’t have a spare Windows 10 machine to play with – since the Free Upgrade from Windows 7/8.1 was over a couple of days ago. Decisions, decisions…