Repairing A Failed P400 Raid Cache Battery

This was done on my HP P400 but the same method should work on a range of other cards with a similar setup such as the P212, P410 and probably many others from this era. As ever your mileage may vary!

Recently I was trying to copy one particularly large file across my network and realised the write speed on my server was dramatically reduced from what I would expect and the reported queue length in resource manager was regularly getting to 10 which seemed excessive. I was also seeing peak throughput to the 12 disk array of about 30 MB/s total.

I was copying the file off my desktop and this is where the weirdness started. I was sending from the desktop at 50MB/s which was slow but on investigation it checked out that that was correct for the drive it was coming from (though it did highlight I was still using one drive that was about 13 years old I should probably replace!). It would start fine and after a while I’d see the following:

Error 0x8007003B: An unexpected network error occurred

Error 0x8007003B: An unexpected network error occurred

I’m still not entirely sure what process is going on in the background here but what I saw was increasing ram usage (presumably at the 20MB/s difference between the sending and writing processes but I didn’t work it out). I assume at some point this buffer hits a limit and it fails. Whatever the exact process this is just a symptom of the fault elsewhere as indicated by having such an awful write speed on the array.

I started investigating the array by actually installing the HP array configuration utility (I probably should have done this ages ago but I actually configured the array through the BIOS utility originally) and on selecting the controller sure enough :

P400 Failed battery warning

Well that’s fairly definitive. So I decided to open it up and have a look at the damage. for the P400 the battery pack is on a long cable and is usually mounted to a bracket on the chassis next to the card. On the DL180 I have it installed into you have to remove three mounting screws and lift out the entire mounting frame and riser card section in one piece.

P400 card installed

Here you can see the HP P400 card (made by LSI but as far as I’m aware with no LSI branded equivalent). Piggybacked on the card is the 512MB cache card and coming out of the right end of it is the multi-coloured cable to the battery. This can be gently pulled out so the battery can be removed.

P400 position in DL180G6

The battery module as it was installed in my server, apparently someone had previously put it back in 180 degrees wrong but nevermind. Right away you can just about see the distortion in the casing caused by the battery failure.

P400 Cache Battery

A closer look at the battery pack shows how badly distorted the casing actually is indicating a total failure of the cells inside – not terribly surprising for a 13 year old battery! If you were doing this properly you’d just buy another one of this whole battery unit and replace it but I decided that for a raid card this old most second hand ones would likely be well used because I doubt they have produced new ones for a while so it’d be hit and miss anyway. Add to that the battery unit just uses four standard 1.2V NiMH cells (albeit uncommon shaped ones) I decided I could probably fix it with basic parts. The issue here was that I found this issue late on December 23rd so the challenge was to fix this with parts I could find locally before the Xmas break. Standby for the hackery…

I started trying to get into the pack on the basis it was ruined anyway. First off flip the pack over and unplug the cable from the battery pack. Technically this isn’t required nor is the board removal below but it makes it easier and takes the risk out when you’re working on the rest of it.

P400 Battery Cable connector
P400 Battery Cable connector removal

Removing the cable shows us the management board that handles charging status and health reporting to the card so we need to keep this. This is retained by a single clip in the recessed section on the left hand edge. Push the clip to the left and lift it, you may need to press it with a tool of some sort as it’s quite small and firm.

P400 Battery management PCB

Once unclipped as above the board can just be lifted out because the battery contacts are just sprung against the underside.

P400 Battery management PCB removed

Next up we need to try to get at the cells themselves. It’s a little hard to see but it turns out the underside is the ‘removable’ section of the case (as in the rest of it is a single moulded part) but mine was thin and brittle enough it just broke up as I tried to take it out but being the underside it’s not visible when reinstalled anyway.

P400 Battery enclosure cover removal

I used one of these iPod cover pry tools I had handy but since the cover broke up anyway a fine screwdriver would also do. Gently work round the cover prying it up but the cover is silicone-ed to the centre of the cells as well so just work on it – it comes of relatively easily.

P400 Battery cells exposed

Now we can see the cells for the first time. You can probably see the how the battery contacts for the PCB form part of the pack. Gently work round the pack cutting/levering the silicone apart then lift the cells out from the curved (non PCB) end of the housing to prevent damage to the contacts. Hopefully If should look like this:

P400 Battery cells removed

For anyone who wants to rebuilt their pack with the correct new cells they’re four Varta V500HT 1.2V/500mAh cells wired in series but you would have to find a good way to replace the cell links because they’re welded on. Also I could only find the cells on eBay where they were £7.50 each

So this is where my plan goes a bit more creative. I decided to just replace the the pack with some basic off the shelf NiMH cells so initially I tried to find a prebuilt pack of the right voltage. These are commonly available on eBay and several other places with a pair of wires coming out however as mentioned earlier this was now Xmas eve and so anything I ordered wouldn’t arrive for ages due to the break. I realised I had some AA NiMH cells which were also 1.2V so four of these would also be correct but I had no way of mounting them. After a brief search I found a local model shop who had suitable 4xAA holders in stock and I was the owner of this.

4xAA Battery  Case

Well worth a couple quid. Next we have to try to make this connect to the management PCB. Now if we were going totally hacky I’d have just soldered the two wires off the battery pack directly onto the contact pads on the back of the PCB and put a big bit of heat shrink over the board but that seemed a step too dodgy…however tempting it was! What I actually did was cut the board contact pads from the pack with a section of the metal from the cell side to allow a wire to be soldered on. The contact pad can them be slotted back into the original housing.

P400 Battery cell contacts reinstalled

In the picture above I slotted the PCB back in to check it made contact and to help hold the pads in place during soldering. Next I drilled a small hole into the rear of the housing to run the wires through from the other pack then soldered the wires on. The + and – are clearly marked on the housing and on the PCB so check this before soldering.

P400 Battery new pack connected

And I stuck a bit of hot glue on the contact points to stop them moving and a blob on the wire to keep it from straining the joints.

P400 Battery 3xAA complete

Now just put the AA’s in the case. If you have a battery case like this one, remember to flick the switch to ON!

Now you just plug the cable back into the PCB and install back into your computer. If you have the proper mount for it in your chassis it will just clip back in and not even be obvious the back has been removed. If you used a longer bit of black 2 core cable between the new pack and the old one you could make it even neater. Alternatively if you cut out the top of the housing you could probably fit a 2×2 AA housing sticking out through with the aid of some silicone but it’d still mount onto the standard bracket even if it would be pretty ugly. Take your pick!

P400 Battery with new 4xAA pack in the server

All done, now put the server back together and fire it up…

P400 Battery charging warning

Now on checking the HP Array configuration tool the message has changed to the above. I left it do its own devices and after a while this warning cleared. Once caching was enabled again the speed increase was dramatic. The write speed on the array for large files (>1Gb so larger than the cache) measured at 234 MB/s on the server Vs the 30 MB/s it was before. Smaller file writes which store to the cache are dramatically faster. On reattempting the large copy that kicked all this off it worked flawlessly first time. Monitoring resource manager on the server I could see that rather than a continuous write at 30 MB/s and high queue depths the write speed was periodically jumping much higher and waiting for more data in between, the queue depth was constantly at less than 1. I also didn’t see the increasing RAM usage I’d seen previously. It is also evident looking at the drive lights are mostly solid now with the occasional blink where they were constantly blinking.

2 thoughts on “Repairing A Failed P400 Raid Cache Battery”

  1. Nice write up, I had to do similar a year or two back on my DL380G5.

    Now, this might be a totally silly question and despite many years in IT, I can’t fathom an answer.

    What is the point of Battery Backed Cache RAM?

    The obvious answer is that it protects the contents of the cache in case the controller loses power so it can flush the contents to disk.

    But if there’s no power, the disks are not spinning, therefore you can’t write to them.

    What am I missing?

    1. Hi, thanks! There’s no question that’s a silly question if you learn something new from it!

      What happens is when the power is restored any queued operations which were in the raid cache memory but not yet written to disk are written out to the disk array automatically by the raid controller. So if you don’t power it back up fast enough after a power failure you will still lose the contents of the cache though generally with a good battery this period can be in the order of days so you have a decent window. Obviously during normal planned shutdown the card is told to flush the cache to disk before the system is powered down so you don’t have this time limit.

      Jon

Leave a Reply

Your email address will not be published. Required fields are marked *