Probably the biggest part of my job, and really it should be the biggest part of any competent administrator’s job, is automation. Most often system administrators start out in smaller places, usually small businesses or their own home network, where the number of machines under their control rarely exceeds single digits. At this point its pretty easy to get by with completely manual processes and indeed it’s usually much more efficient to do so. However things change rapidly as you come into environments with hundreds if not thousands of end points that require some kind of configuration to be done on them and at that point its just not feasible to do it manually any more. Thus most of my time is spent finding ways to automate things and sometimes this leads me down some pretty deep rabbit holes.
Take for instance the simple task of updating firmware.
You’d probably be surprised to find out that despite all the advances in technology over the decades firmware updates are still done through good old fashioned DOS, especially if you’re running some kind of hypervisor like VMware’s ESXi. For the most part this isn’t necessarily a bad thing, DOS is so incredibly well known that nearly all the problems you come across have a solid solution for it, but it does impose a lot of limitations on what you can do. For me the task was simple: the server needed to boot up, update the required firmware and then shut down at the end sop my script would know that the firmware update had completed successfully. There were other ways of doing this, like constantly querying the firmware version until it showed the updated status, but shutting down at the end would be far quicker and much more reliable (the firmware versions returned aren’t always 100% accurate). Not a problem I thought, the DOS CD I had must contain some kind of shut down command that I can put in AUTOEXEC.BAT and we’ll be done in under an hour.
I was utterly, utterly wrong.
You see DOS comes from the day when power supplies were much more physical things than they are today. When you went to turn your PC on back then you’d flip a large mechanical switch, one that was directly wired to the power supply, that’d turn on with an audible clack. Today the button you press isn’t actually connected to the power supply directly it’s connected to the motherboard and when the connection is closed it sends a signal (well it shorts 2 pins) to turn it on. What this means is that DOS really didn’t have any idea about shutting down a system since you’d just yank the power out from underneath it. This is the same reason that earlier versions of Windows gave you that “It’s now safe to turn off your computer” message, the OS simply wasn’t able to communicate to the power supply.
There are of course a whole host of third party solutions out there like this shutdown.com application, FDAPM from the FreeDos guys and some ingenious abuse of the DOS DEBUG command but unfortunately they all seemed to fail when presented with Dell hardware. As far as I can tell this is because the BIOS on the Dell M910 isn’t APM aware which means the usual way these applications talk to the power supply just won’t work (FDAPM reports this as such) which leaves us with precious few options for shutting down. Frustrated I decided that DOS might not be the best platform for updating the firmware and turned towards WinPE.
WinPE is kind of like a cut down version of Windows (available for free by the way) that you can boot into, usually used to deploy the operating system in large server and desktop fleets. By cut down I mean really cut down, the base ISO it creates is on the order of 140MB, meaning if you need anything in there you basically have to add it in yourself. After adding in the scripting framework, drivers for the 10GB Ethernet cards and loading the QAUCLI tool I found in the Windows version of the firmware update I thought it would be a quick step of executing a command line and we’d be done.
Turns out QAUCLI is probably closer to an engineering tool in development more than a production level application. Whilst it may have some kind of debug log somewhere (I can’t for the life of me find it and the user guide doesn’t list anything) I couldn’t find any way to get it to give me meaningful information on what it was doing, whether it was encountering errors or if I had executed the command incorrectly. The interactive portion of it is quite good, in fact its almost a different tool when used interactively, but the scripted section of it just doesn’t seem to work as advertised.
Here’s a list of the quirks I came across (for reference the base command I was trying to use was qaucli -pr nic -svmtool mode=update fwup=p3p11047.bin):
- Adding output=stdout as an option will make the tool fail regardless of any other option.
- There is no validation on whether the firmware file you give it exists or not, nor if the firmware file itself is valid.
- Upgrading/downgrading certain firmware versions will fail. I was working with some beta firmwares that were supposed to fix a client issue which could have likely been the cause but doing the same action interactively worked.
- There is no feedback as to whether the command worked or failed past the execution time. If it fails to update it takes about a minute to finish, if it works its closer to 3~5 minutes.
- Windows seems to be able to talk to some Qlogic cards natively (the QME2572 fibre channel cards specifically) but not the 10GB cards. This is pretty typical as ESXi needs a driver to talk to these cards as well so its not much of a quirk of QAUCLI per se, more that you need to be aware that if you want to flash the firmware on them in a WinPE environment you need to inject the drivers into the image.
Honestly though it could very well be my fault for tinkering with an executable that I probably shouldn’t be. Try as I might to find a legitimate download for QAUCLI I can’t really find one and the only place you’ll be able to get it is by extracting the Windows installer package and pulling it out of there. Still it’s a valuable tool and one that I think could be a lot better than it currently is but if you find yourself in a situation like I did hopefully these little tips will save you some frustration.
I know I would’ve appreciated them 3 days ago 😉