What you need to know about the new HPE Hybrid IT Master ASE Certification exam

As I am sure those of you who are heavily involved in architecting Hewlett Packard Enterprise’s infrastructure solutions consisting of servers, storage and networking already know, there was a new HPE Master level certification announced earlier this year.  This new certification is the HPE Hybrid IT Master ASE, and it is going to be the pinnacle of all HPE certifications.  Many of us that hold Master ASEs in Servers, Storage, and Networking will naturally be looking to obtain this Master ASE certification as well.  In some cases, Partner Ready requirements will drive your need to obtain this certification, but I also know that for many of my peers, it’s a matter of pride and desire to achieve this certification.  However, it really doesn’t matter the reason that drives you to achieve it, I am writing this article to tell you that achieving this new certification isn’t going to be a walk in the park.  HPE opted to take a different path to certification and the traditional testing methods we all know, have tested with before, and are comfortable with have been changed up some for this certification.

By now you are asking yourself how does Dean know about this?  Myself, along with several of my peers from around the globe (many of whom you would likely know too) were honored to be invited join the design team for this certification (and some of related electives for the certification).  When this certification goes live, it will have been a 15+ month journey for some of us, beginning in August 2018.  That journey took us from the initial blueprint of how we wanted to test, to the content of the beta courseware (which was just finished last month), to the certification launch on November 1, 2019.  There are hundreds and hundreds of hours involved amongst us in the design of this certification, the courseware, and of course creating the certification exam its self.  Along the way, there were many phone calls, Skype meetings, face to face meetings at various HPE facilities, and countless hours of reading (and then revising) the alpha and beta courseware material that makes up both the Hybrid IT ASE and HPE Hybrid IT Master ASE courses and exams.  In mid-July (2019) many of us from around the globe gathered in a meeting room at HPE’s campus in Roseville, California to work on the exam creation.

The first thing you’ll notice different is the exam number.  Today, we normally all take proctored HPE0-### exams for our certifications.  The HPE Hybrid IT Master ASE certification will be an HPE1-### series exam, and will not be delivered by Pearson VUE but rather it will be delivered by PSI.  While PSI does have some testing centers, the HPE Hybrid IT Master ASE exam will be an online proctored exam that you will be expected to take at home or at your office – similar to the online proctored HPE0-### exams that are already offered by Pearson VUE.

The second difference you will notice is the length of the exam – you will be given 4 hours to complete it, not the typical 90 or 120 minutes you are used to with the HPE0-### exams (yes – washroom breaks will be allowed).

The third thing you will notice different is both the exam price and the retake policy.  The price of the exam will be between $695 and $895 USD depending on your country of residence, which is more than double the price of today’s HPE0-### exams.  The retake policy is also different.   With HPE0-### exams, you can immediately retake the exam once if you fail it (as long as you have not failed twice in 14 days).  With the new HPE1 exam, there will be an automatic 14-day waiting period after each failure before you can rebook for another attempt.

The fourth thing you will notice is the composition of the HPE Hybrid IT Master ASE exam – it will be broken into 3 distinct sections.  Questions and answers (similar to today’s exams), a research portion, and a hands on portion (more details on all three of these sections is below).  However, for every single item, once you click submit on the answer to the item, there is no going backwards to review or change your answer.

Part one of the exam will consist of a series of Discrete Option Multiple Choice (DOMC) questions.  For those of you that have not seen a DOMC exam before, basically you get asked a question, and are presented with a single answer on the screen at a time – to which you either select YES or NO if the answer is correct for the question.  Each question may have one or more answers that get presented to the test taker (but still only one answer at a time will appear on the screen).  I’ll admit I was very skeptical and concerned when the decision was made to utilize DOMC, but having worked with it for a while now as part of this process, I’m very comfortable with it and I am no longer concerned it will affect your chances of passing or failing.

Part two of the exam will probably start to take some of you out of your comfort zone.  You’ll be given a series of scenarios that you will need to answer questions about.  Some scenarios may build on previous scenarios you were given as well.  You’ll RDP a remote environment, and be required to observe many items in that environment to answer questions about accurately building a solution that properly integrates with that existing environment.  Nothing is off the table here from Synergy frames to storage systems and network switches.  Almost all the Hybrid IT portfolio and their respective management GUIs or CLIs are present here – you’ll need to know where to look to determine if the answer presented to you (via DOMC) is correct.  This is no different from what you’d need to do if you were designing an upgrade for one of your customers.  A simple example is “Your customer wants to do this with their existing environment, do you need to add this particular item to your solution to accomplish this? YES or NO”.

If part two got you out of your comfort zone, then part three is going to really take you far out of your comfort zone…  In part two, you are simply reviewing the exam’s hardware infrastructure and environment, but in part three, you are actually modifying the environment – with very real hardware that you are connected to.  Think of it as having to perform a demo of a feature or something to one of your customers using their existing equipment.

You know all those hands on labs offered at various HPE conferences that you may have attended in the past, but you’ve skipped to spend extra time at the bar in the evenings?  Well those HOL experiences will be very handy here, as it’s very much hands on with the management tools (both GUI and CLI).  Everything from configuring, upgrading, or fixing connectivity issues with Synergy, 3Par, Windows, vCenter, and switches (of all types) is covered here – and you may need to use multiple tools from across the portfolio to accomplish your tasks.  You may use either the GUI or CLI to accomplish your task (or maybe both), but the task must be 100% correct and completed when you hit the submit button.

You will be provided all the appropriate manuals, CLI guides, and documentation you require to complete the tasks – they will available on the server you will be RDPing into.  So it’s opened book so to speak – you’ll have these resources, but only these resources (you won’t be able to search the internet for walkthroughs!).  However, if you have to utilize the provided material to look up how to complete every single little step, you’ll quickly run out of time – the documentation is there to provide you a guide, not tell you how to perform (i.e. for the first time in your life) whatever action it is you need to do.

A word of warning though – as this is real hardware, running in a real datacenter, it is possible for you to completely break the testing environment, which will prevent you from completing your assigned task, possibly resulting a score of zero for the task.  In the real world, if you mess up and accidently destroy or delete something in your customer’s running environment, you’ll have failed in the customer’s eyes.  This is no different – if you break the testing environment here (i.e. maybe you accidentally deleted a volume instead of extending a volume) and are unable to complete the assigned task because of it, then you’ll fail the question.

HPE says this is the first time anyone in the IT certification industry has used real hardware and an automated scoring system in real-time to verify that what you have done is correct.  Spelling counts.  Exactly correct numbers count (i.e. 100MB vs. 1000MB).  If you are asked in a scenario to name something “bigwheel” and you name it “big wheel” with a space (or you typo it as “bigwhel”), then that answer will be marked wrong (although we are told the scoring won’t look at the case sensitivity of the answer, just the spelling, spacing, etc.).  So just like in real life – spelling errors and wrong numbers will result in broken configs, or in this case a wrong answer.  This is completely automated scoring (don’t worry – it’s been fully vetted by your peers already) – so when you hit that final submit button (and I do believe if memory serves me correctly that you’ll be warned that your answer / task is about to be scored if you hit submit), the testing software instantly runs a series of scripts that interrogates everything that makes up the exam’s hardware environment and looks at the relevant output to determine if you’ve correctly accomplished your assigned tasks.  So you’ll know in just a few seconds after hitting that very final submit button if you are the world’s newest HPE Hybrid IT Master ASE or not!

The HPE Hybrid IT Master ASE certification exam is not going to be for the faint of heart.  This certification is going to require you to have several years of real world experience and knowledge in HPE compute, storage, and networking.  And if you think you are going to be able to rely on a brain dump to pass, think again – DOMC, the scenarios on real hardware, the exam cost, and the retake policy (along with some other things I can’t discuss) are going to put a serious crimp on both the quality and quantity of brain dumps that will be available.

So what are my tips to you for achieving this certification?

  • Do take the course.  Yes it is expensive and time consuming, but it will cover (including hands on labs) the concepts and knowledge you must have (aside from the real world experience you should already have) to pass the certification exam.
  • Do not wait to take the exam once you have taken the course – take the exam while the course and hands on labs are fresh in your mind.
  • Be prepared to wait for an exam slot. I think initially it will be hard to schedule an exam due to demand and the limited number of testing slots available per day (given that the exam requires a complete set of real hardware that must be flattened and reset after each exam).
  • Do not wake up one morning and decide to take this exam in the afternoon “cold” without properly preparing.  Many of us do this today at various events we attend (i.e. Aspire, TSS, Discover), and it’s not going to result in an exam pass here.  I know of maybe a handful of my peers in the world that maybe could do that without any preparation and have a reasonable chance of passing.
  • Do read, re-read, and then re-read every single word of every single question on the exam – some of the questions and scenarios are very long with lots of information, and it’s easy to skip over key details, words, or numbers that you will need to accurately answer the question or complete the scenario assignments.
  • Do not be intimidated by the DOMC format – it’s really not as bad as you may initially fear.
  • Do take the practice DOMC exam so you have an idea of what to expect on the real exam. You can find a HPE DOMC practice exam (with examples of ASE level server/storage/networking items) at the following link:  https://sei.caveon.com/launchpad?exam=try-domc-for-hpe

For those of you planning to try to obtain this certification, before you register for the course, I’d suggest you chat with your regional Partner Enablement Manager to see if there are any promotions running for the course and exam (wink, wink, you may find a pleasant surprise).

I would like to wrap up by offering you the best of luck in obtaining the HPE Hybrid IT Master ASE certification and to remind you:

You will truly need to be a Master of HPE Hybrid IT to become a HPE Hybrid IT Master ASE!

 

HOWTO: HTTP boot the HPE Proliant Service Pack ISO DVD using RESTfulAPI to update firmware without messing with WDS or PXE

Most of my customer sites consist of one to four HPE Proliant DL3xx servers running VMware ESXi and an additional HPE Proliant DL3xx running Windows 2012 R2 / 2016. HPE offers some great tools for managing their servers, but unfortunately for smaller organizations, most of HPE’s management tools (and I’m looking squarely at you Insight Control and OneView) take more time to setup and get running correctly then the time you’ll save by installing / updating a small handful of servers manually.  Therefore, I usually don’t deploy these tools to help install OSes or update firmware at my smaller client sites.  I generally just rely on booting the HPE Support Pack for Proliant (SPP) to update firmware, use a USB key with a scripted ESXi install on it for installing ESXi, and utilize WDS to install Windows directly on my Proliants when required.

Prior to HPE Proliant Gen 9 servers, I would PXE boot the Proliant Service Pack using PXELINUX and mount the ISO via NFS.  Then along came Gen 9 with UEFI.  Unfortunately, PXELINUX suffers from a complete lack of support for UEFI.  A couple of times I pestered some of the HPE SPP developers and managers in person while at HPE’s campus in Houston, but they never really showed much interest in explaining or documenting how to get network booting working with the SPP when the server utilized UEFI, so I had pretty much given up on ever getting it to work.

The other day I was playing with the HPE RESTful Interface Tool and decided to try configuring HTTP boot on DL380 Gen10 with the current SPP ISO image (P11740_001_spp-2018.11.0-SPP2018110.2018_1114.38.iso).  Much to my surprise, after modifying only a single configuration file on the ISO image, I was able to successfully boot the current SPP ISO image via HTTP and run a full firmware update on the Gen10 I was playing with.

The nice thing about this method is that because it is all done via HTTP, you don’t have to mess with or disable your WDS (Windows Deployment Services) server to add Linux support (which is what the SPP ISO is based on).  So this is great news for pure Windows shops!  And as a bonus, these steps works with Gen 9 servers too.

So how did I do it?  Before I share that, as always:

Use any tips, tricks, or scripts I post at your own risk.

First, you need to slightly modify the SPP ISO image.  Copy the original SPP ISO image to your web server (i.e. c:\inetpub\wwwroot).

Open the ISO image with your favorite ISO editor and extract \efi\boot\grub.cfg, then open the grub.cfg with a decent text editor (i.e. Notepad++, but definitely not the built-in Windows Notepad).  Scroll down the first menuentry, which will be “Automatic Firmware Update”.  Then copy and paste the following just above that menuentry:

menuentry "HTTP Firmware Update Version 2018.11.0" {
set gfxpayload=keep
echo "Loading kernel..."
linux /pxe/spp2018110/vmlinuz media=net root=/dev/ram0 ramdisk_size=10485760 init=/bin/init  iso1=http://xxx.xxx.xxx.xxx/spp.iso iso1mnt=/mnt/bootdevice hp_fibre cdcache TYPE=MANUAL AUTOPOWEROFFONSUCCESS=no modprobe.blacklist=aacraid,mpt3sas  ${linuxconsole}
echo "Loading initial ramdisk..."
initrd /pxe/spp2018110/initrd.img
}

So your grub.cfg will look like this when you are done:

2018.12.20 - 17.45.17 - SNAGIT - 0027

Adjust the http address (xxx.xxx.xxx.xxx), path, and ISO image name as required for your network, then save the updated grub.cfg and inject it back into the ISO image, over-writing the existing \efi\boot\grub.cfg, and then save the updated ISO image.

Be sure to add the .ISO mime type to your web server so that the ISO file type can be handled correctly.  The command below will work with IIS 8.5 and above to add a new mime type to IIS for .ISO.

C:\Windows\System32\inetsrv\appcmd.exe set config -section:system.webServer/staticContent /+"[fileExtension='iso',mimeType='application/iso']"

Now, you need to install the HPE RESTful Interface Tool on your machine.  The current version at the time of this writing is 2.3.4.0.  Go to the Hewlett Packard Enterprise Support Center and search for “RESTful Interface Tool for Windows”, then download and install the .msi (there is a Linux version available as well there).

Once the HPE RESTful Interface Tool is installed, run it as an Administrator.  Next, you need to connect to your server’s ILO, select the Bios object, set the UrlBootfile Entry and commit the changes.

*** NOTE: Make sure the UrlBootFile entry matches the url of your ISO image that your put on your webserver and specified as the iso1 switch in the grub.cfg entry.

ilorest
login ilo_ip_address -u admin -p password
select Bios.v1_0.0
set UrlBootFile=http://xxx.xxx.xxx.xxx/spp.iso
commit

2018.12.19 - 13.56.41 - SNAGIT - 0003

This takes care of the changes you must make to your Proliant server (keep in mind each server that you want to HTTP boot needs to have this this done).

The next time your server boots, the UrlBootFile change will be applied at the end of POST, then server will automatically reboot and start to POST again.

2018.12.19 - 14.18.08 - SNAGIT - 0005

That’s it – your configuration is all done.  Now when you reboot your server, if you hit F11 for the Boot Menu, you’ll have an entry for HTTP there – select it.

2018.12.19 - 14.20.01 - SNAGIT - 0006

After maybe 30 to 45 seconds (depending on your network speed – I’m using 10GbE), you’ll see the familiar SPP boot menu, but with an extra entry which is set as the default entry.

2018.12.19 - 14.21.25 - SNAGIT - 0009

Select it, and after about a minute (again – I’m using 10GbE) you’ll see the ISO image get mounted.

2018-12-20_17-54-25

If the image fails to mount, verify you are able to download the image you specified as the UrlBootFile from your PC.  If that works, then verify that the grub.cfg is correctly updated, with no typos.  Also – verify your server has 16GB+ of RAM in it, as the grub entry creates a 10GB RAM disk.  You may also need to upgrade the ILO firmware and drivers to current builds (such as 2.61 for ILO4 or 1.39 for ILO5) before using the iLOrest tool.

If you so desire, you could also set the new grub entry to be totally automatic by grabbing the proper switches out of the “Automatic Firmware Update” entry.  I suspect it may also be possible to split the ISO and boot one ISO without the packages folder (so it boots quicker) and mount a second the ISO with the packages folders still there to run the upgrades from.  Just to be clear, I haven’t tested that yet – it’s just a theory at this point.

I have tested this by HTTP booting over a branch office VPN tunnel which tops out at 100Mbps – it took a while for the image to load (I didn’t time it as I was working on other things at the time), but it did eventually load and it successfully updated the remote server.

When the next Support Pack for Proliant is released, all you need to do is update the grub.cfg with the correct paths and copy the updated ISO to your webserver with the same file name you used here.  You shouldn’t need to adjust the UrlBootFile on your servers.

Happy updating!

 

 

HOWTO: Replace a failed 3Par drive

HPE 3Pars are great arrays, but just like any other storage system, they do occasionally end up suffering a failed hard drive.  Replacing a failed 3Par drive isn’t quite the same as replacing a failed Proliant Smart Array controller drive – there are a few manual steps that need done to facilitate the replacement process, which I am going to detail below (note – I’m using a StoreServ 7200, based on OS 3.2.1 MU2 as my reference in this post).

First, SSH (via Putty) the 3PAR’s management IP and login as 3paradm (remember the username and password are case-sensitive).

At the 3PAR_SN# cli% prompt, type:    showpd -failed -degraded

This should show you the failed drive and it’s ID (in the example below, the drive hasn’t totally failed, but rather is just degraded due to an internal loop error in the drive, so it needs replaced).

2016.05.24 - 09.15.46 - SNAGIT -  0026

Next, see if servicemag has been issued or is running with:   servicemag status

If servicemag is not running, you will see:   No servicemag operations logged.

Now we want to see if the data has been evacuated off the drive already by running this command:   showpd -space 15   (where 15 is the drive ID that needs replaced).   Using the output shown below, double check there is no data left on the drive. You need to check that all columns other than size and failed are zero.  As you can see from the example , this drive still has data on it (again because the drive in this example is only degraded, not failed – my experience is that typically failed drives have 0, 0, 0, 0 for volume, spare, free, and unavailable, while failed is usually equal to the size).

2016.05.24 - 09.15.54 - SNAGIT -  0027

To evacuate the data, run this command:    servicemag start -pdid 15     and answer yes when prompted if you are sure you want to run it.

2016.05.24 - 09.56.05 - SNAGIT -  0033

To check the status / progress of the servicemag command, run:    servicemag status

2016.05.24 - 09.16.14 - SNAGIT -  0029

As you can see above, 4 chunklets (1GB blocks of disk space) have been moved off the drive so far, with another 107 chunklets (107 GB) to evacuate.  Below is what you will see once the servicemag process has finished.

2016.05.24 - 09.16.23 - SNAGIT -  0030

Before continuing, verify there is no data left on the drive by running:  showpd -space 15

2016.05.24 - 09.16.28 - SNAGIT -  0031

When the HPE field engineer arrives onsite with the replacement disk, you may need to turn on the locate light on the failed drive for him.  To do this, run:      locatecage -t XX cageY ZZ    where TT is time in seconds (i.e. 300), and Y in cageY is the cage number shown above, and ZZ is the magazine number to locate (i.e.  locatecage -t 300 cage0 15 enables the flashing locate light for 5 minutes for the failed drive that is being referenced in this HOWTO).

Once the drive has been replaced, the 3Par **should in the background** run an admitpd automatically for you.  To verify this, run:   showpd -p -mg ZZ -c Y     to see if the new drive is listed (note it will most likely have different drive ID than the dead drive)

When you have verified the new drive has been seen and admitted, you can check the rebuild status with servicemag statusYou can see below the rebuild process, followed by the status message once servicemag as successfully finished.

2016.05.24 - 09.16.52 - SNAGIT -  0032

If you go back to the HP 3PAR Management Console and refresh the console, you should find the fail drive no longer appears (it will stay there appearing as failed even after it has been removed from the cage until the rebuild process is completed, at which point it will go away).

If the HP 3PAR Management Console indicates a firmware update needs performed on the replacement drive, run:   upgradepd ZZ    and answer yes when prompted.  Refresh the HP 3PAR Management Console when the upgrade is complete to check for any other errors.

If no further errors appear, the drive replacement process is completed.  If there are errors, then escalate back to HPE with your original case number.

As always – Use any tips, tricks, or scripts I post at your own risk.

HOWTO: Turn on a HDD UID on a HPE Proliant in VMware with HPSSACLI

This morning we needed to replace a hard drive in a HPE Proliant running VMware ESXi at a remote site that had a PFA on it.  Unfortunately, while ILO is great at identifying the defective drive, it has no ability to enable the UID on the drive, and given that this unit is at a remote site, we had no way of knowing in advanced if the fault light was actually turn on for this drive before the HPE field engineering arrived to swap the drive.  So after digging through the help documentation, I found the necessary HPSSACLI command to enable the drive’s UID.

First, to get a list of all the physical drives in an ESXi host, SSH the host and run this command:

/opt/hp/hpssacli/bin/hpssacli ctrl slot=0 physicaldrive all show

This should output a list of all the drives in the system as shown below.

2016.05.19 - 10.14.13 - SNAGIT -  0005

Next, to enable the blue UID LED for 1 hour on port 2I, box 1, bay 2, run this command:

/opt/hp/hpssacli/bin/hpssacli ctrl slot=0 physicaldrive 2I:1:8 modify led=on duration=3600

The blue UID should now come on for 1 hour and then shut off on it’s own.  If you want want to manually shut if off before the 1 hour is up, run the same command again, but change the “led=on” to “led=off”.

As always – Use any tips, tricks, or scripts I post at your own risk.

Upgrade a stuck ILO firmware via SSH

We have had a rash of issues where by upgrading ILO firmware via the WebUI has been failing.  It looks like it finishes, but when you log back in, it is still the original firmware from when you started the upgraded.  And no matter what you do via the WebUI, it just will not upgrade.  So to upgrade the stubborn firmware, the simplest thing to do is SSH the ILO directly and upload the firmware via the console interface.  Below are the steps to do this.

First, you need a running web server to pull the firmware from.  IIS is usually the handiest, so it is simply a matter of adding a mime-type for the binary firmware file.  Open an administrative command prompt and run:

c:\windows\system32\inetsrv\appcmd.exe set config /section:staticContent /+"[fileExtension='.bin',mimeType='application/x-bin']"
iisreset /restart

Extract the ILO firmware bin with 7-Zip and put the bin somewhere within IIS that you can download it.   Next – to save myself extra grief, I also make sure I can actually download the firmware to a regular PC with a browser before continuing.  So open the browser of your choice and make sure you can download the bin to your PC before continuing.

Putty the ILO interface, accepting the SSH key (if prompted), and login.  Once logged in, check, then download the new firmware with the following commands.

*** Note – the ILO will automatically reboot once it successfully downloads the firmware and does not give any indication of the reboot.  As a result, you might want to start a continuous ping to the ILO to see once it has rebooted and is back up ***

show /map1/firmware1
cd /map1/firmware1
load -source http://http_server_ip/ilox_xxx.bin

Once the ILO reboots, you should have a working ILO with the firmware version you want / need.

As always – Use any tips, tricks, or scripts I post at your own risk.

2016.05.12 - 19.34.52 - SNAGIT -  0097

Factory Reset a HPE FlexFabric 5700 to defaults

Not to long ago, we received a new HPE FlexFabric 5700 switch and we proceeded to muck around with the configuration settings trying a few things that we normally would never do to a production switch.  When we were done having fun and learning, we needed to reset the unit back to defaults so we could really deploy it into production.  Of course, resetting a switch to factory defaults is not something you do very often, so we had to actually RTFM.  I’ll save you the time of that here…

From the serial console, execute these commands:

restore factory-default
yes
save
yes
{hit enter}
reboot

When the switch reboots, it will be at defaults.

Below is a screen snapshot of what you’ll see during this process.

2016.05.11 - 14.43.21 - SNAGIT -  0066

HOWTO: Monitor the rebuild status of a HPE SmartArray in ESXi 5.5

To monitor the rebuild status of a HP SmartArray controller in VMware ESXi 5.5, you need to have the HP VMware tools bundle installed (which is installed if the server was installed from the HP VMware media / ISO).  Once the tools bundle has been installed, simply SSH the server (or go right on the console, either physically or via ILO), login and run:

/opt/hp/hpssacli/bin/hpssacli ctrl all show status

This will provide you a list of all the SmartArray controllers in the server.  From this list, find the slow number of the controller that contains the logical drive you need to check the status on and run the following command (substitute slot=XX for the slot value you determined with the previous command):

/opt/hp/hpssacli/bin/hpssacli ctrl slot=XX ld all show

2016.04.14 - 09.12.11 - SNAGIT -  0000

If you happen to running an older version of ESXi 5.x, or your HP VMware Tools bundle is not somewhat recent, then the commands are somewhat different.  In this case the correct commands are:

/opt/hp/hpacucli/bin/hpacucli
ctrl all show
ctrl slot=0 ld all show

HPE Insight Remote Support 7.6 auto-upgrade fails

As some of you may have noticed, HPE rehpe_pri_grn_pos_rgbleased Insight Remote Support (IRS) version 7.6 this week.  Among other things, the interface is now rebranded with the new HPE logo and icon, it has better security logging, and add support for a bunch of new HPE Networking and HPE StoreEasy products.

If you have already set the “Automatic Update Level” in IRS to “Automatically Download and Install”, you may already have 7.6 successfully deployed to your server.  It’ll be pretty obvious to tell too – if you see the HPE logo shown above on the login page or as the desktop shortcut icon, you are already at version 7.6.

For some reason however, a couple of my IRS 7.5 servers have failed to auto-update to 7.6.  Trying to install the 7.6 update from the Software Tab in IRS by clicking the Start Update also fails.  Normally at this point, I’d simply go to the Software Depot, download 7.6 and manually run the setup – except that 7.6 isn’t available in the Software Depot as the Software Depot download page generates an error message as of this writing (2016.04.02).

So – after some troubleshooting and poking around the log files, I determined you can download the 7.6 package update from the same spot that IRS downloads it:

https://services.isee.hp.com/SWM/packages/ProdUpgPkg/2016-03-31T154720/ProdUpgPkg+7.6.0.27.zip

Unzip this archive to C:\TEMP and then from a command prompt run:

msiexec /i "C:\TEMP\ProdUpgPkg+7.6.0.27\lib\hprs7kit.msi" /lv "%HP_RS_LOG%\hprs_7.6.0_install.log"

Now – if your servers were like those same servers I have, this will fail too.  Taking a look at “%HP_RS_LOG%\hprs_7.6.0_install.log“, you’ll find that pg_dumpall.exe couldn’t connect to the database as the connection was refused.  This results in database.sql being missing, which causes the install to puke with an error code of 1603.  database.sql is the Postgres database dump of your production IRS database that the installer attempts to make.  Now just above the 1603 error in “%HP_RS_LOG%\hprs_7.6.0_install.log“, you’ll find the actual command line for pg_dumpall.exe, which should be (depending on the vintage of your original IRS install) either:

"C:\Program Files\HP\RS\postgresql_9_win32\bin\pg_dumpall.exe" --host=localhost --port=7950 --username=postgres --file="C:\ProgramData\HP\RS\DATA\PERSISTENCE\UPGRADE\database.sql"
-- or --
"C:\Program Files (x86)\HP\RS\postgresql_9_win32\bin\pg_dumpall.exe" --host=localhost --port=7950 --username=postgres --file="C:\ProgramData\HP\RS\DATA\PERSISTENCE\UPGRADE\database.sql"

Manually running the appropriate version command line from above will result in you being prompted for the postgres user password 6 times.  Unfortunately, this password is undocumented, but by doing some detective work (I won’t be sharing how I found what it was), I’ve determined it to be “edit – removed 2016.04.05 as per a request from HPE“.  So enter this password when prompted each of those 6 times, and you’ll find C:\ProgramData\HP\RS\DATA\PERSISTENCE\UPGRADE\database.sql is created.  Now you can go back and run the installer again from the command prompt:

msiexec /i "C:\TEMP\ProdUpgPkg+7.6.0.27\lib\hprs7kit.msi" /lv "%HP_RS_LOG%\hprs_7.6.0_install.log"

Your upgrade should now complete successfully, and all that is left is to log into IRS, go to the Software Tab and check for updates, and install any remaining updates.

As always – Use any tips, tricks, or scripts I post at your own risk.

 

Setup hourly HPE Insight Remote Support Service checking

In a previous post, I mentioned we utilize HPE Insight Remote Support (IRS) at all our client sites, and discovered the lovely undocumented “feature” that IRS has, which is a tendency not to start after a Windows server reboot after an IRS update. This great undocumented feature defeats the entire purpose of IRS – monitoring and alerting your HPE hardware. After getting burned by this feature three or four times in a month where customers noticed hardware faults (via amber alert lights on the equipment) before we did since IRS was not running to alert us, I decided it was time to write a script to check IRS hourly and alert us if it wasn’t running.

To configure Windows to send an alert if the HP IRS Service is stopped, create the following two files (file contents are at the end of this post) on the IRS server:

  • check_irs_service_status.cmd – which is the wrapper that will call PowerShell from Task Scheduler
  • check_irs_service_status.ps1 – which is the actual PowerShell script that executes the service status check

Lastly, we need to schedule check_irs_service_status.cmd to run hourly. I’ve set 2 minutes after the hour in the example shown below, but you can adjust as required.

schtasks /create /tn "Hourly IRS Service Check" /tr c:\Windows\check_irs_service_status.cmd /sc minute /mo 60 /st 00:02:00 /rp "*" /ru "%userdomain%\%username%"

By default, the SMTP from address will be the netbios computer name of the IRS server @ the User’s DNS Domain FQDN (i.e. IRS-SERVER@JBGEEK.NET).  The SMTP to address will be support @ the User’s DNS Domain FQDN (i.e. SUPPORT@JBGEEK.NET), and the SMTP server will be mail @ the User’s DNS Domain FQDN (i.e. MAIL.JBGEEK.NET).  You can determine what these will be by checking the system’s environment variables with SET from a command prompt.  You can customize these settings in the “Send-MailMessage” command if necessary.

All that is left to do is to stop the service and test run check_irs_service_status.cmd to verify the Send-MailMessage works properly in your environment.

 

check_irs_service_status.cmd

rem --- begin cut and paste of notepad c:\windows\check_irs_service_status.cmd
@echo off
C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -ExecutionPolicy RemoteSigned -noprofile -File C:\Windows\check_irs_service_status.ps1
exit /b
rem --- end cut and paste of c:\windows\check_irs_service_status.cmd ---

 

check_irs_service_status.ps1

###--- begin cut and paste of notepad c:\windows\check_irs_service_status.ps1
### Check_irs_service_status.ps1
### @deancolpitts – http://blog.jbgeek.net
### 2016.01.27
### This script will check the status of the server HPRSMAIN and alert via email if the service is stopped.

$Service = Get-Service -name HPRSMAIN
$Service.Status
if ($Service.Status -eq "Stopped") {
 $CurrentTime = Get-Date
 Send-MailMessage -From "$env:computername@$env:userdnsdomain" -To "support@$env:userdnsdomain" -Subject "$env:computername - HP IRS Service is stopped!!!" -Body "The HP IRS Service is stopped on $env:computername.$env:userdnsdomain at approximately $CurrentTime." -Priority High -DNO onSuccess, onFailure -SmtpServer "mail.$env:userdnsdomain"
}

###--- end cut and paste of notepad c:\windows\check_irs_service_status.ps1

 

HPE Insight Remote Support fails to start after reboot

We utilize HPE Insight Remote Support (IRS) at all our client sites, and typically have it running on either Windows 2008 R2 or Windows 2012 R2.  To simplify administration, we typically enable auto-update of IRS, which means IRS will download updates from HPE as they become available and self-update.  One of the lovely “features” that we discovered is that upon the next Windows server reboot after an IRS update (typically at 3am on the first Wednesday after the 2nd Tuesday of every month – thanks Microsoft), the HPRSMain service fails to start.  No amount poking, prodding or swearing will convince the service to start either.

The solution is to run a repair – except the HPE team doesn’t make that easy either as the only option in Add/Remove programs is to uninstall.  Fortunately, you should find the .msi for IRS in C:\ProgramData\HP\RS\DATA\SWM\LANDINGZONE\ProdUpgPkg\unzipped\lib.

So the quickest way to fix IRS at this point is to open an Administrative Command Prompt and run:

msiexec /f "C:\ProgramData\HP\RS\DATA\SWM\LANDINGZONE\ProdUpgPkg\unzipped\lib\hprs7kit.msi" /lv "%HP_RS_LOG%\hprs_recovery.log"

After a few minutes, the HPRSMain service should start and good until at least the next IRS update.