Perfectly Normal IBM i Server Suddenly Goes Offline: A Break Fix Tale

It was a Friday afternoon when we got the call that the system had gone down.

The Briteskies team has been working with the client, a furniture manufacturing and distribution company, for over a year providing RPG development and support.

At about 1:30pm, the Briteskies RPG developer received a text from the client that that the business system had gone down and could he log on to the system and try to help diagnose and fix the problem.

“I logged on and it looked like a hardware issue, but I wasn’t sure so I called for backup,” – Dale, Briteskies RPG Developer

As the company is likely migrating off the AS400 system in the coming years, they’ve been thinning their related staff and don’t currently have a dedicated System Administrator for the AS400 system. Due to their managed service contract with Briteskies, though the majority of consulting work they receive from Briteskies is related to RPG needs, the Briteskies developer was able to easily pull in a System Admin to help resolve the issue.

Set up clear communication channels

“The first thing I did was get everyone talking on a video conference call,” says the Briteskies System Admin. In cases where time and information are of the essence and multiple people are involved, real-time, in-person virtual conversations with shared screen capabilities cutdowns the possibility of confusion.

Gain access to the system 

In the next ten minutes, the Briteskies team had created a virtual PC, set up a VPN, and was instructing an on-site employee with directions to find the Hardware Management Console and identify its IP address so that we could gain access. 

The HMC was older and hadn’t been updated in a few years so modern web browsers  had a hard time connecting to it. To get around this problem, the Briteskies team uninstalled the browser and installed an older version of Firefox. This allowed our tea to access the HMC web administration display and share it during the video teleconference.

Once in, it was determined that the server was running, but there was no way to get a console connection. 

To access the console, Briteskies started a “normal” shutdown of the server. While IBM i consoles have immediate shutdown capability, Briteskies took the time (over 30 minutes in this instance) to shut it down normally so that any open databases were able to flush their cache and write the contents of memory out to disk, avoiding the potential to damage the databases.

Once the shutdown was complete, our team administered a manual startup and within 15 minutes had access to the server console and was entering credentials on the sign-on screen.

“I continued to share my screen and explained what I was doing and why I was doing it during each step of the process.”

With the shutdown and reboot complete, the system returned to normal function. But everyone wanted to make sure this problem wouldn’t happen again in the future.

Investigate, Diagnose and Fix

The Briteskies System Admin checked the history log and determined that the client’s system had run out of temporary disk storage, causing the server to pause. The reboot had triggered the server to release all temporary storage, allowing it to return to normal function.

Thanks to Briteskies’ immediate attention, the system was back up and running in only two hours. What could have kept the business halted for days was found, diagnosed, and fixed before creating any larger chasms in the business.

With Managed Service contracts, clients can access the expertise of the full Briteskies team depending on their needs.

Just want to thank [Briteskies] for their help this afternoon. We had an outage and were completely down. [Briteskies] jumped on the issue with us and provided great expertise. Meaningful explanation was provided to our team and we learned from that experience. Thanks again for your help, truly appreciated”

Whether you need technical or functional help, Briteskies can help keep your IBM i system up and running. 

IBM Webex – Register Meeting

Thursday, Nov 17 2022 11:00 AM – 12:30 PM

(UTC-05:00) Eastern Time (US & Canada)

Register for webinar

If you want to attend, register now. When your registration is approved, you’ll receive an invitation to join.

Agenda

Hello everyone and welcome to the November VUG. This month’s session is cloud focused – both private on-premises and the IBM Public Cloud. Ross Coniglio and Brian McDonald will review our Power Private Cloud with Dynamic Capacity offering and let you know about the new November updates. We’ll also include a recent customer use case example. Then, Jeff Boleman, Product Manager for PowerVS will give us an overview including new features, roadmaps, common questions asked, and more. This session is not OS-specific as these offerings support RedHat, IBM i, and AIX. Please join us for this Power cloud update!

IBM Rochester – CAAC meets CEAC

IBM Rochester is the home of #IBMi and for the last three days it has been home to both the #CAAC and #CEAC

The two advisory councils meet with IBM to help shape the future of the best operating system in the world.

CAAC – Common Americas Advisory Council

CEAC – Common Europe Advisory Council

#ibmchampion #iUG

Verified by MonsterInsights