I'm working with our operations teams to proactively monitor HOPEX so we can catch and handle issues prior to our users experiencing them.
Initial case I'm looking at is to request our Windows support team reboot the server when someone can't log in due to DB connectivity issues -- the error "OraBinding Error" appears in the ssperr log file when that happens. A reboot fixes the issue. A bit heavy-handed but effective, and it will be a good test to see what other kinds of monitoring we can put in place.
Since the ssperr log file has a date in it and auto-rolls every day (eg, ssperr20190911.txt), the tools they use can't monitor the log file -- they need a consistent, stable log file name to monitor.
I can't find a way to set the log file names or rolling methodology (maybe missed it in the docs).
Has anyone done anything like this before? I tried to find a way to do this with the included monitoring tools, but can't find a way to send notifications from them.
What version of HOPEX it is?
What erros user get on screen when they cannot login?
Is it a standalone setup with one server hosting HOPEX and SQL server both?
If it is not a standalone setup then which server do you restart - HOPEX or SQL?
We are on HOPEX V2R1 Update 3 CP 1 -- looking to patch to Update 3 CP 4 in October, and V3 next year.
I haven't looked at Nagios or any other monitoring software, as we have a corporate standard that is integrated into our ticketing and reporting systems, so I can automatically engage the Windows support teams to reboot the server if need be. If running something local as an intermediary will help, I'm willing to take a look, but ultimately I need something that the corporate monitoring tool can integrate with.
Could I use Nagios to monitor the HOPEX logs, and then generate a log file in the format that our corporate monitoring tools can then in turn monitor (ie, static file name)? Is it a MEGA partner, or recommended solution?