We are happy to announce that Vesta is back under active development as of 25 February 2024. We are working on v1 candidate and expect to engage more with the community over the coming months. We are committed to open source, and we encourage contributors to help us build the future of Vesta.
Monitor and restart stopped services
Monitor and restart stopped services
Today I noticed on one of my servers that the named service had stopped for unknown reasons. I started the service from VESTA CP with no problems. But what if I hadn't took notice...
Is there a VESTA script or cron job monitoring important services?
Do you think it's a good idea to have a way to restart a failed service automatically?
Is there a VESTA script or cron job monitoring important services?
Do you think it's a good idea to have a way to restart a failed service automatically?
Re: Monitor and restart stopped services
I spent the evening today to write a script about this. It's one of my first bash scripts, so any guidance, corrections or remarks will be highly appreciated.
The script was tested on Ubuntu 14.04 and CentOS 7. OS detection code was borrowed from VESTA installation script.
To start services you need admin rights. I run everything as root but I suppose it will work with sudoers as well.
If a service can't be started, an email is sent to the admin.
Configurable variables: LOGFILE, MAILTO, SUBJECT, SRVNAMES (different service names in each OS)
Take care! If you specify an existing file as the logfile, it will be emptied!
To test it, create a new file, copy/paste the contents and save. Don't forget to set it as executable with:
I'm not a professional coder so there might be mistakes or errors in my code. Please review carefully before running this code on production servers
The script was tested on Ubuntu 14.04 and CentOS 7. OS detection code was borrowed from VESTA installation script.
To start services you need admin rights. I run everything as root but I suppose it will work with sudoers as well.
If a service can't be started, an email is sent to the admin.
Configurable variables: LOGFILE, MAILTO, SUBJECT, SRVNAMES (different service names in each OS)
Take care! If you specify an existing file as the logfile, it will be emptied!
Code: Select all
#!/bin/bash
# Service checker
# This script checks if services are running.
# If a service exists and is not running it will start it.
#Declare variable LOGFILE with file and path.
LOGFILE=$HOME/servicechecker.log
[ -e $LOGFILE ] && rm -f $LOGFILE
# Declare variable MAILTO with email address
MAILTO='YOUR EMAIL HERE'
#Declare variable SUBJECT with subject of email
SUBJECT="SERVICES: $(hostname)"
# We'll need the mailx program. Exit if it's not installed
which mailx > /dev/null 2>&1 || ( echo The mailx program missing. Try installing it first. && exit 1 )
# Detect OS and start loop
case $(head -n1 /etc/issue | cut -f 1 -d ' ') in
Debian) # LOOP START - DEBIAN #
type="debian"
;; # LOOP END - DEBIAN #
Ubuntu) # LOOP START - UBUNTU #
type="ubuntu"
SRVNAMES="apache2 bind9 exim4 fail2ban mysql nginx vesta"
# Loop through the services
for p in $SRVNAMES
do
# Explanation of nex line: [if file exists] && execute and if status not running && start it
[ -e /etc/init.d/$p ] && /etc/init.d/$p status | grep "not running" && /etc/init.d/$p start
# Check again and if it's still not running log it
[ -e /etc/init.d/$p ] && /etc/init.d/$p status | grep "not running" && echo $p CAN NOT BE STARTED ON $(hostname) > $LOGFILE
done
;; # LOOP END - UBUNTU #
*) # LOOP START - RED HAT #
type="rhel"
# Check the following services (space separated)
SRVNAMES=" httpd nginx named exim dovecot mariadb crond iptables fail2ban"
# Loop through the services
for p in $SRVNAMES
do # If service is enabled but not active, restart it
if [ "`systemctl is-enabled $p`" = "enabled" ] && [ "`systemctl is-active $p`" != "active" ]
then
echo "$p IS NOT RUNNING. RESTARTING..."
systemctl restart $p
else
echo "$p - Nothing to do! Either running or not enabled."
fi
done
;; # LOOP END - RED HAT #
esac
# If $LOGFILE is NOT empty send email to admin
[ -s $LOGFILE ] && more $LOGFILE | mailx -r root -s "$SUBJECT" "$MAILTO"
Code: Select all
chmod +x [FILE]
Re: Monitor and restart stopped services
I didn't know that webmin can check for running services on remote servers. Thank you!
Re: Monitor and restart stopped services
You could also take a look at Supervisor
http://supervisord.org/
http://supervisord.org/
-
- Support team
- Posts: 1096
- Joined: Sat Sep 06, 2014 9:58 pm
- Contact:
- Os: Debian 8x
- Web: apache + nginx
Re: Monitor and restart stopped services
may be you can use Monit...!
Re: Monitor and restart stopped services
Or monit - pretty nice application
https://mmonit.com/monit/
https://mmonit.com/monit/
Re: Monitor and restart stopped services
Monit is really nice! Thanks.
Installed it with apt-get install monit (I'm on Ubuntu 14.04) and then configured according to my needs. One thing to note is that the version of monit that comes with the ubuntu repository is rather outdated (v5.6 compared to v5.19 which is the latest).
Posting here the configuration if someone is interested.
Are any of you running monit? Any special configurations to share with us?
Installed it with apt-get install monit (I'm on Ubuntu 14.04) and then configured according to my needs. One thing to note is that the version of monit that comes with the ubuntu repository is rather outdated (v5.6 compared to v5.19 which is the latest).
Posting here the configuration if someone is interested.
Code: Select all
check process named with pidfile /var/run/named/named.pid
start program = "/usr/bin/service bind9 start"
stop program = "/usr/bin/service bind9 stop"
restart program = "/usr/bin/service bind9 restart"
if failed port 53 use type udp protocol dns then restart
check process nginx with pidfile /var/run/nginx.pid
start program = "/usr/bin/service nginx start"
stop program = "/usr/bin/service nginx stop"
restart program = "/usr/bin/service nginx restart"
check process apache2 with pidfile /var/run/apache2/apache2.pid
start program = "/usr/bin/service apache2 start" with timeout 60 seconds
stop program = "/usr/bin/service apache2 stop"
restart program = "/usr/bin/service apache2 restart"
check process mysqld with pidfile /run/mysqld/mysqld.pid
group database
start program = "/etc/init.d/mysql mysqld start" with timeout 60 seconds
stop program = "/etc/init.d/mysql mysqld stop"
if failed host 127.0.0.1 port 3306 then restart
if 2 restarts within 2 cycles then alert
check process exim4 with pidfile /var/run/exim4/exim.pid
start program = "/usr/bin/service exim4 start" with timeout 60 seconds
stop program = "/usr/bin/service exim4 stop"
restart program = "/usr/bin/service exim4 restart"
if failed
port 25
protocol smtp
then restart
if 2 restarts within 2 cycles then alert
check process dovecot with pidfile /var/run/dovecot/master.pid
start program = "/usr/bin/service dovecot start" with timeout 60 seconds
stop program = "/usr/bin/service dovecot stop"
restart program = "/usr/bin/service dovecot restart"
Re: Monitor and restart stopped services
Why not? Here we go:Felix wrote:Are any of you running monit? Any special configurations to share with us?
Code: Select all
set daemon 60
set logfile /var/log/monit.log
set logfile syslog facility log_daemon
set mailserver yourdomain.ru
set alert [email protected] with mail-format {
from: [email protected]
subject: $SERVICE $EVENT at $DATE
message: Monit $ACTION $SERVICE at $DATE on $HOST: $DESCRIPTION.
Solve this or this will be bad,
Your RobotAssistant
}
check filesystem hdddrive with path /
if space usage > 85% then alert
if inode usage > 80% then alert
check system yourdomain.ru
if loadavg (1min) > 15 then alert
if loadavg (5min) > 8 then alert
if memory usage > 85% then alert
if cpu usage (user) > 90% then alert
if cpu usage (system) > 90% then alert
if cpu usage (wait) > 80% then alert
check process vsftpd with pidfile /var/run/vsftpd/vsftpd.pid
start program = "/etc/init.d/vsftpd start"
stop program = "/etc/init.d/vsftpd stop"
if failed port 21 protocol ftp for 32 cycles then alert
if failed port 21 protocol ftp for 64 cycles then restart
if 15 restarts within 15 cycles then timeout
check process sshd with pidfile /var/run/sshd.pid
start program "/etc/init.d/ssh start"
stop program "/etc/init.d/ssh stop"
if failed port 22 protocol ssh for 15 cycles then alert
if failed port 22 protocol ssh for 15 cycles then restart
if 15 restarts within 15 cycles then timeout
check process mysql with pidfile /var/run/mysqld/mysqld.pid
start program = "/etc/init.d/mysql start"
stop program = "/etc/init.d/mysql stop"
if cpu > 80% for 10 cycles then restart
if failed host 127.0.0.1 port 3306 for 10 cycles then alert
if failed host 127.0.0.1 port 3306 for 10 cycles then restart
if 15 restarts within 15 cycles then timeout
check process nginx with pidfile /var/run/nginx.pid
start program "/etc/init.d/nginx start"
stop program "/etc/init.d/nginx stop"
if failed host yourdomain.ru port 80 protocol http for 6 cycles then alert
if failed host yourdomain.ru port 80 protocol http for 6 cycles then restart
if 15 restarts within 15 cycles then timeout
check process apache with pidfile /var/run/apache2.pid
start program = "/etc/init.d/apache2 start"
stop program = "/etc/init.d/apache2 stop"
if failed host yourdomain.ru port 8080 protocol http for 5 cycles then alert
if failed host yourdomain.ru port 8080 protocol http for 15 cycles then restart
if loadavg(5min) greater than 50 for 15 cycles then restart
if 15 restarts within 15 cycles then timeout
Re: Monitor and restart stopped services
I absolutely love monit, but M/Monit license fees put me off.mehargags wrote:may be you can use Monit...!
does supervisor has same functionalities as M/Monit?
Re: Monitor and restart stopped services
M/Monit - one monit for several servers, one interface for rules and monitor multiple servers.durjoy wrote: I absolutely love monit, but M/Monit license fees put me off.
does supervisor has same functionalities as M/Monit?
You can use single monit for your work free.