Friday 28 October 2016

Postfix MTA service not working! CentOS 6

I was having problems with keeping the Postfix MTA configured and running with Puppet. Each time Puppet ran it detected that it wasn't running and attempted to start it with no avail. The error when looking at the service was this:
# service postfix status
master dead but pid file exists
But removing the pid file didn't not help:
# locate postfix|grep pid
/var/spool/postfix/pid
/var/spool/postfix/pid/master.pid
[root@webtest ~]# rm /var/spool/postfix/pid/master.pid
rm: remove regular file `/var/spool/postfix/pid/master.pid'? y
[root@webtest ~]# service postfix status
master dead but subsys locked
So looking at the logs this was seen:
# tail  /var/log/maillog
Oct 30 19:44:06 webtest postfix/master[8005]: fatal: bind 127.0.0.1 port 25: Address already in use
Oct 30 20:09:49 webtest postfix/postfix-script[10053]: starting the Postfix mail system
Oct 30 20:09:49 webtest postfix/master[10054]: fatal: bind 127.0.0.1 port 25: Address already in use
Oct 30 20:10:04 webtest postfix/postfix-script[10602]: starting the Postfix mail system
Oct 30 20:10:04 webtest postfix/master[10603]: fatal: bind 127.0.0.1 port 25: Address already in use
Oct 30 20:10:53 webtest postfix/postfix-script[11037]: starting the Postfix mail system
The problem looks like another MTA was running hogging port 25. A quick ps for sendmail revealed nothing, but there's another agent that comes with CentOS 6:
[root@webtest ~]# ps -ef|grep send
root     12448  9780  0 20:16 pts/0    00:00:00 grep send
[root@webtest ~]# ps -ef|grep exim
root     12109  9780  0 20:22 pts/0    00:00:00 grep exim
exim     57456     1  0 Jul07 ?        00:00:00 /usr/sbin/exim -bd -q1h
[root@webtest ~]# service exim stop
Shutting down exim:                                        [  OK  ]
[root@webtest ~]# chkconfig exim off
Now a Puppet run should install and run Postfix without a problem:
# puppet agent -t
Notice: Local environment: 'production' doesn't match server specified node environment 'websites', switching agent to 'websites'.
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for website.domain.com
Info: Applying configuration version '1477858442'
Notice: /Stage[main]/postfixmta/Service[postfix]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/postfixmta/Service[postfix]: Unscheduling refresh on Service[postfix]
Notice: Applied catalog in 1.67 seconds
# puppet agent -t
Notice: Local environment: 'production' doesn't match server specified node environment 'websites', switching agent to 'websites'.
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for website.domain.com
Info: Applying configuration version '1477858442'
Notice: Applied catalog in 1.46 seconds

Monday 3 October 2016

Fix Analytics not displaying in OpsCenter for LDOM - Solaris 11

root@solaris-ldom:~# svcs scn-agent
STATE          STIME    FMRI
maintenance    Sep_23   svc:/application/management/common-agent-container-1:scn-agent
root@solaris-ldom:~# svcs -xv
svc:/application/management/common-agent-container-1:scn-agent (Cacao, a common Java container for JDMK/JMX based management solution)
State: maintenance since Fri Sep 23 22:36:14 2016
Reason: Restarting too quickly.
  See: http://support.oracle.com/msg/SMF-8000-L5
  See: man -M /usr/share/man -s 1M cacaoadm
  See: man -M /usr/share/man -s 5 cacao
  See: /var/svc/log/application-management-common-agent-container-1:scn-agent.log
Impact: This service is not running.
root@solaris-ldom:~# cat /var/svc/log/application-management-common-agent-container-1:scn-agent.log
[ Mar 24 09:57:57 Disabled. ]
[ Mar 24 09:57:57 Rereading configuration. ]
[ Mar 24 09:58:01 Enabled. ]

-cut-

[ Sep 23 22:36:12 Stopping because all processes in service exited. ]
[ Sep 23 22:36:13 Executing stop method ("/usr/lib/cacao/lib/tools/scripts/cacao_smf stop scn-agent"). ]
[ Sep 23 22:36:14 Method "stop" exited with status 0. ]
[ Sep 23 22:36:14 Restarting too quickly, changing state to maintenance. ]
root@solaris-ldom:~# svcadm disable svc:/application/management/common-agent-container-1:scn-agent
root@solaris-ldom:~# svcs scn-agent
STATE          STIME    FMRI
disabled       11:55:48 svc:/application/management/common-agent-container-1:scn-agent
root@solaris-ldom:~# svcs -xv
root@solaris-ldom:~# svcadm enable svc:/application/management/common-agent-container-1:scn-agent
root@solaris-ldom:~# svcs -xv
svc:/application/management/common-agent-container-1:scn-agent (Cacao, a common Java container for JDMK/JMX based management solution)
State: offline* transitioning to online since Mon Sep 26 11:56:09 2016
Reason: Start method is running.
  See: http://support.oracle.com/msg/SMF-8000-C4
  See: man -M /usr/share/man -s 1M cacaoadm
  See: man -M /usr/share/man -s 5 cacao
  See: /var/svc/log/application-management-common-agent-container-1:scn-agent.log
Impact: This service is not running.
root@solaris-ldom:~# tail /var/svc/log/application-management-common-agent-container-1:scn-agent.log
[ Sep 23 22:31:50 Executing start method ("/usr/lib/cacao/lib/tools/scripts/cacao_smf start scn-agent"). ]
[ Sep 23 22:33:13 Method "start" exited with status 0. ]
[ Sep 23 22:36:12 Stopping because all processes in service exited. ]
[ Sep 23 22:36:13 Executing stop method ("/usr/lib/cacao/lib/tools/scripts/cacao_smf stop scn-agent"). ]
[ Sep 23 22:36:14 Method "stop" exited with status 0. ]
[ Sep 23 22:36:14 Restarting too quickly, changing state to maintenance. ]
[ Sep 26 11:55:48 Leaving maintenance because disable requested. ]
[ Sep 26 11:55:48 Disabled. ]
[ Sep 26 11:56:09 Enabled. ]
[ Sep 26 11:56:09 Executing start method ("/usr/lib/cacao/lib/tools/scripts/cacao_smf start scn-agent"). ]
root@solaris-ldom:/var/adm# svcs scn-agent
STATE          STIME    FMRI
online         11:57:12 svc:/application/management/common-agent-container-1:scn-agent
root@solaris-ldom:/var/adm#