For some reason Postgres decided it doesn’t wanna live and is constantly killed every 2 minutes. Any ideas on how to fix this?
[root@web1 ~]# tail -f /var/lib/pgsql/12/data/log/postgresql-Mon.log
2025-03-17 10:20:43.805 CET [1359187] FATAL: terminating connection due to administrator command
2025-03-17 10:20:43.805 CET [1359189] LOG: terminating TimescaleDB job scheduler due to administrator command
2025-03-17 10:20:43.807 CET [1359179] LOG: background worker "logical replication launcher" (PID 1359188) exited with exit code 1
2025-03-17 10:20:43.807 CET [1359179] LOG: background worker "TimescaleDB Background Worker Launcher" (PID 1359187) exited with exit code 1
2025-03-17 10:20:43.807 CET [1359179] LOG: background worker "TimescaleDB Background Worker Scheduler" (PID 1359189) exited with exit code 1
2025-03-17 10:20:43.808 CET [1359182] LOG: shutting down
2025-03-17 10:20:43.822 CET [1359179] LOG: database system is shut down
2025-03-17 10:20:48.778 CET [1509943] LOG: database system was shut down at 2025-03-17 10:20:43 CET
2025-03-17 10:20:48.785 CET [1509941] LOG: database system is ready to accept connections
2025-03-17 10:20:48.787 CET [1509950] LOG: TimescaleDB background worker launcher connected to shared catalogs
2025-03-17 10:22:17.050 CET [1538457] FATAL: terminating connection due to administrator command
2025-03-17 10:22:17.050 CET [1538457] STATEMENT: SELECT pg_terminate_backend(pg_backend_pid())
2025-03-17 10:23:13.438 CET [1509941] LOG: received fast shutdown request
2025-03-17 10:23:13.439 CET [1509941] LOG: aborting any active transactions
2025-03-17 10:23:13.440 CET [1538373] FATAL: terminating connection due to administrator command
2025-03-17 10:23:13.440 CET [1509952] LOG: terminating TimescaleDB job scheduler due to administrator command
2025-03-17 10:23:13.440 CET [1509952] FATAL: terminating connection due to administrator command
2025-03-17 10:23:13.441 CET [1509941] LOG: background worker "logical replication launcher" (PID 1509951) exited with exit code 1
2025-03-17 10:23:13.441 CET [1509950] FATAL: terminating connection due to administrator command
2025-03-17 10:23:13.441 CET [1557950] FATAL: terminating connection due to administrator command
2025-03-17 10:23:13.442 CET [1509941] LOG: background worker "TimescaleDB Background Worker Scheduler" (PID 1509952) exited with exit code 1
2025-03-17 10:23:13.443 CET [1509941] LOG: background worker "TimescaleDB Background Worker Launcher" (PID 1509950) exited with exit code 1
2025-03-17 10:23:13.443 CET [1558177] FATAL: terminating connection due to administrator command
2025-03-17 10:23:13.445 CET [1558006] FATAL: terminating connection due to administrator command
2025-03-17 10:23:13.446 CET [1557911] FATAL: terminating connection due to administrator command
2025-03-17 10:23:13.452 CET [1509945] LOG: shutting down
2025-03-17 10:23:13.471 CET [1509941] LOG: database system is shut down
2025-03-17 10:23:17.553 CET [1562995] LOG: database system was shut down at 2025-03-17 10:23:13 CET
2025-03-17 10:23:17.559 CET [1562966] LOG: database system is ready to accept connections
2025-03-17 10:23:17.561 CET [1563004] LOG: TimescaleDB background worker launcher connected to shared catalogs
2025-03-17 10:23:18.539 CET [1563738] FATAL: terminating connection due to administrator command
2025-03-17 10:23:18.539 CET [1563738] STATEMENT: SELECT pg_terminate_backend(pg_backend_pid())
2025-03-17 10:24:20.075 CET [1644824] FATAL: terminating connection due to administrator command
2025-03-17 10:24:20.075 CET [1644824] STATEMENT: SELECT pg_terminate_backend(pg_backend_pid())
What version of apnscp are you running?
[root@web1 ~]# cpcmd misc_cp_version
WARNING: PingablePDO::__construct(): Connection failed: could not connect to server: Connection refused
Is the server running on host "localhost" (::1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Connection refused
Is the server running on host "localhost" (127.0.0.1) and accepting
TCP/IP connections on port 5432?
(Exception) INTERNAL REPORT: NULL Constructor failed
0. Error_Reporter::report("NULL Constructor failed")
[/usr/local/apnscp/lib/PingablePDO.php:26]
Which version of RHEL or CentOS are you using?
[root@web1 ~]# cat /etc/redhat-release
Red Hat Enterprise Linux release 8.10 (Ootpa)
Can you reproduce this bug reliably? Yes, every 2 minutes. Just leave it sit there.
psql appldb works on my machine, the service shows running but the Dashboard reports it can’t connect and Monit is pissed.
Seems I’m having the same issue on all servers.
Monit Alert:
p114: [p114] postgres
restart - failed protocol test [PGSQL] at /tmp/.s.PGSQL.5432 -- PGSQL: response message is too large: 134217724 bytes received (maximum 1024)
Latest Monit 5.34.5 prerelease - that provides improvements in recovering if firewalld enters panic mode - also slipped in a buffer read problem with Postgres.
Postgres isn’t the problem but rather Monit reading more data than in the buffer and issuing a restart. Looking through tildeslash’s commit log to see if I can spot the relevant change.
For me, the downgrade worked where the unmonitor did not.
I didn’t dig in to find out why, since the 5.34.5 version works without issues.
5.35.0 is what fails with the posgresql monitoring.
[root@p100 ~]# monit status postgres
Monit 5.34.5 uptime: 0m
Process 'postgres'
status OK
monitoring status Monitored
[root@p100 ~]# monit status postgres
Monit 5.35.0 uptime: 0m
Process 'postgres'
status Connection failed
monitoring status Monitored
Same for me with PostgreSQL version 16 as of this morning.
[root@apiscp ~]# tail -f /var/lib/pgsql/16/data/log/postgresql-Mon.log
2025-03-17 19:11:59.175 UTC [2005889] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2025-03-17 19:11:59.177 UTC [2005889] LOG: listening on Unix socket "/.socket/pgsql/.s.PGSQL.5432"
2025-03-17 19:11:59.177 UTC [2005889] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2025-03-17 19:11:59.183 UTC [2005893] LOG: database system was shut down at 2025-03-17 19:11:58 UTC
2025-03-17 19:11:59.190 UTC [2005889] LOG: database system is ready to accept connections
2025-03-17 19:11:59.192 UTC [2005896] LOG: TimescaleDB background worker launcher connected to shared catalogs
2025-03-17 19:12:01.010 UTC [2005937] FATAL: terminating connection due to administrator command
2025-03-17 19:12:01.010 UTC [2005937] STATEMENT: SELECT pg_terminate_backend(pg_backend_pid())
2025-03-17 19:13:01.611 UTC [2006489] FATAL: terminating connection due to administrator command
2025-03-17 19:13:01.611 UTC [2006489] STATEMENT: SELECT pg_terminate_backend(pg_backend_pid())
2025-03-17 19:13:59.206 UTC [2005889] LOG: received fast shutdown request
2025-03-17 19:13:59.229 UTC [2005889] LOG: aborting any active transactions
2025-03-17 19:13:59.229 UTC [2007032] FATAL: terminating connection due to administrator command
2025-03-17 19:13:59.229 UTC [2007066] FATAL: terminating connection due to administrator command
2025-03-17 19:13:59.229 UTC [2007066] STATEMENT: SELECT username, cryptpw, clearpw, uid, gid, home, maildir, quota, fullname, options FROM (SELECT uids.user||'@'||siteinfo.domain as username, NULL as cryptpw, NULL as clearpw, uid as uid, gid as gid, '/home/'||uids.user||'/' as home, 'Mail/'||COALESCE('.'||fs_destination,'') as maildir, NULL as quota, NULL as fullname, NULL as options, enabled, CASE WHEN (email_lookup.user = '') THEN 2 ELSE 1 END as pri FROM email_lookup JOIN uids USING(uid) JOIN gids USING(site_id) JOIN domain_lookup USING(domain) JOIN siteinfo ON(siteinfo.site_id = domain_lookup.site_id) WHERE ( email_lookup."user" = 'root' AND email_lookup.domain = '' OR email_lookup."user" = '' AND email_lookup.domain = '' ) AND email_lookup.type = 'v' ORDER BY PRI LIMIT 1) AS master WHERE enabled = 1::bit;
2025-03-17 19:13:59.232 UTC [2005898] FATAL: terminating background worker "TimescaleDB Background Worker Scheduler" due to administrator command
2025-03-17 19:13:59.232 UTC [2007030] FATAL: terminating connection due to administrator command
2025-03-17 19:13:59.234 UTC [2005896] FATAL: terminating background worker "TimescaleDB Background Worker Launcher" due to administrator command
2025-03-17 19:13:59.234 UTC [2006153] FATAL: terminating connection due to administrator command
2025-03-17 19:13:59.236 UTC [2005916] FATAL: terminating connection due to administrator command
2025-03-17 19:13:59.241 UTC [2007028] FATAL: terminating connection due to administrator command
2025-03-17 19:13:59.243 UTC [2005889] LOG: background worker "TimescaleDB Background Worker Launcher" (PID 2005896) exited with exit code 1
2025-03-17 19:13:59.243 UTC [2005889] LOG: background worker "logical replication launcher" (PID 2005897) exited with exit code 1
2025-03-17 19:13:59.243 UTC [2005889] LOG: background worker "TimescaleDB Background Worker Scheduler" (PID 2005898) exited with exit code 1
2025-03-17 19:13:59.244 UTC [2005891] LOG: shutting down
2025-03-17 19:13:59.262 UTC [2005891] LOG: checkpoint starting: shutdown immediate
2025-03-17 19:13:59.392 UTC [2005891] LOG: checkpoint complete: wrote 89 buffers (4.3%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.051 s, sync=0.040 s, total=0.148 s; sync files=23, longest=0.007 s, average=0.002 s; distance=655 kB, estimate=655 kB; lsn=2/616BF850, redo lsn=2/616BF850
2025-03-17 19:13:59.397 UTC [2005889] LOG: database system is shut down
2025-03-17 19:13:59.598 UTC [2007073] LOG: starting PostgreSQL 16.8 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-23), 64-bit
2025-03-17 19:13:59.598 UTC [2007073] LOG: listening on IPv4 address "127.0.0.1", port 5432
2025-03-17 19:13:59.637 UTC [2007073] LOG: listening on IPv6 address "::1", port 5432
2025-03-17 19:13:59.637 UTC [2007073] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2025-03-17 19:13:59.684 UTC [2007073] LOG: listening on Unix socket "/.socket/pgsql/.s.PGSQL.5432"
2025-03-17 19:13:59.684 UTC [2007073] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2025-03-17 19:13:59.706 UTC [2007077] LOG: database system was shut down at 2025-03-17 19:13:59 UTC
2025-03-17 19:13:59.717 UTC [2007078] FATAL: the database system is starting up
2025-03-17 19:13:59.759 UTC [2007073] LOG: database system is ready to accept connections
2025-03-17 19:13:59.760 UTC [2007081] LOG: TimescaleDB background worker launcher connected to shared catalogs
2025-03-17 19:14:02.413 UTC [2007646] FATAL: terminating connection due to administrator command
2025-03-17 19:14:02.413 UTC [2007646] STATEMENT: SELECT pg_terminate_backend(pg_backend_pid())
Opened an issue with tildeslash, it won’t be visible until authorized on their part. In the meanwhile, problem linked back to commit #decd56e, removal of packed struts.
dnf clean all
dnf update -y monit
Fixed in a custom release, 5.34.4-1.20250316, that rejects this commit.
I’m not seeing it on either server running 5.34.4-1.20250316 but these are checking the unix socket rather than TCP. Edit /etc/monit.d/postgres.conf, replace the if failed directive with this then systemctl restart monit
if failed unixsocket /tmp/.s.PGSQL.5432 protocol pgsql database template1