Still getting OOM errors where server is unresponsive

Is there a recommended/required minimum amount of memory for ApisCP? I’ve been trying for months now to get a handle on these OOM errors, but seemingly no amount of memory I throw at the VM is enough…

Whenever things go sideways, I get HTTP 503 errors, can’t access nexus, can’t access SSH, and can’t even log into the VM at the console because everything hangs as I try to log in. It’s been really frustrating, but even worse is the cost in labor of trying to troubleshoot this for months now.

Any solutions out there for the memory issue? I love ApisCP and its potential, but I don’t have the extra time/energy to troubleshoot software right now, just need it to work. Any thoughts/suggestions would be greatly appreciated!

Error Logs:

[15811576.990971] php invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[15811576.993673] CPU: 1 PID: 1889787 Comm: php Not tainted 4.18.0-425.10.1.el8_7.x86_64 #1
[15811576.995551] Hardware name: OpenStack Foundation OpenStack Nova, BIOS 1.10.2-1ubuntu1 04/01/2014
[15811576.997709] Call Trace:
[15811576.998351]  dump_stack+0x41/0x60
[15811576.999276]  dump_header+0x4a/0x1df
[15811577.000091]  oom_kill_process.cold.33+0xb/0x10
[15811577.001441]  out_of_memory+0x1bd/0x4e0
[15811577.002710]  __alloc_pages_slowpath+0xc24/0xd10
[15811577.004150]  ? blk_flush_plug_list+0xd7/0x100
[15811577.005514]  __alloc_pages_nodemask+0x2e2/0x320
[15811577.006996]  pagecache_get_page+0xce/0x310
[15811577.008012]  filemap_fault+0x78b/0xa10
[15811577.008938]  ? __mod_lruvec_page_state+0x5e/0x80
[15811577.010124]  ? page_add_file_rmap+0x99/0x130
[15811577.011047]  ? alloc_set_pte+0xb8/0x3f0
[15811577.011961]  ? xas_load+0x8/0x80
[15811577.012758]  ? _cond_resched+0x15/0x30
[15811577.013715]  __xfs_filemap_fault+0x6d/0x200 [xfs]
[15811577.014769]  __do_fault+0x38/0xc0
[15811577.015754]  handle_pte_fault+0x55d/0x880
[15811577.016761]  __handle_mm_fault+0x453/0x6c0
[15811577.018189]  handle_mm_fault+0xc1/0x1e0
[15811577.019433]  do_user_addr_fault+0x1b9/0x450
[15811577.020355]  do_page_fault+0x37/0x130
[15811577.021221]  ? page_fault+0x8/0x30
[15811577.021969]  page_fault+0x1e/0x30
[15811577.022751] RIP: 0033:0x84302e
[15811577.023531] Code: Unable to access opcode bytes at RIP 0x843004.
[15811577.024975] RSP: 002b:00007ffe073461c0 EFLAGS: 00010207
[15811577.026320] RAX: 000000000000001d RBX: 00007f9154db5ae0 RCX: 0000000000000001
[15811577.027928] RDX: 0000000000000047 RSI: 00007f9154db5ac0 RDI: 00000000000005f6
[15811577.029251] RBP: 00007f9154da1ee8 R08: 00007f9154db5ac0 R09: 00007f9154db5a60
[15811577.030426] R10: 0000000000000000 R11: 00007f9154db5be0 R12: 0000000001081d10
[15811577.031440] R13: 00007f9154db5b80 R14: 000000000162dc40 R15: 00007f9154cd8c78
[15811577.032749] Mem-Info:
[15811577.033267] active_anon:604610 inactive_anon:263044 isolated_anon:0
[15811577.033267]  active_file:7 inactive_file:3472 isolated_file:84
[15811577.033267]  unevictable:384 dirty:0 writeback:0
[15811577.033267]  slab_reclaimable:13097 slab_unreclaimable:26607
[15811577.033267]  mapped:4211 shmem:16820 pagetables:10015 bounce:0
[15811577.033267]  free:21348 free_pcp:2 free_cma:0
[15811577.040853] Node 0 active_anon:2418440kB inactive_anon:1052176kB active_file:28kB inactive_file:13788kB unevictable:1536kB isolated(anon):0kB isolated(file):336kB mapped:16844kB dirty:0kB writeback:0kB shmem:67280kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:5040kB pagetables:40060kB all_unreclaimable? no
[15811577.048086] Node 0 DMA free:14848kB min:276kB low:344kB high:412kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[15811577.053853] lowmem_reserve[]: 0 2687 3646 3646 3646
[15811577.055601] Node 0 DMA32 free:53124kB min:49608kB low:62008kB high:74408kB active_anon:1834160kB inactive_anon:802812kB active_file:624kB inactive_file:12048kB unevictable:0kB writepending:0kB present:3129200kB managed:2824300kB mlocked:0kB bounce:0kB free_pcp:856kB local_pcp:8kB free_cma:0kB
[15811577.059555] lowmem_reserve[]: 0 0 958 958 958
[15811577.060212] Node 0 Normal free:17420kB min:17692kB low:22112kB high:26532kB active_anon:584280kB inactive_anon:249364kB active_file:584kB inactive_file:1304kB unevictable:1536kB writepending:0kB present:1048576kB managed:981584kB mlocked:0kB bounce:0kB free_pcp:92kB local_pcp:0kB free_cma:0kB
[15811577.068399] lowmem_reserve[]: 0 0 0 0 0
[15811577.069913] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB (U) 0*1024kB 1*2048kB (M) 3*4096kB (M) = 14848kB
[15811577.073517] Node 0 DMA32: 2094*4kB (UME) 786*8kB (UME) 953*16kB (UME) 603*32kB (UME) 166*64kB (UME) 18*128kB (UME) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 62392kB
[15811577.078535] Node 0 Normal: 431*4kB (UME) 120*8kB (UME) 452*16kB (UME) 203*32kB (UME) 33*64kB (UM) 3*128kB (UM) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 18908kB
[15811577.083029] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[15811577.086264] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[15811577.088232] 23382 total pagecache pages
[15811577.089226] 5382 pages in swap cache
[15811577.090080] Swap cache stats: add 217560310, delete 217548091, find 977849224/1050720734
[15811577.091963] Free swap  = 0kB
[15811577.092780] Total swap = 1048572kB
[15811577.093609] 1048442 pages RAM
[15811577.094350] 0 pages HighMem/MovableOnly
[15811577.095414] 93131 pages reserved
[15811577.096095] 0 pages hwpoisoned
[15811577.096852] Tasks state (memory values in pages):
[15811577.098027] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[15811577.100358] [    725]     0   725    21778       18   208896      549         -1000 systemd-udevd
[15811577.102995] [    769]    32   769    16810        6   184320      176             0 rpcbind
[15811577.105862] [    770]     0   770    38973       97   192512      175         -1000 auditd
[15811577.108668] [    772]     0   772    12261       40   139264      149             0 sedispatch
[15811577.111710] [    825] 65534   825     2153        7    61440       21             0 postsrsd
[15811577.114427] [    837]     0   837    24922       58   233472     4172             0 systemd-logind
[15811577.117334] [    840]    81   840    14335      327   151552      119          -900 dbus-daemon
[15811577.120547] [    847]    70   847    14562       55   147456       79             0 avahi-daemon
[15811577.122742] [    853]   993   853    36786       48   192512      171             0 chronyd
[15811577.125148] [    879]    70   879    14531        1   143360      109             0 avahi-daemon
[15811577.127389] [    927]     0   927   165245     1027   503808     5849             0 firewalld
[15811577.129433] [   1671]     0  1671   148179      307   372736      403             0 NetworkManager
[15811577.131580] [   1731]     0  1731    38400      756   200704      254             0 monit
[15811577.133786] [   2746]     0  2746    54335        0    61440       29             0 agetty
[15811577.135591] [   2748]     0  2748     7109        0    98304       57             0 atd
[15811577.137611] [   2783]     0  2783    12751        1   139264      145             0 xinetd
[15811577.139536] [   2946]   985  2946    27437      284   225280     3393           150 redis-server
[15811577.141708] [   3429]     0  3429     3949        0    77824      163           150 shellinaboxd
[15811577.143870] [   3433]     0  3433     3892        2    77824       72           150 shellinaboxd
[15811577.146319] [  17054]     0 17054     5733       54    90112      156             0 crond
[15811577.148705] [  98670]     0 98670    22549       85   212992      446             0 systemd
[15811577.151316] [  98679]     0 98679    56714        1   299008     1183             0 (sd-pam)
[15811577.153877] [  99035]   985 99035    22545      103   212992      432             0 systemd
[15811577.156210] [  99036]   985 99036    56714        1   299008     1180             0 (sd-pam)
[15811577.157672] [ 128747]   982 128747    22552       93   212992      438             0 systemd
[15811577.158932] [ 128750]   982 128750    73100        0   307200     1191             0 (sd-pam)
[15811577.160121] [1677966]     0 1677966    60102     4561   520192    23696             0 systemd-journal
[15811577.161601] [1812672]   193 1812672    60259    29047   507904     5902             0 systemd-resolve
[15811577.163037] [2887685]     0 2887685    35401       15   176128      550             0 haproxy
[15811577.164322] [3266236]     0 3266236    59430       61   106496      155             0 crond
[15811577.166381] [3273267]     0 3273267    77900        0   180224      175             0 gssproxy
[15811577.168597] [3286734]     0 3286734   173900      927   450560     2965             0 tuned
[15811577.170615] [3289511]     0 3289511   581300     5613  1576960     1481             0 fail2ban-server
[15811577.172988] [3289685]     0 3289685   130450     1493   458752      279             0 rsyslogd
[15811577.175839] [3289701]   998 3289701   402114      132   339968     1896             0 polkitd
[15811577.178325] [3289725]     2 3289725    58852        0   212992      272             0 rngd
[15811577.181017] [3289734]    28 3289734   254182      133   253952      133             0 nscd
[15811577.183544] [3289926]     0 3289926    31227       32   143360      128             0 irqbalance
[15811577.186168] [3296185]     0 3296185    19738       39   200704      197         -1000 sshd
[15811577.188786] [3305645]   991 3305645   453199    23030  1200128    71938          -500 mysqld
[15811577.190793] [3352435]     0 3352435     6812       13    90112      139             0 vsftpd
[15811577.193011] [3733039]   983 3733039    38356      638   196608      806             0 haproxy
[15811577.195392] [3733311]     0 3733311    12292       41   131072      111             0 dovecot
[15811577.197690] [3733314]    97 3733314     2080       36    57344       22             0 anvil
[15811577.199666] [3733315]     0 3733315     2330       81    57344      205             0 log
[15811577.201540] [3733316]     0 3733316     4336      234    73728       83             0 config
[15811577.203477] [3733453]    97 3733453     3048       59    57344       55             0 stats
[15811577.204650] [3831456]    26 3831456    94788      184   278528      341         -1000 postmaster
[15811577.205969] [3831458]    26 3831458    87878       20   229376      364             0 postmaster
[15811577.207202] [ 930052]     0 930052    33812        0   155648      176             0 qemu-ga
[15811577.208440] [1774029]    26 1774029    94863      605   286720      340             0 postmaster
[15811577.211146] [1774030]    26 1774030    94852      593   286720      333             0 postmaster
[15811577.213243] [1774031]    26 1774031    94788      103   245760      355             0 postmaster
[15811577.215430] [1774032]    26 1774032    94922      245   282624      385             0 postmaster
[15811577.217459] [1774033]    26 1774033    89186      453   241664      337             0 postmaster
[15811577.219637] [1774034]    26 1774034    94894      195   278528      396             0 postmaster
[15811577.222222] [1774035]    26 1774035    94894      206   258048      393             0 postmaster
[15811577.224694] [1774038]    26 1774038    96356      358   303104      534             0 postmaster
[15811577.227166] [1774039]   979 1774039   230885      951   483328     9688             0 pdns_server
[15811577.229150] [1774044]    26 1774044    95018      107   262144      559             0 postmaster
[15811577.232304] [1774049]    26 1774049    95051      400   294912      472             0 postmaster
[15811577.235371] [1774051]    26 1774051    95051      404   294912      476             0 postmaster
[15811577.237623] [1774054]    26 1774054    95051      401   294912      472             0 postmaster
[15811577.239739] [1774088]     0 1774088     1094        0    49152       27             0 courierlogger
[15811577.241970] [1774089]     0 1774089    15524       44   147456      140             0 authdaemond
[15811577.244295] [1774090]     0 1774090    21254      181   200704      138             0 authdaemond
[15811577.246537] [1774091]     0 1774091    21254      183   200704      137             0 authdaemond
[15811577.248734] [1800115]   985 1800115   105918      321   356352     1038           150 apnscp_php-fpm
[15811577.250817] [1801540]     0 1801540   111113     6772   520192    20064             0 spamd
[15811577.252949] [1801649]     0 1801649    26387       35   180224      224             0 master
[15811577.255204] [1801651]    89 1801651    26439       43   188416      224             0 qmgr
[15811577.257434] [1803034]    89 1803034    26412       36   192512      220             0 tlsmgr
[15811577.258689] [1856841]     0 1856841     9466       70   106496       88             0 auth
[15811577.260207] [1863770]     0 1863770   115102    18519   528384     9285             0 spamd child
[15811577.261662] [1888041]   977 1888041   460524   315789  3088384    34288             0 clamd
[15811577.262791] [1888213]     0 1888213   111153     7083   479232    19754             0 spamd child
[15811577.264014] [1888430]    89 1888430    26404      246   188416        6             0 pickup
[15811577.267127] [1889364]     0 1889364   113017     6213   450560      958             0 apnscp_php
[15811577.270202] [1889373]    26 1889373    96645     1235   311296      233             0 postmaster
[15811577.272836] [1889386]    26 1889386    97440      515   311296      245             0 postmaster
[15811577.275438] [1889389]    26 1889389   489179   381714  3448832    10117             0 postmaster
[15811577.277894] [1889396]     0 1889396    77884      499   241664      150             0 crond
[15811577.280442] [1889403]     0 1889403    55632       57    81920        2             0 sh
[15811577.282784] [1889563]     0 1889563    56752       79    94208       41             0 upcp.sh
[15811577.286383] [1889581]     0 1889581    56686       17    77824       63             0 upcp.sh
[15811577.288775] [1889582]     0 1889582    54272        0    69632       24             0 tee
[15811577.290920] [1889590]   985 1889590     6865       90    90112       40             0 ssh-agent
[15811577.293215] [1889768]     0 1889768    77884      525   241664      136             0 crond
[15811577.295495] [1889769]     0 1889769    19487      398   196608      133             0 crond
[15811577.297689] [1889782]   985 1889782   127136     2026   536576      207             0 horde-alarms
[15811577.300025] [1889786]  9997 1889786     2451       41    69632        0             0 sh
[15811577.301445] [1889787]  9997 1889787    74466     1954   524288      113             0 php
[15811577.302609] [1889797]     0 1889797   134957     4196   638976      269             0 dnf
[15811577.303699] [1889814]   985 1889814   111517     3364   430080     1141           150 apnscp_php-fpm
[15811577.305685] [1890048]     0 1890048    59483       67    94208      149             0 crond
[15811577.307891] [1890049]     0 1890049    55857      295    73728        0             0 bash
[15811577.309892] [1890056]     0 1890056   109037     6183   397312        0             0 apnscp_php
[15811577.311937] [1890087]   978 1890087    53396      696   323584        1             0 freshclam
[15811577.313954] [1890088]     0 1890088    19487      398   196608      133             0 crond
[15811577.317189] [1890089]     0 1890089    19487      397   196608      133             0 crond
[15811577.319841] [1890092]     0 1890092    77884      527   241664      134             0 crond
[15811577.322007] [1890115]    89 1890115    30051      352   217088        8             0 smtpd
[15811577.323994] [1890126]  9997 1890126     2451       40    61440        0             0 sh
[15811577.325946] [1890127]  9997 1890127     2451       56    73728        0             0 sh
[15811577.327606] [1890138]  9997 1890138    65370      715   450560        0             0 php
[15811577.329402] [1890139]  9997 1890139    65370      714   450560        0             0 php
[15811577.331205] [1890143]   985 1890143   125045      735   512000        0             0 horde-alarms
[15811577.333267] [1890148]    26 1890148    96645     1377   311296      226             0 postmaster
[15811577.335309] [1890150]    89 1890150    31069      381   237568        0             0 proxymap
[15811577.337235] [1890174]    89 1890174    31765      378   221184        0             0 smtpd
[15811577.339135] [1890187]    89 1890187    26407      268   192512        0             0 trivial-rewrite
[15811577.341170] [1890225]    89 1890225    26401      248   192512        0             0 anvil
[15811577.343169] [1890241]    89 1890241    33486      380   253952        0             0 proxymap
[15811577.345307] [1890278] 65534 1890278     2153       17    57344       12             0 postsrsd
[15811577.347839] [1890341]   978 1890341    81511    28106   528384        9             0 freshclam
[15811577.350423] [1890364]    26 1890364    96768     1020   311296      228             0 postmaster
[15811577.352644] [1890386]     0 1890386    15764      153   159744      137             0 crond
[15811577.355056] [1890387]     0 1890387    15764      107   159744      137             0 crond
[15811577.357271] [1890388]    89 1890388    31077      353   229376        0             0 cleanup
[15811577.358485] [1890396]     0 1890396    18231      193   122880        0             0 imap
[15811577.359749] [1890408] 65534 1890408     2153        7    57344       21             0 postsrsd
[15811577.361932] [1890411]    26 1890411    95643      960   294912      228             0 postmaster
[15811577.363859] [1890438]    26 1890438    95018      898   286720      261             0 postmaster
[15811577.366105] [1890464]     0 1890464    19738      233   200704        0             0 sshd
[15811577.367828] [1890475]   985 1890475   105918      352   344064     1009           150 apnscp_php-fpm
[15811577.370080] [1890476]     0 1890476    18578      141   172032        0             0 sshd
[15811577.372224] [1890480]     0 1890480    18578      167   172032        0             0 sshd
[15811577.374501] [1890483]   985 1890483   105918      347   344064     1013           150 apnscp_php-fpm
[15811577.376812] [1890486]    26 1890486    94985      708   266240      291             0 postmaster
[15811577.378889] [1890490]    48 1890490    53501      169   319488        0           500 php-fpm
[15811577.381299] [1890491]    48 1890491    53501      169   327680        1           500 php-fpm
[15811577.383914] [1890492]    48 1890492    49214      143   299008        0           500 php-fpm
[15811577.385992] [1890494]     0 1890494   114215      771   253952      237           500 (php-fpm)
[15811577.388258] [1890498]    48 1890498    46441      133   286720        0           500 php-fpm
[15811577.390715] [1890499]    26 1890499    94920      512   258048      306             0 postmaster
[15811577.394174] [1890508]     0 1890508    18026       70   167936        0             0 sshd
[15811577.396372] [1890513]     0 1890513    19746      159   172032        0             0 systemd-cgroups
[15811577.398557] [1890515]     0 1890515    55485       28    77824        0             0 grep
[15811577.400963] [1890516]     0 1890516    54425       26    57344        0             0 agetty
[15811577.403079] [1890517]   985 1890517   105918      348   344064     1013           150 apnscp_php-fpm
[15811577.405441] [1890519]     0 1890519   114175      748   253952      260             0 (php-fpm)
[15811577.407951] [1890520]    26 1890520    94788       81   237568      319             0 postmaster
[15811577.410136] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/php-fpm-MAIN.service,task=(php-fpm),pid=1890494,uid=0
[15811577.414214] Out of memory: Killed process 1890494 ((php-fpm)) total-vm:456860kB, anon-rss:3084kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:248kB oom_score_adj:500
[15811595.515102]

In the above, you have ~4 GB memory (1048442 RAM pages x 4 KB/page). RSS is the amount of memory, in pages, allocated to each process. Pages are also in 4 KB. Walk backward finding the process with abnormally high RSS values.

Both clamd and PostgreSQL are using over 1 GB. clamd, this is normal usage. If you have multiple servers it may be better to centralize scanning; it can be run on a free DNS-only license available in my.apiscp.com.

ConcurrentDatabaseReload can also be turned off, which spawns a separate copy of ClamAV after signatures update.

cpcmd scope:set cp.bootstrapper clamav_clamd_config__custom "ConcurrentDatabaseReload yes"
# Update ClamAV
upcp -sb clamav/setup

Downside is any activity during a database reload won’t be scanned.

As for PostgreSQL, psql -V. If pid 1889389 is still active, this will show what the connection is doing:

 echo "SELECT * FROM pg_stat_activity WHERE pid = 1889389" | psql -x appldb