Problem:
ora_lms process is consuming too much memory in 19c RAC database leading to memory resource starvation and heavily swapping ends up with having Nodes to evict from the cluster.
Analysis:
This issue is an unknown Bug.31969719 being currently worked by Oracle’s Development team .
Facts:
– Don’t try to kill the ora_lms process from OS (thinking it will restart itself) as this will terminate the whole instance.
– The number of LMS processes is determined by the number of CPUs being used in the server.
For example:
n = “number of CPUs reported by the OS and used for CPU_COUNT per default”
n < 4 => 1 LMS process will be started
4 <= n < 16 => 2 LMS processes will be started
n >=16 => 2 LMS + 1 LMS processes for every 32 CPU will be started.
Reference: Doc ID 1392248.1
Workarounds:
Currently, there is no official workaround for that bug up to the date of publishing this post; but the following actions helped me to slow down the memory consumption pace of ora_lms:
– Reduce the SGA size. (although lms process takes its memory from PGA, reducing the SGA size reduces the impact of ora_lms on memory).
– Reduce the db_cache_size.
– Reduce the frequency of AWR snapshots: [it was 20minutes changed to 1 hour]
SQL> EXEC dbms_workload_repository.modify_snapshot_settings (interval => 30, retention => 43200);
I understand that there is no science behind above workarounds, but they worked for me 😀
The only “backed up with science” workaround here; is to restart the RAC DB instances one by one periodically before the system start swapping (hoping that all connected applications are RAC aware 😇).