Updated on 24.5.2023

How to troubleshoot Linux server memory issues

Linux memory

Some unexpected behaviour on the server side may at times be caused by system resource limitations. Linux by its design aims to use all of the available physical memory as efficiently as possible, in practice, the Linux kernel follows a basic rule that a page of free RAM is wasted RAM. The system holds a lot more in RAM than just application data, most importantly mirrored data from storage drives for faster access. This debugging guide aims to explain how to identify how much of the resources are actually being used, and how to recognise real resource outage issues.

Process stopped unexpectedly

Suddenly killed tasks are often the result of the system running out of memory, which is when the so-called Out-of-memory (OOM) killer steps in. If a task gets killed to save memory, it gets logged into various log files stored at /var/log/

You can search the logs for messages of out-of-memory alerts.

sudo grep -i -r 'out of memory' /var/log/

Grep goes through all logs under the directory and therefore will show at least the just ran command itself from the /var/log/auth.log. Actual log marks of OOM killed processes would look something like the following.

kernel: Out of memory: Kill process 9163 (mysqld) score 511 or sacrifice child

The log note here shows the process killed was mysqld with pid 9163 and an OOM score of 511 at the time it was killed. Your log messages may vary depending on Linux distribution and system configuration.

If for example a process crucial to your web application was killed as a result of an out-of-memory situation, you have a couple of options, reduce the amount of memory asked by the process, disallow processes to overcommit memory, or simply add more memory to your server configuration.

Current resource usage

Linux comes with a few handy tools for tracking processes that can help with identifying possible resource outages. You can track memory usage for example with the command below.

free -h

The command prints out current memory statistics, for example in a 1 GB system the output is something along the lines of the example underneath.

                   total   used    free    shared  buffers cached
Mem:               993M    738M    255M    5.7M    64M     439M
-/+ buffers/cache: 234M    759M
Swap:              0B      0B      0B

Here it is important to make the distinction between application-used memory, buffers and caches. On the Mem line of the output it would appear nearly 75% of our RAM is in use, but then again over half of the used memory is occupied by cached data.

The difference is that while applications reserve memory for their own use, the cache is simply commonly used hard drive data that the kernel stores temporarily in RAM space for faster access, which on the application level is considered free memory.

Keeping that in mind, it’s easier to understand why used and free memory is listed twice, on the second line is conveniently calculated the actual memory usage when taking into account the amount of memory occupied by buffers and cache.

In this example, the system is using merely 234MB of the total available 993MB, and no process is in immediate danger of being killed to save resources.

Another useful tool for memory monitoring is ‘top’, which displays useful continuously updated information about processes’ memory and CPU usage, runtime and other statistics. This is particularly useful for identifying resource exhaustive tasks.

top

You can scroll the list using Page Up and Page Down buttons on your keyboard. The program runs in the foreground until cancelled by pressing ‘q’ to quit. The resource usage is shown in percentages and gives an easy overview of your system’s workload.

top - 17:33:10 up 6 days,  1:22,  2 users,  load average: 0.00, 0.01, 0.05
Tasks:  72 total,   2 running,  70 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.3 us,  0.0 sy,  0.0 ni, 99.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   1017800 total,   722776 used,   295024 free,    66264 buffers
KiB Swap:        0 total,        0 used,        0 free.   484748 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
    1 root      20   0   33448   2784   1448 S  0.0  0.3   0:02.91 init
    2 root      20   0       0      0      0 S  0.0  0.0   0:00.00 kthreadd
    3 root      20   0       0      0      0 S  0.0  0.0   0:00.02 ksoftirqd/0
    5 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 kworker/0:0H
    6 root      20   0       0      0      0 S  0.0  0.0   0:01.92 kworker/u2:0
    7 root      20   0       0      0      0 S  0.0  0.0   0:05.48 rcu_sched

In the example output shown above, the system is idle and the memory usage is nominal.

Check if your process is at risk

If your server’s memory gets used up to the extent that it can threaten system stability, the Out-of-memory killer will choose which process to eliminate based on many variables such as the amount of work done that would be lost and total memory freed. Linux keeps a score for each running process, which represents the likelihood at which the process would be killed in an OOM situation.

This score is stored on file in /proc/<pid>/oom_score, where pid is the identification number for the process you are looking into. The pid can be easily found using the following command.

ps aux | grep <process name>

The output of the command when searching for mysql, for example, would be similar to the example below.

mysql     5872  0.0  5.0 623912 51236 ?        Ssl  Jul16   2:42 /usr/sbin/mysqld

Here the process ID is the first number on the row, 5872 in this case, which then can be used to get further information on this particular task.

cat /proc/5872/oom_score

The readout of this gives us a single numerical value for the chance of the process getting axed by the OOM killer. The higher the number the more likely the task is to be chosen if an out-of-memory situation should arise.

If your important process has a very high OOM score, it is possible the process is wasting memory and should be looked into. However just a high OOM score, if the memory usage otherwise remains nominal, is no reason for concern. OOM killer can be disabled, but this is not recommended as it might cause unhandled exceptions in out-of-memory situations, possibly leading to a kernel panic or even a system halt.

Disable over commit

In major Linux distributions, the kernel allows by default for processes to request more memory than is currently free in the system to improve memory utilization. This is based on the heuristics that the processes never truly use all the memory they request. However, if your system is at risk of running out of memory, and you wish to prevent losing tasks to OOM killer, it is possible to disallow memory overcommit.

To change how the system handles overcommit calls Linux has an application called ‘sysctl’ that is used to modify kernel parameters at runtime. You can list all sysctl controlled parameters using the following.

sudo sysctl -a

The particular parameters that control memory are very imaginatively named vm.overcommit_memory and vm.overcommit_ratio. To change the overcommit mode, use the below command.

sudo sysctl -w vm.overcommit_memory=2

This parameter has 3 different values:

  • 0 means to “Estimate if we have enough RAM”
  • 1 equals to “Always allow”
  • 2 that is used here tells the kernel to “Say no if the system doesn’t have the memory”

The important part of changing the overcommit mode is to remember to also change the overcommit_ratio. When overcommit_memory is set to 2, the committed address space is not permitted to exceed swap space plus this percentage of physical RAM. To be able to use all of the system’s memory use the next command.

sudo sysctl -w vm.overcommit_ratio=100

These changes are applied immediately but will only persist until the next system reboot. To have the changes remain permanent, the same parameter values need to be added to sysctl.conf –file. Open the configuration file for edit.

sudo nano /etc/sysctl.conf

Add the same lines to the end of the file.

vm.overcommit_memory=2
vm.overcommit_ratio=100

Save the changes (ctrl + O) and exit (ctrl + X) the editor. Your server will read the configurations every time at boot up, and prevent applications from overcommitting memory.

Add more memory to your server

The safest and most future-proof option for solving out-of-memory issues is adding more memory to your system. In a traditional server environment, you would need to order new memory modules, wait for them to arrive, and install them into your system, but with cloud servers, all you have to do is increase the amount of RAM you wish to have available at your UpCloud control panel.

Log in to your UpCloud control panel, browse to the Server Listing and open your server’s details by clicking on its description. In the Server General Settings tab, there is a section on the right name CPU and Memory Settings. While your server is running, you will notice that these options are greyed out, this is because they may only be safely changed while the server is shut down.

Proceed by turning off your server with the Shutdown request option on the left of the same page, and click OK in the confirmation dialogue. It will usually take a moment for the server to shut down completely, but once it has the CPU and Memory Settings will become available without you having to refresh the page.

Now you will have two options to increase the amount of RAM in your system:

  1. Select a larger preconfigured instance from the Configuration drop-down menu.
  2. Select the Custom configuration from the same box and then use the slider underneath.

The slider allows you to select a value in increments of 1GB to change the RAM to the desired configuration. Changing your server’s configuration also affects the pricing of your server. To see the corresponding prices to each preconfigured option or custom configuration, check the server configuration options at the new server deployment.

Once you’ve selected the new server configuration, simply press the Update button on the right and the changes will be made immediately. Then you can start your server again with the increased RAM.

If you selected a larger preconfigured option, refer to our resizing storage guide on how to allocate the newly added disk space.

Janne Ruostemaa

Editor-in-Chief

  1. Well, now I have seen HIGH cached memory usages in top and free outputs. How to take a dump to check what is there in cached memory?

  2. Janne Ruostemaa

    Hi there, thanks for the question. High memory or cache usage on Linux by itself is nothing to worry about as the system tries to use up the available memory as efficiently as possible. Cached memory for one can be free as needed but you can use e.g. fincore utility program to get a summary of the cached data.

  3. Hii ,

    I have Mem issue in one of server after installing Solemon agent, before also we have high memory utilazation in the server but we didnt get any alerts but now after installing agent we got alerts .
    strange thing is that am not getting any process using more memory by using top nmon sar ,vmstat all mem comands shows no utilization on memory . its almost less than 10 % .

    any help on this , please

  4. Janne Ruostemaa

    Hi there, thanks for the comment. High memory utilisation is normal on Linux as it’s designed to make use of the available resources. Unless you are having problems with processes getting killed due to out-of-memory issues, there’s probably nothing to worry about.

  5. Thanks for the reply janne , and reg above issue , server has 32 GB RAM when am hitting free command its showing 31 GB used and even cache also more , i could see disk I/O is happening more in the server as servers has only one disk . i can also say that cache is filling so soon . , could you please help me how to get exact process using more memory (except top,sar,nmon ,vmstat etc as i tried all these none showing memory utilization high ) . please help

  6. Similar situation here… out of 32 GB installed RAM, free -m shows 30G used, 929M free, cached 18 G.
    -/+ buffer /cache used 11G, free 19 G
    Swap total 7.8 G, used 54 M, free 7.8 G

    Do I need to be concerned here?

  7. Janne Ruostemaa

    Hi Zattara, thanks for the question. With your usage, there’s nothing to worry about. Linux intentionally caches data from the disk to increase system responsiveness. While this makes it look like you are low on free memory, from the perspective of the application the cached memory is free. Check out this site to find out more about the way Linux uses memory.

  8. Janne Ruostemaa

    Indeed, you are not seeing high memory usage on these tools as the majority of it is being “used” by caching. This is normal and won’t cause your server to run out of memory. The number you should look for when checking for memory utilisation is the amount available instead of what’s listed as free. For example:

                  total        used        free      shared  buff/cache   available
    Mem:           991M        329M         98M         22M        564M        473M
  9. Hii sir, my vps server have 16gb of ram, When i check monitoring it shows 20%, 30% cpu memory usage at one point which is quite good, but suddenly at a time it spikes to 100% memory used, my server doesn’t respond, neither i can open WHM or any websites hosted on my server, i can ping into my server but cannot do anything not even ssh. Atlast i have to restart my server to make things normal again, this happens with me frequently in every 7-10 days. How do i run any command like top , free when i dont have even access to ssh service, in var/log there is no log for todays date. Is there any way like to log the process in a file whenever a process eats more then 80% memory or something like that, how can i troubleshoot this issue.

  10. Janne Ruostemaa

    Hi Surya, thanks for the question. Occasional 100% memory usage could indicate a problem with some software running into a bug or leaking memory. I’d suggest setting up logging for memory usage into a file so that even if the server becomes unresponsive, you will have something to investigate after a reboot.

  11. Hi ,
    I need to display all process with command line of each process of cpu and memory usage in the top command or ps .
    I tried with multiple options in ps command but no luck
    Is any option we can get it in top command or ps , can you please advise.

    Thanks

  12. Janne Ruostemaa

    Hi Ravi, thanks for the comment. Either one of top or ps can list processes and threads according to CPU or memory usage. For example, ps auxH --sort -rss to list all processor threads and sort them according to memory usage.

  13. Hi Janne, thanks for the nice writeup.
    For one of my RHEL installation, using vmstat, shows that Cache value just goes down eventually to 0. With the same application running on my other server instance, runs fine with Cache value being constant to 23 GB.
    Both of them runs 32GB of RAM.

  14. Janne Ruostemaa

    Hi Bhavesh, thanks for the comment. The buffer/cache part of the system memory usage is counted as part of the available memory. If both cache and free portions of the system memory are approaching 0, there’s something eating away your RAM. In that case, you might want to look into what is consuming the system memory at an increasing amount.

  15. are am now in risk or what please give me the advice and i want to know why the number in this shape and what the conusming memory in linux

    total used free shared buffers cached
    Mem: 404 339 65 0 0 193
    -/+ buffers/cache: 145 259
    Swap: 9 0 9

  16. Janne Ruostemaa

    Hi Ahmed, thanks for the question. For the fact that you have a decent amount of data cached, your Linux is running memory wise just fine.

  17. i’m really thanks you for your apparition,
    but sorry for the question i have memory consuming by very bad way
    if need any screenshots tell me to investigate with me for this problem
    iam very thankful again to you

  18. Janne Ruostemaa

    You could try monitoring the memory usage on your server with top, for example:

    top -o %MEM
  19. Thank you very much again for your support

    i found the 5 top process consuming the ram

    41.6 6.8 210121560 9523 ora_dbw3_
    41.5 6.8 210175192 9521 ora_dbw2_
    41.5 6.8 210117144 9517 ora_dbw0_
    41.5 6.7 210143064 9519 ora_dbw1_
    that’s refer to DB writer i think is up normal
    please give advice

  20. Janne Ruostemaa

    Databases can take up a fair bit of memory but it’s likely nothing to worry about unless other processes are getting killed because of it. If that’s the case, I’d recommend contacting the database software developer directly for further assistance.

  21. so the server is good and there are any problem in it
    sure its my pleasurer you can contact me on my mail [email protected]

  22. Ramachandran

    Thank you very much for your answer. It helped me. I would like to add this to run the made changes in /etc/sysctl.conf.

    sudo sysctl -p
    (from serverfault.com)

    Now without restarting the maching, your changes are executed.

    Thanks!

  23. Would you please help me with embedded system, I found no APP has memory usage increased sharply , but the available memory decreased ridiculously(leads to oom), I can’t find where the ram goes, do you have any ideas to locate whether it is App or system result in the problem and how, thank you very much

  24. Janne Ruostemaa

    Hi there, thanks for the question. While embedded systems aren’t our forte, the issue sounds like garbage collection is failing to free unused memory if none of the applications is reporting higher than usual memory usage. Hunting down the actual cause is likely to be down to trial and error depending on your particular embedded system.

  25. if memory under available is less than what to do

    total used free shared buff/cache available
    Mem: 31 2 0 26 28 2
    Swap: 62 0 62

  26. Janne Ruostemaa

    Hi Prabhat, thanks for the question. By your output, the majority of your memory is utilised by buffer or cache meaning it can be freed when applications need it. This type of memory usage shouldn’t cause any issues.

  27. Hi sir,

    I am new to Linux system, one of my linux server consuming more mery last 4days not sure why , can you please help me to troubleshot that one

  28. Hi,
    free -m shows that about 1/4 ram is shared memory, I need to remove it. I don’t know how to find what processes are using it. Can you tell me a way to locate those processes? please.

  29. Janne Ruostemaa

    Hi Joey, thanks for the comment. While the shared memory usage shouldn’t be a concern, you can identify the processes that are using it with the following command:

    ipcs -mp
    ------ Shared Memory Creator/Last-op PIDs --------
    shmid      owner      cpid       lpid      
    98335      username   1683       181905

    The output should show something like above where the “cpid” is the process ID using the shared memory.

  30. Janne Ruostemaa

    Hi there, thanks for the comment. On Linux, high memory utilisation is normal, even desired. The idea is to make use of all available memory by caching commonly used files in RAM.

  31. Hi,
    after4/5 hrs all memory shows used including swap memory and server gone hang, using Zimbra mail server on CentOS 6.10, error showing “kernel: Out of memory: Kill process 20402 (apachad) score 131 or sacrifice child” / “kernel: Out of memory: Kill process 30329 (apachad) score 131 or sacrifice child”, many number of process ids. I am new in Linux platform, please guide what to do ?

    Arindam

  32. Janne Ruostemaa

    Hi Arindam, thanks for the comment. If you are able to determine roughly how long it takes for the server to run out of memory, try keeping an eye on your memory usage for example with htop. Once you figure out the poorly behaving process and to what application it belongs, you’ll need to figure out how to stop the memory leak.

  33. Hi Janne,

    Very nice article of memory issue on Linux server you shared.

    May I get your suggestion in term of my VPS for a memory space as following:
    – Total: 1987
    – Used: 863
    – Free: 214
    – Shared: 0
    – Buff/Cache: 909
    – Available : 942

    I’ve been used the server for operating the apps and in one moment I had facing the apps is stag run on processing it and even stop on executing the task.

    What do you think with my case declared and what should I do to recover as well as to fix it up soon. Should be cleared RAM memory cache/buffer or better idea to take for adding space of memory in my VPS?

    Thank you very much for kind advice.

  34. Janne Ruostemaa

    Hi Angung, thanks for the question. If the memory numbers you listed are from a time when your application had issues, the problem probably isn’t low memory. You still have roughly half of the server’s total memory available if your applications were to need more.

Leave a Reply to ahmed

Your email address will not be published. Required fields are marked *

Back to top