How to Fix OOMKilled Kubernetes Error (Exit Code 137) (2024)

  • Home
  • Learning Center
  • How to Fix OOMKilled Kubernetes Error (Exit Code 137)

How to Fix OOMKilled Kubernetes Error (Exit Code 137) (1)

Nir Shtein, Software Engineer

5 min read August 14th, 2024

KubernetesTroubleshooting

What is OOMKilled (exit code 137)

The OOMKilled status in Kubernetes, flagged by exit code 137, signifies that the Linux Kernel has halted a container because it has surpassed its allocated memory limit. In Kubernetes, each container within a pod can define two key memory-related parameters: a memory limit and a memory request. The memory limit is the ceiling of RAM usage that a container can reach before it is forcefully terminated, whereas the memory request is the baseline amount of memory that Kubernetes will guarantee for the container’s operation.

When a container attempts to consume more memory than its set limit the Linux OOM Killer changes the container status to ‘OOMKilled’, which prompts Kubernetes to terminate it. This mechanism prevents a single container from exhausting the node’s memory, which could compromise other containers running on the same node. In scenarios where the combined memory consumption of all pods on a node exceeds available memory, Kubernetes may terminate one or more pods to stabilize the node’s memory pressure.

To detect an OOMKilled event, use the kubectl get pods command, which will display the pod’s status as OOMKilled. For instance:

NAME READY STATUS RESTARTS AGE

my-pod-1 0/1 OOMKilled 0 3m12s

Resolving OOMKilled issues often starts with evaluating and adjusting the memory requests and limits of your containers and may also involve debugging memory spikes or leaks within the application. For in-depth cases that involve persistent or complex OOMKilled events, further detailed investigation and troubleshooting will be necessary.

OOMKilled is actually not native to Kubernetes—it is a feature of the Linux Kernel, known as the OOM Killer, which Kubernetes uses to manage container lifecycles. The OOM Killer mechanism monitors node memory and selects processes that are taking up too much memory, and should be killed. It is important to realize that OOM Killer may kill a process even if there is free memory on the node.

The Linux kernel maintains an oom_score for each process running on the host. The higher this score, the greater the chance that the process will be killed. Another value, called oom_score_adj, allows users to customize the OOM process and define when processes should be terminated.

Kubernetes uses the oom_score_adj value when defining a Quality of Service (QoS) class for a pod. There are three QoS classes that may be assigned to a pod:

  • Guaranteed
  • Burstable
  • BestEffort

Each QoS class has a matching value for oom_score_adj:

Quality of Serviceoom_score_adj
Guaranteed-997
BestEffort1000
Burstablemin(max(2, 1000—(1000 * memoryRequestBytes) / machineMemoryCapacityBytes), 999)

Because “Guaranteed” pods have a lower value, they are the last to be killed on a node that is running out of memory. “BestEffort” pods are the first to be killed.

A pod that is killed due to a memory issue is not necessarily evicted from a node—if the restart policy on the node is set to “Always”, it will try to restart the pod.

To see the QoS class of a pod, run the following command:

Kubectl get pod -o jsonpath=’{.status.qosClass}’

To see the oom_score of a pod:

  1. Run kubectl exec -it /bin/bash
  2. To see the oom_score, run cat/proc//oom_score
  3. To see the oom_score_adj, run cat/proc//oom_score_adj

The pod with the lowest oom_score is the first to be killed when the node runs out of memory.

This is part of a series of articles about Exit Codes.

OOMKilled: Common Causes

The following table shows the common causes of this error and how to resolve it. However, note there are many more causes of OOMKilled errors, and many cases are difficult to diagnose and troubleshoot.

CauseResolution
Container memory limit was reached, and the application is experiencing higher load than normalIncrease memory limit in pod specifications
Container memory limit was reached, and application is experiencing a memory leakDebug the application and resolve the memory leak
Node is overcommitted—this means the total memory used by pods is greater than node memoryAdjust memory requests (minimal threshold) and memory limits (maximal threshold) in your containers

How to Fix OOMKilled Kubernetes Error (Exit Code 137) (2)

Tips from the expert

How to Fix OOMKilled Kubernetes Error (Exit Code 137) (3)

Itiel Shwartz

Co-Founder & CTO

Itiel is the CTO and co-founder of Komodor. He’s a big believer in dev empowerment and moving fast, has worked at eBay, Forter and Rookout (as the founding engineer). Itiel is a backend and infra developer turned “DevOps”, an avid public speaker that loves talking about things such as cloud infrastructure, Kubernetes, Python, observability, and R&D culture.

In my experience, here are tips that can help you better manage and resolve OOMKilled (Exit Code 137) errors in Kubernetes:

Analyze memory usage patterns

Use tools like Prometheus and Grafana to monitor and analyze memory usage trends over time.

Right-size your containers

Regularly adjust memory requests and limits based on historical usage data to prevent over- or under-allocation.

Use memory-efficient libraries

Optimize your application to use libraries and algorithms that consume less memory.

Enable vertical pod autoscaling

Automatically adjust memory limits and requests based on real-time usage to handle load changes effectively.

Debug memory leaks

Use profiling tools like Heapster or the JVM’s built-in tools to detect and fix memory leaks in your application.

OOMKilled: Diagnosis and Resolution

Step 1: Gather Information

Run kubectl describe pod [name] and save the content to a text file for future reference:

kubectl describe pod [name] /tmp/troubleshooting_describe_pod.txt

Step 2: Check Pod Events Output for Exit Code 137

Check the Events section of the describe pod text file, and look for the following message:

State: Running Started: Thu, 10 Oct 2019 11:14:13 +0200 Last State: Terminated Reason: OOMKilled Exit Code: 137 ...

Exit code 137 indicates that the container was terminated due to an out of memory issue. Now look through the events in the pod’s recent history, and try to determine what caused the OOMKilled error:

  • The pod was terminated because a container limit was reached.
  • The pod was terminated because the node was “overcommitted”—pods were scheduled to the node that, put together, request more memory than is available on the node.

Step 3: Troubleshooting

If the pod was terminated because container limit was reached:

  • Determine if your application really needs more memory. For example, if the application is a website that is experiencing additional load, it may need more memory than originally specified. In this case, to resolve the error, increase the memory limit for the container in the pod specification.
  • If memory use suddenly increases, and does not seem to be related to application loads, the application may be experiencing a memory leak. Debug the application and resolve the memory leak. In this case you should not increase the memory limit, because this will cause the application to use up too many resources on the nodes.

If the pod was terminated because of overcommit on the node:

  • Overcommit on a node can occur because pods are allowed to schedule on a node if their memory requests value—the minimal memory value—is less than the memory available on the node.
  • For example, Kubernetes may run 10 containers with a memory request value of 1 GB on a node with 10 GB memory. However, if these containers have a memory limit of 1.5 GB, some of the pods may use more than the minimum memory, and then the node will run out of memory and need to kill some of the pods.
  • You need to determine why Kubernetes decided to terminate the pod with the OOMKilled error, and adjust memory requests and limit values to ensure that the node is not overcommitted.

When adjusting memory requests and limits, keep in mind that when a node is overcommitted, Kubernetes kills nodes according to the following priority order:

  1. Pods that do not have requests or limits
  2. Pods that have requests, but not limits
  3. Pods that are using more than their memory request value—minimal memory specified—but under their memory limit
  4. Pods that are using more than their memory limit

To fully diagnose and resolve Kubernetes memory issues, you’ll need to monitor your environment, understand the memory behavior of pods and containers compared to the limits, and fine tune your settings. This can be a complex, unwieldy process without the right tooling.

Solving Kubernetes Errors Once and for All with Komodor

The troubleshooting process in Kubernetes is complex and, without the right tools, can be stressful, ineffective and time-consuming. Some best practices can help minimize the chances of things breaking down, but eventually something will go wrong—simply because it can.

This is the reason why we created Komodor, a tool that helps dev and ops teams stop wasting their precious time looking for needles in (hay)stacks every time things go wrong.

Acting as a single source of truth (SSOT) for all of your k8s troubleshooting needs, Komodor offers:

  • Change intelligence: Every issue is a result of a change. Within seconds we can help you understand exactly who did what and when.
  • In-depth visibility: A complete activity timeline, showing all code and config changes, deployments, alerts, code diffs, pod logs and etc. All within one pane of glass with easy drill-down options.
  • Insights into service dependencies: An easy way to understand cross-service changes and visualize their ripple effects across your entire system.
  • Seamless notifications: Direct integration with your existing communication channels (e.g., Slack) so you’ll have all the information you need, when you need it.
How to Fix OOMKilled Kubernetes Error (Exit Code 137) (2024)

References

Top Articles
Harness the Power of Clash of Clans: Dominate with the Ultimate Town Hall 6 Base 🏰
Attacking Strategies for BH6 - AllClash
Great Clips Mount Airy Nc
Yogabella Babysitter
Monthly Forecast Accuweather
PontiacMadeDDG family: mother, father and siblings
Devotion Showtimes Near Mjr Universal Grand Cinema 16
Mivf Mdcalc
Encore Atlanta Cheer Competition
Pollen Count Los Altos
Lima Crime Stoppers
Persona 4 Golden Taotie Fusion Calculator
Bc Hyundai Tupelo Ms
Jack Daniels Pop Tarts
Hair Love Salon Bradley Beach
Convert 2024.33 Usd
Ess.compass Associate Login
Buy Swap Sell Dirt Late Model
Huntersville Town Billboards
Silive Obituary
Craigslist Maui Garage Sale
Contracts for May 28, 2020
Usa Massage Reviews
Ocala Craigslist Com
Evil Dead Rise Ending Explained
Uno Fall 2023 Calendar
Ipcam Telegram Group
What Is Opm1 Treas 310 Deposit
Kiddie Jungle Parma
Red Sox Starting Pitcher Tonight
Brenda Song Wikifeet
A Grade Ahead Reviews the Book vs. The Movie: Cloudy with a Chance of Meatballs - A Grade Ahead Blog
Adecco Check Stubs
Edward Walk In Clinic Plainfield Il
Oreillys Federal And Evans
Hotels Near New Life Plastic Surgery
Personalised Handmade 50th, 60th, 70th, 80th Birthday Card, Sister, Mum, Friend | eBay
Hisense Ht5021Kp Manual
Cheetah Pitbull For Sale
Gold Dipping Vat Terraria
Ross Dress For Less Hiring Near Me
Other Places to Get Your Steps - Walk Cabarrus
Cnp Tx Venmo
Locate phone number
Unlock The Secrets Of "Skip The Game" Greensboro North Carolina
Learn4Good Job Posting
18 Seriously Good Camping Meals (healthy, easy, minimal prep! )
Product Test Drive: Garnier BB Cream vs. Garnier BB Cream For Combo/Oily Skin
Santa Ana Immigration Court Webex
Pauline Frommer's Paris 2007 (Pauline Frommer Guides) - SILO.PUB
Latest Posts
Article information

Author: Patricia Veum II

Last Updated:

Views: 6350

Rating: 4.3 / 5 (64 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Patricia Veum II

Birthday: 1994-12-16

Address: 2064 Little Summit, Goldieton, MS 97651-0862

Phone: +6873952696715

Job: Principal Officer

Hobby: Rafting, Cabaret, Candle making, Jigsaw puzzles, Inline skating, Magic, Graffiti

Introduction: My name is Patricia Veum II, I am a vast, combative, smiling, famous, inexpensive, zealous, sparkling person who loves writing and wants to share my knowledge and understanding with you.