Something doesn't work as expected? Check out our advice to help you troubleshoot.

CrashLoopBackoff when restarting all pods

Question from a user: "When I restart all Kubernetes pods at once, they get stuck in a CrashLoopBackoff for a number of minutes before eventually resolving - Why does it happen?"

You are likely hitting the startup issue of a Java application causing failure of liveness probes. The startup can consume a lot of resources, since Java will load many classes at startup. Setting the CPU limit to 0.5 can improve the startup time of the server and should resolve the failing healthchecks.

Unprocessable execution

Sometimes, things can go wrong in an unattended manner; in such situations, you can skip an execution that Kestra is not able to process.

To do so, you can start the executor server (or the standalone server if not using a deployment with separate components) with a list of execution identifiers to skip.

sh
kestra server executor --skip-executions 6FSPERUe1JwbYmMmdwRlgV,5iLGjTLOHAVGUGlsesFaMb

Docker in Docker (DinD)

If you face some issues using Docker in Docker e.g. with Script tasks using DOCKER runner, start troubleshooting by attaching the terminal: docker run -it --privileged docker:dind sh. Then, use docker logs container_ID to get the container logs.

Also, try docker inspect container_ID to get more information about your Docker container. The output from this command displays details about the container, its environments, network settings, etc. This information can help you identify what might be wrong.

Docker in Docker using Helm charts

On some Kubernetes deployments, using DinD with our default Helm charts can lead to:

bash
Device "ip_tables" does not exist.
ip_tables              24576  4 iptable_raw,iptable_mangle,iptable_nat,iptable_filter
modprobe: can't change directory to '/lib/modules': No such file or directory
error: attempting to run rootless dockerd but need 'kernel.unprivileged_userns_clone' (/proc/sys/kernel/unprivileged_userns_clone) set to 1

To fix this, use root to launch the DinD container by setting the following values:

yaml
dind:
  image:
    tag: dind
  args:
    - --log-level=fatal
  securityContext:
    runAsUser: 0
    runAsGroup: 0

securityContext:
  runAsUser: 0
  runAsGroup: 0

Docker in Docker (DinD) on a Mac with ARM-based Apple silicon chip

If you are getting an error similar to: java.io.IOException: com.sun.jna.LastErrorException: [111] Connection refused, it might be related to a Docker in Docker (DinD) issue.

Try using an embedded Docker server as shown below:

Example docker-compose.yml

tmp directory "No such file or directory"

If you're encountering errors associated to the tmp directory such as "No such file or directory", there's a good chance your tmp directory isn't mounted correctly in your Kestra configuration.

When configuring your docker-compose.yml, it's important to make sure that the tmp-dir has the same path as the volume otherwise Kestra won't know what directory to mount for the tmp directory.

yaml
kestra:
  tasks:
    tmp-dir:
      path: /home/kestra/tmp

In this example, /home/kestra:/home/kestra matches the tasks tmp-dir field.

yaml
volumes:
  - kestra-data:/app/storage
  - /var/run/docker.sock:/var/run/docker.sock
  - /home/kestra:/home/kestra

Was this page helpful?