Convalesco

Current revision: 0.8

Last update: 2022-09-01 15:59:29 +0000 UTC

Read not to contradict and confute; nor to believe and take for granted; nor to find talk and discourse; but to weigh and consider.

F. Bacon , The Essays - 1597 A.D.


Uploading kubernetes pod coredumps to s3

Date: 12/08/2022, 19:04

Category: technology

Revision: 1



the task

A C++ application emits core dumps every now and then. The files need to be hosted to AWS s3 for later inspection.

The solution the team came up with is simple yet elegant. Linux allows the admin to setup a path for core dumps. The directory will be shared between the application generating core dumps and the script that uploads files to s3. A shell script will monitor the filesystem for changes to the predefined directory via inotify. When the application generates a new core dump, the script will receive the notification and perform the upload action.

The script will be deployed as a kubernetes daemonset to the DigitalOcean managed kubernetes distribution called DOKS.

There is no ReadWriteMany access mode support for DOKS volumes yet, so hostPath it is!

handling signal propagation

Shell scripts won’t propagate signals to child processes. Running shell scripts without care in docker containers can lead to kubernetes nodes resource starvation via zombie processes1. So here’s the script we came up with:

#!/bin/sh
set -e

# SIGTERM propagation to child process
_t() {
  echo "[$(date '+%F %T')] Caught SIGTERM signal!"
  kill -TERM "$child" 2>/dev/null
}

trap _t SIGTERM

# set kubernetes related variables
n_name="${NODE_NAME:-unknown}"
c_name="${CLUSTER_NAME:-unknown}"

echo "[$(date '+%F %T')] Monitoring ${LOCAL_PATH} on ${n_name}"

# launch inotifywait in the background
inotifywait -q -m /"${LOCAL_PATH}" -e close_write | while read path action file
do
  n="${S3_BUCKET}/${c_name}/${n_name}.${file}"
  aws s3 cp "${path}/${file}" "s3://${n}" --only-show-errors
  echo "[$(date '+%F %T')] [coredump] '${file}' has been uploaded to 's3://${n}'"
done &

# propagate bash signal to child
child=$!
wait "$child"

The NODE_NAME and CLUSTER_NAME variables will help us identify the node that issues the coredump. This version is the result of three iterations:

  1. Handled signal propagation manually
  2. Added date and time to the logs
  3. Silenced unnecessary inotifywait and awscli logs

That’s all, the script is ready for testing!

the dockerfile

Following best practices the dockerfile is based on alpine linux and runs with user privileges:

FROM alpine:3.15.5

RUN apk update && \
    apk add --no-cache inotify-tools=3.20.11.0-r0 \
                       aws-cli=1.19.105-r0 \
                       ca-certificates

RUN adduser -u 1000 --gecos '' --disabled-password --no-create-home myuser
COPY watch.sh /usr/bin/
USER myuser
ENTRYPOINT ["/usr/bin/watch.sh"]

The script is running as userID 1000, same ID as the applications generating core dumps.

kubernetes deployment

The interesting part in the deployment definition are the initContainers. Here is the relevant section:

initContainers:
- name: set-coredump-path
  image: busybox:1.28
  command: ['sh','-c','echo "/var/coredump/core.%e.sig%s.%p.%t" > /proc/sys/kernel/core_pattern']
  securityContext:
    privileged: true
- name: set-coredump-size-limit
  image: busybox:1.28
  command: ['sh', '-c', 'ulimit -c 1024000000']
  securityContext:
    privileged: true
- name: set-coredump-permissions
  image: busybox:1.28
  command: ['sh','-c','chown -R 1000:1000 /var/coredump']
  securityContext:
    privileged: true
  volumeMounts:
  - name: coredump
    mountPath: /var/coredump
containers:
- name: upload-to-s3
  image: gathertown/upload-to-s3:latest
  env:
  - name: NODE_NAME
    valueFrom:
      fieldRef:
        apiVersion: v1
        fieldPath: spec.nodeName
  - name: LOCAL_PATH
    value: "/var/coredump"
  - name: S3_BUCKET
    value: "myorg-coredump"
  - name: CLUSTER_NAME
    value: __CLUSTER_NAME__``
  resources:
    limits:
      memory: 100Mi
    requests:
      cpu: 100m
  securityContext:
    runAsUser: 1000
    runAsNonRoot: true
    allowPrivilegeEscalation: false
    capabilities:
      drop:
        - all
  volumeMounts:
  - name: coredump
    mountPath: /var/coredump
volumes:
- name: coredump
  hostPath:
    path: /var/coredump
    type: DirectoryOrCreate

The initContainers will modify the host’s behaviour. InitContainers must run in privileged mode. The actions performed by initContainers are:

  1. set the host core dump pattern to drive core dumps to the predefined directory
  2. set the host core dump size limit to 1GB. Another commonly used number is 100MB
  3. set the appropriate directory permissions

The DirectoryOrCreate type will make sure will be created if it doesn’t exist in the node. The securityContext improves security as well.

monitoring and alerting

Luckily, core dumps are rare, so monitoring and alerting is done through logs. When the keyword [coredump] appears on our log system, developers will receive an alert.

example code

The application code is publicly available on github.