#+OPTIONS: html-postamble:nil #+title: Graph Docker Container Memory Usage with Gnuplot #+date: <2025-02-08 Sat> #+author: Tom Alexander #+email: #+language: en #+select_tags: export #+exclude_tags: noexport Sometimes it can be useful to build a graph of docker memory usage over time. For example, I was recently working on reducing the maximum memory of a long-running script. There certainly are heavy and complex options out there like setting up Prometheus and [[https://docs.docker.com/engine/daemon/prometheus/][configuring docker to export metrics to it]] but I threw together a small python script, using only the python standard library, that outputs gnuplot code to render a graph. * Usage Invoke the python script before starting any docker containers. Then, once a docker container is started, the script will start recording memory usage. Any additional docker containers that are started while the script is running will also get recorded. When no docker containers are left, the script will export gnuplot code over stdout that can then be rendered into a graph. Each container will get its own line on the graph. All containers will have their start time aligned with the left-hand side of the graph as if they had started at the same time (so the X-axis it the number of seconds the docker container has been running, as opposed to the wall time). If you'd like, you can insert a horizontal line at whatever memory quantity you'd like by uncommenting the src_python[:exports code]{horizontal_lines} array below. This can be useful for showing a maximum limit like the paltry 32GiB offered by Cloud Run. * Example Invocation #+begin_src bash $ ./graph_docker_memory.py | gnuplot > graph.svg INFO:root:Waiting for a docker container to exist to start recording. INFO:root:Recorded stat jovial_chandrasekhar: 528384 bytes INFO:root:Recorded stat jovial_chandrasekhar: 528384 bytes INFO:root:Recorded stat exciting_bohr: 512000 bytes INFO:root:Recorded stat jovial_chandrasekhar: 516096 bytes INFO:root:Recorded stat exciting_bohr: 512000 bytes INFO:root:Recorded stat jovial_chandrasekhar: 561152 bytes INFO:root:Recorded stat exciting_bohr: 512000 bytes INFO:root:Recorded stat jovial_chandrasekhar: 561152 bytes INFO:root:Recorded stat exciting_bohr: 4866441 bytes INFO:root:Recorded stat jovial_chandrasekhar: 561152 bytes INFO:root:Recorded stat exciting_bohr: 3166699 bytes INFO:root:Recorded stat jovial_chandrasekhar: 561152 bytes INFO:root:Recorded stat exciting_bohr: 3128950 bytes INFO:root:Recorded stat jovial_chandrasekhar: 8568963 bytes INFO:root:Recorded stat exciting_bohr: 3128950 bytes INFO:root:Recorded stat jovial_chandrasekhar: 8528068 bytes INFO:root:Recorded stat exciting_bohr: 3128950 bytes INFO:root:Recorded stat jovial_chandrasekhar: 8528068 bytes INFO:root:Recorded stat exciting_bohr: 32547799 bytes INFO:root:Recorded stat jovial_chandrasekhar: 8528068 bytes INFO:root:Recorded stat exciting_bohr: 4329570 bytes INFO:root:Recorded stat jovial_chandrasekhar: 8528068 bytes #+end_src You can also throw src_bash[:exports code]{tee} in there to save the gnuplot file to make manual adjustments or to render in some other fashion: #+begin_src bash ./graph_docker_memory.py | tee graph.gnuplot | gnuplot > graph.svg #+end_src * Output The output from the above run would be: [[./files/graph.svg]] And the gnuplot source: #+begin_src gnuplot set terminal svg background '#FFFFFF' set title 'Docker Memory Usage' set xdata time set timefmt '%s' set format x '%tH:%tM:%tS' # Please note this is in SI units (base 10), not IEC (base 2). So, for example, this would show a Gigabyte, not a Gibibyte. set format y '%.0s%cB' set datafile separator "|" plot "-" using 1:2 title 'exciting\_bohr' with lines, "-" using 1:2 title 'jovial\_chandrasekhar' with lines 0|512000 4|512000 9|512000 13|4866441 18|3166699 23|3128950 27|3128950 32|3128950 35|32547799 40|4329570 e 0|528384 5|516096 9|561152 14|561152 18|561152 23|561152 28|8568963 32|8528068 37|8528068 40|8528068 45|8528068 e #+end_src * The script #+begin_src python #!/usr/bin/env python from __future__ import annotations import json import logging import re import subprocess from dataclasses import dataclass from datetime import datetime from time import sleep from typing import Collection, Final, NewType, Tuple ContainerId = NewType("ContainerId", str) ContainerName = NewType("ContainerName", str) SAMPLE_INTERVAL_SECONDS: Final[int] = 2 @dataclass class Sample: instant: datetime stats: dict[ContainerId, Stats] @dataclass class Stats: memory_usage_bytes: int def main(): logging.basicConfig(level=logging.INFO) samples: list[Sample] = [] labels: dict[ContainerId, ContainerName] = {} first_pass = True # First wait for any docker container to exist. while True: sample, labels_in_sample = take_sample() if labels_in_sample: break if first_pass: first_pass = False logging.info("Waiting for a docker container to exist to start recording.") sleep(1) # And then record memory until no containers exist. while True: sample, labels_in_sample = take_sample() if not labels_in_sample: break samples.append(sample) labels = {**labels, **labels_in_sample} sleep(SAMPLE_INTERVAL_SECONDS) if labels: # Draws a red horizontal line at 32 GiB since that is the memory limit for cloud run. write_plot( samples, labels, # horizontal_lines=[(32 * 1024**3, "red", "Cloud Run Max Memory")], ) def write_plot( samples: Collection[Sample], labels: dict[ContainerId, ContainerName], ,*, horizontal_lines: Collection[Tuple[int, str, str | None]] = [], ): starting_time_per_container = { container_id: min( (sample.instant for sample in samples if container_id in sample.stats) ) for container_id in labels.keys() } print( """set terminal svg background '#FFFFFF' set title 'Docker Memory Usage' set xdata time set timefmt '%s' set format x '%tH:%tM:%tS' # Please note this is in SI units (base 10), not IEC (base 2). So, for example, this would show a Gigabyte, not a Gibibyte. set format y '%.0s%cB' set datafile separator "|" """ ) for y_value, color, label in horizontal_lines: print( f'''set arrow from graph 0, first {y_value} to graph 1, first {y_value} nohead linewidth 2 linecolor rgb "{color}"''' ) if label is not None: print(f"""set label "{label}" at graph 0, first {y_value} offset 1,-0.5""") # Include the horizontal lines in the range if len(horizontal_lines) > 0: print(f"""set yrange [*:{max(x[0] for x in horizontal_lines)}<*]""") line_definitions = ", ".join( [ f""""-" using 1:2 title '{gnuplot_escape(name)}' with lines""" for container_id, name in sorted(labels.items()) ] ) print("plot", line_definitions) for container_id in sorted(labels.keys()): start_time = int(starting_time_per_container[container_id].timestamp()) for sample in sorted(samples, key=lambda x: x.instant): if container_id in sample.stats: print( "|".join( [ str(int((sample.instant).timestamp()) - start_time), str(sample.stats[container_id].memory_usage_bytes), ] ) ) print("e") def gnuplot_escape(inp: str) -> str: out = "" for c in inp: if c == "_": out += "\\" out += c return out def take_sample() -> Tuple[Sample, dict[ContainerId, ContainerName]]: labels: dict[ContainerId, ContainerName] = {} stats: dict[ContainerId, Stats] = {} docker_inspect = subprocess.run( ["docker", "stats", "--no-stream", "--no-trunc", "--format", "json"], stdout=subprocess.PIPE, ) for container_stat in ( json.loads(l) for l in docker_inspect.stdout.decode("utf8").splitlines() ): if not container_stat["ID"]: # When containers are starting up, they sometimes have no ID and "--" as the name. continue labels[ContainerId(container_stat["ID"])] = ContainerName( container_stat["Name"] ) memory_usage = parse_mem_usage(container_stat["MemUsage"]) stats[ContainerId(container_stat["ID"])] = Stats( memory_usage_bytes=memory_usage ) for container_id, container_stat in stats.items(): logging.info( f"Recorded stat {labels[container_id]}: {container_stat.memory_usage_bytes} bytes" ) return Sample(instant=datetime.now(), stats=stats), labels def parse_mem_usage(mem_usage: str) -> int: parsed_mem_usage = re.match( r"(?P[0-9]+\.?[0-9]*)(?P[^\s]+)", mem_usage ) if parsed_mem_usage is None: raise Exception(f"Invalid Mem Usage: {mem_usage}") number = float(parsed_mem_usage.group("number")) unit = parsed_mem_usage.group("unit") for multiplier, identifier in enumerate(["B", "KiB", "MiB", "GiB", "TiB"]): if unit == identifier: return int(number * (1024**multiplier)) raise Exception(f"Unrecognized unit: {unit}") if __name__ == "__main__": main() #+end_src