airflow balancer

Utilities for tracking hosts and ports and load balancing DAGs

Build Status codecov License PyPI

Overview

airflow-balancer is a utility library for Apache Airflow to track host and port usage via yaml files. It is tightly integrated with airflow-laminar/airflow-config.

With airflow-balancer, you can register host and port usage in configuration:

_target_: airflow_balancer.BalancerConfiguration
default_username: timkpaine
hosts:
  - name: host1
    size: 16
    os: ubuntu
    queues: [primary]

  - name: host2
    os: ubuntu
    size: 16
    queues: [workers]

  - name: host3
    os: macos
    size: 8
    queues: [workers]

ports:
  - host: host1
    port: 8080

  - host_name: host2
    port: 8793

Either via airflow-config or directly, you can then select amongst available hosts for use in your DAGs.

from airflow_balaner import BalancerConfiguration, load

balancer_config: BalancerConfiguration = load("balancer.yaml")

host = balancer_config.select_host(queue="workers")
port = balancer_config.free_port(host=host)

...

operator = SSHOperator(ssh_hook=host.hook(), ...)

Visualization

Configuration, and Host and Port listing is built into the extension, available either from the topbar in Airflow or as a standalone viewer (via the airflow-balancer-viewer CLI).

https://raw.githubusercontent.com/airflow-laminar/airflow-balancer/refs/heads/main/docs/img/toolbar.png https://raw.githubusercontent.com/airflow-laminar/airflow-balancer/refs/heads/main/docs/img/home.png https://raw.githubusercontent.com/airflow-laminar/airflow-balancer/refs/heads/main/docs/img/hosts.png

Installation

You can install from pip:

pip install airflow-balancer

Or via conda:

conda install airflow-balancer -c conda-forge

License

This software is licensed under the Apache 2.0 license. See the LICENSE file for details.