Configuring Docker Compose

This page describes how to use the provided compose.yaml file to start AnyVar alongside its dependencies, and highlights some configuration options you can customize for your environment.

Overview

The compose file defines four main services:

  • seqrepo – provides a SeqRepo instance on a Docker volume

  • uta – UTA database in a PostgreSQL container

  • anyvar_db – PostgreSQL database for AnyVar

  • anyvar – the AnyVar web API, configured to use SeqRepo, UTA, and the AnyVar DB

It also defines three Docker volumes:

  • seqrepo_vol – storage for SeqRepo data

  • uta_vol – storage for the UTA PostgreSQL data directory

  • anyvar_vol – storage for the AnyVar PostgreSQL data directory

All three volumes are declared as external: true, so they must exist before you run docker compose up. For example:

docker volume create seqrepo_vol
docker volume create uta_vol
docker volume create anyvar_vol

SeqRepo Options

The compose file supports several patterns for providing a SeqRepo database to AnyVar, depending on whether you want Docker to manage the data or reuse an existing SeqRepo on your local filesystem.

Using the bundled SeqRepo container (default)

By default, the compose file runs a seqrepo container that mounts the seqrepo_vol volume at /usr/local/share/seqrepo:

seqrepo:
  image: biocommons/seqrepo:2024-12-20
  volumes:
    - seqrepo_vol:/usr/local/share/seqrepo

The anyvar service then mounts the same volume and points its configuration at the 2024-12-20 snapshot:

anyvar:
  volumes:
    - seqrepo_vol:/usr/local/share/seqrepo
  environment:
    - SEQREPO_INSTANCE_DIR=/usr/local/share/seqrepo/2024-12-20
    - SEQREPO_DATAPROXY_URI=seqrepo+file:///usr/local/share/seqrepo/2024-12-20

This is the simplest option if you are happy to let Docker manage the SeqRepo storage in an external volume.

Using a host SeqRepo directory directly

If you already have a SeqRepo database on your local filesystem and want AnyVar to use it directly (without a dedicated seqrepo container), you can bind mount your existing directory into the anyvar container.

  1. Comment out the seqrepo service in the compose file.

  2. Replace the volume mapping under anyvar:

    anyvar:
      volumes:
        - $SEQREPO_ROOT_DIR:/usr/local/share/seqrepo
    
  3. Export SEQREPO_ROOT_DIR before running compose, pointing it at the root of your local SeqRepo installation:

    export SEQREPO_ROOT_DIR=/path/to/my/seqrepo
    docker compose up
    
  4. Keep the environment variables as they are, or adjust if your SeqRepo snapshot directory name differs:

    environment:
      - SEQREPO_INSTANCE_DIR=/usr/local/share/seqrepo/2024-12-20
      - SEQREPO_DATAPROXY_URI=seqrepo+file:///usr/local/share/seqrepo/2024-12-20
    

Make sure the snapshot path (e.g. 2024-12-20) exists under $SEQREPO_ROOT_DIR and matches SEQREPO_INSTANCE_DIR.

Copying a host SeqRepo into a Docker volume

If you have a local SeqRepo and prefer to copy it into a Docker-managed volume (seqrepo_vol) rather than bind mounting your host directory each time, you can use the optional seqrepo_local_populator service.

This service is commented out by default. To enable this pattern:

  1. Comment out the seqrepo service.

  2. Uncomment seqrepo_local_populator.

  3. Adjust the rsync command if your snapshot directory name is different from 2024-12-20.

  4. Change the depends_on for anyvar to depend on seqrepo_local_populator instead of seqrepo:

    anyvar:
      depends_on:
        anyvar_db:
          required: true
          condition: service_started
        seqrepo_local_populator:
          required: true
          condition: service_completed_successfully
    
  5. Export SEQREPO_ROOT_DIR before running compose:

    export SEQREPO_ROOT_DIR=/path/to/my/seqrepo
    docker compose up
    

On startup, the seqrepo_local_populator container will copy your local SeqRepo snapshot into seqrepo_vol. The anyvar container will then use the volume just as in the default configuration.

UTA Service Configuration

The uta service runs a UTA PostgreSQL instance with its data directory on the uta_vol volume.

You must download uta_20241220.pgd.gz from https://dl.biocommons.org/uta/ using a web browser and move it to the root of the repository.

You can test that UTA is running and populated by using the example psql command in the compose file comments.

AnyVar Database Configuration

The anyvar_db service runs PostgreSQL and mounts anyvar_vol. If you expose the AnyVar API publicly, you must change these defaults to secure values.

Running the stack

After creating the external volumes and configuring any optional environment variables (such as SEQREPO_ROOT_DIR), you can start the stack with:

docker compose up

The AnyVar API will be available on http://127.0.0.1:8010 by default, under the ports mapping.