As part of a book project with O'Reilly, I wanted to trigger and monitor the build process that converts AsciiDoc files into PDF, HTML, etc. O'Reilly has a production system called Atlas that allows users to trigger builds through a UI, but I wanted to do this via their JSON API instead.

The bash script I ended up with was built on top of an elegant blog post by Kees C. Bakker:

#! /bin/bash

# load atlas credentials
. .env

job_id=$(curl -X POST \
        -F "project=oreillymedia/secret-project" \
        -F "branch=main" \
        -F "formats=pdf" \
        -F "auth_token=$auth_token" \
        https://atlas.oreilly.com/api/builds -s | jq ".id");

printf "\nSent job to Atlas with ID $job_id\n"

build_url="https://atlas.oreilly.com/api/builds/$job_id\?auth_token=$auth_token"
atlas_url="https://atlas.oreilly.com/oreillymedia/project-name"
interval_in_seconds=5
status_path=".status[0].status"
download_path=".status[0].download_url"

printf "\nPolling '${build_url%\?*}' every $interval_in_seconds seconds, until 'complete'\n"

while true;
do
    status=$(curl $build_url -s | jq $status_path);
    printf "\r$(date +%H:%M:%S): $status";
    if [[ "$status" == "\"complete\"" || "$status" == "\"failed\"" ]]; then
        if [[ "$status" == "\"failed\"" ]]; then
            printf "\n Build failed! View logs at $atlas_url"
        else
            download_url=$(curl $build_url -s | jq $download_path);
            printf "\nBuild complete! Download URL: $download_url";
        fi
        break;
    fi;
    sleep $interval_in_seconds;
done

This script performs two tasks:

  • Trigger a PDF build
  • Poll the endpoint until the build is "complete" or "failed", then return URL

We use jq to pick out various fields of interest from the JSON responses, e.g. the build request

curl -X POST \      
        -F "project=oreillymedia/secret-project" \
        -F "branch=main" \
        -F "formats=pdf" \
        -F "auth_token=$auth_token" \
        https://atlas.oreilly.com/api/builds

returns something like

{
    "id": 308588,
    "branch": "main",
    "created_at": "2021-01-07T21:40:15.655Z",
    "project": "oreillymedia/secret-project",
    "clone_url": null,
    "build_url": "/api/builds/308588",
    "status": [
        {
            "id": 363930,
            "format": "pdf",
            "download_url": null,
            "status": "queued",
            "message": null
        }
    ],
    "user": {
        "id": 1234,
        "nickname": "foobar",
        "avatar": "https://secure.gravatar.com/avatar/123456"
    }
}

so we can get the ID by just using jq ".id". Ditto for the response to the /api/builds endpoint which returns JSON of the form:

{
    "id": 308595,
    "branch": "main",
    "created_at": "2021-01-07T21:55:04.381Z",
    "project": "oreillymedia/secret-project",
    "clone_url": null,
    "build_url": "/api/builds/308595",
    "status": [
        {
            "id": 363938,
            "format": "pdf",
            "download_url": "...",
            "status": "complete",
            "message": {
                "debug": [],
                "info": [
                    "Building web PDF",
                    ...
                ],
                "warn": [],
                "error": []
            }
        }
    ],
    "user": {
        "id": 1234,
        "nickname": "foobar",
        "avatar": "https://secure.gravatar.com/avatar/123456"
    }
}

In this case the parent status field is an array, hence the need to pick out the first element with status[0].