tomasfarias.dev

Sphinx Documentation on GitHub Pages using Poetry

Sphinx is the most widespread documentation tool I’ve seen used for Python projects. It can output to multiple formats, including HTML and PDF, handle code and cross-references, and plenty of extensions are available in PyPI for more specific use-cases.

But this post is not about the wonders of Sphinx, or the nuances of how to write reStructuredText, as there is already plenty of documentation out there1. Instead I’ll be focusing on my efforts over the last couple of weeks to automate a Sphinx documentation deployment pipeline, by hosting it in Github Pages, for a project that uses poetry as its dependency management tool.

The project that I’ll be using as an example is airflow-dbt-python, an Airflow operator I’ve written to work with dbt (don’t worry, you don’t need to know what any of these tools are for to follow this blog post, it’s just the project I’ve worked on documenting). In the next sections I’ll be dissecting each part of the documentation pipeline, and you can checkout the repo if you want to see the final product up and running.

Specifying the dependencies in poetry

Let’s start by adding the necessary dependencies to our pyproject.toml file (non-documentation specific dependencies have been omitted for clarity):

[tool.poetry.dependencies]
Sphinx = { version = "4.2.0", optional = true }
sphinx-rtd-theme = { version = "1.0.0", optional = true }
sphinxcontrib-napoleon = { version = "0.7", optional = true }

[tool.poetry.extras]
docs = ["Sphinx", "sphinx-rtd-theme", "sphinxcontrib-napoleon"]

Several things are going on here:

This way, developers working on the project may install the documentation dependencies by running:

poetry install airflow-dbt-python --extras docs

Writing our documentation with Sphinx

We will not be saying too much here as there are plenty of resources on how to write good documentation and, in particular, on how to use Sphinx to do it2. I encourage you to go over the Sphinx quickstart guide if you want to get up and running to continue with this blog post. If you have used poetry to install Sphinx, as detailed in the previous section, remember you’ll need to run the Sphinx commands with poetry:

poetry run sphinx-quickstart

Setting up GitHub Actions

In order to automate the deployment of our documentation, we’ll be using GitHub Actions. This is not strictly required, and other CI/CD vendors may work just as well or even better, but since my project is hosted in GitHub, I’m taking advantage of those free credits.

Here’s the full YAML, which should be dropped in .github/workflows/docs_pages.yaml:

name: Docs2Pages
on:
  push:
    tags: '*'
  pull_request:
    branches:
      - master

jobs:
  build-docs:
    runs-on: ubuntu-latest
    steps:
    - name: Checkout
      uses: actions/checkout@master
      with:
        fetch-depth: 0
    - uses: actions/setup-python@v2
      with:
        python-version: 3.9
    - uses: abatilo/[email protected]
    - name: install
      run: poetry install -E amazon -E docs
    - name: Build documentation
      run: |
        mkdir gh-pages
        touch gh-pages/.nojekyll
        cd docs/
        poetry run sphinx-build -b html . _build
        cp -r _build/* ../gh-pages/        
    - name: Deploy documentation
      if: ${{ github.event_name == 'push' }}
      uses: JamesIves/[email protected]
      with:
        branch: gh-pages
        folder: gh-pages

Let’s now go over each relevant section:

on:
  push:
    tags: '*'
  pull_request:
    branches:
      - master

This is just configuring the action to run only when a tag is pushed, or when a pull request is made against the master branch of the repo. We want our documentation to be built and deployed whenever a new tag is pushed, but we also want to build the documentation when a pull request is opened to ensure the build succeeds.

jobs:
  build-docs:
    runs-on: ubuntu-latest
    steps:
    - name: Checkout
      uses: actions/checkout@master
      with:
        fetch-depth: 0
    - uses: actions/setup-python@v2
      with:
        python-version: 3.9
    - uses: abatilo/[email protected]
    - name: install
      run: poetry install -E amazon -E docs

The first three steps handle checking out the repo, setting up python and poetry, and installing the package with its dependencies. You can checkout the specific action we use to install poetry over here. This particular action assumes we have already setup python, which is why we run it in this particular order.

In the last step, notice that we are running the poetry install command using the -E (same as --extras) flag to install our docs dependencies. This will install Sphinx and its dependencies as we specified before.

    - name: Build documentation
      run: |
        mkdir gh-pages
        touch gh-pages/.nojekyll
        cd docs/
        poetry run sphinx-build -b html . _build
        cp -r _build/* ../gh-pages/        

We finally get to the actual build step. There are several commands running here:

    - name: Deploy documentation
      if: ${{ github.event_name == 'push' }}
      uses: JamesIves/[email protected]
      with:
        branch: gh-pages
        folder: gh-pages

The last step in our pipeline deploys the documentation to GitHub Pages. The actual deployment involves committing the contents of the gh-pages folder to a branch that we have also named gh-pages. You can find the action used for this step here. If you checkout the gh-pages branch of the repo you will see it looks nothing like other branches, it just contains the documentation we built in the previous step.

One last thing to mention: we have an if-conditional to ensure the deployment step happens only on push events. Since our entire action only runs on tag pushes or pull requests, this means deployment will only run when we push tags, as we don’t support multiple documentation environments. The pipeline could be extended here to support test and/or development environments to deploy the documentation, but that’s beyond the scope of this post.

Finishing by configuring our GitHub repo

If you have followed all the steps until now, and have pushed a tag, you’ll find yourself with a new gh-pages branch which contains all the documentation files built by Sphinx. However, nothing has happened yet as we have not told GitHub that we want to use GitHub Pages to host our documentation. This is a very simple task that we can accomplish by going into the Settings of our repo, under the Pages section and creating a Source pointing to our new gh-pages branch:

Setting up GitHub Pages in our repo

And that’s it! After a few minutes our documentation should be up-and-running in a URL with the format: https://<account>.github.io/<repo>/. GitHub will let us know when the site is available in the same Settings page:

Our GitHub Pages is ready

Final thoughts

The idea for this post came to me as all of the guides and other posts I could find regarding deploying Sphinx documentation to GitHub Pages were not using poetry as their package management tool. Since adapting those guides gave me a few headaches, I wanted to help other people smooth out the process with my own guide. I’m hoping at least to have clearly explained the poetry-specific parts of the setup and would like to leave you with some ideas on what to do next:


  1. Like the Sphinx docs themselves: https://www.sphinx-doc.org/en/master/contents.html. Check them out! ↩︎

  2. Here’s a couple of said resources:

     ↩︎

#python   #workflow   #poetry   #sphinx