Auto-documenting Python modules as Confluence pages

Today I want to explore the documentation domain, and specifically, how to minimize developers efforts on maintaining the module documentation. It’s not uncommon to create a README.md file for your repository, showcasing the module usage. It is, however, a bit of a challenge to maintain it in a relevant shape, while the module evolves. And it is even a bigger challenge to ensure that other contributors also take that into consideration and submit documentation updates alongside with their contributions.

As an automation developer, my first look at any task always involves assessing if this particular task can be automated and how much effort it’s going to take. And when approached with the idea of maintaining module documentation, I discovered that this part of the module lifecycle is, in fact, easily automated. Everyone, meet AutoDoc!

Let’s talk docstrings

Python already gives us an option to store documentation inside the code by using docstrings – strings that provide a description of the code functionality and its endpoints. Since they are right there in the code, it’s transparent for both contributors and maintainers of said code.

For our solution to recognize docstrings, they should follow one of the python documentation standards: Google docstrings, reStructuredText, NumPy docstrings, or Epytext. For the purpose of this guide, we’ll be using Google docstrings format:

class MyClass(object):
    """Short description here.

    Args:
        input (str): first argument here
        num (int): some other arg

    Attributes:
        attribute (str): Attribute description 
        num (int): some other attribute
    """

If you want to learn more about docstrings, I’d recommend to read this complete Python documentation guide: Documenting Python code.

Sphinx as a service

“Sphinx is a tool that makes it easy to create intelligent and beautiful documentation” – says Sphinx project home page, and I want to underline the intelligent part: not only Sphinx does all the heavy lifting for you, it also adds some minor but very convenient improvements to the documentation, such as cross-references to module classes and table of contents trees.

Sphinx is a Python module that, among other things, works with Python docstrings and parses them through the module engine, converting docstring content into formatted documentation pages. Sphinx supports several formats (text/html) out of the box, and it is possible to use extensions that would add support for other formats. In our case, we’d be using a few extra modules to achieve our goal:

  • sphinxcontrib.napoleon – Google docstring support
  • sphinxcontrib.confluencebuilder – Confluence exporter

Let’s begin with installing Sphinx packages and creating configuration for your Python project:

pip install sphinx sphinxcontrib.napoleon sphinxcontrib.confluencebuilder
cd myPythonProject
mkdir docs && cd docs
sphinx-quickstart

Quickstart allows you to quickly create a structure that would support auto-documenting features. It would ask a few questions before creating anything about project to generate configuration metadata.

As you can see on the screenshot above, sphinx created a few files for us:

  • conf.py – sphinx main configuration file
  • index.rst – index page of your future documentation solution
  • Makefile to simplify build commands

Sphinx configuration

For Sphinx to do what we need, we need to add some extra items to our conf.py configuration file. Firstly, we need to ensure that our module is imported in the configuration so that Sphinx can work with the objects from the module:

import os
import sys

sys.path.insert(0, os.path.abspath(".."))

import mymodulename

Next, let’s add general project information:

project = "mymodulename"
copyright = "<license>"
author = "<Your name here>"
release = "<version string if you need it>"

Finally, we need to configure our ConfluenceBuilder extension to work with the Confluence instance of our choice:

# activating extensions
extensions = ["sphinxcontrib.napoleon", "sphinxcontrib.confluencebuilder"]

# supported source file types
source_suffix = [".rst"]

# Confluence general configuration
confluence_publish = True
confluence_server_url = "https://<server name here>.atlassian.net/wiki/"
# Confluence credentials are retrieved from environment variables in this example
confluence_server_user = os.environ.get("CONFLUENCE_USER")
confluence_server_pass = os.environ.get("CONFLUENCE_PASSWORD")

# publishing options:

# target Confluence Space: A short Space name (SERV) or a user id (~123456) if a personal Space is used
confluence_space_name = "ADOCS"
# target parent page: pages would be created as child pages of this parent page.
confluence_parent_page = "Python documentation"
# set parent page to the Space home page. Overrides confluence_parent_page
confluence_master_homepage = False
#  purge all the child pages of the parent page that are not a part of this solution 
confluence_purge = False
# purge starting from the solution's index page instead of parent page. Set to False if multiple root pages are defined.
confluence_purge_from_master = True
# keep page hierarchy as defined in toctree references
confluence_page_hierarchy = True
# maximum hierarchy depth
confluence_max_doc_depth = 2
# apply the following labels to all the published pages
confluence_global_labels = ["my-module-docs"]

A few notes about the publishing configuration:

  • Enabling purge is the only way to go if you’re planning on removing/renaming documents (which is almost always). It is disabled in the example above so that you can make an educated decision on whether to enable it.
  • It’s a good idea to use a separate Confluence Space for your automated pages – to protect your manual pages from accidental purge due to misconfiguration.
  • You can always practice on your personal Space first

Defining page layout

We start with the index page: our index.rst file. Sphinx would specifically look for that file at first and would build hierarchy based on it. Index page is where you would put a general description of your module and a toctree of child pages. We can start with a single page describing module objects:

General
~~~~~~~
.. automodule:: mymodule.general
   :members:

Client
~~~~~~
.. autoclass:: mymodule.client.Client
   :members:

This definition would extract all docstrings from the module mymodule.general and class mymodule.client.Client (and their members) and display them on the index page. Aside from those two, there are dozens of autodoc directives available to you.

You can use a combination of toctree and autodoc directives to build your own document structure. For example, this is how you would add all rst files from the modules subfolder to the index:

Welcome to mymodule documentation!
==================================

Modules
----------

.. toctree::
    :maxdepth: 1
    :glob:

    modules/*

In order to create customized page layout, you would need to familiarize yourself with reStructuredText syntax. But, thankfully, autodoc directives abstracts most of the complexity, leaving you to pretty much organizing the layout of your documentation.

Building the solution

Once the RST pages are completed, you can now build the solution. Since we have the Confluence builder plugin activated, we can now use confluence as a build parameter. Run the following in our docs folder:

make -b confluence

This command would render our *.rst files into Confluence pages and publish them according to the configuration parameters we set earlier.

Final steps

The last order of business would be to integrate the documentation build step into the module lifecycle pipeline – in my organization it would be a Jenkins job that publishes the module to our internal repository. By adding an extra step to the pipeline you’re ensuring that the documentation on Confluence is always up-to-date. Now you can simply add a link to your Confluence documentation to README.md and call it a day.

Have fun playing with autodocs!