Integrating Confluence Documents into Sphinx

I use Sphinx to generate documentation for my Python code. I like the idea of having documentation right alongside source code as it encourages me to document as I go. And there’s something satisfying about seeing all of your work outlined in an easy-to-read format.

Today at work I was working on some documentation for a customer and faced an interesting challenge. All of my documentation on the code and how it worked was in RST documents for Sphinx, but all of my documentation on how the code was supposed to work (e.g. user stories, design specifications, etc.) was in Confluence. I wanted to pull them all together into a single, comprehensive document which I could hand to the customer and say “here’s everything about how the tool is supposed to work, how it actually works, and how use it”.

My initial thought was to use the sphinxcontrib-confluencebuilder package, which is a Sphinx extension that translates Sphinx documents into the Confluence wiki format. I had previously used it to push some JavaScript documentation to our Confluence server and it worked fairly well. However, the package does not support the new Confluence .adf wiki format, and because all my user stories and design specs were in this new format I wasn’t thrilled about pushing a bunch of older looking technical docs alongside them. Not only that, but I found the PDFs from Sphinx significantly easier to read than the ones generated from Confluence.

Solution

I remembered Confluence has a RESTful API which you could use to query page content. And, not only could you query it, but you could get the content itself in its pure HTML glory. This is especially nice because converting to HTML to RST is easy (and reliable) with Pandoc.

So I figured I’d try:

Querying the pages I needed using the Confluence REST API
Convert the HTML of those pages into RST using Pandoc
Dump the files into my Sphinx source/ directory

Luckily this was all easy to do in a quick Bash script:

prepare.sh:

#!/bin/bash

for cmd in jq pandoc; do
    if ! command -v $cmd &> /dev/null
    then
        echo "$cmd not installed; install and run again"
    fi
done

function pull_confluence_page {
    space="$1"
    title="$2"
    outfile="$3"
    
    # Encode title for URL
    title_url=$(printf %s "$title" | jq -sRr @uri)

    curl -u $CONFLUENCE_USERNAME:$CONFLUENCE_API_KEY \
        -X GET \
        "https://kgoettler.atlassian.net/wiki/rest/api/content?spaceKey=${space}&title=${title_url}&expand=body.view" \
        --header 'Accept: application/json' \
        --header 'Content-Type: application/json' \
        | jq -r '.results[0].body.view.value' \
        | pandoc --wrap=none -f html -o $outfile
        
    # Prepend the title of the page as a chapter header to the RST file
    title_len=$(echo $title | wc -c)
    title_underline=$(printf "%${title_len}s" | sed 's/ /*/g')
    echo -e "${title}\n${title_underline}\n" | cat - $outfile > temp && mv temp $outfile

}

pull_confluence_page "Space 1" "User Story 1" source/user_story_1.rst
pull_confluence_page "Space 1" "User Story 2" source/user_story_2.rst
...

With this script written, all I had to do was call it from my Sphinx Makefile (I added a condition where the Confluence documents are only fetched if I’m actually building the docs):

%: Makefile
	@if [ "$@" != "clean" ]; then\
		bash ./prepare.sh;\
	fi
	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

and make sure another file in my source/ folder linked to them

designs.rst:

Design Documents
================

User stories and design documents for all tools may be found at the links below

.. toctree::
   :maxdepth: 2
   
   user_story_1
   user_story_2

Results

Obviously I can’t share the documents I produced for the customer, but I can say the results were surprisingly good. Pandoc did a fantastic job converting the Confluence HTML to RST. After building everything into a PDF document with Sphinx, the Confluence-derived pages are indistinguishable from those derived from my handwritten RST.

If you have Confluence docs that you want to integrate with Sphinx documents, give the above a try.