Integrating Confluence Documents into Sphinx
I use Sphinx to generate documentation for my Python code. I like the idea of having documentation right alongside source code as it encourages me to document as I go. And there’s something satisfying about seeing all of your work outlined in an easy-to-read format.
Today at work I was working on some documentation for a customer and faced an interesting challenge. All of my documentation on the code and how it worked was in RST documents for Sphinx, but all of my documentation on how the code was supposed to work (e.g. user stories, design specifications, etc.) was in Confluence. I wanted to pull them all together into a single, comprehensive document which I could hand to the customer and say “here’s everything about how the tool is supposed to work, how it actually works, and how use it”.
My initial thought was to use the sphinxcontrib-confluencebuilder
package,
which is a Sphinx extension that translates Sphinx documents into the Confluence
wiki format. I had previously used it to push some JavaScript documentation to
our Confluence server and it worked fairly well. However, the package does not
support the new Confluence .adf wiki format, and because all my user stories and
design specs were in this new format I wasn’t thrilled about pushing a bunch of
older looking technical docs alongside them. Not only that, but I found the PDFs
from Sphinx significantly easier to read than the ones generated from
Confluence.
Solution
I remembered Confluence has a RESTful API which you could use to query page content. And, not only could you query it, but you could get the content itself in its pure HTML glory. This is especially nice because converting to HTML to RST is easy (and reliable) with Pandoc.
So I figured I’d try:
- Querying the pages I needed using the Confluence REST API
- Convert the HTML of those pages into RST using Pandoc
- Dump the files into my Sphinx
source/
directory
Luckily this was all easy to do in a quick Bash script:
prepare.sh
:
#!/bin/bash
for cmd in jq pandoc; do
if ! command -v $cmd &> /dev/null
then
echo "$cmd not installed; install and run again"
fi
done
function pull_confluence_page {
space="$1"
title="$2"
outfile="$3"
# Encode title for URL
title_url=$(printf %s "$title" | jq -sRr @uri)
curl -u $CONFLUENCE_USERNAME:$CONFLUENCE_API_KEY \
-X GET \
"https://kgoettler.atlassian.net/wiki/rest/api/content?spaceKey=${space}&title=${title_url}&expand=body.view" \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
| jq -r '.results[0].body.view.value' \
| pandoc --wrap=none -f html -o $outfile
# Prepend the title of the page as a chapter header to the RST file
title_len=$(echo $title | wc -c)
title_underline=$(printf "%${title_len}s" | sed 's/ /*/g')
echo -e "${title}\n${title_underline}\n" | cat - $outfile > temp && mv temp $outfile
}
pull_confluence_page "Space 1" "User Story 1" source/user_story_1.rst
pull_confluence_page "Space 1" "User Story 2" source/user_story_2.rst
...
With this script written, all I had to do was call it from my Sphinx Makefile (I added a condition where the Confluence documents are only fetched if I’m actually building the docs):
%: Makefile
@if [ "$@" != "clean" ]; then\
bash ./prepare.sh;\
fi
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
and make sure another file in my source/
folder linked to them
designs.rst
:
Design Documents
================
User stories and design documents for all tools may be found at the links below
.. toctree::
:maxdepth: 2
user_story_1
user_story_2
Results
Obviously I can’t share the documents I produced for the customer, but I can say the results were surprisingly good. Pandoc did a fantastic job converting the Confluence HTML to RST. After building everything into a PDF document with Sphinx, the Confluence-derived pages are indistinguishable from those derived from my handwritten RST.
If you have Confluence docs that you want to integrate with Sphinx documents, give the above a try.