Turning a list of dicts into a ReStructured Text table

I recently found myself having to prepare a report of some mortgage calculations so that non-technical domain experts could read it, evaluate it, and tell me whether my math and the way I was using certain APIs was correct.

Since I’m using Python, I decided to go as native as possible and make my little script generate a ReStructured Text file that I would then convert into HTML, PDFs, whatever. The result of certain calculations ended up looking like a data table expressed as list of dicts all with the same keys. I wrote a function that would turn that list of dicts into the appropriately formatted ReStructured Text.

For example, given this data:

creators = [{"name": "Guido van Rossum", "language": "Python"}, 
            {"name": "Alan Kay", "language": "Smalltalk"},
            {"name": "John McCarthy", "language": "Lisp"}]

when you call it with:

dict_to_rst_table(creators)

it produces:

+------------------+-----------+
| name             | language  |
+==================+===========+
| Guido van Rossum | Python    |
+------------------+-----------+
| Alan Kay         | Smalltalk |
+------------------+-----------+
| John McCarthy    | Lisp      |
+------------------+-----------+

The full code for this is:

from collections import defaultdict

from io import StringIO


def dict_to_rst_table(data):
    field_names, column_widths = _get_fields(data)
    with StringIO() as output:
        output.write(_generate_header(field_names, column_widths))
        for row in data:
            output.write(_generate_row(row, field_names, column_widths))
        return output.getvalue()


def _generate_header(field_names, column_widths):
    with StringIO() as output:
        for field_name in field_names:
            output.write(f"+-{'-' * column_widths[field_name]}-")
        output.write("+\n")
        for field_name in field_names:
            output.write(f"| {field_name} {' ' * (column_widths[field_name] - len(field_name))}")
        output.write("|\n")
        for field_name in field_names:
            output.write(f"+={'=' * column_widths[field_name]}=")
        output.write("+\n")
        return output.getvalue()


def _generate_row(row, field_names, column_widths):
    with StringIO() as output:
        for field_name in field_names:
            output.write(f"| {row[field_name]}{' ' * (column_widths[field_name] - len(str(row[field_name])))} ")
        output.write("|\n")
        for field_name in field_names:
            output.write(f"+-{'-' * column_widths[field_name]}-")
        output.write("+\n")
        return output.getvalue()


def _get_fields(data):
    field_names = []
    column_widths = defaultdict(lambda: 0)
    for row in data:
        for field_name in row:
            if field_name not in field_names:
                field_names.append(field_name)
            column_widths[field_name] = max(column_widths[field_name], len(field_name), len(str(row[field_name])))
    return field_names, column_widths

Feel free to use it as you see fit, and if you’d like this to be a nicely tested reusable pip package, let me know and I’ll turn it to one. One thing that I would need to add is making it more robust to malformed data and handle more cases of data that looks differently.

If I turn it into a pip package, it would be released from Eligible, as I wrote this code while working there and we are happy to contribute to open source.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.