Danny Brown

A Blog on Code and Occasionally Other Things

Collecting Merge Field Names from Multiple Word Documents Using Python

Danny BrownDecember 21, 2018

I’m often needing to generate documents that have a lot of canned language. Rather than manually create these, I’ve been using Python to automate as much as possible.

Creating Merge Fields in a Word Document

The first thing you need is a document with merge fields. To add a merge field, go to the Insert tab in your Word document, then select the Quick Parts drop-down menu. Here, select the Field... option.

In this menu, scroll down to the field name called MergeField. Select it. (Pro tip: two clicks below the scrollbar will take you down far enough.)

You don’t need to worry about the format or field options. Just give your merge field a name and click OK.

For this example, I’ve created two separate documents with three merge fields each, which appear like this:

doc1.docx

doc2.docx

Selecting Merge Fields from Word Documents Using Python

A lot of the time, I need to populate several documents at once and I don’t know which merge fields are in which document. Python can solve that really quickly. Start by making sure you’ve installed the docx-mailmerge package, which is going to do most of the work for us.

pip install docx-mailmerge

Then use this function:

import glob
import mailmerge

def get_merge_fields_from_all_documents():
    results = []
    for file_name in glob.glob("*.docx"):
        with mailmerge.MailMerge(file_name) as document:
            merge_fields = list(document.get_merge_fields())
            results.append(merge_fields)
    return results

list_of_lists = get_merge_fields_from_all_documents()
# [['CityStateZip', 'Name', 'Address'], ['PurchasePrice', 'ShipDate', 'InvoiceNumber']]          

Let’s break down what’s going on here.

import glob
import mailmerge

def get_merge_fields_from_all_documents():
    results = []

First, we’re importing the modules we’re using, declaring our function, and creating an empty list to hold our results.

for file_name in glob.glob("*.docx"):

glob is a cool little module in The Python Standard Library that matches path names in our current directory. Here we’re using it with a for loop to iterate through each .docx file in the current directory (i.e., the directory where we ran our script).

with mailmerge.MailMerge(file_name) as document:

Next, we’re opening each document our for loop selected using the mailmerge package we installed earlier.

merge_fields = list(document.get_merge_fields())

Here, we’re using mailmerge‘s get_merge_fields() method, which returns the name of each merge field in the document. It normally returns a the data in a set, but here we’re using the list() method to return a list.

results.append(merge_fields)

Now we just need to append this list to the master list. After that, we use return results and have a handly list of lists including all the merge field names in our documents.

list_of_lists = get_merge_fields_from_all_documents()
# [['CityStateZip', 'Name', 'Address'], ['PurchasePrice', 'ShipDate', 'InvoiceNumber']]

And if you prefer just one list with all your merge field names, a simple list comprehension will get you there:

one_list = [item for sublist in list_of_lists for item in sublist]
# ['CityStateZip', 'Name', 'Address', 'PurchasePrice', 'ShipDate', 'InvoiceNumber']

In my next post, I’ll be covering what we can actually do with this information. Spoiler: Python makes it really easy to populate these fields programmatically.

Posted In code | Python

Post navigation

PreviousUsing Code Prettify and CSS to Highlight Code on WordPress
NextPopulating Merge Fields in a Word Document Using Python

Danny Brown

A Dev Blog with Some Tangents

About

Categories

  • code
    • APIs
    • Bash
    • CSS
    • Django
    • HTML
    • JavaScript
    • Python
    • S3
    • Selenium
    • Serverless
    • TypeScript
  • games
  • music
    • concert reviews
    • synthesizers
  • opinion
  • sports
  • tech
    • Bitbucket
    • Git
    • GitHub
    • MS Teams
    • WordPress
  • theater

Recent Posts

  • Open Pull Requests from the Terminal (One of My Favorite Dotfiles Scripts)
  • Dotfiles Script for a New TypeScript/Node Project
  • So I Told You to Go See a Broadway Play? Tips for Theater in New York
  • Build a Simple Microsoft Teams Bot Easily, No SDK Required
  • Creating a GUI for Conway’s Game of Life Using Pygame and Numpy

External Links

  • GitHub
  • LinkedIn

Recent Posts

  • Open Pull Requests from the Terminal (One of My Favorite Dotfiles Scripts)
  • Dotfiles Script for a New TypeScript/Node Project
  • So I Told You to Go See a Broadway Play? Tips for Theater in New York
  • Build a Simple Microsoft Teams Bot Easily, No SDK Required
  • Creating a GUI for Conway’s Game of Life Using Pygame and Numpy

Categories

  • code
    • APIs
    • Bash
    • CSS
    • Django
    • HTML
    • JavaScript
    • Python
    • S3
    • Selenium
    • Serverless
    • TypeScript
  • games
  • music
    • concert reviews
    • synthesizers
  • opinion
  • sports
  • tech
    • Bitbucket
    • Git
    • GitHub
    • MS Teams
    • WordPress
  • theater
Copyright © 2025. Danny Brown
Powered By WordPress and Meritorious