I’m often needing to generate documents that have a lot of canned language. Rather than manually create these, I’ve been using Python to automate as much as possible.
Creating Merge Fields in a Word Document
The first thing you need is a document with merge fields. To add a merge field, go to the Insert
tab in your Word document, then select the Quick Parts
drop-down menu. Here, select the Field...
option.
In this menu, scroll down to the field name called MergeField
. Select it. (Pro tip: two clicks below the scrollbar will take you down far enough.)
You don’t need to worry about the format or field options. Just give your merge field a name and click OK.
For this example, I’ve created two separate documents with three merge fields each, which appear like this:
doc1.docx
doc2.docx
Selecting Merge Fields from Word Documents Using Python
A lot of the time, I need to populate several documents at once and I don’t know which merge fields are in which document. Python can solve that really quickly. Start by making sure you’ve installed the docx-mailmerge package, which is going to do most of the work for us.
pip install docx-mailmerge
Then use this function:
import glob import mailmerge def get_merge_fields_from_all_documents(): results = [] for file_name in glob.glob("*.docx"): with mailmerge.MailMerge(file_name) as document: merge_fields = list(document.get_merge_fields()) results.append(merge_fields) return results list_of_lists = get_merge_fields_from_all_documents() # [['CityStateZip', 'Name', 'Address'], ['PurchasePrice', 'ShipDate', 'InvoiceNumber']]
Let’s break down what’s going on here.
import glob import mailmerge def get_merge_fields_from_all_documents(): results = []
First, we’re importing the modules we’re using, declaring our function, and creating an empty list to hold our results.
for file_name in glob.glob("*.docx"):
glob
is a cool little module in The Python Standard Library that matches path names in our current directory. Here we’re using it with a for
loop to iterate through each .docx file in the current directory (i.e., the directory where we ran our script).
with mailmerge.MailMerge(file_name) as document:
Next, we’re opening each document our for
loop selected using the mailmerge
package we installed earlier.
merge_fields = list(document.get_merge_fields())
Here, we’re using mailmerge
‘s get_merge_fields()
method, which returns the name of each merge field in the document. It normally returns a the data in a set, but here we’re using the list()
method to return a list.
results.append(merge_fields)
Now we just need to append this list to the master list. After that, we use return results
and have a handly list of lists including all the merge field names in our documents.
list_of_lists = get_merge_fields_from_all_documents() # [['CityStateZip', 'Name', 'Address'], ['PurchasePrice', 'ShipDate', 'InvoiceNumber']]
And if you prefer just one list with all your merge field names, a simple list comprehension will get you there:
one_list = [item for sublist in list_of_lists for item in sublist] # ['CityStateZip', 'Name', 'Address', 'PurchasePrice', 'ShipDate', 'InvoiceNumber']
In my next post, I’ll be covering what we can actually do with this information. Spoiler: Python makes it really easy to populate these fields programmatically.