Sometimes one’s work environment leaves one doing things in weird ways, like storing data in an Excel spreadsheet rather than a database. I have found myself in this situation before, and while it’s not exactly ideal, libraries like openpyxl make things a little bit more possible.
In my situation, I knew that in searching through the spreadsheet in question, I was more likely to want the more recent data rather than the older data. This meant that it would be ideal to insert new data in the top row of my sheet. It’s as simple as a right-click and inserting a row when just using the Excel GUI, but it is surprisingly difficult to do programmatically! I’m hopeful that the function I’m sharing today will make someone’s life easier.
Start by using pip install openpyxl
in your command line. Once the package is installed, we can get started.
Let’s start by including our imports, defining a function that takes a sheet_name argument, and performing our initial setup so we can start working with our sheet.
import openpyxl from openpyxl import load_workbook from openpyxl.styles import Font, Alignment def insert_line_in_sheet(sheet_name): workbook = load_workbook(filename="saved_data.xlsx")
First we open the Excel file itself with the load_workbook function. Once this is done, we can select the sheet that we passed into the function.
old_sheet = workbook[sheet_name] old_sheet.title = 'Old' + sheet_name row_count = old_sheet.max_row col_count = old_sheet.max_column
We’re going to be creating a new sheet based on the sheet we passed into the function. Therefore, we’ll change the name of the original sheet to avoid confusion. We’re also extracting the number of rows and columns in this sheet using the max_row and max_column attributes. We’ll need this information soon.
if sheet_name == "SheetName1": workbook.create_sheet(index=0, title=sheet_name) elif sheet_name == "SheetName2": workbook.create_sheet(index=1, title=sheet_name) new_sheet = workbook[sheet_name]
Finally, we create the new worksheet. If you need your worksheets to be in a particular order, you can specify that with the index
argument of the create_sheet()
method. We currently have the new sheet, which has the original name of the passed sheet, and the old sheet which has “Old” prefixing its name.
Now we’re set up. On the most basic level, we just need to do the following to copy all data down one row:
for row_num in range(1, row_count+1): for col_num in range(1, col_count+1): new_sheet.cell(row=(row_num + 1), column=col_num).value \ = old_sheet.cell(row=row_num, column=col_num).value
The openpyxl library can be kind of confusing because it uses one-based numbering rather than zero-based. What we’re doing here is using nested for loops to step through each row, then through each column, based on the numbers of each we saved above. Then we’re copying data from the old sheet to the new sheet, but one row down.
Those are the basics, but we can do a lot better than this. First of all, every spreadsheet of substance I’ve ever used has had a header row with labels. How about freezing that in place?
cell = new_sheet['A2'] new_sheet.freeze_panes = cell
This will freeze everything above and to the left of cell A2 – in other words, the entire top row and no columns.
When you create a new sheet, all custom column widths from the old sheet are lost. Here are a couple different ways you might set columns to the width you want (assuming that information is known in advance):
if sheet_name == "SheetName1": column_names = ["A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K"] column_widths = [11, 39, 30.14, 24, 24, 21.14, 23, 16, 23.71, 21, 20] for name, width in zip(column_names, column_widths): new_sheet.column_dimensions[name].width = width elif sheet_name == "SheetName2": column_names = ["A", "B", "C", "D", "E", "F", "G", "H", "I"] column_width = 21.29 for name in column_names: new_sheet.column_dimensions[name].width = column_width
With SheetName1, a different width is used for each column and assigned individually. SheetName2 shows a simpler method where all columns are the same width.
An improvement to this function would be to have openpyxl get the column widths from the old sheet and automatically match them to the new sheet. I’ll be honest with you: I’m using SQL databases now and no longer wrestling with Excel spreadsheets for data, so I’m not super inclined to figure this out. I’d be happy to update the post if someone wants to send me some working code, though!
Now we can get to actually copying the values from our header row:
for col_num in range(1, col_count+1): new_sheet.cell(row=1, column=col_num).value \ = old_sheet.cell(row=1, column=col_num).value
And then we can set some styles. For example, if you want to maintain a certain font size, that needs to be explicitly set, as a new worksheet will default to font size 11. Here’s an example of setting the header row font size to 10, the font style to bold, and the cell aligntment to center.
header_font = Font(size=10, bold=True) header_alignment = Alignment(horizontal='center') for cell in new_sheet["1:1"]: cell.font = header_font cell.alignment = header_alignment
Now let’s add a little bit to the portion of code that actually copies our spreadsheet data:
cell_font = Font(size=10) for row_num in range(2, row_count+1): for col_num in range(1, col_count+1): new_sheet.cell(row=(row_num + 1), column=col_num).value \ = old_sheet.cell(row=row_num, column=col_num).value new_sheet.cell(row=(row_num + 1), column=col_num).font = cell_font
Here we’ve added a font size style and also adjusted the outer for
loop to start placing data on the third line rather than the second (since now we have added a fixed header row).
The only thing left at this point is cleanup:
# Remove the sheet with the old data workbook.remove(old_sheet) # Save or it's all for naught workbook.save(filename="saved_data.xlsx")
Here’s the full function in it’s all glory. It’s an awful lot of code for something so simple to do in the Excel GUI, but it gets the job done.
import openpyxl # for .xlsx files from openpyxl import load_workbook from openpyxl.styles import Font, Alignment def insert_line_in_sheet(sheet_name): # Setup workbook = load_workbook(filename="saved_data.xlsx") old_sheet = workbook[sheet_name] old_sheet.title = 'Old' + sheet_name row_count = old_sheet.max_row col_count = old_sheet.max_column if sheet_name == "SheetName1": workbook.create_sheet(index=0, title=sheet_name) elif sheet_name == "SheetName2": workbook.create_sheet(index=1, title=sheet_name) new_sheet = workbook[sheet_name] # Freeze the header row cell = new_sheet['A2'] new_sheet.freeze_panes = cell # Set the column width for the new worksheet. if sheet_name == "SheetName1": column_names = ["A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K"] column_widths = [11, 39, 30.14, 24, 24, 21.14, 23, 16, 23.71, 21, 20] for name, width in zip(column_names, column_widths): new_sheet.column_dimensions[name].width = width elif sheet_name == "SheetName2": column_names = ["A", "B", "C", "D", "E", "F", "G", "H", "I"] column_width = 21.29 for name in column_names: new_sheet.column_dimensions[name].width = column_width # Copy the header and set the styles. for col_num in range(1, col_count+1): new_sheet.cell(row=1, column=col_num).value \ = old_sheet.cell(row=1, column=col_num).value header_font = Font(size=10, bold=True) header_alignment = Alignment(horizontal='center') for cell in new_sheet["1:1"]: cell.font = header_font cell.alignment = header_alignment # Copy the rest of the original sheet, but down a row. cell_font = Font(size=10) for row_num in range(2, row_count+1): for col_num in range(1, col_count+1): new_sheet.cell(row=(row_num + 1), column=col_num).value \ = old_sheet.cell(row=row_num, column=col_num).value new_sheet.cell(row=(row_num + 1), column=col_num).font = cell_font # Remove the sheet with the old data workbook.remove(old_sheet) # Save or it's all for naught workbook.save(filename="saved_case_data.xlsx")