Openpyxl: Python Module to Read/Write Excel Files

Filed Under: Python
Python Openpyxl Module

Python openpyxl module is a native library to work with excel files. We can read excel files as well as write excel files.

1) Installing Openpyxl Module

We can install openpyxl module using the PIP command.


$ pip install openpyxl
Pip Install Openpyxl

Pip Install Openpyxl

2) Reading Excel File using openpyxl

I have created a sample excel file (records.xlsx) with three sheets. The data is present in the first two sheets.

Openpyxl Example Excel Sheet

Openpyxl Example Excel Sheet

We will use this excel file to look at some examples to read data from the excel sheet.

1. Getting Sheet Names from the Excel File


import openpyxl

excel_file = openpyxl.load_workbook('records.xlsx')

# sheet names
print(excel_file.sheetnames)

Output:


['Employees', 'Cars', 'Numbers']

The sheetnames variable returns the list of the names of worksheets in the workbook. The names are returned in the order of the worksheets in the excel file.

2. Getting Specific Sheet from the Excel File

We can access a specific worksheet using the index variable with the workbook object.


employees_sheet = excel_file['Employees']

print(type(excel_file))
print(type(employees_sheet))

currently_active_sheet = excel_file.active

Output:


<class 'openpyxl.workbook.workbook.Workbook'>
<class 'openpyxl.worksheet.worksheet.Worksheet'>

If you want to access the currently active sheet, use the active property of the workbook.

3. Reading a Cell Value from the Excel Sheet

There are two ways to get a cell value from the excel sheet. We can get the Cell object using the cell() function or we can get it using the index of the cell.


cell_obj = employees_sheet.cell(row=1, column=1)
print(type(cell_obj))
print(f'Employees[A1]={cell_obj.value}')

# second way
print(f'Employees[A1]={employees_sheet["A1"].value}')

Output:


<class 'openpyxl.cell.cell.Cell'>
Employees[A1]=EmpID
Employees[A1]=EmpID

4. Total Number of Rows and Columns in the Excel Sheet

We can get the total number of rows and columns using the max_row and max_column properties of the worksheet.


print(f'Total Rows = {employees_sheet.max_row} and Total Columns = {employees_sheet.max_column}')

Output:


Total Rows = 4 and Total Columns = 3

5. Printing Header Row of the Excel Sheet


header_cells_generator = employees_sheet.iter_rows(max_row=1)

for header_cells_tuple in header_cells_generator:
    for i in range(len(header_cells_tuple)):
        print(header_cells_tuple[i].value)

Output:


EmpID
EmpName
EmpRole

The iter_rows() function generates cells from the worksheet, by row. We can use it to get the cells from a specific row.

6. Printing all the values from a column


for x in range(1, employees_sheet.max_row+1):
    print(employees_sheet.cell(row=x, column=1).value)

Output:


EmpID
1
2
3

7. Printing all the values from a row


for x in range(1, employees_sheet.max_column+1):
    print(employees_sheet.cell(row=2, column=x).value)

Output:


1
Pankaj
CEO

8. Reading Range of Cells from the Excel Sheet

We can pass the range of cells to read multiple cells at a time.


cells = employees_sheet['A2':'C3']

for id, name, role in cells:
    print(f'Employee[{id.value}, {name.value}, {role.value}]')

Output:


Employee[1, Pankaj, CEO]
Employee[2, David Lee, Editor]

9. Iterating Cells by Rows


for row in employees_sheet.iter_rows(min_row=2, min_col=1, max_row=4, max_col=3):
    for cell in row:
        print(cell.value, end="|")
    print("")

Output:


1|Pankaj|CEO|
2|David Lee|Editor|
3|Lisa Ray|Author|

The arguments passed to the iter_rows() function creates the two-dimensional table from which the values are read, by row. In this example, the values are read between A2 and C4.

10. Iterating Cells by Columns


for col in employees_sheet.iter_cols(min_row=2, min_col=1, max_row=4, max_col=3):
    for cell in col:
        print(cell.value, end="|")
    print("")

Output:


1|2|3|
Pankaj|David Lee|Lisa Ray|
CEO|Editor|Author|

The iter_cols() function is same as iter_rows() except that the values are read column-wise.

3) Writing Excel File using openpyxl

In this section, we will look into some examples of writing excel files and cell data.

1. Writing Excel File using openpyxl


from openpyxl import Workbook
import datetime

excel_file = Workbook()
excel_sheet = excel_file.create_sheet(title='Holidays 2019', index=0)

# creating header row
excel_sheet['A1'] = 'Holiday Name'
excel_sheet['B1'] = 'Holiday Description'
excel_sheet['C1'] = 'Holiday Date'

# adding data
excel_sheet['A2'] = 'Diwali'
excel_sheet['B2'] = 'Biggest Indian Festival'
excel_sheet['C2'] = datetime.date(year=2019, month=10, day=27).strftime("%m/%d/%y")

excel_sheet['A3'] = 'Christmas'
excel_sheet['B3'] = 'Birth of Jesus Christ'
excel_sheet['C3'] = datetime.date(year=2019, month=12, day=25).strftime("%m/%d/%y")

# save the file
excel_file.save(filename="Holidays.xlsx")

Output:

Openpyxl Write Excel File

Openpyxl Write Excel File

2. Updating a Cell value

We can either use the index of the cell or use the cell object to set the value. Let’s change some values in the excel file created in the last example.


import openpyxl

excel_file = openpyxl.load_workbook('Holidays.xlsx')
excel_sheet = excel_file['Holidays 2019']

# using index
excel_sheet['A2'] = 'Deepawali'

# using cell object
excel_sheet.cell(row=2, column=2).value = 'Biggest Indian Festival for Hindus'

excel_file.save('Holidays.xlsx')

Output:

Openpyxl Update Cell Value

Openpyxl Update Cell Value

3. Appending Multiple Values to the Excel Sheet

We can use the append() function to add a sequence of values to the bottom of the worksheet.


holiday_rows = (
    ('Black Friday', 'Fourth Thursday of November, Shopping Day', '11/29/19'),
    ('Holi', 'Festival of Colors', '3/20/19')
)

for row in holiday_rows:
    excel_sheet.append(row)

excel_file.save('Holidays.xlsx')

Output:

Openpyxl Append Multiple Rows To Excel File

Openpyxl Append Multiple Rows To Excel File

4) Deleting Rows and Columns from the Excel Sheet

We can use the delete_cols() and delete_rows() functions to delete columns and rows from the excel sheet.


import openpyxl

excel_file = openpyxl.load_workbook('Holidays.xlsx')
excel_sheet = excel_file['Holidays 2019']

# delete column
excel_sheet.delete_cols(idx=2)  # B=2

# delete row
excel_sheet.delete_rows(idx=2, amount=2)  # rows 2,3 are deleted

excel_file.save('Holidays.xlsx')

The idx parameter provides the index of the rows and columns to delete. If we want to delete multiple adjacent rows and columns, we can provide the amount argument.

5) Conclusion

Python openpyxl module is a perfect choice to work with excel sheets. We can also add images to the excel sheet by using the pillow library with it. But, it doesn’t guard us against quadratic blowup or billion laughs XML attacks. So, if you are getting values from the user and saving it, then try to validate and sanitize it.

6) Further Readings

7) References

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages