Processing Excel files using Python

    At work, I had to deal with the task of processing xls files using python. A little googling, I came across several libraries with which you can work with excel files.

    - xlrd - makes it possible to read Excel files
    - xlwt - creates and populates Excel files
    - xlutils - a set of utilities for expanding the capabilities of the previous two libraries
    - pyExcelerator - also makes it possible to work with Excel files, but has not been updated for a long time.

    For my task, I used the first three libraries.
    The task was divided into several parts: reading a file with the xls extension; creating a new one and filling it out; creating a copy of the file based on the input file; deleting the necessary lines in the output file.

    Reading input file

    This task is not very difficult. The documentation and examples that came with xlrd helped solve it quickly.
    Code example:
    import xlrd
    rb = xlrd.open_workbook('d:/final.xls',formatting_info=True)
    sheet = rb.sheet_by_index(0)
    for rownum in range(sheet.nrows):
    row = sheet.row_values(rownum)
    for c_el in row:
    print c_el

    Create a new file and fill it

    This task was no more difficult than the previous one. Documentation and examples helped.
    Code example:
    import xlwt
    from datetime import datetime

    font0 = xlwt.Font() = 'Times New Roman'
    font0.colour_index = 2
    font0.bold = True

    style0 = xlwt.XFStyle()
    style0.font = font0

    style1 = xlwt.XFStyle()
    style1.num_format_str = 'D-MMM-YY'

    wb = xlwt.Workbook()
    ws = wb.add_sheet('A Test Sheet')

    ws.write(0, 0, 'Test', style0)
    ws.write(1, 0,, style1)
    ws.write(2, 0, 1)
    ws.write(2, 1, 1)
    ws.write(2, 2, xlwt.Formula("A3+B3"))'example.xls')

    Creating a copy of the file based on the input file

    This problem can be solved in two ways. Option one: open the input file for reading, create a new file and overwrite all the data from one file to another in a loop. Such a solution is not difficult to implement, so it makes no sense to lay out sample code. Option two: use the xlutils library. This library has a lot of interesting and useful things, but xlutils.copy will be interesting for our task .
    And so, an example of code to create a file based on the input using xlutils.copy:
    import xlrd
    import xlwt
    from xlutils.copy import copy

    rb = open_workbook('final.xls',on_demand=True,formatting_info=True)
    wb = copy(rb)"final_complete.xls")

    Here is such a small code. In order for it to work, the on_demand = True flag must be on . By using the formatting_info flag, the output file is obtained with the same design styles as the input. For my task, this turned out to be the right option.

    Deleting rows by a specified condition

    To solve this problem, it was decided to use a filter. One option is to rewrite from one file to another, excluding those options that do not fulfill a given condition. But there is one catch, if you need to preserve the design style of the document, then this approach will not work (Unless of course you know the design style in advance and can set it programmatically). The solution to this problem was achieved through the use of xlutils.filter. Task: to leave in the output Excel file only those entries that are contained in the transferred list.
    Code that solves this problem:
    from xlutils.filter import GlobReader,BaseFilter,DirectoryWriter,process


    class MyFilter(BaseFilter): 

        goodlist = None
        def __init__(self,elist): 
            self.goodlist = goodlist
            self.wtw = 0
   = 0

        def workbook(self, rdbook, wtbook_name): 
  , 'filtered_'+wtbook_name) 

        def row(self, rdrowx, wtrowx):

        def cell(self, rdrowx, rdcolx, wtrowx, wtcolx): 
            value = self.rdsheet.cell(rdrowx,rdcolx).value
            if value in self.goodlist:
    data = """somedata1

    goodlist = data.split("\n")



    Using a set of three libraries, the tasks were solved. The following was noticed: if there are graphic elements in the input Excel file (pictures, etc.), they are not transferred to the output file. Perhaps by studying these libraries it will be possible to solve this part of the problem.

    References - the first three libraries. is a group that discusses the use of the xlrd, xlwt, and xlutils libraries.

    PS I think it would be nice to transfer this post to a thematic blog.

    Also popular now: