So I thought today for a change of pace, I will shift to a different topic than Tensorflow, object detection and the sorts. In this article I am goin to go over the steps for developing a python script which reads in an excel file and converts it to a csv(comma seperated values) file. The task is simple but in the course of this article I will cover some important points like, argument passing using Pythons Options parser, reading excel files using the library xlrd , writing a csv file.
The dependencies needed to be imported for this program to work are:
import os
import xlrd
import csv
from sys import exit
from optparse import OptionParser
I use exit module to return codes back to the environment in case the scripts are to be run in an automation pipeline. You will see the usage in the later sections.
First lets go over the function that shall read in the excel file, iterate it row by row and save the data in a python list.
workbook=xlrd.open_workbook(excelFile,'r')
workSheet = workbook.sheet_by_index(sheetIndex)
sheetData = []
rowData = []
for rowIndex in range(1,workSheet.nrows):
rowData = []
for colIndex in range(1,workSheet.ncols):
rowData.append(workSheet.cell_value(rowIndex,colIndex))
sheetData.append(rowData)
The above code snippet, opens a new workbook as read-only from the variable excelFile(which is the path string to the input excelFile). Then the target sheet from the workbook is opened, as you can see, i an open the sheet using the index value, you can also use sheet_by_namey function if you want to go that way. Then we iterate each column in each row in that sheet, reading the columns one-by-one and saving the cell value in a python list. As easy as pie right?
Now that we have a python list populated with the contents of the excel file, next step would be to write it to a csv file..
# outFile: Path string to output.csv file
with open(outFile, "w", newline="") as f:
writer = csv.writer(f)
writer.writerows(data) # data: python list with excel-file's contents
We have already gone over the core elements required for this program. Now for some command-line-interface. OptionParser provides a convenient way for you to setup a cli for your program all the way from arguments to options. And it will also auto generate a help menu for your program which can be accessed from the terminal with -h or --help option.
usage = "usage: %prog arg1 {input xlsx file path} arg2{sheet to be converted(index)} [options]"
parser = OptionParser(usage=usage)
parser.add_option("-o", "--output",dest="outFile",
help="Full path for output csv file", metavar="FilePath",
default=outFile)
(options, args) = parser.parse_args()
outFile = options.outFile # File path
if len(args) != 2:
print("Incorrect number of arguments provided!")
exit(1)
else:
inputExcelFile = args[0]
sheetIndex = args[1]
For this program I chose to have the path to the input file and the index of the target sheet as arguments and the path to the output file as an option. So in case no path parameter is provided the output file will be generated in the scipts directory. All that remains now is to bring all these parts together to make up the program. You can go to the repository for the program and get the source code from there, explore it and build on it further if you want. So this is it for an easier tutorial. I am planning on making a GUI for this tool using PyQt. Then we will get to explore GUI development as well, so this might be in one of the next articles, until then happy coding and until next time.
* ------ * ------ * ------ *