The topic for this episode of my LibreOffice Calc & Python Programming tutorial series is to get the data stored in a range of cells in LibreOffice. And we are going to do that using some very useful dedicated Pythons Functions delivered with the PyUno library.
How are we going to do it?
We are going to get the coordinates of a range of selected cells in an open Calc document and store all the data that the range of cells contains into an Tuple.
But first, what is an Tuple?
A Tuple in Python is like an Array of data but that can store data with different types.
But what is an Array?
An array in computer programming is a container that stores data in consecutive memory addresses. It is widely used in many programming languages.
In Python an Array is used to store many variables with the same type of data (string, integer, float, etc…) which is not convenient for storing data coming from a Spreadsheet’s cells.
That is where the Tuple is convenient as it can store data of different types (see the 4 possible types in my previous episode of this series).
Note that a Tuple in Python is ‘immutable’, it means that we cannot change its content once it has been defined. It is important to know and I’ll show you how to work with tuples easily.
So let’s get Started.
Here is the plan for this tutorial:
- Set the context for global variables
- Get the Range of Addresses
- Get the data from a range of Cells by Position
- Get the data from a range of Cells by Name
- Store Cells content in an Tuple of Data
Set the context for global variables
Note: you can start from the same script as the previous tutorial (part 6 - Cells Types ).
Starting with our usual context() function that stores desktop , model and active_sheet at a global scope:
def context(): # set global variables for context global desktop global model global active_sheet # get the doc from the scripting context # which is made available to all scripts desktop = XSCRIPTCONTEXT.getDesktop() model = desktop.getCurrentComponent() # access the active sheet active_sheet = model.CurrentController.ActiveSheet
Get the Range of Addresses
In our main function we start by calling context(), then we need to get the Addresses of the cells .
def main(): # call function context() context() # get the range of addresses from selection oSelection = model.getCurrentSelection() oArea = oSelection.getRangeAddress()
The method getRangeAddress() returns a CellRangeAddress.
A CellRangeAddress is a Struct (short for Structure). It has the following Attributes:
Sheet, StartColumn, StartRow, EndRow, EndColumn.
We don’t really need the Sheet index yet as we are only working on the current sheet. But we need the 4 other attributes. They correspond to the top-left and bottom-right cells of what we have selected with the mouse.
I strongly advise you to check XCellRangeAddress and explore the LibreOffice API. It is actually a really good way to discover the different methods and attributes that you can use.
The next step is to get all necessary cell references to retrieve data from a Range of Cells. We can do that now that we have our oArea object variable.
# store the attribute of CellRandeAddress nLeft = oArea.StartColumn nTop = oArea.StartRow nRight = oArea.EndColumn nBottom = oArea.EndRow #(note: could the attribute directly instead of using intermediary variable)
Note that although we could use the oArea attributes, storing them in variables with relevant names makes the program/script more readable.
Get the data from a range of Cells by Position
Now we can call function getCellRangeByPosition to returns a ‘sub-range’ of the active sheet. Understand that the active sheet itself is a range of cells that is much bigger.
That ‘sub-range’ is the range of Selected Cells from where we are going to extract data. I am calling that range oRangeSource.
# get data from the source oRangeSource = active_sheet.getCellRangeByPosition(nLeft, nTop, nRight, nBottom)
Here is an example of a selected range of cells in LibreOffice:
Get the data from a range of Cells by Name
In this tutorial we are using variables to get the Range of Cells. We could also use arbitrary values corresponding to the Top-left and Bottom-Right cells of the range like:
oRangeSource = active_sheet.getCellRangeByPosition(0, 0, 2, 9)
That set of parameters (0, 0, 2, 9) means that we are getting the range from cell(0,0) to cell(2,9).
In some cases it is easier to refer the cells by name. With names instead of positions the range (0,0,2,9) would be from cell “A1” to cell “C10”.
If you work with a range of cells by name it is then much easier to use the dedicated function getCellRangeByName:
# example by name: oRangeSource = active_sheet.getCellRangeByName('A1:C10')
You would define the range of cells the same way you do it in formulas: ‘A1:C10‘.
Using range by position is much easier in programs where the ranges are not always the same and can be dynamically changed. It makes it much more flexible to work with, especially if you need to run loops on the cells.
Store Cells content in an Tuple of Data
This is a key operation.
The function getDataArray extracts the content of the cells within a range and stores that content in a tuple. And from there we will be able to use that content easily with Python. Here is the simple line of code needed to do that:
# get data from the Range of cells and store in a tuple oDataSource = oRangeSource.getDataArray()
Note: A print statement here would be helpful to check how the tuple of data looks like. To get the APSO console check part 4 of this series of tutorials).
# print to console print(oRangeSource)
Here is an example of what you would get:
The Tuple produced is actually a tuple of tuples:
((1.0, ‘C’), (2.0, ‘D’), (‘A’, 3.0), (‘B’, 4.0))
As you can see each tuple inside the big tuple stores a row of data.
If you look closer, for example if you look at the tuple for the first row:
You will see that the 1.0 has no quote where the ‘C’ has quotes. This means that the types of the data within the cells have been preserved.
A number in LibreOffice remains a number in Python, and string remains a string. In fact the function getDataArray() returns a tuple of tuples with data of a type Double or String. Double is a type for any decimal numbers.
Now here is the full code for this tutorial:
def context(): # set global variables for context global desktop global model global active_sheet # get the doc from the scripting context # which is made available to all scripts desktop = XSCRIPTCONTEXT.getDesktop() model = desktop.getCurrentComponent() # access the active sheet active_sheet = model.CurrentController.ActiveSheet def main(): # call function context() context() # get the range of addresses from selection oSelection = model.getCurrentSelection() oArea = oSelection.getRangeAddress() # store the attribute of CellRangeAddress nLeft = oArea.StartColumn nTop = oArea.StartRow nRight = oArea.EndColumn nBottom = oArea.EndRow #(note: could the attribute directly instead of using intermediary variable) # get the Cell Range oRangeSource = active_sheet.getCellRangeByPosition(nLeft, nTop, nRight, nBottom) # example by name: # ~ oRangeSource = active_sheet.getCellRangeByName('A1:C10') # get data from the Range of cells and store in a tuple oDataSource = oRangeSource.getDataArray() # print to console print(oDataSource)
To make it easier to test my scripts I usually assign them to a combination of keys like CTRL + 0. Learn how to do it on this post.
In the next tutorials we will actually do something with the tuple created here and then play a little bit the data inside.
Please leave a comment to let me know what you think about this tutorial or if you had any issues going through it. Getting your comments helps me a lot to make the next tutorials better! Thank you!