Programming Gnumeric using Python

A powerful way to access and manipulate data in Gnumeric involves using the Python programming language. As Gnumeric develops from version 1.2, the scripting methods will become increasingly powerful. Since Gnumeric is free software, you could extend it directly using the source code and adding C language functions to the code. Python offers a higher level abstraction through which to interact with the spreadsheet.

Python and Gnumeric can be used in several ways. This section will describe how to obtain Gnumeric, install it and get things configured correctly for access with Python. If you already have the pieces in place, you can skip the section Section 18.3.1 ― Installing and Building Gnumeric for Python.

This section was written by Charles Twardy. It owes a great deal to the nice guide Travis Whitton wrote: Python/Gnumeric guide for the old API in Gnumeric 1.0. Jon Käre Hellan contributed most of the code to enable Python in Gnumeric and wrote the file python-gnumeric.txt in the source tree. Nathan Hurst provided the idea and support.

The Python API, that is the list of methods available in Python, is still experimental and may change!

For further information, the web page maintained by Jon Käre Hellan's has some python plugins and other useful information. That page can be found through this link. The main Gnumeric page may also have useful information.

If you need help online, you may want to check out:

  • The Gnumeric Function-Writer's Guide. Until I write one for Python, you'll have to settle for doc/developer/writing-functions.sgml in the Gnumeric source tree.
  • The files that actually define the Python interface. In particular, plugins/python-loader/py-gnumeric.c has good comments at the beginning.
  • The instructions on how to use GNOME Git can be found here.
  • The gnumeric discussion list: <gnumeric-list@gnome.org>
  • The IRC channel #gnumeric on the GIMPnet server. Right now, the project leader is Jody Goldberg (jody) and the Debianizer is: J.H.M. Dassen (jhm). Jody, Jon K. Hellan, and Zbigniew Chyla appear prominently in the Python ChangeLog.

18.3.1. Installing and Building Gnumeric for Python

This section describes how to obtain the Gnumeric source code, configure it for Python and build it. This section will eventually be removed as Python becomes supported by default.

18.3.1.1. Preliminaries

I'm going to define some variables here so that you can insert the appropriate command or item for your system when they occur. I'll prefix them all with '$'.

  • $root: Do whatever you do to become root. The usual options are:
    • su - and hit Enter
    • sudo
    • fakeroot (works in some situations, but not all)
  • $version: Whatever your current Gnumeric version is. Some examples:
    • 1.4.20
    • 1.6.20
    • 1.7.90

18.3.1.2. In the Beginning (Installing and Building)

You need to get Python and Gnumeric, and the Python plugin for Gnumeric. You can get the binaries, the packaged source, or the developing edge sources from Git.

18.3.1.2.1. Getting the binaries (Debian)

I've only tested this on sid (unstable). The version you get from stable (woody) may not act quite the same.

  • $root apt-get install gnumeric gnumeric-python python

18.3.1.2.2.  Getting and building the current Debianized source

If you have Debian, and don't need the bleeding edge, this is by far the easiest way to get and build the source.

  1. Change to a directory where you want to hang the source directory.

  2. $root apt-get build-dep gnumeric

  3. apt-get source gnumeric

  4. cd gnumeric-$version

  5. debian/rules build

  6. To make the .deb packages: $root ./debian/rules binary

  7. To install those .deb packages:

    1. cd .. to change to that directory.

    2. $root dpkg -i gnum*deb (presuming you don't have other .deb packages beginning with "gnum" lying around here).

  8. You may or may not want to remove those .deb files now: $root rm gnum*deb

18.3.1.2.3. Getting and building the source from Git

Remember that this is the developing edge. Things may not work. Generally don't do this unless you are subscribed to the mail list and possibly also on the IRC channel.

You will need a few things for this to work at all:

  1. gnome-common

  2. libgsf (see below)

  3. pygtk2 (On Debian, make sure to get python-gtk2 and python-gtk2-dev)

  4. gnumeric (see below, obviously)

And although the following will build in the main build space, it's probably better to build in a temporary space. But I can't be bothered to learn how to fiddle the build pathways.

  1. Change to a directory where you want to hang the source directory for Gnumeric and a few other GNOME things.

  2. Getting and building libgsf:

    1. git clone git://git.gnome.org/libgsf

    2. cd libgsf

    3. Red Hat: ./autogen.sh

    4. Debian: ./autogen.sh --prefix=/usr --with-gconf-schema-file-dir=/etc/gconf/schemas

    5. make

    6. $root make install

    7. If you find that this didn't work, try make clean and then repeat from the autogen step.

  3. Getting and building libgal No longer necessary! (13 June 2003)

  4. Getting and building goffice:

    1. git clone git://git.gnome.org/goffice

    2. cd goffice

    3. Red Hat: ./autogen.sh

    4. Debian: ./autogen.sh --prefix=/usr --with-gconf-schema-file-dir=/etc/gconf/schemas

    5. make

    6. $root make install

    7. If you find that this didn't work, try make clean and then repeat from the autogen step.

  5. Getting and building gnumeric:

    1. git clone git://git.gnome.org/gnumeric gnumeric-head

    2. cd gnumeric-head

    3. Red Hat: ./autogen.sh and wait while it compiles

    4. Debian: ./autogen.sh --prefix=/usr --with-gconf-schema-file-dir=/etc/gconf/schemas

    5. make

    6. Optional: $root make install

    7. If you find that this didn't work, try make clean and then repeat from the autogen step. For example, sometimes I've had it not create the python-loader.

OK, you should now have gnumeric! Test it! If you installed the Debianized version via apt-get, or did "make install", it should be installed to /usr/bin (or /usr/local/bin on Red Hat?) and you can just type gnumeric. Otherwise you will find it in gnumeric-head/src/ and you will have to run it from there.

18.3.2. The Python Console

There is an interactive Python console available from inside Gnumeric. This is a good place to explore things, and if the console is expanded, will be a nice place for scripting. In the meantime, what I have called "Spellbooks" below are much more useful, but are fixed plugins as of Gnumeric startup. So right now I putter in the console as I develop plugin literal in the form of spellbooks. After 1.2.0, Gnumeric will be working on its scripting API, so the two approaches may merge. Or not.

18.3.2.1. Enabling the Python Console

You can run a Python interpreter from inside Gnumeric, but you have to turn it on. To do this you simply uncomment a line in python-loader/plugins.xml. Normally, that file lives in /usr/lib/gnumeric/$version/plugins/python-loader/, or perhaps /usr/local/lib... on Red Hat. I used to suggest making a local but you should probably make a local copy, but that was pain for little gain. So:

  1. gnumeric --version to make sure you get the right version name for the following. (You'll have to do this for every new version of Gnumeric!)

  2. cd ~/.gnumeric/ $version /plugins/

  3. Edit python-loader/plugin.xml.

  4. Uncomment the five lines starting with ui-console-menu service near the bottom (remove the "<!--" and "-->" tags around the <service...> and </service> tags.

  5. Save the file.

  6. Start gnumeric (same version).

  7. Select from the Tools the Python console.

  8. Enjoy!

18.3.2.2. Playing with the Python console

At the top there is a drop-down menu Execute in. Right now your only choice will be Default. After you evaluate functions from other plugins, those environments will become available too (JK says this is called lazy loading). But I'll assume you are using Default. (The only real difference is that you have to import Gnumeric first, and you can't see your plugin functions.)

(Note: older releases required you to type print dir() instead of just dir(). Fixed in cvs 16 June 2003, and certainly in 1.1.20 and higher.

Let's start by taking a look at the environment.

>>> import 1Gnumeric
>>> dir()
['Gnumeric', '__builtins__', '__doc__', '__name__']
>>> dir(Gnumeric)
['Boolean', 'CellPos', 'FALSE', 'GnumericError', 'GnumericErrorDIV0',
'GnumericErrorNA', 'GnumericErrorNAME', 'GnumericErrorNULL',
'GnumericErrorNUM', 'GnumericErrorRECALC', 'GnumericErrorREF',
'GnumericErrorVALUE', 'MStyle', 'Range', 'TRUE', '__doc__',
'__name__', 2'functions', 'plugin_info', 'workbook_new', 'workbooks']
      

'Gnumeric' is a module that exists only within Gnumeric, and which defines the Gnumeric Python API.

Gnumeric.functions is the list of all the Gnumeric functions you would see in the function browser. You cannot yet do dir(Gnumeric.functions) but maybe someone will bind that soon.

RangeRef is not listed. That seems to limit us, though later in the tutorial we'll see how to use regular functions to get inside RangeRefs.

So do some exploring. First, let's poke around to figure out how to use CellPos.

# I wonder how to use CellPos (I've glanced at the source, but...)

>>> dir(Gnumeric.CellPos)                 # shows nothing useful

>>> Gnumeric.CellPos()                    
TypeError: CellPos() takes exactly 2 arguments (0 given)  

>>> Gnumeric.CellPos("a1","a2") 
TypeError: an integer is required.        # Right. 

>>> a=Gnumeric.CellPos(1,2)               # It worked!
>>> a
<CellPos object at 0x106b6eb8>      # Yeah, I know...

>>> dir(a)
['get_tuple']

>>> a.get_tuple()
(1,2)                                     # Cool. That's (col,row)

>>> str(a)                                # Super cool.
'B3'                                      # JK hasn't implemented this for tuples yet
       

Similarly, we can explore Gnumeric.Range:

>>> r = Gnumeric.Range((1,2),(3,4))
TypeError: Range() argument 1 must be CellPos, not tuple

>>> r = Gnumeric.Range(a,a)
>>> r
<Range object at 0x1071d888>

>>> dir(r)
['get_tuple']

>>> r.get_tuple()
(3, 7, 3, 7)
       

If you evaluate in the context of a plugin (rather than in Default), then dir(Gnumeric.plugin_info) will reveal some simple informational functions you can call for the local plugin(s).

Note: obviously I don't really know what I'm doing, or I wouldn't be poking around like this.

18.3.2.3. More fun with the Python console

Jon K. Hellan writes, "Here are some more things you can do from the console:"

# Get a workbook
>>> wb=Gnumeric.workbooks()[0]
>>> wb
<Workbook object at 0x862a490>
>>> dir(wb)
>>> ['gui_add', 'sheet_add', 'sheets']

# Get a sheet
>>> s=wb.sheets()[0]
>>> s
<Sheet object at 0x863e8d0>
>>> dir(s)
['cell_fetch', 'get_extent', 'get_name_unquoted', 'rename',
'style_apply_range', 'style_get', 'style_set_pos', 'style_set_range']

# Get a cell. s.cell_fetch(0,0) and s[0,0] are synonyms. First
# coordinate is columns, i.e. s[1,0] is B1.
>>> c=s[0,0]
>>> c
<Cell object at 0x863d810>
>>> dir(c)
['get_entered_text', 'get_mstyle', 'get_rendered_text', 'get_value',
'get_value_as_string', 'set_text']

# Change the cell - it changes in the grid
>>> c.set_text('foo')

# Retrieve the cell - both ways.
>>> c.get_value()
foo
>>> s.cell_fetch(0,0).get_value()
foo

Very, very nice. Note, after setting a value, it won't show up until that cell is redrawn. That will happen automatically with plugin functions, but in the console, you may have to click on the cell.

18.3.3. Using the built-in Python functions

To enable the Python-loader and Python plugins:

  1. Select the Tools menu and the Plugins menuitem.

  2. Select "Python plugin loader" and "Python functions". Restart Gnumeric.

The quickest way to test whether you now have Python functions is to type =py_capwords("fred flintstone") in the first cell. After you hit <Enter>, you should see "Fred Flintstone".

You can also click on the functions button, and scroll down to the "Python" category. Select that. You should see at least two functions defined: PY_CAPWORDS and PY_PRINTF. They're not very useful, but they prove you've got the plugins. Test them either via the GUI or by typing into the cell.

I'll presume they worked.

18.3.4. Writing your own Python functions

To scribe new magic you must write your spells in places where Gnumeric will find them. That place is in folders under: ~/.gnumeric/<version>/plugins/ Each folder under here is one "spellbook" of new plugin functions. You may put all your spells in one spellbook, or group them neatly depending on your tastes. Each spellbook must have two files. We'll create a spellbook called "myfuncs". A pedestrian name for pedestrian spells. When I have more skill, perhaps I'll make some with better names. Several suggest themselves:

  • Transformations: of obvious value for a spreadsheet
  • Illusions: statistical functions, clearly
  • Charms: presentation graphics
  • Necromancy: file repair and missing values?
  • Divination: data mining!

18.3.4.1. Prepare the spellbook

In many ways it would be easier to start by copying the py_func spellbook to your local .gnumeric folder, and just adding a function to that. But in general it will be more useful to be able to write your own separate spellbooks, so here we go.

  1. Make the folder: First we make the folders and get into the right one. As noted above, we'll call our folder (spellbook) myfuncs.

    1. If they don't already exist:

      1. mkdir ~/.gnumeric

      2. mkdir ~/.gnumeric/<version>

    2. mkdir ~/.gnumeric/<version>/plugins/myfuncs/

    3. cd ~/.gnumeric/<version>/plugins/myfuncs/

  2. Make the files: A spellbook has two files. The first is the python file with the functions. The second is the XML file "plugin.xml". The XML file holds that master spells that tell Gnumeric what functions we've defined, and what the name of the python file is, and one other important item. We'll create these as blank files.

    1. touch my-func.py

    2. touch plugin.xml

  3. Write the master spells The good news is that you only need to do this once per spellbook. After that you just add spells to it.

    Your XML file must tell Gnumeric about your plugin. Here is a simple template. (If you want to learn about internationalization, see the example in the system's py-func spellbook.) Open up plugin.xml and insert the following lines:

    <?xml version="1.0"?>
    <plugin id="Gnumeric_MyFuncPlugin">
    	<information>
    		<name>Other Python functions from HOWTO</name>
    		<description>A few extra python functions demonstrating the API.</description>
    	</information>
    	<loader type="Gnumeric_PythonLoader:python">
    		<attribute name="module_name" value="my-func"/> 3
    	</loader>
    	<services>
    		<service type="function_group" id="example"> 4
    			<category>Local Python</category>
    			<functions>
    			</functions>
    		</service>
    	</services>
    </plugin>
    		  

    The value of "name" determines the name of your python script (file). In this case, it must be "my-func.py"

    The value of "id" here determines the name of the function dictionary in your python script. In this case, it must be "example_functions" because here the value is "example".

  4. Prepare to write the spells: Next we'll create a minimal python file. As noted above, we must name the file my-func.py and it must have a dictionary called example_functions. So open up my-func.py and insert the following lines.

    # my-func.py
    #
    
    from Gnumeric import GnumericError, GnumericErrorVALUE
    import Gnumeric
    import string
    	
    example_functions = {
    }
    		  

18.3.4.2. Writing new spells

To add new functions to Python, you now must do five things (three sir!):

  1. Write the python function in your copy of my-func.py.

  2. Insert the function info into the example_functions dictionary at the end of my_func.py

  3. Insert the function name into the functions list at the end of plugin.xml.

Writing a simple script: Let's do something very simple: add two numbers together. First, edit my-func.py.

	# Add two numbers together
    def func_add(num1, num2):
        return num1 + num2

    # Translate the func_add python function to a gnumeric function and register it
    example_functions = {
        'py_add': func_add
    }
	  

Then let the plugin-loader(?) know about your function. Add the following line near the end of plugin.xml (between <functions> and </functions>).

                 <function name="py_add"/>
	

Now start Gnumeric and type py_add(2,3) into a cell. You should get "5". You can also use cell references. If that was in cell A1, go to cell A2 and type py_add(A1,3) and you will get "8". But your function won't show up in the GUI list yet.

Tell the GUI: To make your function show up in the GUI, you have to tell Gnumeric some things about it via a standard header, like this:

	# Add two numbers together
	def func_add(num1, num2):
        '@FUNCTION=PY_ADD\n'\
        '@SYNTAX=py_add(num1, num2)\n'\
        '@DESCRIPTION=Adds two numbers together.\n'\
        'Look, the description can go onto other lines.\n\n'\
        '@EXAMPLES=To add two constants, just type them in: py_add(2,3)\n'\
        'To add two cells, use the cell addresses: py_add(A1,A2)\n\n'\
        '@SEEALSO='

        return num1 + num2
	  

The text after '@DESCRIPTION=' is the description that shows up in the function GUI. You can make it as simple or detailed as you want. I'm not sure how many other fields get used right now, as I haven't seen the EXAMPLES show up anywhere.

But this still isn't quite right. Gnumeric doesn't know how many arguments the function can handle, nor of what type. So the function GUI will prompt for the two values it knows about (as type "Any") and then keep prompting for more. But py_add cannot accept all types, nor can it handle more than two arguments, so unless you give it precisely 2 numbers, you will get an error when you click "OK".

Know your limits... We got away last time just because Gnumeric was forgiving. Now we need to say that we can accept only 2 values, of type floating-point (which will also handle ints).

Where before we had: 'py_add': func_add, we should now put: 'py_add': ('ff','num1,num2',func_add) This says that Gnumeric should expect two floating-point numbers ('ff') with names 'num1' and 'num2', and pass them to func_add.

...and surpass them Of course, there is no reason an add function shouldn't be able to handle a range. The simplest way to do that is to accept a range, and then call Gnumeric's own SUM function on it! All of Gnumeric's functions are available to you in the dictionary Gnumeric.functions, keyed by name. So here is how to write py_sum.

  1. First, define func_sum (in my-func.py):

    def func_sum(gRange):
    	'@FUNCTION=PY_SUM\n'\
    	'@SYNTAX=PY_SUM(range)\n'\
    	'@DESCRIPTION=Adds a range of numbers together.'\
    	'Just like built-in SUM.\n\n'\
    	'@EXAMPLES=To add values in A1 to A5, just type them in:\n'\
    	'    py_sum(a1:a5)\n'\
    	'@SEEALSO='
    	try:
    		sum = Gnumeric.functions['sum']
    		val = sum(gRange)
    		#  val = reduce(lambda a,b: a+b, vals)
    	except TypeError:
    		raise GnumericError, GnumericErrorVALUE
    	else:
    		return val
    		  
  2. Then insert it into your functions dictionary. That dictionary now looks like this (with 'r' denoting a range type):

    example_functions = {
    	'py_add': ('ff','num1,num2',func_add),
    	'py_sum': ('r', 'values', func_sum)
    }
    		  
  3. Finally, make an entry in the XML list, so that it now looks like:

    			<functions>
    				<function name="py_add"/>
    				<function name="py_sum"/>
    			</functions>
    		  

I told you this was the easy way to do it. Obviously it's not very useful to just duplicate Gnumeric functions. But that's as far as I've made it. From what can tell, range objects are packaged as opaque pointers of type RangeRefObject. There seems to be no way to work with them from within Python, so we must rely on the Gnumeric functions.

18.3.4.3. Do it yourself (mostly)

All is not lost, despite the opaque pointers. For in Gnumeric we can read about all the functions that have been defined. Some of those take references (including RangeRefs) and return useful information. For example, under "Lookup" we find "Column" and "Row" which return arrays of all the column (or row) indices in the range. So we can redo the sum function.

Presume we can convert our RangeRef to a start tuple and end tuple. Then we can write sum2:

def func_sum2(gRange):
	'@FUNCTION=PY_SUM2\n'\
	'@SYNTAX=PY_SUM2(range)\n'\
	'@DESCRIPTION=Adds a range of numbers together,'\
	'without calling built-in SUM.\n\n'\
	'@EXAMPLES=To add values in A1 to A5, just type them in:\n'\
	'    py_sum(a1:a5)\n'\
	'@SEEALSO='
	try:
		[r_begin, r_end] = range_ref_to_tuples(gRange)
		wb=Gnumeric.Workbooks()[0]   # Careful! This is WRONG! It doesn't
		s=wb.sheets()[0]             # use the ACTUAL workbook or sheet.

		val = 0
		for col in range(r_begin[0], r_end[0]):
			for row in range(r_begin[1], r_end[1]):
				cell = s[col, row]
				val = val + cell.get_value()
				# Note: this doesn't skip blank cells etc.

	except TypeError:
		raise GnumericError,GnumericErrorVALUE
	else:
		return val
		

That's fine as far as it goes, but we need to define the helper function "range_ref_to_tuples". Although I'm rather ashamed to show this ugly literal, here's how I did it (someone suggest a better way, please!):

def range_ref_to_tuples(range_ref):
	'''I need a function to find the bounds of a RangeRef. This one
	extracts them from the Gnumeric "column" and "row" commands, and
	returns them as a pair of tuples. Surely there is a better way?
	For example, return a list of cells??'''

	col  = Gnumeric.functions['column']   
	row  = Gnumeric.functions['row']

	# "column" and "row" take references and return an array of col or row
	# nums for each cell in the reference. For example, [[1, 1, 1], [2, 2, 2]]
	# for columns and [[2, 3, 4], [2, 3, 4]] for rows.

	try:
		columns = col(range_ref)
		rows    = row(range_ref)

		begin_col = columns[0][0] - 1  
		begin_row = rows[0][0] - 1     

		end_col = columns[-1][-1]
		end_row = rows[-1][-1]

		# We subtracted 1 from the begin values because in the API,
		# indexing begins at 0, while "column" and "row" begin at 1.
		# We did NOT subtract 1 from the end values, in order to make
		# them suitable for Python's range(begin, end) paradigm.
		
	except TypeError:
		raise GnumericError,GnumericErrorVALUE
	except NameError:                     # right name?
		raise GnumericError,Gnumeric.GnumericErrorNAME
	except RefError:                     # right name?
		raise GnumericError,Gnumeric.GnumericErrorREF
	except NumError:                     # right name?
		raise GnumericError,Gnumeric.GnumericErrorNUM


	return [ (begin_col, begin_row), (end_col, end_row) ]
		

From there, insert the function into the dictionary, and insert its name into plugin.xml. I leave these as exercises to the reader (answers in the sample files -- no peeking!). Restart Gnumeric and you should be able to use py_sum2!

There are a couple of glitches:

  • It fails the first time with "could not import gobject". Just run again, I don't know what that's about.
  • It will only work for Workbook 1 and Sheet 1. JK thinks that there may be no way to get the current Workbook/Sheet in the Python API. Hrm....
  • As noted, it should do some simple trapping to skip blank or text-filled cells. That can be done! I just didn't. It's late.

18.3.4.4. More help

Relative to the source tree:

  • The Python interface is defined in: plugins/python-loader/py-gnumeric.c That file also has good notes at the beginning.
  • There are interesting things about the way it used to be in: doc/developer/python-gnumeric.txt.

18.3.4.5. Program Listings

You can see my examples in full, with more comments:

18.3.5. Upgrading

To upgrade, first choose any method from the installation section above. But note: when you upgrade your Gnumeric version, it will look for your Python scripts in the corresponding version-named subdirectories. For example, if your scripts are in "~/.gnumeric/1.1.17/plugins", but you just upgraded to 1.1.18, you may need to rename that to "~/.gnumeric/1.1.18/plugins". If you want to keep and run several versions of Gnumeric, you'll have to copy or symlink them.

If you want the Python console, you'll also have to re-enable it, following the directions above. If you had made a local copy of the old one, make sure you don't copy or link that to the new directory. It won't work.

Find the new version with gnumeric --version, making sure to invoke the proper gnumeric.

18.3.6. Fancy tricks

To be written....

  • Swapping ranges (not a normal cell function, but I wrote one) that did this. But now I can rewrite it using the GUI, which will make a lot more sense.
  • JK's python-only transpose function
  • A Gnumeric interface to the Snob clustering algorithm. Coming soon to a spreadsheet near you!

18.3.7. Features wanted, and Questions

  • Is it really impossible to determine the current workbook/sheet from Python? That's a bummer. [JK writes: "Not yet fixed, but now fixable."]
  • Several previous items are no longer on this list, due to the diligence of the Gnumeric hackers.