LINUX GAZETTE

"Linux Gazette...making Linux just a little more fun!"


Pl/Python and Cursors in Pl/Pgsql for PostgreSQL

By Mark Nielsen


  1. Purpose
  2. Compiling PostgreSQL
  3. Executing Pl/Python and Pl/PgSQL with Cursors
  4. Suggestions for Future
  5. Conclusion
  6. References

Purpose

The purpose of this article is to get people more familiar with some new programming techniques with procedural languages in PostgreSQL 7.2. I am really a big fan of procedural languages, and overall, PostgreSQL rocks when it comes to stored procedures. You can create many different types of functions like : SQL, Pl/PgSQL, Pl/Perl, and Pl/Tcl. Recently in 7.2, you also have Pl/Perlu and Pl/Python --- unrestricted Perl and Python. The ability to use other programming languages inside a database server makes life a lot easier for a programmer (even if there is some inefficiency with Perl and Python). For a few years, I was irritated by the fact procedural languages couldn't return more than one item. That has sort of been taken care of, but not perfectly. One of the last areas to make my life ten times easier is to have procedural languages return more than item. We will see we can sort of do this, but I have suggestions to make it better.

NOTE: I am using Red Hat 7.2 as a base for this article. Things will be different with your version of Linux.

Compiling PostgreSQL

There are three things you need to know when compiling PostgreSQL:
  1. Edit src/pl/plperl/Makefile.PL and comment out lines 14 through 34.
  2. Edit src/include/pg_config.h.in and change
    #define INDEX_MAX_KEYS 16
    to
    #define INDEX_MAX_KEYS 512
  3. Use this command to compile PostgreSQL:
    adduser postgres
    mkdir /usr/local/pg72
    ./configure --prefix=/usr/local/pg72 --with-libraries=/usr/lib/perl5/5.6.0/i386-linux/CORE/libperl.a --with-perl --with-python
    make
    make install
    chown -R postgres /usr/local/pg72
    mkdir /home/postgres
    chown postgres /home/postgres
    su -l postgres
You can also add odbc, tcl, and other languages, but I am just using Perl and Python as examples for now. After you execute "make" and "make install", follow these commands as the user postgres.

  1. PATH=/usr/local/pg72/bin:$PATH
    export PATH
    LD_LIBRARY_PATH=/usr/local/pg72/lib
    export LD_LIBRARY_PATH
    export PGDATA=/usr/local/pg72/data
    export PGLIB=/usr/local/pg72/lib
  2. initdb
  3. /usr/local/pg72/bin/pg_ctl -D /usr/local/pg72/data -l /tmp/pg_logfile start
  4. createlang plperl template1
  5. createlang plpython template1
  6. createlang plpgsql template1
  7. createdb postgres

Executing Pl/Python and Pl/PgSQL with Cursors

Here are a list of commands you can execute using the command "psql template1". This assumes the database "postgres" has been created.

Execute these commands:


\c postgres

drop table contact;
create table contact (
first text, last text, phone text, address text,
city text,state text, zip text
);

drop function replace_e_to_a(text);
CREATE FUNCTION replace_e_to_a(text) RETURNS text AS
'
import re
Text1 = re.sub(''e'', ''a'',args[0])
return Text1
'
LANGUAGE 'plpython';

drop function replace_numbers_to_z(text);
CREATE FUNCTION replace_numbers_to_z(text) RETURNS text AS
'
import re
Text1 = re.sub(''[0-9]'', ''z'',args[0])
return Text1
'
LANGUAGE 'plpython';

insert into contact values ('Mark','Nielsen','123-123-1234',
  '1234 Somewhere St.', 'Some City 123', 'TX','12345-1234');
insert into contact values ('Mark','Nielsen2','123-123-1234',
  '3456 Somewhere St.', 'Some City 444', 'MI','12345-1234');
insert into contact values ('Mark','Nielsen3','123-123-1234',
  '5678 Somewhere St.', 'Some City 654', 'AX','12345-1234');

select first, last, address  from contact;

drop function select_contact();
CREATE FUNCTION select_contact () RETURNS refcursor AS '
DECLARE
    cursor1 CURSOR FOR select replace_e_to_a(first)as first, 
                        replace_e_to_a(last) as last,
                        replace_numbers_to_z(address) as address
            from contact;
BEGIN
 open cursor1;
 return (cursor1);
END;
' LANGUAGE 'plpgsql';

begin;
select select_contact();
FETCH cursor1; FETCH cursor1;FETCH cursor1;
end;
and the output should look like:
DROP
CREATE
DROP
CREATE
DROP
CREATE
INSERT 255188 1
INSERT 255189 1
INSERT 255190 1
 first |   last   |      address       
-------+----------+--------------------
 Mark  | Nielsen  | 1234 Somewhere St.
 Mark  | Nielsen2 | 3456 Somewhere St.
 Mark  | Nielsen3 | 5678 Somewhere St.
(3 rows)

DROP
CREATE
BEGIN
 select_contact 
----------------
 cursor1
(1 row)

 first |  last   |      address       
-------+---------+--------------------
 Mark  | Nialsan | zzzz Somewhere St.
(1 row)

 first |   last   |      address       
-------+----------+--------------------
 Mark  | Nialsan2 | zzzz Somewhere St.
(1 row)

 first |   last   |      address       
-------+----------+--------------------
 Mark  | Nialsan3 | zzzz Somewhere St.
(1 row)

COMMIT

From this example, you can see how the Pl/pgSQL executes the Python procedures (the Pl/Python procedures). You don't need Pl/pgSQL to execute Pl/Python procedures, I just did it that way. You can only use Perl and Python to manipulate data, not change data in the tables. Perl and Python just input and output data, they don't do anything to the database directly. Also, pl/perlu gets installed when you compile Perl into PostgreSQL, which is nice.

Suggestions for Future

I still have three big complaints/suggestions:
  1. Procedures seem to be able to only return one "item'. It can't return two items. For me, I would like to be able to return at least two items -- a cursor and a flag saying if the cursor is good or not. Cursors DO NOT make procedures return more than one value, they return the cursor which you can consider to be a reference to many items, not the items themselves. The way I compile my procedures is that they can take upto 512 variables of input, but they can only return one "item". I am dumbstruck as to why that is. I tried to read of the discussions on the mailing lists and docs, but I got lost.
  2. There doesn't seem to be very many things you can do to cursors, like see how many entries there are, if there are any entries left, if there are any entries at all, if it executed correctly, etc. Cursors need to be advanced some more. Perhaps I just don't have enough experience with cursors to comment on them.
  3. Procedures can't return rows of data. Besides the fact it can only return one item, it can't return rows of one items or rows which contain multiple items. Being able to return multiple rows with multiple fields would be cool. Even if we only want to return one row of data, it would be nice to return a cursor as one item, and then its status (good, bad, how many, etc) as well. If you can return more than one item, you might as well make it unlimited or limited by a compile-time number (like inputs are).
I like the progression of procedural languages in PostgreSQL. It seems to be way ahead of a lot of the other free database servers and even most of the commercial ones. Taking it to the next step would really make it more of a kick-butt database server. I try to help out by writing up articles, perhaps you can try to help out by adding features?

This isn't really a complaint of cursors, but of DBD:Pg for Perl, and possibly other interfaces to PostgreSQL -- cursors really aren't supported, as far as I can tell. Thus, if Pl/PgSQL could return multiple rows of multiple items, it would take care of this problem.

The only other way I know to store data from a procedure is into a temporary table which can be accessed after the procedure has finished. To me, that is a bit extreme for 99% of the data I want to get. It is extreme because usually I just want one row of data and creating a table just for one row of data isn't worth the effort.

Conclusions.

Pl/Python will finally let me let go of Perl once and for all (I have converted myself to Python). Pl/PgSQL is getting closer to something that makes it easy for me to program and create complicated procedures -- I just wish it could actually return multiple items and not just a reference to a cursor or other single item.

The sad thing is, my version of DBD::Pg for Perl and my Python interface don't support cursors, and so, it is useless for me to the most part, but at least it is getting better. I found some things at http://developer.postgresql.org/todo.php which look promising. Since cursors really aren't supported in the programming languages I use, if I truly need to store lots of data, I will probably have to use temporary tables. I still don't understand why a procedural language can't return data like you can in a normal sql command. Whatever the limitation is, it would be nice to overcome.

References

  1. Standard Database Setup with Perl and PostgreSQL: Part 3
  2. Part 2: PostgreSQL: Perl procedures with PL/pgSQL
  3. Part 1: PostgreSQL: Perl procedures with PL/pgSQL.
  4. An older article setting of Web and Database Servers.
  5. Branden Williams articles on PostgreSQL.
  6. http://techdocs.postgresql.org/oresources.php
  7. http://techdocs.postgresql.org/
  8. Some links which have nothing to do with this article, but I am considering for future articles.
  9. If this article changes, it will be available at http://www.gnujobs.com/Articles/26/nielsen.html

Mark Nielsen

Mark works at AudioBoomerang.com which creates, delivers, and tracks personalized multimedia email, web, and newsletter campaigns. He works as a consultant delivering end products to AudioBoomerang.com clients, such as advanced customized statistical reports used for demographic or pyschological profiles for future campaigns. In his spare time, he writes articles relating to Free Software (GPL) or Free Literature (FDL) and is involved with the non-profit learning center eastmont.net.


Copyright © 2002, Mark Nielsen.
Copying license http://www.linuxgazette.net/copying.html
Published in Issue 80 of Linux Gazette, July 2002