"Linux Gazette...making Linux just a little more fun!"
Evaluating postgreSQL for a Production Environment
Introduction
With the advent of relatively powerful, free and/or cheap software, I wanted
to see if I was able to create a production environment around Linux and
a DBMS. In the past I have worked with several DBMS products in several
environments. My goal was to evaluate the Linux/postgreSQL against the
several environments I am familiar with. Out of these environments I will
pick the aspects which I think are important in a production environment.
Past production environments
I have worked in three stages of production environment. These were the
unconnected PC environment, the connected PC environment (file/print server
networking), and multi-user/multi-tasking environment (minicomputer). I
will introduce the several tools in order of this complexity.
Some terms will need to be explicitly defined, because the xBase terminology
is sometimes confusing. The term 'database' here means the collection of
several related tables which are needed to store organised data. The term
'table' is used to define one collection of identical data, a set. This
is because in the original xBase languages, 'database' was used as to mean
'table'.
FoxBase
Effectively being a much faster clone of dBase III, FoxBase contained the
minimum necessary to define the tables of a database and a programming
language which contained all necessary to write applications very quickly.
A database consisted of several tables and their indexes. The association
of tables and their indexes must explicitly be done using commands.
The programming language used is completely procedural. It contains
statements to create menu's, open and close tables, filter tables (querying),
insert, update and delete records, view records through screens and a browse
statement. Defining all these things in a program is quite straightforward.
Records are manipulated as program variables. All data is stored in ASCII
format.
One special feature which originated in dBase, are 'macro's'. These
macro's are text strings, which could be compiled and interpreted at run-time.
This was a necessary feature, because most statements took string arguments
without quotes, e.g. OPEN MY_TABLE. If you wanted to define a statement
with a parameter, you could not directly refer to a variable. Trying to
execute OPEN ARG_TABLE, the program would search for the file 'ARG_TABL'.
To circumvent this problem you need to code the following :
ARG_TABLE = "MY_TABLE"
OPEN &ARG_TABLE
Clipper
Clipper originated as a dBase compiler, but added soon after powerful extensions
to the language. I have also worked with the Summer '87 and the 5.0 version.
At the database level, nothing changed very much from FoxBase, but at the
user interface level and the programming level, several advanced features
offered faster development turn-around times and advanced user interfacing.
The macro feature was still available, but Clipper expanded it through
code blocks. In the original implementation, a macro needed to be evaluated
every time it was used. In cases where macro's where used to filter data,
this amounted to waste of computing time. The introduction of the code
block made it possible to compile a macro just once and then use the compiled
version.
Other features where the introduction of some object-oriented classes
for user interfacing, a powerful multi-dimensional array type, declarations
for static and local variables and a plethora of functions to manipulate
arrays and tables. The flipside of all this was that learning to effectively
use the language took some more time. I have two books about Clipper 5.0
and they are quite large.
FoxPro 2.0
FoxPro was the successor of FoxBase. It added GUI-features to the text-interface,
making it possible to work with overlapping windows. FoxPro 2.0 also added
embedded SQL statements. It was only a subset with SELECT, INSERT, UPDATE
and DELETE, but this offered already a substantial advantage over the standard
query statements. It also offered a better integration between tables and
their indexes, and one of the most powerful query optimizers ever developed.
They also provided some development tools, of which the most important
where the screen development and the source documentation tools.
Clipper and FoxPro made it also possible to program for networks and
thus enable multi-user database systems.
WANG PACE
WANG PACE is a fully integrated DBMS development system which runs on the
WANG VS minicomputers. It offers an extended data dictionary with field-
and record-level validation, HL-language triggers and view-definitions.
All defined objects contain a version count. When an object is modified
and subsequent programs are not, then a runtime error is generated when
compiled versions don't match DD versions. It also offers a powerful screen
editor, a report editor and a query-by-example system. Access through COBOL,
COBOL85 or RPGII is available with a pre-processor which compiles embedded
statements into API-calls.
Summary of important features
If I look in retrospect to these systems, what were the important features
which made programming more productive ? This reference must be made against
postgreSQL and the available libraries for interfacing to the back-end.
It must also be made from the point of the production programmer, who must
be able to write applications without being bothered by irrelevant details.
-
Field names translate to native variable names
Defining a field name for a table makes it available under the same
name to the program which can then use it as an ordinary, native variable.
-
Uniform memory allocation system
The xBase systems have a dynamic memory allocation scheme, which is
completely handled by the runtime library. COBOL is fully statically allocated.
In none of these languages the programmer needs to be concerned with tracking
allocated memory.
-
Direct update through the current record
The manipulated record is available to update the table through one
or another UPDATE statement.
-
Database statements have the same format as the application language
When programming in xBase, the statements to extract and manipulate
data from the database tables formed an integral part of the procedure
language.
In COBOL, the statements where embedded and processed by a preprocessor.
The syntax of the available statements was made to resemble COBOL syntax,
with its strong and weak points.
-
Simple definition and usage of screens
In xBase, there are simple yet powerful statements available for defining
screens. Screens are called through only one or two statements.
In WANG PACE, screens can only be defined through the screen editor.
There are three statements available : one to use menu's, one to process
one record in a cursor and an iterative version to process all records
in a cursor. Most screen processing is handled through the run-time libraries.
Features available when installing postgreSQL
The four first features can be installed using the ecpg preprocessor. This
makes it possible to use native program variables, you don't have to worry
about memory allocation, because the run-time library takes care of it,
and updates can also take place using the selected program-variables.
What is missing, is a special form of include statement. Now you need
to know which fields are in a table if you want to use a 'exec sql declare'
statement. It would be better if there was something like 'exec sql copy
fields from <tablename>'. If the tabledefinition then changes, recompiling
the program will adjust to the new definitions.
Using the pgaccess program (under X-windows) provides access to the
data dictionary in a more elegant manner.
Summary
I started out to write a critique on postgreSQL because of the summary
documentation which is delivered in the package. This made it rather hard
to find and use all the components which provide additional functions.
I started describing my experiences on other platforms to get an insight
in what a production environment should deliver to the programmer. Then
I started to look closely at the delivered documentation and to my surprise
all components that I needed where in fact in the package.
The following critique still remains however. The documentation of the
package is too much fragmented, and most parts of the documentation are
centered around technical aspects which do not bother the production programmer.
This is understandable however. The documentation is written by the same
people that implement them. I know of my own experience that writing a
user manual is very hard and it is easy to get lost in the technical details
of the implementation that you know about.
The following parts of postgreSQL are important for the production programmer,
and their documentation should be better integrated.
-
The psql processor
This is a nice tool to define all necessary objects in a database,
to get acquainted with SQL and to test ideas and verify joins and queries.
-
The ecpg preprocessor
This is the main production tool to write applications which use
database manipulation statements. This capacity should probably be extended
to other languages too. Since all bindings from the selected cursor are
made to program variables, records can be processed without the hassle
of converting them from and to ASCII, and updates can be made through the
'exec sql update' statement.
-
The pgaccess package
The pgaccess package provides access to all parts of the database
and offers the ability to design screens and reports. It is still in a
development phase. I hope it will be extended in the future, because the
basic idea is excellent and the first implementations are worthwile.
The libpq library is of no real value to a production programmer. It should
be mainly a tool to be used in implementing integrated environments and
database access languages. It could e.g. be used to create an xBase like
environment (for those who wish to use this).
Further research
In the following weeks (months) I hope to setup a complete database system
over a network, with one server and several types of clients (workstation,
terminal, remote computer) through several interfaces (Ethernet, serial
connections). I will investigate the several platforms for application
development. I intend to have a closer look at the provided tools in the
postgreSQL package (developing a simple database for my strip book collection),
but I will also look at the possibilities that Java offers a development
platform with JDBC, screen design and application programming.
One last note : for the moment I concentrate on using tools that I don't
need to pay for, because I need the money for my hardware platforms and
for my house. This does not mean that I am a die-hard 'software should
be gratis' advocate. A production environment favors to pay for software,
because then it knows that it has a complete tool with support and warranty
(horror stories about bad support not withstanding).
Copyright © 1999, Jurgen Defurne
Published in Issue 36 of Linux Gazette, January 1999