pg-python architecture
======================

This README provides a high-level overview of the implementation.

Native Typing
-------------

pg-python was designed with the goal of providing Python interfaces to
PostgreSQL objects--often conceptual rather than literal. This marks a sharp
contrast with traditional PL implementations as parameters are *not* sent
through conversion functions to become common language objects. Rather,
instances of "interface types" are used to represent values coming from the
database. Conversion mechanisms are, however, provided through standard language
interfaces as Postgres objects are not often directly usable by Python libraries.

Native typing provides Python interfaces to Postgres types using Python types.
This is achieved by extending the PyTypeObject structure with Postgres type
information(PyPgTypeInfo) extracted from a type's corresponding pg_type row.
Those types are 'Postgres.Type' objects. Postgres.Type's are conceptual, that is,
they are *not* instances of the 'pg_type' relation type.

Many Postgres.Type's have custom implementations providing access to an object's
data. Notably, time types provide properties that allow the user to leverage
much of date_part()'s functionality directly in Python. These convenience
interfaces for standard Postgres types are common and contribute to most of the
size of types/*.c files.

Support for Postgres operators are provided. Normally, this is a direct mapping
of the Python operator's textual representation, so "pg_ob1 * pg_ob2" will
resolve the "*" operator based on the types of pg_ob1 and pg_ob2. When only one
Postgres object is present, the other operand will be converted to that type.


Error Management
----------------

pg-python handles database errors by raising a Python exception representing
the error. By doing this, pg-python must record when this has been done so that
future database actions can be prohibited. The `pl_state` global is used to
track the error state of the PL.

When a database error(ereport, elog) occurs, pl_state is altered to indicate
that subsequent use of database functionality should be restricted by an
exception being raised. The "DB_IS_NOT_READY()" macro should be used at the
beginning of every Python interface that requires that the database is in a
non-error state.

Before the language handler exits, the PL must check the state to validate that
a database error has not been suffocated. Exiting the handler indicating success
in a failed transaction is likely to cause trouble for the caller,
so this check is employed to guarantee that the error state is properly
indicated.

The following error codes have been added:

 ERRCODE_PYTHON_ERROR
  This is used to describe common PL or language errors. These may have been
  indirectly caused by the user. This code is used to indicate that the Python
  error came from the edges of the PL, not directly from the user's code.

 ERRCODE_PYTHON_EXCEPTION
  A Python exception was raised by the function's code.

 ERRCODE_PYTHON_PROTOCOL_VIOLATION
  The user failed to meet the PL's expectations.
  Currently, this is used by check_state and triggers.

 ERRCODE_PYTHON_RELAY
  Used internally by the PL to designate that the failure is a Python exception
  and it should be raised (Python) or thrown (Postgres) accordingly.

Transaction Management (Internal Subtransactions)
-------------------------------------------------

Subtransactions are managed using a simple counter. Each Postgres.Transaction
instance will get a local transaction identifier that is used to validate that
transactions are exited in the appropriate order.


Prepared Statements and Cursors (SPI)
-------------------------------------

Database access is implemented using two types: the Postgres.Statement and the
Postgres.Cursor.

The statement objects provide an interface to a parameterized plan. When
invoked, a Postgres.Cursor object is created to manage the result set of the
Portal.

Scrollable cursors are supported using the statement's "declare" method. When a
scrollable cursor is created, the seek and read methods are available for use.


Postgres Interface Module
-------------------------

All Postgres interfaces are provided using the "Postgres" module. This is a
built-in module that is extended with the contents of the "module.py" file.
Specifically, once the C portions of module are initialized, the Python portion
is executed in the module's context. This provides an easy way to do further,
high-level initialization(@pytypes is implemented in pure-Python).


First Call Initialization
-------------------------

Much of the PL is initialized in _PG_init. However, it is possible for _PG_init
to be executed outside of a transaction(preloading), which means that we cannot
fully initialize interface module and the Python environment as it depends on
the system cache being available--specifically, regprocedure is used to
implement "proc(...)", and regtype is used to provide the "Postgres.types"
pseudo-module.


Inline Execution
----------------

DO statements are handled using a fake PyPgFunction object--that is, there is no
corresponding pg_proc entry. The function's module is set to the InlineExecutor
class defined in the pure-Python portion of the Postgres
module(See src/module.py). This class provides the "main" entry point which
takes a single argument, the source to execute.


Interrupt Support
-----------------

Interrupts are supported by overriding the default signal handlers for SIGTERM
and SIGINT. While scary at first, these overrides are *not* installed unless the
PL has been invoked--they are installed at first call, not on _PG_init.
Additionally, when they have been installed, the new functionality is only invoked
in transactions where there has been PL activity. Specifically, the
`handler_count` must be greater than zero *and* there must be a Python stack
frame.

Managing interrupts during PL execution is fairly tricky. When an interrupt
occurs, the KeyboardInterrupt exception is set via Py_AddPendingCall iff CFI
doesn't raise. If the Python interpreter never actually calls it, it's
important the Py_MakePendingCalls is performed between transactions in order to
avoid a spurious exception upon re-entry.
