-
-
Notifications
You must be signed in to change notification settings - Fork 155
Code Architecture
(n.b. all descriptions below are intended to provide a high level overview of how pycsw is implemented. For full details, please refer to the codebase)
pycsw is a CGI based application written in Python, which accepts HTTP GET and POST requests as per OGC:CSW 2.0.2. The basic flow of events is:
client request --> pycsw (handle request, produce response) --> server response
pycsw is always called from csw.py
, and always instantiates a server.Csw
object and then uses its dispatch()
method to handle the request and generate a response.
The server.Csw
class sets up the server to be able to handle OGC:CSW requests accordingly:
- setup configuration (
default.cfg
) - initialize the underlying repository (database) connection and queryables model
- set default HTTP properties (gzip compression)
- generate !GetDomain model
- load any profile code (e.g. as
apiso
) - setup transactions (if specified)
- setup distributed search (if specified)
- setup logging (if specified)
At this point, pycsw is ready to handle the request, using server.Csw.dispatch()
, which does the following:
- parse request (GET or POST or SOAP)
- do basic parameter checking (service, version, request)
- process the request accordingly
(server.Csw.exceptionreport()
is always used when pycsw encounters an error and returns an OGC ExceptionReport
)
All server.Csw
methods return lxml.etree.Element
objects, which are then processed by server.Csw.write_response()
and returned to the client as XML.
- handle SECTIONS parameter if specified
- handle extra profile parameters if specified
- set / process updatesequence
- return response XML as
lxml.etree.Element
- perform GET validation
- process the output of schemas as
csw:SchemaComponent
elements - return response XML as
lxml.etree.Element
- perform GET validation
- process parameter name
- validate against internal domain model
- process property name
- validate existence of property against
self.repository.queryables['all']
- query repository (SQL distinct query against XPath of queryable in
records.xml
- return response XML as
lxml.etree.Element
- perform GET validation
- query repository. SQL query, one of:
- spatial (
util.query_spatial()
) - aspatial (
util.query_xpath()
) - spatial + aspatial
- sorting (if specified)
- do distributed searching (if specified)
- write out results (based on
outputschema
) - distributed search results are returned verbatim
- return response XML as
lxml.etree.Element
- perform GET validation
- query repository. SQL query by id (against
records.identifier
) - write out results (based on
outputschema
) - return response XML as
lxml.etree.Element
- wrapper around
server.Csw.getrecordbyid()
- gets raw XML record
- return response XML as
lxml.etree.Element
- validate XML document
- insert mode
- update mode
- delete mode
- fetch XML from URL
- insert into repository, or update if identifier exists
-
server.Csw._gen_soap_wrapper()
is the generic SOAP wrapper private method -
server/config.py
sets the server's operation model inconfig.MODEL
. Any modifications are then made by calling code (e.g. to add more queryables, typenames, etc.) - spatial query magic is via Shapely in
server/util.py:query_spatial()
, called via SQL function bound back to this method - full text (e.g. '*:!AnyText') style queries are via
server.util.py:query_anytext()
, called via SQL function bound back to this method - XPath style queries are via
server.util.py:query_xpath()
, called via SQL function bound back to this method