-
Notifications
You must be signed in to change notification settings - Fork 4
LexDbInternals
The LexDB uses a PostgreSQL database to provide a source of lexical items for client applications such as the LKB.
The fld table stores user-defined the field definitions used in constructing the rev table below. The contents are set by the script install-lexdb. Field definitions cannot be altered once the LexDB has been created.
The rev table stores revisions of lexical items. It has the following structure. The first 4 fields, which hard-coded into every LexDB, have the following definitions:
name TEXT NOT NULL,
userid TEXT DEFAULT user NOT NULL,
modstamp TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
dead BOOLEAN DEFAULT 'f' NOT NULL
Following these are the user-defined fields specific to a particular LexDB. These are obtained from the .fld file provided to the script install-lexdb. They are also stored in the public.fld table of the LexDB for later reference. The following are the user-defined fields used by the ERG LexDB:
type TEXT
orthography TEXT
keyrel TEXT
altkey TEXT
alt2key TEXT
keytag TEXT
altkeytag TEXT
compkey TEXT
ocompkey TEXT
pronunciation TEXT
complete TEXT
semclasses TEXT
preferences TEXT
classifier TEXT
selectrest TEXT
jlink TEXT
comments TEXT
exemplars TEXT
usages TEXT
lang TEXT
country TEXT
dialect TEXT
domains TEXT
genres TEXT
register TEXT
confidence real DEFAULT 1
source TEXT
The fields (name,userid,modstamp) provide the primary key. The field dead is used to marks dead revisions.
This table provides keys for the lookup of lexical items by component words. Eg. a revision with orthography 'a few' will be keyed on both 'a' and 'few'. Keys are in normalized (lower case) form as provided by the clinet application. (We do not use the PostgreSQL lower() function as it may differ to the equivalent function used in the client application.)
name TEXT NOT NULL,
userid TEXT DEFAULT user NOT NULL,
modstamp TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
key text NOT NULL
The dfn table stores the mapping used to construct TDL lexical entries from fields of a revision in rev (the entry can then be processed by the client application in the same manner as entries obtained from a textual TDL lexicon file). See LexDbFieldMappings.
mode TEXT NOT NULL,
slot TEXT NOT NULL,
field TEXT NOT NULL,
path TEXT,
type TEXT
This table stored assorted configuration settings and other data.
A sample public.meta is shown below:
var | val
-----------------------+-------------------------------------------
lexdb-version | 4.00
supported-psql-server | 7.4
supported-psql-server | 8.0
filter | TRUE
pub-fn | check_psql_server_version
...
pub-fn | dump_public_rev_rev_key_to_tmp_tmp_key
user | bmw20
mod_time | 2005-05-17 09:48:59.415022+01
A sample private meta is shown below:
var | val
------------+-------------------------------
filter | TRUE
mod_time | 2005-05-17 09:48:59.422795+01
build_time | 2005-05-17 10:04:37.91322+01
-
user is set for each user for whom a private schema has been initialized;
-
filter is an SQL WHERE-clause which determines which rev entries are visible to a user's lex table;
-
mod-time stores the time at which data in the schema was last modified;
-
build_time stores the time at which the (private schema) lex table was last rebuilt.
The PostgreSQL database is created and owner by the database user lexdb. This user is the owner of the schema public. In order to make use of the LexDB a client must log on as a separate user. The first time such a user connects to the LexDB a private database schema will be initialized.
A private schema contains a private rev, rev_key and meta tables. When lexical items are modified by the user (or new items created) changes are stored in the private schema. When a user is happy with any changes the new rev (and associated rev_key) entries are then transferred to the public schema. (This requires a lexdb login authorization.)
A private schema also contains a lex and a lex_key tables. These have the same structure as a rev / rev_key table and are used to provide a cache of the user's current lexicon -- that is, we take the union of public.rev with the user's private rev, pass the entries though the user's filter, and take head (most recent) revisions. (The views filt and head correspond to stages in this process, but should not generally be accessed directly.)
... TO DO
Home | Forum | Discussions | Events