Content-type: text/html; charset=UTF-8 Man page of Schema

Schema

Section: Herrin Software Development, Inc. (5)
Updated: 21 Oct 2000
Index Return to Main Contents

NAME

Schema - Schema description file for Qddb relations.

DESCRIPTION

The Schema file in a Qddb relation describes each attribute in the relation and whether the attribute is expandable and/or structured. Expandable attributes may have more than one instance within a single tuple. Structured attributes are a way of keeping secondary relational information with the primary tuple. The Schema file also specifies various options associated with each attribute, and some options that are associated with the entire relation.

FORMAT

Each Qddb Schema may have any of the following options that apply to the entire relation:

    HashSize = <integer>
    HashType = <integer>
    CacheSize = <integer>
    MaxMem = <integer>
    DateFormat = <string>
    Use Cached Secondary Search
    Use Cached Hashing
    Use Reduced Attribute Identifiers
    Use ExcludeWords
    Use Condensed Indexing
    Index Full Attribute Values

These options should go at the top of the Schema file.

HashType = <integer> specifies the hash function. <integer> = 1 specifies a hash function designed for data in which prefixes and suffixes occur frequently in the data. <integer> = 0 specifies a hash function that uses the first four characters of the string to compute the hash value. <integer> = 0 is the default for backward compatibility. You can experiment with your own hash functions by inserting them into the function Qddb_HashValue (in qddb-<version>/Lib/LibQddb/Hash.c.)

CacheSize = <integer> specifies the number of hash entries that are cached when using cached hashing. DateFormat specifies the default date format when the format is not specified for an attribute of type date.

MaxMem = <integer> specifies the maximum KB of memory that Qddb will attempt to use during stabilization. This number is only a hint.

RegexType = V8|POSIX|PCRE specifies the default type of regular expression search. V8 is the default is is consistent with the original V8 regular expressions. POSIX uses the GNU POSIX regular expression implmentation in the GNU C library. The PCRE option is the Perl Compatible Regular Expression library. PCRE is only available if the libraries from pcre.org are installed when Qddb is compiled, for example, on Ubuntu when the libpcre3-dev package is installed.

Use Cached Secondary Search informs Qddb that your database will be heavily modified and all modifications should be indexed when saved. The index is stored both on disk (in the file SecondaryCache) and in memory.

Use Cached Hashing tells Qddb to build a HashTable of fixed width and height so that entries may be read on demand. We recommend this option for almost all relations. HashSize = <integer> specifies the number of hash buckets to use when stabilizing the relation.

Use Reduced Attribute Identifiers reduces the size of the Database and Index files by converting the full attribute identifiers to a base 36 number indexing into a new file called RedAttrIndex.

Use ExcludeWords specifies that the ExcludeWords file, if present in the database directory, contains a list of newline-separated words to exclude from indexing. If the file does not exist, this option has no effect.

Use Condensed Indexing instructs the stabilization process to use base-62 integers for indexing instead of base-10. This option can greatly reduce the size of all the index-related files. The database must be immediately restabilized after adding or removing this option.

Index Full Attribute Values instructs the stabilization process to index full attribute values for all attributes in each tuple in addition to any words separated by characters in the separators attribute option. Specifying this option is the same as supplying the fullindex option to each attribute in the Schema. Index Full Attribute Values may be overriden in individual attributes with the nofullindex option.

Comments may be included in the Schema file by preceding the comment with a

Attributes are described in either a leaf or a structured form. The leaf form is:

    AttributeName 
        ?verbosename "attribute desc."? 
        ?type integer|real|string|date?
        ?alias aliasname?
        ?format "format string"?
        ?separators "separator string"?
        ?defaultvalue "value string"?
        ?exclude?
        ?fullindex|nofullindex?
        ?autoincrement([0-9]+?,[0-9]+?)?
        ?*?

The default type is string. The default formats are "%f" for attributes of type real, "%ld" for attributes of type integer, and either "%m/%d/%y" or "%d/%m/%y" for attributes of type date (depending on the --with-default-date-format option to configure). Dates may have any format as described by strftime(3). Strings may not have a format specified.

Separators affect Qddb's definition of a word. By default, any alphanumeric sequence of characters (separated by non-alphanumeric characters) is a word.

Separators for attributes of type string default to:

    "	 r!@#$%^&*()_+-={}[];':

Attributes of type real default to the above string minus ".+-". Attributes of type integer or date default to the above string minus "-+". You should take care when specifying separators for any attribute with a type other than string.

The exclude option excludes the corresponding attribute from indexing. Any search on that attribute will return a NULL keylist.

The attribute option fullindex affects Qddb's indexing by adding the entire attribute value, including any separators, as an extra word for the explicit purpose of regular expression or word searching across the entire value. This extra word is in addition to words parsed using the attribute's separators. The attribute option nofullindex turns off the fullindex option that becomes default when using the Index Full Attribute Values Schema option.

The structured form is:

    AttributeName ?verbosename "attribute desc."? (
        SubAttributeName <options>
        SubAttributeName <options>
    ) ?*?

Optional features are enclosed by "?" above and may be omitted. Schemas are free-format, so arrange things to suit you.

EXAMPLES

Suppose that you want to build a small address book with all your friends names. Each friend has a family (with multiple first and last names) and possibly multiple addresses and multiple phones. A schema for this might be:

    Use Cached Hashing
    HashSize = 1000
    CacheSize = 100

    FamilyMembers verbosename "Family Members" (
        Name (
            First verbosename "First Name"
            Middle verbosename "Middle Name"
            Last verbosename "Last Name"
        )
    ) *
    Address (
        Street City State ZipCode verbosename "Zip Code"
        Phones (
            Desc verbosename "Description"
            Area verbosename "Area Code" type integer format "%3d"
            Prefix type integer format "%3d"
            Suffix type integer format "%4d"
        ) *
    ) *
    DateEntered verbosename "Date Entered" type date format "%d %B %Y"
    Brownies verbosename "Brownie Pts." type real format "%.2f"

Each record in this relation contains possibly multiple FamilyMembers each with their own Name. Each record also has a set of addresses; each address has a set of phone numbers associated with it.

REFERENCES

A Guide to QDDB
Eric H. Herrin II and Raphael A. Finkel

Qddb User's Guide

An ASCII Database for Fast Queries of Relatively Stable Data
Eric H. Herrin II and Raphael A. Finkel
Computing Systems, Volume 4 Number 2
University of California Press, Berkeley CA

Schema and Tuple Trees: An Intuitive Structure for 
Representing Relational Data
Eric H. Herrin, II and Raphael A. Finkel
Computing Systems, Volume 9, Number 2
MIT Press, Cambridge MA

This document was created by man2html, using the manual pages.
Time: 18:55:30 GMT, October 31, 2018