Pegasus Enhancement Proposal (PEP)

PEP #: 314

Title: SQLite Repository

Status: Approved

Version History:

Version Date Author Change Description
1.0 4 February 2008 Bart Whiteley Initial submittal/placeholder
2.0 17 September 2008 Roger Kumpf Reworked for Pegasus 2.9. Use SQLite as a database engine.
2.1 22 October 2008 Roger Kumpf Include specific SQL schema. Address review comments.
2.2 24 October 2008 Roger Kumpf Correct the ClassTable key definition. Detail the derivation of database file names. Add discussion items.
2.3 28 October 2008 Roger Kumpf Introduce a SQLITE_HOME build variable. Do not commit SQLite source code to the OpenPegasus source repository.
Approved version (Ballot 159).

 


Abstract: An alternate repository implementation using the SQLite database engine is introduced to improve reliability.


Definition of the Problem

The Pegasus CIMRepository implementation persists its data on disk using many files. Some of the files are related, such that their contents must stay in sync. For example, the storage of instances of a particular class uses an index file and an instances file. Association data is stored in yet another file. Transaction semantics exist for the purpose of keeping an index file in sync with its instances file, but they are known to fail in some cases. No logic exists to keep the association data in sync with the class and instance data. If the file contents get out of sync, the repository becomes corrupt and unusable, with no straightforward remedy.

Proposed Solution

Completing the transaction semantics to be robust in all cases would be a significant undertaking, made even more complicated by the need for backward compatibility with the existing file layout and transaction algorithm. The proposed solution is to use an existing database engine to leverage proven transaction reliability rather than attempting to solve the problem with the existing persistent store.

A new persistent store type is added to the repository as an option, based on the SQLite database engine. SQLite is chosen for its transactional capabilities, track record of quality, serverless operation, lack of need for setup or configuration, and friendly license. In fact, SQLite has no license because the source is in the public domain.

With Bug 7902, a PEGASUS_REPOSITORY_STORE_COMPLETE_CLASSES build option was introduced. When set to false (the default), only local elements are included in the class definitions that are stored on disk. This option is important because it also removes the restriction on modification of non-leaf classes. It also reduces the repository footprint, often by over 50%. Performance may be impacted by the need to retrieve and propagate elements from superclasses when loading a class definition, but this may be mitigated somewhat by caching. Regardless of this build setting, a repository of either type continues to operate correctly with the CIMRepository implementation.

The SQLite repository implementation only stores local class definitions. That is, the PEGASUS_REPOSITORY_STORE_COMPLETE_CLASSES build option may not be set to true in this case.

Class Design

A pure virtual PersistentStore class is introduced which abstracts the interface to the FileBasedStore class. A new SQLiteStore class is added which implements the PersistentStore class. The CIMRepository uses polymorphism to interact with the FileBasedStore and the SQLiteStore in the same way, allowing seamless support for both repository types. A static isExistingRepository() method is added to the FileBasedStore and the SQLiteStore to allow the CIMRepository to select the appropriate back-end store for an existing repository directory.

Database Schema

The SQLite database uses schema that parallels the data in the file-based store. This simplifies the implementation of the SQLiteStore interface, which matches the FileBasedStore interface. A separate database file is used for each namespace. (The database file name is derived from the namespace name by converting it to lower case, changing the '/' characters to '#', encoding the extended range characters using the CommonUTF escapeStringEncoder, and adding a '.db' suffix.)

The following tables are created in each database:

        CREATE TABLE NamespaceTable(
            nsname TEXT PRIMARY KEY NOT NULL,
            shareable INTEGER NOT NULL,
            updatesallowed INTEGER NOT NULL,
            parentnsname TEXT NOT NULL,
            remoteinfo TEXT NOT NULL);

        CREATE TABLE QualifierTable(
            qualname TEXT NOT NULL,
            normqualname TEXT PRIMARY KEY NOT NULL,
            rep BLOB NOT NULL);

        CREATE TABLE ClassTable(
            classname TEXT NOT NULL,
            normclassname TEXT PRIMARY KEY NOT NULL,
            superclassname TEXT NOT NULL,
            rep BLOB NOT NULL);

        CREATE TABLE ClassAssocTable(
            assocclassname TEXT NOT NULL,
            normassocclassname TEXT NOT NULL,
            normfromclassname TEXT NOT NULL,
            normfrompropname TEXT NOT NULL,
            toclassname TEXT NOT NULL,
            normtoclassname TEXT NOT NULL,
            normtopropname TEXT NOT NULL,
            PRIMARY KEY(normassocclassname, normfrompropname, normtopropname));

        CREATE TABLE InstanceTable(
            normclassname TEXT NOT NULL,
            instname TEXT NOT NULL,
            norminstname TEXT PRIMARY KEY NOT NULL,
            rep BLOB NOT NULL);

        CREATE TABLE InstanceAssocTable(
            associnstname TEXT NOT NULL,
            normassocinstname TEXT NOT NULL,
            normassocclassname TEXT NOT NULL,
            normfrominstname TEXT NOT NULL,
            normfrompropname TEXT NOT NULL,
            toinstname TEXT NOT NULL,
            normtoclassname TEXT NOT NULL,
            normtopropname TEXT NOT NULL,
            PRIMARY KEY(normassocinstname, normfrompropname, normtopropname));

The "normalized" fields are modified to allow matching to be done using straightforward data comparisons. This essentially means converting CIM names to lower case and using the canonical form for instance names.

Note that the NamespaceTable exists in each database, but it only contains a single row in each database. This design keeps the namespace attributes within the same table as the other data in the namespace and prevents the need to synchronize data across databases.

While the SQLiteStore implementation may change from release to release, the database schema will need to remain compatible. Therefore, it is important to give careful consideration to the database schema.

Build Option

New build options are proposed as follows:

PEGASUS_USE_SQLITE_REPOSITORY
SQLITE_HOME

Discussion


Copyright (c) 2008 Hewlett-Packard Development Company, L.P.; IBM Corp.; EMC Corporation; Symantec Corporation; The Open Group.

Permission is hereby granted, free of charge, to any person obtaining a copy  of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

THE ABOVE COPYRIGHT NOTICE AND THIS PERMISSION NOTICE SHALL BE INCLUDED IN ALL COPIES OR SUBSTANTIAL PORTIONS OF THE SOFTWARE. THE SOFTWARE IS PROVIDED  "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.