Pegasus Enhancement Proposal (PEP)
PEP #: 314
Title: SQLite Repository
Status: Approved
Version History:
Version | Date | Author | Change Description |
---|---|---|---|
1.0 | 4 February 2008 | Bart Whiteley | Initial submittal/placeholder |
2.0 | 17 September 2008 | Roger Kumpf | Reworked for Pegasus 2.9. Use SQLite as a database engine. |
2.1 | 22 October 2008 | Roger Kumpf | Include specific SQL schema. Address review comments. |
2.2 | 24 October 2008 | Roger Kumpf | Correct the ClassTable key definition. Detail the derivation of database file names. Add discussion items. |
2.3 | 28 October 2008 | Roger Kumpf | Introduce a SQLITE_HOME build variable. Do not commit SQLite source
code to the OpenPegasus source repository. Approved version (Ballot 159). |
Abstract: An alternate repository implementation using the SQLite database engine is introduced to improve reliability.
The Pegasus CIMRepository implementation persists its data on disk using many files. Some of the files are related, such that their contents must stay in sync. For example, the storage of instances of a particular class uses an index file and an instances file. Association data is stored in yet another file. Transaction semantics exist for the purpose of keeping an index file in sync with its instances file, but they are known to fail in some cases. No logic exists to keep the association data in sync with the class and instance data. If the file contents get out of sync, the repository becomes corrupt and unusable, with no straightforward remedy.
Completing the transaction semantics to be robust in all cases would be a significant undertaking, made even more complicated by the need for backward compatibility with the existing file layout and transaction algorithm. The proposed solution is to use an existing database engine to leverage proven transaction reliability rather than attempting to solve the problem with the existing persistent store.
A new persistent store type is added to the repository as an option, based on the SQLite database engine. SQLite is chosen for its transactional capabilities, track record of quality, serverless operation, lack of need for setup or configuration, and friendly license. In fact, SQLite has no license because the source is in the public domain.
With Bug 7902, a PEGASUS_REPOSITORY_STORE_COMPLETE_CLASSES build option was introduced. When set to false (the default), only local elements are included in the class definitions that are stored on disk. This option is important because it also removes the restriction on modification of non-leaf classes. It also reduces the repository footprint, often by over 50%. Performance may be impacted by the need to retrieve and propagate elements from superclasses when loading a class definition, but this may be mitigated somewhat by caching. Regardless of this build setting, a repository of either type continues to operate correctly with the CIMRepository implementation.
The SQLite repository implementation only stores local class definitions. That is, the PEGASUS_REPOSITORY_STORE_COMPLETE_CLASSES build option may not be set to true in this case.
A pure virtual PersistentStore class is introduced which abstracts the interface to the FileBasedStore class. A new SQLiteStore class is added which implements the PersistentStore class. The CIMRepository uses polymorphism to interact with the FileBasedStore and the SQLiteStore in the same way, allowing seamless support for both repository types. A static isExistingRepository() method is added to the FileBasedStore and the SQLiteStore to allow the CIMRepository to select the appropriate back-end store for an existing repository directory.
The SQLite database uses schema that parallels the data in the file-based store. This simplifies the implementation of the SQLiteStore interface, which matches the FileBasedStore interface. A separate database file is used for each namespace. (The database file name is derived from the namespace name by converting it to lower case, changing the '/' characters to '#', encoding the extended range characters using the CommonUTF escapeStringEncoder, and adding a '.db' suffix.)
The following tables are created in each database:
CREATE TABLE NamespaceTable( nsname TEXT PRIMARY KEY NOT NULL, shareable INTEGER NOT NULL, updatesallowed INTEGER NOT NULL, parentnsname TEXT NOT NULL, remoteinfo TEXT NOT NULL); CREATE TABLE QualifierTable( qualname TEXT NOT NULL, normqualname TEXT PRIMARY KEY NOT NULL, rep BLOB NOT NULL); CREATE TABLE ClassTable( classname TEXT NOT NULL, normclassname TEXT PRIMARY KEY NOT NULL, superclassname TEXT NOT NULL, rep BLOB NOT NULL); CREATE TABLE ClassAssocTable( assocclassname TEXT NOT NULL, normassocclassname TEXT NOT NULL, normfromclassname TEXT NOT NULL, normfrompropname TEXT NOT NULL, toclassname TEXT NOT NULL, normtoclassname TEXT NOT NULL, normtopropname TEXT NOT NULL, PRIMARY KEY(normassocclassname, normfrompropname, normtopropname)); CREATE TABLE InstanceTable( normclassname TEXT NOT NULL, instname TEXT NOT NULL, norminstname TEXT PRIMARY KEY NOT NULL, rep BLOB NOT NULL); CREATE TABLE InstanceAssocTable( associnstname TEXT NOT NULL, normassocinstname TEXT NOT NULL, normassocclassname TEXT NOT NULL, normfrominstname TEXT NOT NULL, normfrompropname TEXT NOT NULL, toinstname TEXT NOT NULL, normtoclassname TEXT NOT NULL, normtopropname TEXT NOT NULL, PRIMARY KEY(normassocinstname, normfrompropname, normtopropname));
The "normalized" fields are modified to allow matching to be done using straightforward data comparisons. This essentially means converting CIM names to lower case and using the canonical form for instance names.
Note that the NamespaceTable exists in each database, but it only contains a single row in each database. This design keeps the namespace attributes within the same table as the other data in the namespace and prevents the need to synchronize data across databases.
While the SQLiteStore implementation may change from release to release, the database schema will need to remain compatible. Therefore, it is important to give careful consideration to the database schema.
New build options are proposed as follows:
Thilo Boehm asks: 1) What will be the migration path from a file based repository to a data base based repository? 2) What will be the migration path to a next schema?
Since the PersistentStore implementations can be changed seamlessly underneath the CIMRepository, the CIMRepository itself does not really care what type of store is used. So, users of CIMRepository do not care either. Therefore, repupgrade will continue to work as before. Going forward, upgrading from a repository that stores only local class definitions, it may be possible to skip repupgrade altogether and just load the new schema on top of the existing repository.
Venkat Puvvada asks: Why are the Namespace properties remoteInfo and parentNamespace not defined to have NULL values when they are hardly used in current implementation (only when Remote CMPI is enabled)?
The parentnsname is used whenever shared schema support is enabled, not just for Remote CMPI. Null values could indeed be used there instead of empty strings. No significant advantage to either approach is apparent. An empty string value is used internally for the remoteinfo field, so using null for it in the database would require extra logic to handle it.
Venkat Puvvada asks: Why are normalized key strings required when we can make case insensitive primary keys using COLLATE NOCASE?
COLLATE NOCASE only works on ASCII characters, not the whole Unicode range. The SQLIte documentation describes it as an 'extra built-in collation sequence intended for testing purposes'. Since Unicode case-insensitive comparisons are relatively expensive, it is faster to do the case folding once up front.
Copyright (c) 2008 Hewlett-Packard Development Company,
L.P.; IBM Corp.; EMC Corporation; Symantec Corporation; The Open Group.
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation files
(the "Software"), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to permit
persons to whom the Software is furnished to do so, subject to the following
conditions:
THE ABOVE COPYRIGHT NOTICE AND THIS PERMISSION NOTICE SHALL BE
INCLUDED IN ALL COPIES OR SUBSTANTIAL PORTIONS OF THE SOFTWARE. THE SOFTWARE IS
PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.