Pegasus Enhancement Proposal (PEP)
PEP#: 360
PEP Type: Functional
Title: Provider life cycle indications.
Status: Approved.
Version History:
Version |
Date |
Author |
Change Description |
0.1 |
25 Apr 2011 |
Ajay Rao, Ashok
Pathak, Devchandra Leishangthem |
Initial Submission |
1.0 |
12 May 2011
|
Venkat Puvvada |
Added implementation details
|
1.1
|
20 May 2011
|
Venkat
Puvvada
|
Added lifecycle indications
during CIMServer start/stop,
ballot version.
|
1.2
|
26 Sep 2011
|
Venkat
Puvvada
|
Added implementation details for
the guaranteed
indication delivery
during the CIMServer shutdown,
ballot version.
|
Abstract: Add Support for Provider (Management
Instrumentation) life cycle
indications to OpenPegasus.
Definition of the Problem
Whenever
out-of-process provider crashes during the processing of the request,
the client(s) gets CIMException with error message "Lost connection
with cimprovagt <provider-module-name>". Non indication providers
are automatically started/loaded for the next client request. If an
indication provider crashes, it is automatically started with
subscriptions enabled until restarts exceeds maxFailedProviderModuleRestarts value. If automatic
restarts of the indication provider exceeds maxFailedProviderModuleRestarts then the
indications provided by it is no longer
available, and the consumer
is under a false impression that the provider is
running properly. This may result in loss of critical events for
client/listener. A better solution is needed to identify the faulty
providers and automatically inform the clients to take corrective
action.
Proposed Solution
This PEP proposes the solution to add the
provider life cycle indications to OpenPegasus by reporting the
modifications to the PG_ProviderModule instance.The following
are the life cycle indications proposed for the provider module. Some
life cycle indications can only be provided when provider is running
out-of-process (OOP).
- ProviderModule creation
- The Provider module
instance (PG_ProviderModule) is created.
- ProviderModule modification
- Change in Group Name. OOP only.
- Change in OperationalStatus
to Disabled(Stopped)
- Change in OperationalStatus
to Enabled (Started)
- Change in OperationalStatus
to Degraded state. OOP only.
- ProviderModule deletion:
- The Provider Module instance
(PG_ProviderModule) is deleted.
- ProviderModule failure. OOP only.
- When a non Indication
provider or an inactive Indication Provider
has crashed.
- ProviderModule restarted
automatically. OOP only.
- Provider module with
indication provider(s) restarted automatically as indicated by
maxFailedProviderModuleRestarts config option.
- Provider is added to the
ProviderModule
- Provider instance
(PG_Provider) is
added/registered to the provider module.
- Provider is removed from the
ProviderModule:
- Provider
instance (PG_Provider) is removed/deleted/unregistered from the
provider
module.
- ProviderModule
enabled due to CIMServer start/restart.
- ProviderModule
disabled due to CIMServer shutdown.
To
enable
support
for
provider life cycle indications, the following class
PG_ProviderModulesInstAlert is
added to the root/PG_Interop namespace. Clients wishing to receive the
provider life cycle indications can subscribe using the
PG_ProviderModulesInstAlert class in filter's query. Clients
should set CIM_IndicationFilter.SourceNamespace property value as
"root/PG_Interop". Clients are free to
create the subscription in any namespace.
ProviderRegistrationProvider acts as indication provider for
PG_ProviderModulesInstAlert class. Note that this is the
first
time that the OpenPegasus is going to supporting the indications itself
without
loading the
user registered providers.
[Version("2.12.0"),
Description ( "PG_ProviderModulesInstAlert "
"notifies creation or deletion or modification of "
"PG_ProviderModule instances.")]
class PG_ProviderModulesInstAlert : CIM_AlertInstIndication {
[Required, Description(
"An enumerated
value that describes the probable cause of "
"the situation
which resulted in the PG_ProviderModulesInstAlert." ),
ValueMap { "1", "2", "3", "4", "5", "6", "7",
"8", "9",
"10", "11", "12", "13", "14", "15..255" },
Values { "Unknown", "Other", "Provider module
created",
"Provider module deleted", "Provider module enabled",
"Provider module disabled", "Provider module degraded",
"Provider module with no active indication subscriptions"
" failed/crashed",
"Provider module with active indication subscriptions"
" restarted automatically after failure/crash",
"Provider module group changed",
"Provider is added to the provider module",
"Provider is removed from the provider module",
"Provider module enabled due to CIMServer start/restart",
"Provider module disabled due to CIMServer shutdown",
"Pegasus Reserved"}]
uint16 AlertCause;
[Required,
Description (
"Provider module
instances for the corresponding alert type. "
"There can be more
than one provider module instance if the "
"AlertCause cause
value is either \"13\" or \"14\"."),
EmbeddedObject]
string ProviderModules[];
[Description ("Name of the provider if the
AlertCause cause value is "
"either \"11\" or \"12\"."),
ModelCorrespondence {
"PG_Provider.Name"} ]
string ProviderName;
};
Implementation details.
ProviderRegistrationProvider, ProviderManagerService,
ProviderRegistrationManager and
IndicationService facilitates
generation of provider life cycle Indications.
PG_ProviderModulesInstAlet.ProviderModules property of the indication
instance
is populated with changed PG_ProviderModule instances.
IndicationService
modification
IndicationService handles the subscriptions differently when the
filter's query class name contains the classes serviced by internal
control providers.
Since repository does not have any information about the internal
control providers, IndicationService will create the provider module
and provider instances for control providers( those are enabled
for indications) dynamically
during the CIMServer startup and uses them to dispatch
create/delete/modify
subscription requests to control providers. These instances are not
stored in the repository. Currently only one registration is
created for the ProviderRegistrationProvider for the class PG_ProviderModulesInstAlert. Generic
infrastructure is provided such that any control provider can become
indication provider by extending CIMIndicationProvider interface. For
control providers, indication related requests/responses are routed
through ModuleController instead of ProviderManagerService.
ProviderRegistrationManger
modification
Since both ProviderRegistrationProvider and
ProviderManagerService uses ProviderRegistrationManager for the
provider related updates, event alert logic is added to the
ProviderRegistrationManger. ProviderRegistrationManager acts as
instrumentation layer for ProviderRegistrationProvider.
ProviderRegistrationProvider
& ProviderManagerService modification
ProviderRegistrationProvider is modified to generate indications by
implementing CIMIndicationProvider interface.
ProviderRegistrationManager notifies ProviderRegistrationProvider
about the changes in provider/provider module using a callback method
and ProviderRegistrationProvider generates indications if there are any
active subscriptions. ProviderManagerService
is modified to notify changes in provider/provider module using
ProviderRegistrationManager whenever provider has crashed or an
indication provider restarted
automatically.
Only one indication is generated containing all provider module
instances during the CIMServer start and shutdown. CIMServer waits
until the 'provider modules disabled
due to CIMServer shutdown' indication is delivered during
the
shutdown.
Indication
delivery during the CIMServer shutdown
The current
cimserver infrastructure does not guarantee
indication delivery during the cimserver shutdown. Indications in the
HandlerService queues are discarded and logged during the shutdown.
Indication delivery API which supports OperationContext is enhanced to
wait for the indication delivery status. OperationContext supports the
TimeoutContainer which is used to specify the operation timeout for a
particular operation. ProviderRegistration control provider uses
the TimeoutContainer to deliver the indication with the specified
timeout and waits for the delivery status. This is undocumented for
external providers. This behavior can be applied to external providers
also without breaking them. This PEP does not propose this behavior to
be exposed to external providers.
Note that there is no clear way
of knowing whether the indication
delivery was successful or not with the proposed enhancement. If the
API returns before the specified timeout, then it can be safely
assumed that the indication delivery was successful. This proposal does
not guarantee that indication is always delivered during the CIMServer
shutdown. The timeout proposed here is the default CIMServer shutdown
timeout (20 seconds). CIMServer may get killed if the shutdown timeout
expires while waiting for the indication to be delivered (ex. waiting
for the delivery retry). During the normal shutdown this should always
work. For example the following code can be used in C++ default
provider to deliver the indication and wait for its delivery status.
OperationContext context;
//
Default 20 seconds timeout to deliver the indication.
context.insert(TimeoutContainer(20 * 1000));
_indicationResponseHandler->deliver(context, indication);
Testcases are added to test the proposed functionality.
Future
work
- Enhance the indication delivery API to report the delivery status
either synchronously or asynchronously with the proper error message in
case of delivery failure or indication was discarded due to some other
reasons.
Schedule
Available in Pegasus 2.12.0
Discussion
(r_kieninger) I suggest to add
another two indications: - 'Provider Module disabled due to CIM Server
shutdown' - 'Provider Module restarted after CIM Server restart'. This
would allow the listeners to detect situations where they did not
receive indications because the CIM Server was unavailable, even when
it had died unexpectedly.
(venkat_puvvada) Suggestion
incorporated.
(thilo_boehm) Could you please
provide some subscription and filter example(s) ?
(venkat_puvvada) Filter query
example: SELECT * FROM PG_ProvoderModulesInstAlert WHERE AlertCause =
4; Filter's SoucreNamespace property should have root/PG_Interop Value.
Subscription can be created using filter and handler.
Copyright (c) 2011
IBM Corp;
Permission is hereby granted, free of charge, to any person
obtaining a copy of
this software and associated documentation files (the "Software"),
to deal in
the Software without restriction, including without limitation the
rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or
sell copies of
the Software, and to permit persons to whom the Software is
furnished to do so,
subject to the following conditions:
The above copyright notice and this permission notice shall be
included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF
OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Template last modified: February 17th
2009 by Martin Kirk
Template version: 1.15