[
¤1] Pegasus Enhancement Proposal (PEP)
[
¤2] PEP #: 299
[
¤3] Title: Support for the indication delivery retry.
[
¤4]
[
¤5] Version: 1.0
[
¤6] Created: 26th March 2007
[
¤7] PEP Type: Concept
[
¤8]
[
¤9]
Status: draft
[
¤10] Version History:
[
¤53]
[
¤54] Abstract: Add support to OpenPegasus for the indication
delivery retry as defined in DSP1054.
[
¤55]
[
¤56] Definition of the Problem
[
¤57] If a management application uses
indications to keep track of the status of a managed resource, it is
essential that all indications it has subscribed for are actually
delivered. Current behavior for indication
delivery in OpenPegasus is single delivery attempt and no persistence
of
indications which is considered unreliable. To achieve indication
delivery retry the following questions needs to be considered.
[
¤58]
- [
¤59] How to determine whether the indication delivery is successful or
not?
- [
¤60] What are the different types of listeners/handlers that are
considered for delivery retry and what are the successful delivery
mechanisms ?
[
¤61]
- [
¤62] How are the indications stored in CIMServer during delivery retry
?
[
¤63]
- [
¤64] What are the parameters defined in DSP1054 used by the
retry functionality ?
- [
¤65] What happens to the indications on the queue store when delivery
retry parameters are exceeded?
- [
¤66] Should those indications needs to be logged ?
- [
¤67] How are subscriptions handled when delivery retry parameters are
exceeded?
- [
¤68] How will all function associated with indication delivery retry
is controlled at build ?
- [
¤69] Is there a minimum or maximum delivery retry time requirement for
a failed indication?
- [
¤70] What is relative priority of delivery retry function, with
respect to other CIM server function?
- [
¤71] Is the order of delivery for indications handled via retry
function
guaranteed to be identical to the order in which they were generated?
- [
¤72] When do newly generated indications begin to get delivered if
there are undelivered indications to the same destination that are
stored in the queue?
- [
¤73] Shall CIMListener be able to handle duplicate indications ?
- [
¤74] What happens to the existing indications in the queue those are
attempted for delivery retry when the corresponding subscription has
been deleted/removed?
[
¤75]
[
¤76] Proposed Solution
[
¤77] This PEP proposes to define the
parameters involved in reliable indication delivery. Reliable
indication delivery means indications delivery within the reasonable
amount of time limits under given constraints(defined by standards). We
try to deliver the indication in the reasonable limits which are guided
by the CIM_IndicationService
class properties DeliveryRetryAttempts
and DeliveryRetryInterval
from DSP1054.
[
¤78]
[
¤79] Solutions proposed for the above
problems.
[
¤80]
- [
¤81] Protocol specifications shall define the way to determine the
successful delivery of indications.
[
¤82]
- [
¤83] The following listeners/handlers shall be considered for
indication
delivery retry.
- [
¤84] CIM-XML Handler - CIMExportIndicationResponseMessage MUST be
received by the CIMServer from CIMListener without any exception for
successful delivery.
[
¤85]
- [
¤86] Indications are stored in memory per
listener destination, based on configurable queue length.
[
¤87]
- [
¤88] Delivery retry function uses CIM_IndicationService
class properties DeliveryRetryAttempts
and DeliveryRetryInterval
from DSP1054.
- [
¤89] Indications are deleted from the
queue and are logged.
- [
¤90] Subscriptions are managed using CIM_IndicationService.SubscriptionRemovalTimeInterval
and CIM_IndicationService.SubscriptionRemovalAction
properties based on delivery failed attempts. Subscriptions have
'OnFatalErrorPolicy' property
which
can be used to manage the individual subscriptions. If
OnFatalErrorPolicy property value is 4 (Remove) then it will abide by
the CIM_IndicationService.SubscriptionRemovalAction
[
¤91]
setting and behavior. Subscription deletion can also happen
when indication
delivery has failed
to transient handlers or when they expire.
[
¤92]
- [
¤93] When indication profile support is enabled by setting
PEGASUS_ENABLE_DMTF_INDICATION_PROFILE_SUPPORT=true. And also another
config option specifying the size of delivery retry queue per
destination. If queue length is zero no retry attempt is made.
- [
¤94] DSP1054 has property CIM_IndicationService.DeliveryRetryInterval which defines minimal time interval
in seconds for the indication service to wait before delivering an
indication to a particular listener destination that previously failed.
Maximum time is not defined and it can take longer due to other
processing in CIMOM. Note that delivery retry priority is very low.
- [
¤95] The priority of the
delivery retry function will be kept low enough so as not to adversely
affect the CIM server's ability to respond to client requests.
- [
¤96] Yes.
- [
(k_schopmeyer) All existing indications for the target destination, not all existing indications.
¤97] Delivery of new Indications are queued until all the
existing Indications are either delivered or they are deleted (see
question 5 for more detail). This
guarantees the order of delivery.
- [
¤98] Yes, When a delivery retry attempt is made by the CIMServer
the value of the IndicationIdentifier property of CIM_Indication class
of the indication being delivered will be added as a new array element
in the CorrellatedIndications property. This marks this delivery as a
transmission retry. CIMListener should check for CorrellatedIndications
array(this idea is already accepted in DMTF) to determine whether
the indication is being retried or not and identify duplicates.
[
¤99]
- [
¤100] Indications are discarded and deleted from the queue.
[
¤101]
Design of IdenticationIdentifier:
[
¤102]
[
¤103]
It is recommended that Provider sets the IndicationIdentifier unique
and IndicationTime. If provider is not setting the
IndicationIdentifier or its value is
NULL, it is difficult to identify the duplicate indications and it is
the current limitation. Discussion is going on design of
IndicationIdentifier in DMTF at present. Delivery retry support is
controlled by the build option
PEGASUS_ENABLE_DMTF_INDICATION_PROFILE_SUPPORT and implementation may
choose to turn off this feature if provider's are not at all capable of
setting the IndicationIdentifier itself. Clients may also choose to
employ other mechanisms to compare the indications, for example it may
check for all properties present in the indication instance if
IndicationIdentifier is not available.
[
¤104]
[
¤105]
References
[
¤106]
- [
¤107] Refer to DSP0107 CIM Indications (Events) White Paper
[
¤108]
- [
¤109] Refer to draft DSP1054 Indications Profile for more information.
[
¤110]
[
¤111] Future work/ideas
- [
¤112] Pegasus needs to be enhanced according to the design on
IndeicationIdentifier once it is finalized in DMTF.
[
¤113]
- [
¤114] Consider the following listeners/handlers for future work.
- [
¤115] SNMP Handler.
[
¤116]
- [
¤117] EMAIL Handler.
- [
¤118] Syslog destination Handler.
- [
¤119] Consumer providers residing in
the CIMServer.
- [
¤120] How are the indications persisted in CIMServer?
- [
¤121] If the secondary store is used for the persistence of
indications, what is the organization of the secondary store? 1
file
per subscription, 1 file per OS instance, or something else?
- [
¤122] How to control the overall size of the secondary store?
Build-time or run time or both?
- [
¤123] When should the indication with delivery fail be written to
secondary store?
- [
¤124] Does indications persists across CIMServer restarts ?
- [
¤125] Do we need delivery retry parameters (Ex. delivery retry
attempts/interval, persistence storage size etc...) per
subscription
basis?
- [
¤126] Do we need to consider the reliability of CIM Listener side to
deliver the indications (got from the CIMOM) to some receiving
application?
- [
¤127] Should the CIM server generate an indication every time it starts
?
- [
¤128] Should the CIM server generate heartbeat indications for the
client applications to know that indication delivery has been
interrupted?
- [
¤129] Does the provider need some kind of reliable indicator when an
indication
was accepted for delivery? Like, first persist the indication before
returning success to the provider?
[
¤130]
Consider various levels of reliability as follows
- [
¤131] 0 = single (exactly 1) delivery attempt, no retry, no persistence
- [
¤132] 1 = delivery retry per Indication Profile params, no persistence
across server starts, log the delivery failed indications, delete on
successful delivery or when delivery params exceeded.
[
¤133]
- [
¤134] 2 = delivery retry per Indication Profile params, persistence
across server starts, log the delivery failed indications, delete on
successful delivery or when delivery params
exceeded.
- [
¤135] 3= delivery retry
happens as per Indication Profile prams, persistence
across server starts, persist elsewhere on successful delivery.
[
¤136] Discussion
(r_kumpf) How important is
it to prevent the same indication from being delivered to the same
listener multiple times?
[
¤137]
(venkat_puvvada) DSP1054 says
IndicationIdentifier of CIM_Indication class shall provide uniqueness
to identify possible duplication indictaions those happen during
CIMServer attempts for delivery retry.
[
¤138]
[
¤139]
(r_kumpf) This definition is
specific to the CIM-XML protocol, so it is insufficient. It also raises
questions about multiple delivery of the same indication and
possibly significant extra overhead for delivery retry if it can be
determined that delivery will never be successful.
[
¤140]
(venkat_puvvada) CIMListener shall
be able to distinguish duplicate indications. We can consider the
following classification for successful indication delivery.
[
¤141]
a. CIM-XML handler: We consider
successful delivery when listener sends back
CIMExportIndicationResponseMessage without any exception. Predicting
the indication delivery that would be never successful, for example a
permenant failure like incorrect hostname present.
[
¤142]
b. Email handler: Unknown at
moment, need to discuss and elaborate with current users.
[
¤143]
c. SNMP handler: Unknown at
moment, need to discuss and elaborate with current users.
[
¤144]
[
¤145]
Discussion on
future items
[
¤146]
[
¤147]
(r_kumpf) What is the
expected/desired result when the CIM Server is not running? A provider
will not generate indications during the time it is not running. Can
the stated requirement, 'it is essential that all indications it (a
management application) has subscribed for are actually delivered,' be
met in that case? I'm struggling to understand why it is interesting to
persist indications across a CIM Server restart, when indications can
be lost during that time anyway. A CIM Server crash would also
presumably lose the indications since the persistence could would not
be invoked. What, specifically, is the value of trying to persist
indications across CIM Server restarts?
[
¤148]
(venkat_puvvada) The assumption is
that the CIMOM is a service that runs forever, so we are not trying to
solve the case for when it is not running. However, the CIMOM is taken
off-line during reboots, sometimes when applications are
adding/removing providers, or when an unexpected failure (crash)
occurs. Indication delivery is a more complex scenario when compared to
normal get and enum operations where the client can simply retry. In
this case, each layer of the stack may work normally, but due to some
external issue (network, crash, reboot, etc) , the indication is lost.
And, since indications usually communicate an event of interest, the
CIMOM should taken extra precautions to detect failures and reduce
indication loss within the stack.
[
¤149]
[
¤150]
Discussion on
0.3 version
[
¤151]
[
¤152]
(marek_szermutzky) I guess queue
length means number of possible indication entries. How does a systems
administator know what implications on memory and CPU usage a change to
this configuration has ?
[
¤153]
(venkat_puvvada) It depends upon
the number of destinations that CIMServer has been attempting the
delivery retry. Its implementation specific and can be discussed how to
provide delivery retry statistics.
[
¤154]
[
¤155]
(r_kumpf) What happens when an
indication is generated and the delivery retry queue is full? Is the
indication at the head of the queue discarded or the newly generated
one?
[
¤156]
(venkat_puvvada) Indication at the
head of the queue is discarded.
[
¤157]
(k_schopmeyer) Is there a log
entry for this?
[
¤158]
(venkat_puvvada) yes, indications
are logged.
[
¤159]
[
¤160]
(marek_szermutzky) Who
(server, provider, client) will generate the identifier ? How
(algorithm used for uniqueness, what grade of uniqueness) will the
identifier be generated ? What grade of uniqueness is required ? I
think unique on a specific CIM server should suffice, i.e. a guaranteed
unique number(atomic count) not overflowing within
'SubscriptionRemovalTime Interval'.
[
¤161]
(venkat_puvvada) Provider should
maintain the IndicationIdentifier unique. Construction algorithm for
IndicationIdentifier is defined in CIM_Indication class definition.
Yes, provider should not use same IndicationIdentifier within
SubscriptionRemovalTimeInterval.
[
¤162]
There has been lot of discussion going on in DMTF on design of
IndicationIdentifier and time and context of maintaining the
indicationIdentifier unique. For intial implementation we can consider
the above factors.
[
¤163]
(k_schopmeyer) Sadly, it is
probably more complex than simply a counter. We have to account for
provider restart somewhere and the provider must be capable of knowing
the identifier (the case of correlating indications). At this point we
don't have to account for server restart because I am assuming that
there are no delivery retries through server restart. They are all
dropped so that there are no retries through a server restart. I would
assume that we are going to have to do something like a two part id
where the provider gets some initial part when starting or on request
and then can add additional uniqueness for its indications with and
additional component (ex. incrementing integer). The DMTF is trying to
come up with a definition for the version 1.1.0 version of the
indication profile now.
[
¤164]
[
¤165]
[
¤166] Copyright (c) 2008 Hewlett-Packard Development
Company,
L.P.; IBM Corp.; EMC Corporation; Symantec Corporation; The Open Group.
[
¤167]
[
¤168]
Permission is hereby granted, free of charge, to any
person
obtaining a copy of this software and associated documentation
files
(the "Software"), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge,
publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit
persons to whom the Software is furnished to do so, subject to the
following
conditions:
[
¤169]
[
¤170]
THE ABOVE COPYRIGHT NOTICE AND THIS PERMISSION NOTICE
SHALL BE
INCLUDED IN ALL COPIES OR SUBSTANTIAL PORTIONS OF THE SOFTWARE. THE
SOFTWARE IS
PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
IN THE
SOFTWARE.