[ ¤1] Pegasus Enhancement Proposal (PEP)

[ ¤2] PEP #: 324

[ ¤3] PEP Type: Functional
[ ¤4]

[ ¤5] Title: DMTF Indications Profile (DSP1054) Implementation, stage 2.
[ ¤6]

[ ¤7] Version: 0.8
[ ¤8]

[ ¤9] Created: 11th April 2008

[ ¤10] Authors: Venkateswara Rao Puvvada
[ ¤11]

[ ¤12] Status:  draft

[ ¤13] Version History:

[ ¤14] Version [ ¤15] Date [ ¤16] Author [ ¤17] Change Description
[ ¤18] 0.1 [ ¤19] 11th April 2008
[ ¤20]
[ ¤21] Venkat Puvvada
[ ¤22]
[ ¤23] Initial Submission
[ ¤24] 0.2
[ ¤25]
[ ¤26] 21st April 2008
[ ¤27]
[ ¤28] Venkat Puvvada
[ ¤29]
[ ¤30] Added more design details, removed
[ ¤31] CIM_IndicationServiceSettingData class, removed modification to CIM_IndicationService
[ ¤32] and CIM_IndicationServiceCapability classes.
[ ¤33]
[ ¤34] 0.3
[ ¤35]
[ ¤36] 1st May 2008
[ ¤37]
[ ¤38] Venkat Puvvada
[ ¤39]
[ ¤40] Full rewrite using implementation experience, Removed indication persistence
[ ¤41]
[ ¤42] 0.4
[ ¤43]
[ ¤44] 5th May 2008
[ ¤45]
[ ¤46] Venkat Puvvada
[ ¤47]
[ ¤48] Decided to move retry logic from IndicationService to HandlerService, Disabling/removing
[ ¤49] subscriptions does not affect indications the retry queue.
[ ¤50]
[ ¤51] 0.5
[ ¤52]
[ ¤53] 8th May 2008
[ ¤54]
[ ¤55] Venkat Puvvada
[ ¤56]
[ ¤57] Added RetryThread Algorithm
[ ¤58]
[ ¤59] 0.6
[ ¤60]
[ ¤61] 15th May 2008
[ ¤62]
[ ¤63] Venkat Puvvada
[ ¤64]
[ ¤65] Modified RetryThread Algorithm, decided to retry the indication for DeliveryRetryAttempts +1 times when
[ ¤66] the indication was not attempted for initial delivery , because retry queue already exists.
[ ¤67]
[ ¤68] 0.7
[ ¤69]
[ ¤70] 19th May 2008
[ ¤71]
[ ¤72] Venkat Puvvada
[ ¤73]
[ ¤74] Added flowchart/ design picture
[ ¤75]
[ ¤76] 0.8
[ ¤77]
[ ¤78] 14th July 2009
[ ¤79]
[ ¤80] Venkat Puvvada
[ ¤81]
[ ¤82] Full  rewrite using approved concept PEP 299
[ ¤83]

[ ¤84]

[ ¤85] Abstract: This PEP implements indication delivery retry using CIM_IndicationService DeliveryRetryAttemts and DeliveryRetryInterval properties when indication delivery has failed because of 'temporary' errors in the protocol.


[ ¤86] Definition of the Problem

Today indications which are failed to deliver at first attempt, never be delivered even if only a temporary error occurs.
[ ¤87]
[ ¤88] Currently when indication delivery has failed we trace a message and ignore the indication. We don't attempt to retry those delivery failed indications. We need to have a mechanism to avoid the loss of indications during the temporary network problems. DSP1054 Central class CIM_IndicationService has the properties DeliveryRetryAttempts and DeliveryRetryInterval  which defines how many retry attempts should be made with retry interval before discarding the indication.
[ ¤89]
[ ¤90] PEP 323 implements all the mandatory classes from DSP1054. As part of PEP 323 CIM_IndicationService is implemented and the properties DeliveryRetryAttempts and DeliveryRetryInterval are not considered to implement the delivery retry.
[ ¤91]
[ (k_schopmeyer) I am not sure what you mean by this comment. If you mean that we will want to save to disk, keep undelivered queue on disk, etc. That may be a bit extreme and at least MUST be considered an option. Would like to discuss this one at 16 July call.
(venkat_puvvada) I mean its a queue in memory in this PEP. Storing to disk can be considered later.
¤92]
When indications are delivered frequently we need to have a mechanism to store these indications before discarding them.
[ ¤93]
[ ¤94]
Proposed Solution
[ ¤95]

[ ¤96]
This PEP proposes to retry the delivery failed  indications a dedicated amount of times. This means indications will be delivered if only  a temporary error broke the delivery attempt. This PEP proposes the solution for improving the protocol so that deliveries can be accomplished in case of 'temporary' errors in the protocol. We try to deliver the indication in the reasonable limits which are guided by the CIM_IndicationService class properties DeliveryRetryAttempts and DeliveryRetryInterval. Indications are discarded and logged when max DeliveryRetryAttempts exceeds for a particular indication.
[ ¤97]
[ ¤98] Note: This PEP proposes solution for CIMXMLIndication listener destinations only. Solution for other types of listener destinations like Server based Consumer providers, SNMP handlers and Email handlers destinations are not implemented as part of this PEP.
[ ¤99]
[ ¤100] Terms used:
[ ¤101]
[ ¤102] DestinationQueue :
Queue maintained for each ListenerDestination to store the indications to be delivered/retried.
[ ¤103] DeliveryRetry : Indication is being retried for delivery after CIM_indicationService.DeliveryRetryInterval.
[ ¤104] DestinationQueueTable: A hash table consists of all DestinationQueues
[ ¤105] DispatcherThread : Thread which monitors all DestinationQueues in DestinationQueueTable and attempts DeliveryRetry using dedicated thread pool.
[ ¤106]
[ ¤107] Brief summary of the proposed solution.
[ ¤108]
  1. [ ¤109] Use CIM_IndicationService class properties DeliveryRetryAttempts and DeliveryRetryInterval to implement delivery retry in case of temporary failures for CIM-XML Handlers.
  2. [ ¤110] Indication delivery will be considered successful  only when CIMExportResponseMessage is received from CIMListener without any exception.
  3. [ (k_schopmeyer) Note that this is still at the proposal stage in the DMTF and the way things are going may not even make our 2.10 cutoff. Thus we have to be careful with this setting property in correlation array function since we probably should find a way to make it optional code until the proposal becomes a DMTF spec.
    (venkat_puvvada) Sure, it can be made optional. I will add new build option like PEGASUS_USE_CORRELATEDINDICATIONS_FOR_DELIVERYRETRY
    ¤111]
    When a delivery retry attempt is made, the value of the IndicationIdentifier property of CIM_Indication class of the indication being delivered will be added as a new array element in the CorrellatedIndications property. This marks this delivery as a transmission retry. CIMListener should check for CorrellatedIndications array(this idea is already accepted in DMTF) to determine whether  the indication is being retried or not  and identify duplicates. If IndicationIdentifier is null or not present, null value is added to CorrellatedIndications array.
  4. [ ¤112] IndicationService sends message to HandlerService when subscription is not active. HandlerService deletes and logs the matched indications for the subscription from the DestinationQueue.
  5. [ ¤113] Modify the CIMHandler interface to return the proper error codes instead of throwing the exceptions which helps in identifying the temporary failures
    [ ¤114] and attempt for delivery retry.
    [ ¤115]
Functionality not implemented as part of this PEP.
[ ¤116]
[ ¤117] 1.This PEP does not propose solution for other types of handlers like  SNMP handlers and Email handlers etc..
[ ¤118] 2. The properties CIM_IndicationService.SubscriptionRemovalTimeInterval and CIM_IndicationService.SubscriptionRemovalAction are not used as part
[ ¤119]     of this PEP. Default value of CIM_IndicationService.SubscriptionRemovalAction is 'Ignore'. Subscriptions will not be removed/disabled when
[ ¤120]     CIM_IndicationService.SubscriptionRemovalTimeInterval expires. There is no change to the current behavior.
[ ¤121]
[ ¤122] Current behavior
[ ¤123]
[ ¤124] The following sequence of operations happens during the indication delivery at present.
[ ¤125]
  1. [ ¤126] Indications in the form of CIMHandleIndicationRequestMessage comes to HandlerService from IndicationService. Indications are sent using SendAsync() method with callback from IndicationService.
    [ ¤127]
  2. [ ¤128] HandlerService  loads appropriate Handler and gives the indication to the Handler for the delivery.
  3. [ ¤129] Handler(CIM-XML, SNMP etc...) tries to send the indication and returns the status of delivery to the HandlerService through CIMException.
  4. [ ¤130] HandlerService constructs CIMHandleIndicationResponseMessage and adds the exception if delivery has failed. Response is sent to IndicationService.
    [ ¤131]
  5. [ ¤132] IndicationService receives CIMHandleIndicationResponseMessage through callback method. Nothing is done, adds a trace message if indication delivery has failed and destructs the response.
    [ ¤133]
Proposed implementation
[ ¤134]

[ ¤135]
Thread and Queue model
[ ¤136]
[ ¤137] The major goal of  thread and queue model is that  there should not be any impact on indication delivery to the destinations which are UP and running and are able to receive the indications without any problems.
[ ¤138]
[ ¤139]
[ ¤140]
[ ¤141] The following sequence of operations happens during the indication delivery after the proposed solution implemented.
[ ¤142]
  1. [ ¤143] Indications in the form of CIMHandleIndicationRequestMessage comes to HandlerService from IndicationService. Indications are sent using SendForget().
  2. [ ¤144] HandlerService checks if the DestinationQueue for the destination of indication exists. If DestinationQueue exists indication is enqueued onto the DestinationQueue, else goto step 3.
  3. [ ¤145] HandlerService  loads appropriate Handler and gives the indication to the Handler for the delivery.
  4. [ ¤146] Handler(Currently only CIM-XML) tries to send the indication and returns the status of delivery to the HandlerService.
  5. [ ¤147] If indication delivery is not successful , HandlerService creates new DestinationQueue and enqueues the indication onto the DestinationQueue.
    [ ¤148]
Note 1: Indications those are enqueued onto DestinationQueue at step 2 and step 5 will be delivered later by DispatcherThread using CIM_IndicationService properties DeliveryRetryAttempts and DeliveryRetryInterval. DispatcherThread monitors all the queues in the DestinationQueueTable and gets eligible indications from the queues, puts them in DeliveryQueue and starts worker threads from the DeliveryThreadPool. Worker threads in DeliveryThreadPool gets the indications from the DeliveryQueue and starts delivering the indications. Worker threads updates the delivery status of the queues directly. Details are explained later in this PEP.
[ ¤149] Note 2:
Indications are sent using SendForget() from IndicationService to HandlerService. At present IndicationService does not take any action if indication delivery fails when response arrives through callback method from HandlerService. In future enhancements, HandlerService sends message to IndicationService to reconcile SubscriptionRemovalAction which implements the subscription's onFatalErrorpolicy. This will greatly reduces burden on metadispatcher to route the responses back to IndicationService and improves the performance.
[ ¤150]
[ ¤151]
DeliveryRetry
[ ¤152]

[ ¤153]
When delivery of indication has failed for a particular ListenerDestination,  the unsent indications will be queued-up per ListenerDestination basis. All ListenerDestinations will have a DestinationQueue. DestinationQueue is created for the ListenerDestination when the first time indication delivery has failed to this destination. DestinationQueue is a list (C++ class which have methods to insert/delete the indications to/from queue), can grow dynamically whose size is determined by  indicationDeliveryQueueSize config property. This is a new static config property introduced with this PEP. DestinationQueueTable, a hash table that consists of all DestinationQueues . Handler name (String) is used as key for the lookup of the queue in the table.
[ ¤154]
[ (k_schopmeyer) The trace is primarily a development tool. Should we not be logging something when we throw indications away. The original use of the discarded data was for 'abnormal' discards, those things that were probably due to pegasus problems. This is a normal event, queue-too-big, discard.
(venkat_puvvada) Yes, i agree. Discarded indications will be just logged.
¤155]
First delivery failure of any indication will start a DispatcherThread which monitors the all retry queues in the table. DispatcherThread will monitor all the DestinationQueues and attempts DeliveryRetry according to the DeliveryRetryAttempts and DeliveryRetryInterval properties of CIM_IndicationService instance.  When indications count in the DestinationQueue exceeds  indicationDeliveryQueueSize, older indications will be removed from the front of  DestinationQueue and new indication will be added to the back of the DestinationQueue. Discarded indications are logged in trace under DiscardedData component.
[ ¤156]
[ ¤157] If DeliveryRetry fails indication is inserted at the front of the queue if queue is not full otherwise indication is discarded. When  new indication added to the queue and queue is full ,indication at the front of the queue will be removed and new indication is added at the back of the queue.

[ ¤158]
[ (k_schopmeyer) Since we are now going to have a mechanism that uses memory to store data for possibly long periods of time, can cause log entries when indications are discarded, and also is going to ask the adminstrator to set config variables, I think we are going to have to have some tools so that the admin can figure out what is happening. Are there indications in retry, how many, how long, possibly which destinations, what are the high-water marks, etc. Without this type of information, the admin will not really understand when his server develops memory issues because of large numbers of retries in queue and will not have any real clue how to set the config variables.
(venkat_puvvada) Yes , i will add class like PG_IndicationDeliveryQueue, which will have properties like, name, size, creation time, last delivery time, number of indications discarded, number of indications successfully delivered. User can enumerate instances of PG_IndicationDeliveryQueue and check for number of delivery queues and their status.
¤159]
If there are 'n' DestinationQueues and if DeliveryRetry was successful for particular DestinationQueue, we will not continue to deliver all the indications from the same DestinationQueue  instead we continue iterating all DestinationQueues trying to deliver the indications and comeback to successful DestinationQueue. This is iterative approach and allows all the DestinationQueues to get the same priority when attempting DeliveryRetry, it also solves the problem where consumer is too slow to receive the indications continuously without any time delay and if there are many Destaintion queues, if suddenly network comes up this won't cause spike of activity. When indications count in DestinationQueue becomes zero, DestinationQueue will be removed from DestinationQueueTable. If  DestinationQueueTable is empty DispatcherThread will exit.
[ ¤160]
[ ¤161]
Overview of how delivery retries are attempted.
[ ¤162]
[ (k_schopmeyer) At this point, we are getting to where we will have a number of different 'scheduled' thread mechanisms between a) provider unload, pull operaitons timer timers, etc. and I wonder if it is not time to define a simple scheduler instead of everybody doing their own thread,wait mechanism. This should not be too difficult, one thread to run the scheduler and an api to enter new timed events in the scheduler.
(venkat_puvvada) With the proposed solution DispatcherThread is only created when there are delivery failures and thread automatically terminates when there are no indications to be delivered. Having a scheduler is nice idea, we can definitely have it proposed and discussed in a separate PEP.
¤163]
DispatcherThread once started actually sleeps for RETRY_THREAD_WAIT_TIME before monitoring the queues. Default value is 100 milli seconds and this value is configurable during build time. Each queue has lastDeliveryTime which is used to find out whether the the DeliveryRetryInterval  time has expired for the queue. DispatcherThread acquires the read lock on DestinationQueueTable, removes one indication from each DestinationQueue in DestinationQueueTable whose DeliveryRetryInterval expired or last delivery result for this queue is successful, enqueues them into the DeliveryQueue and starts worker threads in DeliveryThreadPool. Max delivery worker threads are 5 which is configurable during build time. During this process it also collects the idle queues whose size is zero and idle without any pending operations and lock is released on the DestinationQueueTable. Delivery worker threads delivers the indications from DeliveryQueue  and updates the queue status with the status of delivery. If there are any queues which are idle , DispatcherThread obtains the WriteLock on the DestinationQueueTable and deletes the idle queues.
[ ¤164]
[ ¤165] Server behavior from a client/listener perspective
[ ¤166]

[ ¤167] What changes from a client perspective with this PEP implemented is nothing, but that on temporary errors(caused by network, crash of the Listener etc.) the CIM Server will try to redeliver an indication a dedicated number of times in intervals of a given minimum time. Thus the client can trust the CIM Server to retry sending the "seeming lost" indication for a time of at least "DeliveryRetryAttempts " multiplied with "DeliveryRetryInterval".  When a delivery retry attempt is made by the CIMServer the value of the IndicationIdentifier property of CIM_Indication class of the indication being delivered will be added as a new array element in the CorrellatedIndications property. This marks this delivery as a transmission retry. CIMListener should check for CorrellatedIndications array(this idea is already accepted in DMTF) to determine whether  the indication is being retried or not  and identify duplicates.It is recommended that Provider sets the IndicationIdentifier unique and IndicationTime. If provider is  not setting the IndicationIdentifier  or its value is NULL, it is difficult to identify the duplicate indications and it is the current limitation. Discussion is going on design of IndicationIdentifier in DMTF at present. Delivery retry support is controlled by the build option PEGASUS_ENABLE_DMTF_INDICATION_PROFILE_SUPPORT and implementation may choose to turn off this feature if provider's are not at all capable of setting the IndicationIdentifier itself. Clients may also choose to employ other mechanisms to compare the indications, for example it may check for all properties present in the indication instance if IndicationIdentifier is not available.
[ ¤168]
[ ¤169]
Assuming that there is enough indicationDeliveryQueueSize, CIMServer can deliver the delivery failed indications  to the client/listener without being lost in the minimum time  interval of (DeliveryRetryAttempts * DeliveryRetryInterval).
[ ¤170]
[ ¤171]
CIM_IndicationService.DeliveryRetryAttempts
[ ¤172]
[ ¤173] The DeliveryRetryAttempts property defines the number of times that the indication service will try to deliver an indication to a particular listener destination. This value does not include the original delivery attempt.
[ ¤174]
[ ¤175] Default value : 3
[ ¤176] Modifiable : No.
[ ¤177]
[ ¤178] Note: PEP 323 proposed the default values for the properties of CIM_IndicationService and CIM_IndicationSeerviceCapabilities instances. Those are remain unchanged except the above mentioned CIM_IndicationService.DeliveryRetryAttempts property.
[ ¤179]
[ ¤180] Note 1: CIMServer memory consumption may go up if indicationDeliveryQueueSize is huge and there are many DestinationQueues. Maximum allowed queue size is 200(this value can be discussed), default value is 50.
[ ¤181]
Note 2:
  If indicationDeliveryQueueSize is zero, no retry attempts are made.
[ ¤182]
[ ¤183] Indication Information
[ ¤184]
[ ¤185] Indications are stored in the DestinationQueue in the form of IndicationInfo class. IndicationInfo class will have the following properties. (See more information in "New Classes added" section below)
[ ¤186]
[ ¤187]     CIMInstance _indication; - Indication to be delivered
[ ¤188]     CIMInstance _subscription; - Subscription to which indication matched
[ ¤189]     OperationContext _context; - OperationContext for the delivery of indication.
[ ¤190]     String _nameSpace; - Namespace from where indication originated
[ ¤191]     DestinationQueue *_queue; - DestinationQueue pointer to which this indications belongs to, this is used to route the indications to appropriate queue.
[ ¤192]     Uint16 _deliveryAttemptsMade; - Number of delivery attempts made.
[ ¤193]
[ ¤194]
Algorithms
[ ¤195]
[ ¤196]
Note: See "New Classes Added " section below for more details on DestinationQueue and IndicationInfo classes and "HandlerService changes" section for changes made to HandlerService
[ ¤197]
[ ¤198]
Algorithm 1:  Lookup, creation of DestinationQueue and  adding the indication to the DestinationQueue.
[ ¤199]
[ ¤200] (1) Indication arrives to HandlerService from IndicationService
[ ¤201] (2) Acquire  ReadLock on  DestinationQueueTable
[ ¤202] (3) IF (DestinationQueue exists for this ListenerDestination)
[ ¤203]          (a) Put the indication at the back of the queue using the method DestinationQueue::insertBack() (See the algorithm 2 below for this method).
[ ¤204]     ELSE
[ ¤205]          (a) Try to deliver the indication by loading the appropriate Handler (CIM-XML, SNMP etc...)
[ ¤206]          (b)  IF (Indication delivery is successful)
[ ¤207]                     (aa) Delete the indication.
[ ¤208]               ELSE
[ ¤209]                     IF (Destination is CIM-XML Handler type)
[ ¤210]                          (aa) Create DestinationQueue if it does not exist.
[ ¤211]                          (bb) Set queue last delivery status to FAIL.
[ ¤212]                          (cc) Put the indication at the back of the queue using the method DestinationQueue::insertBack().
[ ¤213]                          (dd) Start DispatcherThread if it is not running.
[ ¤214]                     ELSE
[ ¤215]                          (aa) Delete and log the indication.
[ ¤216]                     ENDIF
[ ¤217]               ENDIF
[ ¤218]     ENDIF
[ ¤219] (4) Release  ReadLock on  DestinationQueueTable.
[ ¤220]
[ ¤221] Note: When new indication (not attempted for DeliveryRetry) arrived to HandlerService, if the DestinationQueue exists for this destination we simply put the indication on to the back of the queue without attempting for the delivery. This allows the ordering of the indications. These indications are attempted for DeliveryRetry for DeliveryRetryAttempts + 1 times.
[ ¤222]
[ ¤223] Algorithm 2: DestinationQueue::insertBack() method
[ ¤224]
[ ¤225]
1. Acquire Mutex lock on the queue.
[ ¤226] 2. IF (queue is full) THEN
[ ¤227]         (a) Delete the indication at the front of the queue and log the indication
[ ¤228]     ENDIF
[ ¤229] 3. Insert the indication at the back of the queue.
[ ¤230] 4. Release Mutex lock on the queue.
[ ¤231]

[ ¤232]
Algorithm 3: DispatcherThread algorithm
[ ¤233]
[ ¤234] RETRY_THREAD_WAIT_TIME= 100 milli seconds (default)
[ ¤235] MAX_DELIVERY_THREADS=5
[ ¤236] idleQueues Array - Holds the idle queues(whose size is zero and there are no pending operations) at the end of each iteration.
[ ¤237]
[ ¤238] (1) WHILE forever DO
[ ¤239]       (a) Sleep for RETRY_THREAD_WAIT_TIME
[ ¤240]       (b) Acquire ReadLock on the DestinationQueueTable.
[ ¤241]       (c) FOR each DestinationQueue in the DestinationQueueTable DO
[ ¤242]                (aa)  Get the next indication to be delivered from the DestinationQueue using DestinationQueue::getNextIndicationForDelivery() method.(See algorithm 4
[ ¤243]                        below)
[ ¤244]                (bb) IF (There is a eligible indication to be delivered from this queue) THEN
[ ¤245]                           (aaa)  Put the indication onto the DeliveryQueue.
[ ¤246]                           (bbb) IF (delivery worker threads running count <  MAX_DELIVERY_THREADS)
[ ¤247]                                          (aaaa) Increment delivery worker threads running count
[ ¤248]                                          (bbbb) Start delivery worker thread.
[ ¤249]                                    ENDIF
[ ¤250]                        ENDIF
[ ¤251]                (cc)  IF  (queue is idle) THEN
[ ¤252]                            (aaa) Add the queue to the idleQueues array.
[ ¤253]                       ENDIF
[ ¤254]            ENDFOR
[ ¤255]      (e) Release ReadLock on the DestinationQueueTable.
[ ¤256]      (f) IF (There are queues in idleQueues array) THEN
[ ¤257]             (aa) Acquire WriteLock on DestinationQueueTable.
[ ¤258]             (bb) FOR each queue in idleQueues Array DO
[ ¤259]                        (aaa)  IF (queue status is still idle) THEN
[ ¤260]                                         (aaaa) Delete the queue and remove from the DestinationQueueTable.
[ ¤261]                                 ENDIF
[ ¤262]                    ENDFOR
[ ¤263]             (cc) IF (There are no queues left in the DestinationQueueTable) THEN
[ ¤264]                        (aaa) Release the WriteLock.
[ ¤265]                        (bbb) Exit the thread.
[ ¤266]                     ENDIF
[ ¤267]             (dd) Release WriteLock on DestinationQueueTable.
[ ¤268]             (ee) Clear idleQueues array.
[ ¤269]           ENDIF
[ ¤270]      (g)  Cleanup idle threads in DeliveryThreadsPool after every 5 mins.
[ ¤271] ENDWHILE
[ ¤272]
[ ¤273] Algorithm 4: DestinationQueue::getNextIndicationForDelivery() method
[ ¤274]
[ ¤275] (1)  Acquire Mutex on the queue.
[ ¤276] (2) IF ( (Queue is not empty) AND (Queue status is not PENDING) AND (queue last delivery result is success OR (current time - last delivery time of the queue)
[ ¤277]       > DeliveryRetryInterval time)) THEN
[ ¤278]           (a) Set the queue last delivery status to PENDING.
[ ¤279]           (a) RETURN first indication from the queue after releasing the Mutex on the queue.
[ ¤280]      ENDIF
[ ¤281] (3) RETURN NULL after releasing the Mutex on the queue.
[ ¤282]
[ ¤283] Algorithm 5:  Delivery worker thread algorithm
[ ¤284]
[ ¤285] (1) WHILE (Indications exists on DeliveryQueue) DO
[ ¤286]          (a)  Remove the first indication from the queue.
[ (venkat_puvvada) Here, IndicationIdentifier of the Indication is added to CorrelatedIndications array of indication to mark it as transmission retry if this indication was already attempted for delivery.
¤287]
         (b) Try to deliver the indication by loading CIM-XML Handler (currently only CIM-XML Handler supported)
[ ¤288]          (c) IF (Indication delivery is successful) THEN
[ ¤289]                  (aa) Acquire Mutex on the queue.
[ ¤290]                  (bb) Update queue last delivery status to SUCCESS.
[ ¤291]                  (cc) Update queue lastDeliveryTime to current time.
[ ¤292]                  (dd) Release Mutex on the Queue.
[ ¤293]              ELSE
[ ¤294]                  (aa) Acquire Mutex on the queue.
[ ¤295]                  (bb) Update queue last delivery status to FAIL.
[ (venkat_puvvada) here, Indication is deleted and logged if delivery attempts made for indication exceeds CIM_IndicationService.DeliveryRetryAttempts else DestinationQueue::insertBack() metod is called in te next step.
¤296]
                 (cc) Call DestinationQueue::insertFront() method to put indication back onto the queue. (see algorithm 6 below).
[ ¤297]                  (dd) Update queue lastDeliveryTime to current time.
[ ¤298]                  (ee) Release Mutex on the Queue.               
[ ¤299]             ENDIF
[ ¤300]     ENDWHILE
[ ¤301] 2. Decrement delivery worker threads running count.
[ ¤302]
[ ¤303] Algorithm 6: DestinationQueue::insertFront() method
[ ¤304]
[ ¤305]
1. Acquire Mutex lock on the queue.
[ ¤306] 2. IF (queue is full) THEN
[ ¤307]       (a) Delete and log the indication
[ ¤308]     ELSE
[ ¤309]       (a) Insert the indication at the front of the queue.
[ ¤310]     ENDIF
[ ¤311] 4. Release Mutex lock on the queue.
[ ¤312]

[ ¤313]
Changes to Handlers
[ ¤314]
[ ¤315] CIMHandler interface will be changed to return the enum of status codes instead of throwing the exceptions. Handlers needs to return the following codes from the CIMHandler::handleIndication() method. Its Handler's responsibility to decide which is the permanent failure or temporary failure.
[ ¤316]
[ ¤317] enum DeliveryResult {SUCCESS, ERROR, FATALERROR, POSTDELIVERYERROR }
[ ¤318]
[ ¤319] Success - Delivery success
[ ¤320] FatalError - Permanent failure, no retry.
[ ¤321] Error - Error, can be retried later
[ ¤322] PostDeliveryError - Error occurred after delivery while waiting for the response, can be retried later.
[ ¤323]
[ ¤324] CIMHandler::handleIndication() method prototype will be modified as follows.
[ ¤325]
[ ¤326]      virtual DeliveryResult handleIndication(
[ ¤327]          const OperationContext& context,
[ ¤328]          const String nameSpace,
[ ¤329]          CIMInstance& indicationInstance,
[ ¤330]          CIMInstance& indicationHandlerInstance,
[ ¤331]          CIMInstance& indicationSubscriptionInstance,
[ ¤332]          CIMException& cimException, // New property , contains the exception in case of ERROR, FATALERROR, POSTDELIVERYERROR returns.
[ ¤333]          ContentLanguageList& contentLanguages) = 0;
[ ¤334]
[ ¤335] The following are possible exceptions are identified that can be thrown from CIMXML handlers.
[ ¤336]
[ ¤337] (1) CIMExcpetion - CIM error code : CIM_ERR_FAILED
[ ¤338] Reasons:
[ ¤339]  (a) Destination property is missing from handler.
[ ¤340]  (b) Destination property type mismatch in handler, must be string.
[ ¤341]  (c) https is specified in URL when SSL is not enabled.
[ ¤342]
[ ¤343] (2) CIMExcpetion - CIM error code : CIM_ERR_NOT_SUPPORTED
[ ¤344] Reasons:
[ ¤345]    (a) URL format is not understood or protocol is not supported.
[ ¤346]
[ ¤347] (3)AlreadyConnectedException
[ ¤348] (4) CannotCreateSocketException
[ ¤349] (5)CannotConnectException
[ ¤350] (6)InvalidLocatorException
[ ¤351] (7)NotConnectedException
[ ¤352] (8)ConnectionTimedOutException.
[ ¤353] (9)SSLException.
[ ¤354]
[ ¤355] (10)CIMClientMalformedHTTPException - When HTTP response from the CIM Server was improperly formed. For example, connection closed by remote end/invalid http headers received.
[ ¤356] (11)CIMClientHTTPErrorException - When HTTP error response was sent by the CIM Server. Ex. authentication failed
[ ¤357] (13)CIMClientResponseException - contains unexpected data in the export response
[ ¤358] (14)CIMClientXmlException - XML in the export response can not be validated.
[ ¤359] (15) bad_alloc
[ ¤360]
[ ¤361] For CIMXML Handlers, any exception other than permanent failure that occurred before and after including a failed write to HTTPConnection queue will be considered for DeliveryRetry. The first two CIMExceptions mentioned above will be considered as permanent failures.
[ ¤362]
[ ¤363] For example CIMXML Handlers returns FATALERROR if it catches the exceptions mentioned above from 1-2.
[ ¤364]
[ ¤365] Impact on other Handlers
[ ¤366]
[ ¤367]
There is no impact on other handlers. External behavior does not change. Other handlers (SNMP,EMAIL etc..), instead throwing the exception , exception will be returned as output arg from the CIMHandler::handleIndication() method. See the interface changes section below for more information.
[ ¤368]
[ ¤369] Except for the CIMXML Handler, all other handlers will be changed to return either SUCCESS or FATALERROR. Exceptions are handled within Handler itself , they are not propagated to HandlerService.
[ ¤370]
[ ¤371] Solution for other types of Handlers except CIMXML Handler is not proposed as part of this PEP. For example, SNMP Handler can return SUCCESS or FATALERROR. As we are not proposing the any solution for detecting the temporary failures for SNMP Handlers, any error that occurs will be considered as FATALERROR. In this case no retry will happen. This also allows us to use single interface and future modifications will be easier. For example in future if we implement the solution for SNMP handlers all that we need to do is , find out appropriate exceptions which causes the temporary failure and return 'ERROR' code so that delivery retry can be accomplished.

[ ¤372]
[ ¤373] Changes to Common library
[ ¤374]
[ ¤375] New CIMMessage CIMNotifySubscriptionNotActiveRequestMessage is introduced with this PEP . This message is sent by IndicationService to HandlerService when subscription is deleted or disabled using SendForget(). HandlerService deletes and logs the the matched indications for the subscription from the DestinationQueue.
[ ¤376]
[ ¤377] class PEGASUS_COMMON_LINKAGE CIMNotifySubscriptionNotActiveRequestMessage
[ ¤378]     : public CIMRequestMessage
[ ¤379] {
[ ¤380] public:
[ ¤381]     CIMNotifySubscriptionNotActiveRequestMessage(
[ ¤382]         const String & messageId_,
[ ¤383]         const CIMInstance &subscription_,
[ ¤384]         const QueueIdStack& queueIds_)
[ ¤385]     : CIMRequestMessage(
[ ¤386]         CIM_NOTIFY_SUBSCRIPTION_NOT_ACTIVE_REQUEST_MESSAGE,
[ ¤387]         messageId_, queueIds_),
[ ¤388]         subscription(subscription_)
[ ¤389]     {
[ ¤390]     }
[ ¤391]
[ ¤392]     virtual CIMResponseMessage* buildResponse() const;
[ ¤393]
[ ¤394]     CIMInstance subscription;
[ ¤395] };
[ ¤396]
[ ¤397] Changes to IndicationService
[ ¤398]
[ ¤399]
1. Indications are sent to HandlerService using SenForget() instead of SendAsync().
[ ¤400] 2.
CIMNotifySubscriptionNotActiveRequestMessage is sent to HandlerService when subscription is deleted or disabled.
[ ¤401]
[ ¤402]
Changes to HandlerService
[ ¤403]
[ ¤404]
HandlerService is changed to accommodate the delivery retry functionality. The following methods and variables are added.
[ ¤405]
[ ¤406]
#ifdef PEGASUS_ENABLE_DMTF_INDICATION_PROFILE_SUPPORT
[ ¤407]     /**
[ ¤408]         This method is called when HandlerService receives the
[ ¤409]         CIMNotifySubscriptionNotActiveRequestMessage. Indications matching the
[ ¤410]         subscription will be discarded from the queue and logged.
[ ¤411]     */
[ ¤412]     void _handleSubscriptionNotActiveRequest(Message *message);
[ ¤413]     /**
[ ¤414]         This method is called to stop dispatcher and delivery worker threads
[ ¤415]         when HandlerService receives the CimServiceStop request.
[ ¤416]     */
[ ¤417]     void _stopDispatcherAndCleanupDestinationQueues();
[ ¤418]     /**
[ ¤419]         Tries to deliver the indication, returns true if delivery is successful
[ ¤420]         else false.
[ ¤421]     */
[ ¤422]     Boolean _deliverIndication(IndicationInfo *info);
[ ¤423]     /**
[ ¤424]         This method is called when indication in the form of
[ ¤425]         CIMHandleIndicationRequestMessage arrives to HandlerService  from
[ ¤426]         IndicationService. This method puts the indication after converting
[ ¤427]         the indication to IndicationInfo if DestinationQueue exists. Returns
[ ¤428]         true if indication enqueues onto the destination queue successfully
[ ¤429]         else false.
[ ¤430]    */
[ ¤431]     Boolean _lookupDestinationQueueAndEnqueue(
[ ¤432]         CIMHandleIndicationRequestMessage *message);
[ ¤433]     /**
[ ¤434]         This method is called when indication delivery has failed and
[ ¤435]         the destination queue for this indication was not found previously.
[ ¤436]         This method creates the destinationQueue and enqueues the indication.
[ ¤437]    */
[ ¤438]     void _createDestinationQueueAndEnqueue(
[ ¤439]         CIMHandleIndicationRequestMessage *message);
[ ¤440]
[ ¤441]     String _getHandlerKey(
[ ¤442]         const CIMInstance &handler,
[ ¤443]         const CIMInstance &subscription);
[ ¤444]
[ ¤445]     String _getHandlerKey(
[ ¤446]         const CIMInstance &subscription);
[ ¤447]
[ ¤448]     typedef HashTable<
[ ¤449]                 String,
[ ¤450]                 DestinationQueue*,
[ ¤451]                 EqualFunc<String>,
[ ¤452]                 HashFunc<String> > DestinationQueueTable;
[ ¤453]
[ ¤454]     DestinationQueueTable _destinationQueueTable;
[ ¤455]     ReadWriteSem _destinationQueueTableLock;
[ ¤456]
[ ¤457]     AtomicInt _deliveryThreadsCount;
[ ¤458]     AtomicInt _dispatcherThreadRunning;
[ ¤459]     AtomicInt _dieNow;
[ ¤460]     List<IndicationInfo, Mutex> _deliveryQueue;
[ ¤461]     ThreadPool _threadPool;
[ ¤462]     Thread _dispatcherThread;
[ ¤463]     const Uint32 _maxDeliveryThreads;
[ ¤464]     static ThreadReturnType PEGASUS_THREAD_CDECL
[ ¤465]         _dispatcherRoutine(void *param);
[ ¤466]     static ThreadReturnType PEGASUS_THREAD_CDECL _deliveryRoutine(void *param);
[ ¤467] #endif
[ ¤468]
[ ¤469]
Changes to Config library
[ ¤470]

[ ¤471] Config library is modified to handle new config Property indicationDeliveryQueueSize.

[ ¤472] New classes added

/**
[ ¤473]     This class is used to store the indication related information in the DestinationQueue
[ ¤474] */
[ ¤475] class PEGASUS_HANDLER_SERVICE_LINKAGE IndicationInfo : public Linkable

[ ¤476] {
[ ¤477] public:
[ ¤478]     IndicationInfo(
[ ¤479]         const CIMInstance &indication,
[ ¤480]         const CIMInstance &subscription,
[ ¤481]         const OperationContext &context,
[ ¤482]         const String &nameSpace,
[ ¤483]         DestinationQueue *queue,
[ ¤484]         Uint16 deliveryAttemptsMade = 0) :
[ ¤485]             _indication(indication),
[ ¤486]             _subscription(subscription),
[ ¤487]             _context(context),
[ ¤488]             _nameSpace(nameSpace),
[ ¤489]             _queue(queue),
[ ¤490]             _deliveryAttemptsMade(deliveryAttemptsMade)
[ ¤491]     {
[ ¤492]     }
[ ¤493]
[ ¤494]     CIMInstance& getIndication()
[ ¤495]     {
[ ¤496]         return _indication;
[ ¤497]     }
[ ¤498]
[ ¤499]     CIMInstance& getSubscription()
[ ¤500]     {
[ ¤501]         return _subscription;
[ ¤502]     }
[ ¤503]
[ ¤504]     String& getNamespace()
[ ¤505]     {
[ ¤506]         return _nameSpace;
[ ¤507]     }
[ ¤508]
[ ¤509]     DestinationQueue *getDestinationQueue()
[ ¤510]     {
[ ¤511]         return _queue;
[ ¤512]     }
[ ¤513]
[ ¤514]     OperationContext& getOperationContext()
[ ¤515]     {
[ ¤516]         return _context;
[ ¤517]     }
[ ¤518]
[ ¤519]     void incDeliveryAttemptsMade()
[ ¤520]     {
[ ¤521]         _deliveryAttemptsMade++;
[ ¤522]     }
[ ¤523]
[ ¤524]     Uint16 getDeliveryAttemptsMade()
[ ¤525]     {
[ ¤526]         return _deliveryAttemptsMade;
[ ¤527]     }
[ ¤528]
[ ¤529] private:
[ ¤530]     CIMInstance _indication;
[ ¤531]     CIMInstance _subscription;
[ ¤532]     OperationContext _context;
[ ¤533]     String _nameSpace;
[ ¤534]     DestinationQueue *_queue;
[ ¤535]     Uint16 _deliveryAttemptsMade;
[ ¤536] };
[ ¤537]
[ ¤538]
/**
[ ¤539]     The DestinationQueue class holds the indications to be delivered to the
[ ¤540]     destination in the form of IndicationInfo.
[ ¤541] */
[ ¤542]
[ ¤543] class PEGASUS_HANDLER_SERVICE_LINKAGE DestinationQueue
[ ¤544] {
[ ¤545] public:
[ ¤546]
[ ¤547]     enum DeliveryStatus
[ ¤548]     {
[ ¤549]         PENDING,
[ ¤550]         FAIL,
[ ¤551]         SUCCESS,
[ ¤552]     };
[ ¤553]
[ ¤554]     DestinationQueue(const CIMInstance &handler, const String &queueName);
[ ¤555]     ~DestinationQueue();
[ ¤556]
[ ¤557]     Boolean isIdle()
[ ¤558]     {
[ ¤559]         AutoMutex mtx(_queueMutex);
[ ¤560]         return _queue.size() == 0 && _lastDeliveryStatus != PENDING;
[ ¤561]     }
[ ¤562]
[ ¤563]     String getQueueName()
[ ¤564]     {
[ ¤565]         return _queueName;
[ ¤566]     }
[ ¤567]  
[ ¤568]     CIMInstance& getHandler()
[ ¤569]     {
[ ¤570]         return _handler;
[ ¤571]     }
[ ¤572]
[ ¤573]     void insertBack(IndicationInfo* message);
[ ¤574]     void updateSuccess(IndicationInfo *message);
[ ¤575]     void updateFailure(IndicationInfo *message);
[ ¤576]     void deleteMatchedIndications(const CIMInstance &subscription);
[ ¤577]
[ ¤578]     IndicationInfo* getNextIndicationForDelivery(
[ ¤579]         const struct timeval *timeNow,
[ ¤580]         Boolean &isIdle);
[ ¤581]
[ ¤582] private:
[ ¤583]     void _insertFront(IndicationInfo* message);
[ ¤584]
[ ¤585]     CIMInstance _handler;
[ ¤586]     List<IndicationInfo,NullLock> _queue;
[ ¤587]     Mutex _queueMutex;
[ ¤588]     struct timeval _lastDeliveryTime;
[ ¤589]     DeliveryStatus _lastDeliveryStatus;
[ ¤590]     String _queueName;
[ ¤591]
[ ¤592]     static Uint32 _maxQueueLength;
[ ¤593]     static Uint32 _maxDeliveryAttempts;
[ ¤594]     static Uint32 _minDeliveryRetryIntervalSeconds;
[ ¤595]     static AtomicInt _initialized;
[ ¤596]     static Mutex _intializeMutex;
[ ¤597] };
[ ¤598]
[ ¤599]

[ ¤600] Update for pegasus/doc/BuildAndReleaseOPtions.html

indicationDeliveryQueueSize
[ ¤601]
[ ¤602] Description: Defines the number of delivery failed indications that are queued up for a particular listener destination. If queue is full older indications from the queue are discarded to accommodate newer indications.
[ ¤603] Default Value: 50
[ ¤604] Recommended Default Value (Development Build): 50
[ ¤605] Recommended Default Value (Release Build): 50
[ ¤606] Recommend To Be Fixed/Hidden (Development Build): No/No
[ ¤607] Recommend To Be Fixed/Hidden (Release Build): No/No
[ ¤608] Dynamic?: No
[ ¤609] Considerations: If not specified default queue length 50 is used. This option is available only when Pegasus is built with PEGASUS_ENABLE_DMTF_INDICATION_PROFILE_SUPPORT=true. Specifying the value as zero means no delivery retry attempted.
[ ¤610]
[ ¤611] Testcases
[ ¤612]

[ ¤613] Testcases will be added to test the functionality proposed in this PEP. For example
Create the subscription, don't start the Listener. Provider generates the 'n' indications. Now start the listener, listener should get 'n' indications generated by the provider before DeliveryRetryAttempts expired for the first indication. See the RetryAlgorithm above. Tests are also added to test discarded indications.

[ ¤614] Schedule

Available in 2.10
[ ¤615]

[ ¤616] Discussion
[ ¤617]

  1. [ ¤618] Default value for indicationDeliveryQueueSize.
    [ ¤619]
  2. [ ¤620] Karl will be asking DMTF people on usage of CIM_IndicationServiceCapabilities and CIM_IndicationServiceSettingData classes. How to model these classes whether the instances these classes can be modifiable dynamically ?
  3. [ ¤621] Karl suggested to limit the maximum value for indicationDeliveryQueueSize.
  4. [ ¤622] Is the specification clear about the meaning of the DeliveryRetryAttempts value? It seems like it should be the number of delivery retry attempts made AFTER an initial failed delivery attempt. Karl volunteered to follow up with the DMTF on this item. PEP proposes DeliveryRetryAttempts+1 attempts for inidcations which are enqueued onto the DestinationQueue without being attempted for delivery.
Comments on version 0.1
[ ¤623]
[ ¤624] (r_kumpf) What about this PEP is specific to CIM-XML? Why would the IndicationService care about the type of the listener destination?
[ ¤625] (venkat_puvvada) There is no reason for not supporting the other Listener destination types. I am not sure how best we can match these parameters for Email and SNMP handlers, so i decided not include support for them at this stage.
[ ¤626]
[ ¤627] (r_kumpf) Do indications continue to be retried for delivery after the associated CIM_IndicationSubscription and CIM_ListenerDestination instances are deleted?
[ ¤628] (venkat_puvvada) No, indications will be discarded.
[ ¤629]
[ ¤630] (r_kumpf) What is the rationale for persisting indications across cimserver restarts? When the cimserver is stopped, indications will cease to be generated. When it is restarted, the listener may receive stale indications and not receive more current ones for events that occurred while the cimserver was stopped. This could result in an administrator getting paged about a critical problem that was fixed months earlier.
[ ¤631] (venkat_puvvada) The reason for persistence of indications is client may not want loose any indications. Client must be intelligent enough to discard out of date indications by looking at timestamp of delivered indication.
[ ¤632]
[ ¤633] (r_kumpf) The traceFilePath is a poor choice for a directory to persist data needed for CIM Server operation. This directory is generally world writable.
[ ¤634] (venkat_puvvada) yes, i agree, this needs to be discussed.
[ ¤635]
[ ¤636] (r_kumpf) What are the contents and format of this file? How is compatibility protected on CIM Server upgrade?
[ ¤637] (venkat_puvvada) The file will have Handler , subscription and Indication(with content language list added to indication instance) instances in XML form. Indications are saved for each subscription under for each listener destination. It will be compatible with CIMServer upgradation.
[ ¤638]
[ ¤639] Comments on version 0.2
[ ¤640]
[ ¤641] (r_kumpf) Doesn't the CIMHandleIndicationRequestMessage already contain all the information that is needed to deliver the indication? The only extra data the IndicationService should need to track is related to the retry algorithm. What am I missing?
[ ¤642] (venkat_puvvada) CIMHandleIndicationRequestMessage does not have the following information.
[ ¤643] subscriptionInstanceNames
[ ¤644] providerName
[ ¤645] pendingRetryCount
[ ¤646] These are required to construct CIMProcessIndicationRequestMessage request again.
[ ¤647]
[ ¤648] (r_kumpf)  How is it determined which exceptions indicate an indication delivery failure? For example, why does CannotCreateSocketException cause a retry but not bad_alloc?
[ ¤649] (venkat_puvvada) Though its difficult to examine the CannotCreateSocketException , its possible that we can retry when socket() returns errno with ENOBUFS or ENOMEM means resources at TCP/IP layer/memory exhausted and can be retried later.
[ ¤650]
[ ¤651] (r_kumpf) Is there a strict requirement that a given indication is delivered at most once for a given subscription?
[ ¤652] (Venkat_puvvada) yes
[ ¤653]
[ ¤654] (k_schopmeyer) I think that there are many parts of this algorithm that are not part of that description (ex. what happens when subscription is deleted).
[ ¤655] (venkat_puvvada) Indications will be discarded
[ ¤656]
[ ¤657] (r_kumpf) It is not obvious to me that the desired behavior would be to stop trying to deliver an indication after a subscription is deleted. This kind of information is important to document in the proposal, along with the rationale for the decision.
[ ¤658] (venkat_puvvada) Sure, i will add in the next version of the PEP.
[ ¤659]

[ ¤660] Comments on version 0.3
[ ¤661]
[ ¤662] (k_schopmeyer) Nit. This is only one component in moving from 'sort of best effort' to reliable delivery. I suggest that this is simply improving the protocol so that deliveries can be accomplished in case of 'temporary' errors in the protocol and not really reliable delivery.
[ ¤663] (venkat_puvvada) ok
[ ¤664]
[ ¤665] (r_kumpf) Why is the HandlerRetryQueue logic in the IndicationService? Retrying delivery seems like it should be the IndicationHandlerService's job. I think it is more of a protocol-level thing than an indication processing thing.
[ ¤666] (venkat_puvvada) Yes, i agree. Actually we have decided to discard the indications on the RetryQueue when matched subscription is removed/disabled. If we implement this in IndicationService, we can directly access the ActiveSubscriptionTable to see if subscription is active or not. This will have performance benefit. Keeping this implementation in the HandlerService requires to check in repository for for subscription validity or a message needs to be sent by IndicationService to HandlerService when subscription is removed/disabled.
[ ¤667]
[ ¤668] (r_kumpf) What happens when a delivery retry fails? Is the indication put back on the queue? At the beginning or the end? What if the queue is full?
[ ¤669] (venkat_puvvada) If DeliveryRetry fails indication is inserted at the front of the queue. When queue is full ,indication at the front of the queue will be removed and new indication is added at the back of the queue.
[ ¤670]
[ ¤671] (r_kumpf) Is a new exception class the best way for a handler to communicate the delivery status? It might make sense to change the CIMHandler::handleIndication return type from void to a status value. Possible values could be Success, Error, and FatalError, for example. An interesting question here is what is the behavior when the handler throws an exception which is not DeliveryFailedException? Is it assumed that the delivery was sucecssful or permanently failed?
[ ¤672] (venkat_puvvada) This is good idea. We can have possible values Success, Error, and FatalError.
[ ¤673] Success - Delivery success
[ ¤674] Error - Error, can be retried later
[ ¤675] FatalError - Permanent failure, no retry.
[ ¤676]
[ ¤677] If handler throws other than DeliveryFailedException, thats either permanent failure or post-delivery failure, we don't retry in those cases.
[ ¤678]

[ ¤679] Comments on version 0.4
[ ¤680]

[ ¤681] (r_kumpf) It may make sense to make this a class instead of a struct. It could have insertFront() and insertBack() methods which know how to behave when the queue is full. (I.e., insertFront() drops the indication when the queue is full, and insertBack() drops the first entry when the queue is full.)
[ ¤682] (venkat_puvvada) ok
[ ¤683]
[ ¤684] k_schopmeyer) Should we consider some maximum limit on the number of retry queues? This is just another possible memory protector.
[ ¤685] (venkat_puvvada) I am ok with that, need to discuss this.
[ ¤686]
[ ¤687] r_kumpf) I presume these test cases will get pretty interesting. Do you have thoughts about how they will work?
[ ¤688] (venkat_puvvada) Create the subscription, don't start the Listener. Provider generates the 'n' indications. Now start the listener, listener should get 'n' indications generated by the provider when DeliveryRetryInterval expired.
[ ¤689]

[ ¤690] Comments on version 0.5
[ ¤691]
[ ¤692]
(r_kumpf) Can you characterize the threading implications here? If each of the retries is done by the RetryThread, that would mean the DestinationQueueTable would potentially be locked for a long time. If the delivery retry fails, the IndicationHandlerService will need to put the indication back into the DestinationQueueTable. Will deadlock occur?
[ ¤693] If a new thread is started for each delivery retry, that would cause a spike of activity on each interval, affecting the delivery of indications to listeners that have not experienced failures.
[ ¤694] (venkat_puvvada) No deadlock will occur. It works in the following way.
[ ¤695] 1. Take the lock on the queue table.
[ ¤696] 1. Iterate through queue table, get one indication from each queue, store them in array.
[ ¤697] 2. Release lock on the table.
[ ¤698] 3. Send each indication in the array to HandlerService, using SendAsync() method.
[ ¤699] 4. If DeliveryRetry fails HandlerService puts the indication on to the queue.
[ ¤700]
[ ¤701] (r_kumpf) Note that this behavior is inconsistent with the definition of the CIM_IndicationService.DeliveryRetryInterval property: 'The DeliveryRetryInterval property defines the minimal time interval in seconds for the indication service to wait before delivering an indication to a particular listener destination that previously failed. The implementation may take longer due to QoS or other processing. Note that implementations may preset this setting and not allow this value to be modified.'
[ ¤702] (venkat_puvvada) Yes. If new queue is added to the queue table while retry thread waiting on the semaphore , new queue may get monitored before DeliveryRetryInterval time. We have other option, wait until the next time interval.
[ ¤703]
[ ¤704] (r_kumpf) Is the specification clear about the meaning of the DeliveryRetryAttempts value? It seems like it should be the number of delivery retry attempts made AFTER an initial failed delivery attempt. Karl volunteered to follow up with the DMTF on this item.
[ ¤705]
[ ¤706] Comments on version 0.7
[ ¤707]
[ ¤708] (r_kumpf) Shouldn't the lastRetryTime be tracked per indication rather than per queue?
[ ¤709] (dmitry_mikulin) If lastRetryTime is per queue, how are you going to tell which indications are ready to be re-tried?
[ ¤710] (venkat_puvvada) If we maintain the lastRetryTime for each indication, we can not deliver the indications in sequence. For example if there are many indications in the retry queue if we try to deliver the indications according to the indications lastRetryTime it is possible that we deliver latest indications in the queue.
[ ¤711]
[ ¤712] (r_kumpf) This steps seems like it would unnecessarily delay the delivery of queued indications once the intermittent problem (network error, for example) is resolved.
[ ¤713] (venkat_puvvada) RETRY_THREAD_WAIT_TIME value is configurable. This also prevents spike of activity when suddenly all clients/listeners comes up and also solves the problem where consumers are too slow to receive the inidcations.
[ ¤714]
[ ¤715] (b_whiteley) I would prefer to see a solution where all handler types are supported. In addition to extending this functionality to the other handler types, I suspect this implementation would be cleaner.
[ ¤716] (venkat_puvvada) Yes, this can be tried in the next stage of implementation.
[ ¤717]
[ ¤718] (b_whiteley) I'm not very familiar with the current Indication Handler Service, so I apologize for the lack of specifics. As I read through this PEP, my gut feeling is that the approach proposed in this PEP will introduce a lot of problems and instability.
[ ¤719] It doesn't seem right to have the Handler Service hand indications to other components that will ultimately hand the indications back to the Handler Service.
[ ¤720] I would prefer a design that incorporates the following:
[ ¤721] * Refactor the HandlerService itself to handle all of the delivery retry logic, rather than having a separate component reinsert indications into the HandlerService.
[ ¤722] * Enhance the Handler interface so that delivery retry is applicable to all types of Handlers, not just CIM-XML.
[ ¤723] * Design it in a way that is consistent with turning Handlers into Handler Providers at a later date, so that new handlers can be added just as instance providers are added today.
[ ¤724]
[ ¤725]

[ ¤726]
[ ¤727] Copyright (c) 2006 Hewlett-Packard Development Company, L.P.; IBM Corp.;
[ ¤728] EMC Corporation; Symantec Corporation; The Open Group.
[ ¤729]
[ ¤730] Permission is hereby granted, free of charge, to any person obtaining a copy
[ ¤731] of this software and associated documentation files (the "Software"), to
[ ¤732] deal in the Software without restriction, including without limitation the
[ ¤733] rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
[ ¤734] sell copies of the Software, and to permit persons to whom the Software is
[ ¤735] furnished to do so, subject to the following conditions:
[ ¤736]
[ ¤737] THE ABOVE COPYRIGHT NOTICE AND THIS PERMISSION NOTICE SHALL BE INCLUDED IN
[ ¤738] ALL COPIES OR SUBSTANTIAL PORTIONS OF THE SOFTWARE. THE SOFTWARE IS PROVIDED
[ ¤739] "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT
[ ¤740] LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
[ ¤741] PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
[ ¤742] HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
[ ¤743] ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
[ ¤744] WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


[ ¤745]  

[ ¤746] Template last modified: March 26th 2006 by Martin Kirk
[ ¤747]
Template version: 1.11