Developing a QoS Component to guarantee Operator Reliability in Distributed Event-Based Systems

BearbeiterIn:Cosmin Bercea
Titel:Developing a QoS Component to guarantee Operator Reliability in Distributed Event-Based Systems
Typ:bachelor thesis
Betreuer:Mutschler, C.; Philippsen, M.
Status:abgeschlossen am 31. März 2015
Vorausetzungen:
  • C/C++ Programming Skills
  • Basic knowledge on event-based systems
Thema:

The locating and communication systems department at the Fraunhofer Institute for Integrated Circuits IIS in Nuremberg works on wireless locating systems that enable the continuous tracking of transmitters with high precision in real-time. A distributed event-based system analyzes generated position data streams and allows to handle several millions events per second. As often the data load of real-time locating systems is very high, there are hard requirements on efficiency and scalability for event processing systems. In order to derive (high-level) events of interest, the Programming Systems Group (Informatik 2) and the Fraunhofer IIS developed a distributed event processing system that allows to split complex detection algorithms in several event detectors that are fully scalable.
This distributed event processing system has been deployed in other applications besides the analysis of position data. This involves for instance smart grid applications, financial market analysis, and gamification services. However, in addition to real-time locating systems these kinds of applications also demand high reliability. Since many physical processes depend on the output of event processing systems their correctness and performance characteristics are of critical importance. In order to assure the consistency of the event processing system, the provided event streams need to be indistinguishable from an execution in which the hosts of some operators fail, or in which event streams are not available during a temporary partitioning of the network. If computing nodes are wirelessly connected (e.g. by using smart-phones as nodes) the probability of some operator failure is even higher (e.g. bad Wi-Fi connection, battery issues, leaving connection areas).
To assure QoS requirements in distributed (and especially mobile) event processing systems, computation resources need to be actively scheduled, and in order to achieve consistency, a recovery algorithm needs to recognize failed hosts and quickly re-compute the failed detectors from event stream logs.
The objectives of this thesis are to get familiar with the already existing event processing system and the implementation of a software component that insures QoS requirements on a system with adaptive computation resources. An evaluation on a mobile event processing cluster, consisting of several smartphones as processing nodes illustrates the efficiency of the developed algorithms.

Milestones:

  • Research on existing QoS event processing systems in the distributed computing area.
  • Get comfortable with the existing event processing system at Fraunhofer IIS.
  • Implementation of the needed software components and algorithms.
  • Implementation of a QoS-aware event processing system with adaptive computation resources (running on smart-phones).
  • Evaluation of the efficiency and consistency of the system.
  • Elaboration of the results.

Literature:

  • B. Koldehofe, R.Mayer, U. Ramachandran, K. Rothermel, M. Völz: Rollback-Recovery without Checkpoints in Distributed Event Processing Systems. Proceedings of the 7th ACM International Conference on Distributed Event-Based Systems (Arlington, TX, USA). pp. 27-38, 2013.
  • Z. Wang, Y. Zhang, X. Chang, X. Mi, Y. Wang, K. Wang, H.Yang: Pub/Sub on Stream: A Multi-Core Based Message Broker with QoS Support. Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems (Berlin, Germany). pp. 127-138. 2012.
  • T. Repantis V. Kalogeraki: Replica Placement for High Availability in Distributed Stream Processing Systems. Proceedings of the 2nd ACM International Conference on Distributed Event-Based Systems (Rome, Italy). pp. 181-192. 2008.
  • G. Lakshmanan, Y.Li, R. Strom: Placement of Replicated Tasks for Distributed Stream Processing Systems. Proceedings of the 4th ACM International Conference on Distributed Event-Based Systems (New York, NY, USA). pp. 128-139. 2012.
  • G. Silva, B. Gedik, H. Andrade, K. Wu, R. Iyer: Fault Injection-based Assessment of Partial Fault Tolerance in Stream Processing Applications. Proceedings of the 5th ACM International Conference on Distributed Event-Based Systems (New York, NY, USA). pp. 231-242. 2011.
watermark seal