

**INTEGRATED TIMING ANALYSIS AND VERIFICATION OF COMPONENT-BASED  
DISTRIBUTED REAL-TIME SYSTEMS**

By  
**Pranav Srinivas Kumar**

Dissertation  
Submitted to the Faculty of the  
Graduate School of Vanderbilt University  
in partial fulfillment of the requirements

for the degree of

**DOCTOR OF PHILOSOPHY**  
in  
Computer Science  
**AUGUST, 2016**  
Nashville, Tennessee

Approved:

Date:

---

Dr. Gabor Karsai

---

Dr. Xenofon D. Koutsoukos

---

Dr. Gautam Biswas

---

Dr. Akos Ledeczi

---

Dr. Bharat Bhuva

## TABLE OF CONTENTS

|                                                                 | Page |
|-----------------------------------------------------------------|------|
| LIST OF TABLES . . . . .                                        | v    |
| LIST OF FIGURES . . . . .                                       | vi   |
| Chapter                                                         |      |
| I. Introduction . . . . .                                       | 1    |
| II. Fundamentals . . . . .                                      | 7    |
| III. Related Research . . . . .                                 | 10   |
| 3.1. General Analysis Methodologies . . . . .                   | 10   |
| 3.1.1. Testing and Evaluation . . . . .                         | 10   |
| 3.1.2. Simulation . . . . .                                     | 12   |
| 3.1.3. Formal Analysis Methods . . . . .                        | 15   |
| 3.1.4. Static Code Analysis . . . . .                           | 20   |
| 3.2. Fundamental Timing Analysis Tools . . . . .                | 22   |
| 3.2.1. Cheddar Real-Time Scheduling Framework . . . . .         | 22   |
| 3.2.2. UPPAAL . . . . .                                         | 23   |
| 3.3. System-level Timing Analysis Methodologies . . . . .       | 23   |
| 3.3.1. Petri net-based Timing Analysis for Concurrent Systems   | 23   |
| 3.3.2. Analyzing AADL models with Petri nets . . . . .          | 27   |
| 3.3.3. Analyzing AADL Models with Timed Petri nets . . . . .    | 29   |
| 3.3.4. MAST: Modeling and Analysis Suite . . . . .              | 30   |
| 3.3.5. Verification in AutoFocus 3 . . . . .                    | 32   |
| IV. Design Model: Distributed Managed Systems (DREMS) . . . . . | 34   |
| 4.1. DREMS Component Model . . . . .                            | 34   |
| 4.2. Component Execution Semantics . . . . .                    | 36   |
| 4.3. Temporal Partition Scheduler . . . . .                     | 38   |
| 4.4. Motivation to use DREMS . . . . .                          | 39   |
| V. Colored Petri net-based Modeling Methodology . . . . .       | 40   |
| 5.1. Problem Statement . . . . .                                | 40   |
| 5.2. Colored Petri Net-based Analysis Model . . . . .           | 42   |
| 5.2.1. Model of Time . . . . .                                  | 45   |
| 5.2.2. Modeling Temporal Partitioning . . . . .                 | 46   |
| 5.2.3. Modeling Component Thread Behavior . . . . .             | 47   |
| 5.2.4. Modeling Component Operations . . . . .                  | 49   |

|            |                                                               |     |
|------------|---------------------------------------------------------------|-----|
| 5.2.5.     | Modeling Component Interactions . . . . .                     | 53  |
| 5.2.6.     | Modeling Timers . . . . .                                     | 54  |
| 5.3.       | Modeling Component Operation Business Logic . . . . .         | 56  |
| 5.3.1.     | Problem Statement . . . . .                                   | 56  |
| 5.3.2.     | Challenges . . . . .                                          | 56  |
| 5.3.3.     | Outline of Solution . . . . .                                 | 57  |
| VI.        | State Space Analysis and Verification . . . . .               | 62  |
| 6.0.1.     | Searching the State Space . . . . .                           | 65  |
| 6.0.2.     | Deadline Violations . . . . .                                 | 66  |
| 6.0.3.     | System-wide Deadlocks . . . . .                               | 67  |
| 6.0.4.     | Response-time Analysis . . . . .                              | 68  |
| 6.0.5.     | Incomplete Designs . . . . .                                  | 69  |
| 6.1.       | Modeling and Analysis Improvements . . . . .                  | 70  |
| 6.1.1.     | Problem Statement . . . . .                                   | 70  |
| 6.1.2.     | Outline of Solution . . . . .                                 | 70  |
| 6.1.3.     | Handling Time . . . . .                                       | 72  |
| 6.1.4.     | Distributed Deployment . . . . .                              | 75  |
| 6.2.       | Investigating Advanced State Space Analysis Methods . . . . . | 78  |
| 6.2.1.     | Problem Statement . . . . .                                   | 78  |
| 6.2.2.     | Outline of Solution . . . . .                                 | 78  |
| 6.2.3.     | Evaluation of Solution . . . . .                              | 79  |
| 6.2.4.     | Contributions . . . . .                                       | 80  |
| VII.       | Experimental Evaluation . . . . .                             | 84  |
| 7.0.1.     | Challenges . . . . .                                          | 84  |
| 7.1.       | Resilient Cyber-Physical Systems (RCPS) Testbed . . . . .     | 86  |
| 7.1.1.     | Architecture . . . . .                                        | 86  |
| 7.2.       | ROSMOD Software Infrastructure . . . . .                      | 87  |
| 7.2.1.     | Deployment Infrastructure . . . . .                           | 94  |
| 7.3.       | Evaluation of Timing Analysis Results . . . . .               | 97  |
| 7.3.1.     | Understanding the CPN Analysis Plots . . . . .                | 98  |
| 7.3.2.     | Client-Server Interactions . . . . .                          | 100 |
| 7.3.3.     | Publish-Subscribe Interactions . . . . .                      | 103 |
| 7.3.4.     | Trajectory Planner . . . . .                                  | 106 |
| 7.3.5.     | Time-triggered Operations . . . . .                           | 109 |
| 7.3.6.     | Long-Running Operations . . . . .                             | 111 |
| 7.4.       | Analysis Limitations . . . . .                                | 116 |
| VIII.      | Summary and Future Work . . . . .                             | 118 |
| Appendices | . . . . .                                                     | 119 |
| A.         | Publications . . . . .                                        | 120 |

|                                   |     |
|-----------------------------------|-----|
| 1.1. Workshop Papers . . . . .    | 120 |
| 1.2. Conference Papers . . . . .  | 120 |
| 1.3. Journal Papers . . . . .     | 121 |
| 1.3.1. Awaiting Reviews . . . . . | 121 |
| REFERENCES . . . . .              | 123 |

## LIST OF TABLES

| Table                                                        | Page |
|--------------------------------------------------------------|------|
| 1. Client Server Example - Summary of Results . . . . .      | 101  |
| 2. Publish Subscribe Example – Summary of Results . . . . .  | 104  |
| 3. Trajectory Planner Example – Summary of Results . . . . . | 108  |
| 4. Periodic Timers – Summary of Results . . . . .            | 111  |

## LIST OF FIGURES

| Figure                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | Page |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
| 1. Embedded Software Development Lifecycle Comparison . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 4    |
| 2. Sample Petri Net, reprinted from [90] . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 24   |
| 3. DREMS Component . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 35   |
| 4. Component Operation Execution Semantics: This figure shows the DREMS operation scheduling on an incoming operation request. At some time $t_i$ , the component executor thread is busy executing an operation – component operations can be triggered into execution by the (1) expiry of a timer, (2) the arrival of a subscription message, or (3) the arrival of a service request. $t_{req}$ represents the arrival time of a remote request. $t_{wait}$ is the wait time of this request in the message queue while the current operation is still executing. $t_{req\_schld}$ is the time stamp at which the current operation completes executing. At this time, the remote request is finally scheduled for execution. $t_{req\_cmpl}$ is the time stamp at which the remote request completes. The overall time taken by the component to respond to this request is calculated as: $t_{wait} + t_{exec} = t_{req\_cmpl} - t_{req}$ . . . . . | 37   |
| 5. Sample Temporal Partition Schedule with Hyperperiod = 300 ms . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 39   |
| 6. Colored Petri Net Analysis Model . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 44   |
| 7. Analysis Model - Structural Aspects . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 45   |
| 8. Temporal Partition Schedule Data Structure . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 47   |
| 9. Component Thread Execution Cycle . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 48   |
| 10. Component Operation Scheduling Cycle . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 50   |
| 11. RMI Application . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 51   |
| 12. RMI Application - Client Timer Operation . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 52   |
| 13. RMI Application - Server Operation . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 53   |
| 14. Operation Induction . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 54   |

|     |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |    |
|-----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 15. | Operation Induction Token . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 54 |
| 16. | Timer Operations . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 55 |
| 17. | Modeling the Business Logic of Component Operations . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 58 |
| 18. | Sample Business Logic Model . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 60 |
| 19. | CPN Business Logic Representation . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 61 |
| 20. | Bounded State Space for a Multi-component Timer example – The component threads have the same real-time priority and are executed in the same device. Each component is triggered with a 100 Hz periodic timer and all three timers are synchronized to illustrate non-determinism. The round robin scheduling quantum is set to 4 milliseconds. 'Mark' is a state space query function that provides the <i>marking</i> of a place in a particular state space node. . . . .                                                                                                                                                                                            | 64 |
| 21. | SearchNodes function provided by CPNTools . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 65 |
| 22. | Deadline Violation Observer place . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 67 |
| 23. | A Clock Token with Temporal Partitioning . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 73 |
| 24. | Dynamic Time Progression . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 74 |
| 25. | Structural Reductions in CPN . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 76 |
| 26. | Generated CPN model for a Distributed Application Deployment . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 81 |
| 27. | Sweep-Line Method . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 82 |
| 28. | Dead States Checking in a Component-based application . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 83 |
| 29. | Testbed Architecture . . . . .                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 86 |
| 30. | ROSMOD Metamodel. Containment is specified from <i>src</i> to <i>dst</i> where the source has a containment attribute <i>quantity</i> , meaning that <i>quantity</i> objects of type <i>src</i> can be contained in an object of type <i>dst</i> . Pointers are specified as a one to one mapping from source to destination, using the name of the pointer. Sets allow for pointer containment. All objects contain a <i>name</i> attribute of type <i>string</i> , not shown for clarity. Note: the metamodel is used to create the ROSMOD Modeling Language, but users do not see or interact with it; it is used to enforce proper model creation semantics. . . . . | 89 |

|     |                                                                       |     |
|-----|-----------------------------------------------------------------------|-----|
| 31. | Software Deployment Workflow . . . . .                                | 95  |
| 32. | Interpreting Execution Time Plots . . . . .                           | 99  |
| 33. | Experimental Observation: Client-Server Interactions . . . . .        | 100 |
| 34. | CPN Analysis Results: Client-Server Interactions . . . . .            | 101 |
| 35. | CPN Analysis Results: Client-Server Response Times in Bad Designs . . | 102 |
| 36. | Experimental Observation: Publish-Subscribe Interactions . . . . .    | 103 |
| 37. | CPN Analysis Results: Publish-Subscribe Interactions . . . . .        | 104 |
| 38. | CPN Analysis Results: Time-triggered Publisher – Periodicity Issues . | 105 |
| 39. | Trajectory Planner Test . . . . .                                     | 106 |
| 40. | Experimental Observation: Trajectory Planner . . . . .                | 107 |
| 41. | CPN Analysis Results: Trajectory Planner . . . . .                    | 108 |
| 42. | CPN Analysis - Sensor firing too frequently . . . . .                 | 109 |
| 43. | Experimental Observation: Periodic Timers . . . . .                   | 110 |
| 44. | CPN Analysis Results: Periodic Timers . . . . .                       | 111 |
| 45. | Long Running Operations - Timing Diagram . . . . .                    | 113 |
| 46. | Long Running Operation - Software Model . . . . .                     | 114 |
| 47. | Experimental Observation: Composed Component Assembly . . . . .       | 115 |
| 48. | CPN Analysis Results: Composed Component Assembly . . . . .           | 116 |

## CHAPTER I

### INTRODUCTION

The decisive role of optimized and robust software in safety and mission-critical distributed real-time embedded (DRE) systems is becoming increasingly recognized. Embedded software is pertinent in a variety of heterogeneous domains e.g. avionics [17], automotive systems [84], locomotives [114], and industrial control systems [115]. The volume and complexity of such software grows everyday depending on an assortment of factors, including challenging system requirements e.g. resilience to hardware and software faults, remote deployment and repair. Deployment, the procedure for launching or reconfiguring software processes on embedded hardware, becomes extremely difficult if obtaining access to such devices is limited. Large scale deployment of embedded software, for this reason, has become considerably more arduous – periodic peer reviews, numerous verification and certification methods are applied to maintain industry standards for safety, precision and reliability of embedded real-time software. Even still, software errors manifest in deployed systems; errors that can be extremely difficult to reproduce in a laboratory test environment.

There exists a long list of real-world scenarios where errors in embedded software implementations has cost millions of dollars and human life. Between 1999 and 2010, at least 2,200 Toyota vehicles sold in the United States experienced unintended cases of rapid acceleration, causing nearly 900 accidents and over 100 deaths [23]. In 2010, Toyota recalled some 10 million vehicles, an extraordinary number given that the company sold only about seven million vehicles during that period. Toyota engineers described the problem as a disconnect in the vehicle’s complex anti-lock brake system (ABS) that causes less than a one-second lag in its operation. With this delay, a vehicle going 60 mph will have traveled nearly another 90 feet before the brakes begin to take hold. Brakes in Toyota hybrids such as the Prius operate differently from brakes in most cars. In addition to the standard brakes,

which use friction from pads pressed against rotors, the embedded software driving the electric motors help slow the vehicle. This process also generates electricity to recharge the batteries. This is a prime example of how timing errors in consumer-focused embedded software, spanning millions of lines of code, can have disastrous effects to everyday life. The Prius is Toyota's third best-selling model in the United States. The automaker recalled 2.3 million vehicles on January, 2011 because of problems with sticking gas pedals and later halted the sale of the eight models involved in the recall. Toyota's U.S. sales plunged 16 percent in January as a result, even as sales of other automakers rose.

To mitigate such software complexity, model-driven component-based software engineering (CBSE) and development [8, 13, 21, 41, 102] has become an accepted practice. CBSE tackles escalated demands with respect to requirements engineering, high-level design, error detection, tool integration, verification and maintenance. The widespread use of component technologies in the market has made CBSE a focused field of research in the academic sectors. Applications are built by assembling together small, tested component building blocks that implement a set of services. These building blocks are typically built from UML [11] class diagrams, or imported from other projects/vendors and *connected* together via exposed interfaces, providing a "black box" approach to software construction. This approach also treats software verification in a more modular fashion; the various software components can be verified individually and then composed together to derive a functional system.

Remote embedded devices e.g. fractionated spacecraft<sup>1</sup> following mission timetables and hosting distributed software applications expose several concerns including strict timing requirements, complexity in deployment, repair and integration; and resilience to faults, including mechanical failures like surface fractures, electrical failures such as single-event upsets, and manufacturing defects, and lastly software failures such as design defects and

---

<sup>1</sup>A fractionated spacecraft is a satellite architecture where the functional capabilities of a conventional monolithic spacecraft are distributed across multiple modules which interact through wireless network links.

run-time faults. High-security and time-critical software applications hosted on such platforms run concurrently with all of the system-level mission management and fault recovery tasks that are periodically undertaken on the distributed nodes. Once deployed, it is often difficult to obtain a reliable period of low-level access to such remote systems for runtime debugging and evaluation. These types of DRE systems, therefore demand comprehensive design-time modeling and analysis methods to detect possible anomalies in system behavior, like the unacceptable response times in the advanced braking systems in vehicles.

With the DARPA System F6 Project, our team has designed and prototyped a full information architecture called *Distributed REal-time Managed Systems* (DREMS) [27, 31] that addresses requirements for rapid component-based application development and deployment for fractionated spacecraft. The stack of developed software includes a design-time model-driven development tool suite [28], and a component model [86] with precise execution semantics enabling robust and analyzable software designs. The minutiae of the DREMS architecture are described in Chapter IV. The formal modeling and analysis methodology presented in this dissertation focuses on applications that rely on this foundational architecture.

The principle behind design-time analysis here is to map the structural and behavioral specifications of the system under analysis into a formal domain for which analysis tools exist. The key is to use an appropriate model-based abstraction such that the mapping from one domain to another remains valid under successive refinements in system development such as code generation. The analysis must ensure that as long as the assumptions made about the system hold, the behavior of the system lies within the safe regions of operation. The results of this analysis will enable system refinement and re-design if required, before actual code development.

Figure 1a shows a *spiral model* [15] of a typical industrial software/system development life cycle (SDLC). The five stages in this cycle include requirements analysis, software design, implementation, integration testing, and design evolution. Although the intricacies



**Figure 1: Embedded Software Development Lifecycle Comparison**

of each stage is hidden, the large majority of industrial software development follows this lifecycle. Embedded software development, especially for safety critical systems, does not lend itself well to this life cycle, mainly because the deliverable in such projects is usually not just a software package or a hardware platform but an amalgamation of both. Software development in fields like robotics, is tightly coupled with the hardware; assessment of software performance is sometimes dependent on and blocked by the hardware availability. Such blocking delays lead to inefficiencies in software evaluation and longer development times. It is also possible that design oversights could lead to poor timing performance e.g. long response times to critical events, that could damage the hardware in the process. Thus, the analysis presented in this work, supports and argues for a verification-driven workflow, as shown in Figure 1b. The software evaluation is performed at design-time and as often as possible until the assembly is refined and optimized. Application developers use domain-specific modeling languages to structure large-scale component assemblies and modular code generation features to speed up software development efforts. Moreover, domain-specific properties such as interaction patterns, component execution code, and associated

temporal properties such as worst-case execution times, deadlines etc. can be easily injected into such models. Using such application parameters in the *design* model, a Colored Petri net-based (CPN) [51] *formal analysis model* is generated. The system behavior is both simulated and analyzed using a CPN execution engine, CPN Tools [94], and useful properties of the system are verified. By generating a bounded *state space* of the system, the execution traces exhibited by the system are searched for property violations. Such system properties include the lack of deadlocks, deadline violations and worst-case trigger-to-response times. The goal of this analysis is to ensure that a component-based system: an assembly of tested component building blocks, meets the temporal specifications and requirements of the system.

The results of this analysis will help improve the structure of the application, enabling safe deployment of dependable components that are known to operate within system specifications. Using CBSE also enables this restructuring process as the components are not tightly coupled software entities. So, when designing the integrated system, the analysis can be performed by assigning *time budgets* to the discrete tasks in the execution. This enables timing analysis before implementation and also uses the time budgets as requirements for efficient code implementation. These budgets are often derived from some high-level requirements and appropriately distributed between the different components in the system. The analyzed system may not necessarily be complete, but instead be in a process of evolution. As the design progresses, the system requirements become concrete and the design is re-verified at each stage to ensure the consistency of all timing guarantees.

The remainder of this dissertation is organized as follows. Chapter II describes some fundamental concepts about distributed real-time systems, component-based software and some challenges in timing analysis. Chapter III briefly describes general software testing and analysis methodologies, and summarizing related research in timing analysis and verification for distributed real-time embedded applications. Chapter IV introduces the DREMS infrastructure and the Component Model used to experiment with and validate the timing

analysis results. Chapter **V** discusses the Colored Petri net-based timing analysis model devised for component-based DRE systems. Chapter **VI** describes the scope and efficiency of the analysis methods implemented with this CPN model. Chapter **VII** evaluates this model with published results on analysis design, scalability and experimental validation. Finally, Chapter **VIII** concludes the dissertation, providing a summary of the detailed work and describing potential future work.

## CHAPTER II

### FUNDAMENTALS

A real-time system [71] is one where the correctness of the system behavior is dependent not only on the logical results of the computation but also on the physical time when these results are produced. Here, the system *behavior* refers to the sequence of outputs over time of the system. The flow of time is modeled as a directed line that extends from the past into the future. A slice of time in this line is called an *instant*. Any ideal occurrence at a time instant is called an *event*. An interval on this time line is called the *duration*, defined by two events, the start event and the end or terminating event. This timeline is digital when the time line is partitioned into a sequence of equally spaced durations, called clock *ticks*. A real-time system typically changes as a function of physical time i.e. a non-spatial continuum in which events occur in apparently irreversible succession from the past through the present to the future. If the real-time system is *distributed*, then it consists of a set of computers, *nodes*, interconnected by a real-time communication network.

Real-time systems are subject to strict operational deadlines. These deadlines constrain the amount of time permitted to elapse between a stimulus provided to the system and a response generated by the system. Consequently, the correctness of such systems depends heavily on its temporal and functional behavior. Real-time programs that are logically correct i.e. implement the intended functions, may not operate correctly if the required timing properties are not met. Typically, such systems are classified as either soft or hard real-time systems. In soft real-time systems, missing deadlines does not completely degrade the overall system performance e.g. delays in opening a web browser does not render the web browser useless; the browser, as an entity is still functional. Hard real-time systems, however, are systems where missed deadlines could be critical e.g. delays in pacemaker timing cycles leads to irregular heart beats with potentially fatal consequences. It is important

that any error within the system e.g. data loss or corruption, node failure etc. be detected within a short time with a very high probability. The required error-detection latency must often be in the same order of magnitude as the sampling period of the fastest critical control loop. Then, it is possible to perform corrective action, or bring the system to a safe state. This makes the design of hard real-time systems different from soft real-time systems. The demanding response time requirements, often in milliseconds or less, preclude human intervention during normal operation. A hard real-time system must be highly autonomous to maintain safe operation. In contrast, the response time requirements of soft real-time systems is often in the order of seconds.

Timing and schedulability analysis in real-time systems usually assumes an ideally functioning software program where every step of computation performs as expected and characterizes these steps with timing properties such as worst-case execution times (WCET) [110] or response times [56]. Once a *timing model* of the system is realized, the behavior can be analyzed by using either a discrete event simulator, prototypical testing or formal analysis methods. This thesis concentrates on a formal analysis approach to analyzing the temporal behavior of a class of distributed real-time embedded systems.

*Verification* establishes a consistency between the formal system specification and the system requirements, and also between the specification and the implementation, while *validation* is concerned with the consistency between the model of the user's intentions and the system requirements. The missing link between verification and validation is the relationship between the user's intentions i.e. informal specification and the formal system specification. Discrepancies between these notations are called *system specification errors*. Verification is usually reduced to a mathematical analysis process, while validation must examine the system's behavior in the real-world. If properties of a system have been formally verified, it still has not been established whether the existing formal specification captures all the aspects of the intended behavior in the user's real-world environment. To be free of specification errors, validation and specification testing are required for quality

assurance. The primary verification method is *formal analysis* and the primary validation method is *testing*. The following chapter reviews various general analysis methodologies, including software testing and formal verifications methods, and also selective system-level analysis methodologies for concurrent real-time systems that are related to this dissertation.

## **CHAPTER III**

### **RELATED RESEARCH**

#### **3.1 General Analysis Methodologies**

There are several methods and techniques to analyze the design of a DRE system, studying various structural and behavioral properties for correctness. Methods include prototype testing, system simulation, and formal verification.

##### **3.1.1 Testing and Evaluation**

Testing methodologies are an important part of a software development life cycle. The goal of software testing a large suites of applications is to find and fix errors prior to delivery to the end user. However, testing real-time systems is not trivial since an execution that is deemed "correct" is dependent on both logical correctness and timeliness. Rigorous testing methods are required to cover various levels of real-time system development, including both software and hardware.

In general, testing can be classified as one of two types: Black box testing and white box testing. In black box testing [59], the tested piece of software is treated like a black box; inputs are fed to this box and the output values are recorded. Black box testing ignores *how* the inputs are transformed into outputs. By providing an exhaustive combination of inputs, the function of the software piece is evaluated. A disadvantage to this method is that unreachable pieces of code are often bypassed. Conversely, white box testing exercises all paths in a piece of software. White box testing is driven by code inspections or static code analysis where software is tested line-by-line e.g. code inspection in flight engine controllers to confirm appropriate control logic for each mode of operation. Using complete knowledge of the tested code, unit tests can also be fully be generated that exercise all control paths

There are various levels of real-time system testing performed in the industry. Unit testing focuses on the smallest units of software such as functional blocks and modules. Unit tests are developed or generated by the software developer and usually oriented for white-box testing. In integration testing, software evaluation is performed when sets of unit-tested software modules are *integrated* into a larger program structure. Integration testing focuses on the interfaces between pretested software components and the testing of the system as a whole. Integration testing follows the black box testing methodology where the tester is unaware or uninterested in how each software module functions. Integration tests often extend to the system level where software is incorporated with other system entities such as hardware devices. Groups of software and hardware pieces are tested for compliance and performance.

Most testing methods for software concentrate on logical correctness and are not specialized in evaluating temporal correctness. This is an essential requirement for real-time systems. Existing testing procedures need to be improved with new methods that concentrate on determining whether a system violates specific timing constraints. Such violations mean that the computation done by a piece of software required too little or too much time to complete.

The goal for real-time systems testing frameworks should therefore be to find testing inputs with the shortest and longest execution times, in order to check for temporal errors. There are two main ways to find test cases - manual and automatic. In manual testing, a tester uses static timing analysis to predict the test inputs which result in the best/worst case execution times for a piece of code. When performing such analysis, the tester takes a piece of code, analyzes the set of possible control flow paths, combined with an abstract model of the hardware architecture to obtain timing bounds [91]. This is a very labor intensive technique and the behavior of the hardware is difficult to predict e.g. in modern processors that use speculative execution [42], execution time of a piece of software becomes non-deterministic [95]. Speculative execution is an optimization technique where a computer

system executes tasks that may not be needed; the main idea is to do work before it is known whether the work is needed or not, especially if the resources are available to do so. If it is realized later that the work was unnecessary, then the results are discarded and any changes made by it are reverted.

Prototype testing involves using a prototype approximation of the real system that is similar in computational power and connectivity. This can be considered as an adequate model of a real deployment. Unlike simulations, prototyped testing provides the analysis with actual hardware on which software can be executed at real-world speeds. The effectiveness of such testing is dependent on the closeness of the chosen computing nodes to the real-world scenario. The software execution behavior realized via prototype testing provides insights about the consistency and validity of system-level specifications e.g. average trigger-to-response times. A disadvantage to prototype testing is that the software and the overall system design must be complete, or at least mostly complete. For large systems, the input space i.e. the possible combinations of inputs with all possible timing of inputs, is too large. For this reason, testing methods can never be exhaustive i.e. enforcing every possible combination of code execution traces. Testing also encounters various challenges in concurrent systems since the results may depend on the interleaving of concurrent activities e.g. threads, which are impossible to control. Here, *results* include worst-case execution times, end-to-end response times etc. Testing is not easily repeatable, especially for large concurrent systems and does not produce concrete, certifiable results.

### 3.1.2 Simulation

System simulation is a commonly used, often automated technique in the industry for testing control systems and algorithms [44, 57]. The part of the runtime system i.e. the implemented code is executed with a simulated environment e.g. the plant dynamics of propulsion system. The TRUETIME simulation toolbox [43], based on MATLAB/Simulink

[102], simulates the temporal behavior of a multi-tasking real-time kernel containing controller tasks. The controller tasks manage processes modeled as Simulink blocks. Different scheduling policies such as Priority First-in First-Out (PFIFO) or Earliest Deadline First (EDF) can be used. The execution time of these tasks are modeled as either constants, time-dependent variables or probability distributions. Accounting for low-level details such as context switching and interrupt handling effects TRUETIME is used for simulating the execution of real-time task sets. This toolbox has also been used to simulate the behavior of communication networks. In general TRUETIME can be used to simulate and analyze the effects of timing nondeterminism on control algorithms and performance. This helps develop compensation schemes that dynamically adjust control based on measured timing variations in the real-time system.

Large projects aimed at developing real-time systems employ a variety of simulation-based tools. These tools can be classified along two dimensions: scope and abstraction level. The abstraction level can be viewed from two perspectives: functional and timing models of abstraction. The scope of the simulation can be independent sub-systems or the full system as a whole. In the context of full system simulation, the abstraction level must be functionally low enough to boot and run commercial operating systems or industrial benchmarks, and temporally low enough to support modern hardware. However, going to such detailed levels of abstraction must not result in an overall simulation performance that inhibits the study of realistic workload scenarios, in terms of execution lengths and size of data. Full system simulation tools like Simics [73] support the design and testing of computer hardware and software from within a simulation framework that attempts to approximate the final application context. Simics simulates processors at the instruction-set level, supporting various models such as x86, x86-64, ARM and PowerPC architectures. Any Simics session can be stopped to a single step where the state can be inspected. The simulation is low-level enough to access memory traffic, set breakpoints and modify the systems e.g. adding new instructions or caches. The performance results presented in [73]

show the system boot simulation for a variety of operating systems e.g. x86-64 Linux boot, simulating nearly 1.3 billion instructions in 285 seconds at 4.5 MIPS. As for scalability, Simics is able to simulate upto 30 processors, with nearly 3.3 billion instructions per processor taking 40 minutes to simulate. These results are certainly impressive given the scope of the simulation and the refinement of the simulated models. In contrast to similar tools, Simics can run actual firmware and completely unmodified kernel and driver code.

Unlike formal analysis methods, simulations do not rely on rigid mathematical reasoning methods. Simulations are also not exhaustive i.e. positive results obtained from a simulation do not certify the system's performance. Simulations like Simics, although useful in exposing erratic behaviors, do no generally provide a definitive answer to system-level verification queries. This means that a lack of deadlocks in a simulation session does not mean that the system will never reach a state of deadlock. Although random variations to the expected behavior can be enforced on a simulation, such variations to state variables will never be exhaustive. vUnlike simulations, the requirements for system *verification* are strict. If a verification query for deadlock-freedom answers a "NO", then the system is guaranteed to be devoid of deadlocks. Such a guarantee cannot be provided by simulations. When simulating a system, all of the parameters governing the simulation are made explicit. This way, from the initial state of the system, the simulation is a discrete event-progressed sequence of steps following a specific trajectory. So, for small systems with a relatively minimal set of variable characteristics, multiple simulation models are typically generated and executed in parallel and the results are interpreted. System models can be refined to great levels of detail while simulating scenarios, covering various low-level details such as scheduling algorithms and communication protocols. For a small model or a set of models, the simulation methods are automated and the results are interpreted visually. The error-detection capabilities rely on the effectiveness of the results evaluation. An advantage to using simulation methods is that the system design need not necessarily be

complete i.e. sub-models of the system can be analyzed individually for correctness with some meaningful assumptions.

### 3.1.3 Formal Analysis Methods

The aim of formal analysis methods is to establish system correctness with mathematical techniques, mainly logical reasoning. Their potential has led to an increasing use by engineers of formal methods for verification of complex software and hardware systems. Formal methods are one of the highly recommended verification techniques for software development of safety critical systems according to e.g., the best practices standard of the IEC (International Electrotechnical Commission) and standards of the ESA (European Space Agency). During the last few decades, research in formal methods has led to the development of many promising verification techniques that facilitate the early detection of errors. Investigations have shown that formal verification procedures would have revealed the exposed defects in e.g. the Ariane-5 missile [69] and the Mars Pathfinder [55, 82].

Model-based verification techniques are based on models describing the possible system behavior in a mathematically precise and unambiguous manner. Even prior to verification, accurate modeling leads to the discovery of incompleteness, ambiguities and inconsistencies in system specifications. System models are typically accompanied by algorithms that systematically explore all states of the system model. This provides the basis for a whole range of verification techniques including exhaustive exploration of the model's behavior (model checking), theorem proving, static code analysis and experiments with restrictive set of scenarios in the model (simulation), in reality (testing).

Both simulation and testing methods, though common, are non-exhaustive techniques since every possible system behavior or reachable state is not analyzed and checked. For exhaustive analysis, formal verification methods have become an applied standard, especially when analyzing hardware architectures and electronic circuits [6, 18]. Formal verification aims to cover all of the potential behaviors of the system, traversing a complete state

space tree for each design. In general, formal methods enable reasoning mathematically about the correctness of a system design. This work is widely used in certifying large-scale critical hardware designs and becoming increasingly common in software. However, it must be noted that formal methods have several challenging problems that limit its use in the industry. State space explosion limit the applicability of certain formal analysis techniques to classes of systems that have highly variable behaviors. State space explosion refers to a scenario where the *state space*, the tree of possible executions from a specific initial/current state, grows exponentially as a function of the number of state variables. The larger the number of system components, and the larger the number of potential internal states, the larger the state space. This means, for complex composed systems, the number of state space *nodes* that must be checked against formally specified requirements can be very large. Many verification methods are affected by state space explosion, leading to long analysis times and hindered practical use. Also, formal methods are usually hard and mathematically intense. Any user of formal analysis methods needs to both understand the methodologies and the internals of the tool. Industrial strength formal methods are also uncommon, protected or too ad hoc leading to a general lack of design and analysis tools that are generic and easily applicable to large domains of systems.

There are two main types of formal and verification analysis methods studied in literature: Model checking and Theorem Proving.

### 3.1.3.1 Model Checking

The main principle behind *Model Checking* is to analyze a system by constructing an appropriate *model* of the system. Here, the model refers to some data structure that accurately represents the structure and behavior of the system. Using this model, a model checker explores all possible system states from a known initial state and *checks* for property violations. Similar to a computer chess program that checks possible moves, a model checker examines all possible system scenarios in a systematic manner. In this way, it can

be shown that a given system model truly satisfies a certain property. Any verification using model-based techniques is only as good as the model of the system. State-of-the-art model checkers can handle state spaces of  $10^8$  to  $10^9$  states with explicit state space enumeration. Using clever algorithms and tailored data structures, larger state spaces ( $10^{120}$  states) can be handled for specific problems. Even subtle errors that remain undiscovered using simulation, and prototype testing can be potentially revealed using model checking. In academia, model checkers are commonly used with new tree temporal logic methods, traversal algorithms, and state space reduction techniques. Academic model checkers used for real-time systems include Spin [47], UPPAAL [66] and Kronos [112].

Model checkers provide a boolean result i.e. true if the analyzed property holds, and false if there property is not true e.g. Will the system reach a deadlock states when this circular dependency is in effect? Model checking requires a precise statement of the properties to be examined. The system model is sometimes automatically generated from a model description that is specified in some programming language like C or some hardware description language like Verilog or VHDL, though quite often such models are hand-crafted from knowledge about the system. The system model address how the system behaves and the property specification prescribes what the system should do, and what it should not do. If a state is encountered that violates the property under consideration, the model checker provides a counterexample that indicates how the model could reach the undesired state. The counterexample describes an execution path that leads from an initial system state to a state that violates the system property. By simulating the violating scenario, useful information can be derived in order to adapt the model or the property accordingly.

Model checking has many strengths. It is a general verification approach that is applicable to a wide range of applications such as embedded systems, software and hardware design. It supports partial verification i.e. properties can be checked individually, thus allowing focus on the essential properties first. It provides diagnostic information in case a property is violated; this is useful for debugging purposes. Model checking does not require

a high degree of expertise or user interaction i.e. model checkers are simply started when required. Model checking is also gaining in popularity in the industry - several hardware companies have started in-house verification labs. Model checkers can be easily integrated into existing development cycles and the learning curve is not steep. Lastly, model checking is mathematically sound and based on graph theory, data structures and logic.

Model checking also suffers from several weaknesses: Model checking is mainly appropriate for control-intensive applications and less suited for data-intensive applications since data typically ranges over infinite domains. The applicability of a model checking method is dependent on decidability - model checking is not effective on infinite-state systems. Model checkers verify a system model and not the actual system itself; any result obtained from model checking is as good as the system model. Complementary methods such as real system testing is required. Model checking suffers from state space explosion i.e the number of states needed to model the system may exceed the amount of available computer memory. Despite the development of several very effective methods to combat this problem, models of realistic systems may still be too large to fit in memory. Model checkers require some expertise in finding appropriate abstractions to obtain system models and to state system properties in logic formalisms. Lastly, model checkers are not guaranteed to yield correct results because the model checking engine may itself contain software defects.

### **3.1.3.2 Theorem Proving**

Formal verification is accomplished by constructing a mathematical model of the computer program, hardware or software, and then using calculations to determine whether the model satisfies desired properties. This is similar to how mathematical models are used to validate the structural design of bridges, or the aerodynamic properties of a rocket. However, the appropriate mathematics for modeling computer systems is formal logic, and the

calculation is accomplished by formal deduction, which has far higher computational complexity than solvers for partial differential equations used to model physical phenomena. Consequently, formal calculation often requires recourse to human guidance or to simplified models.

Formal deduction by human-guided theorem proving and interactive proof checking can, in principle, verify any design, but doing so may require excessive amounts of time, effort and skill. On the other hand, models analyzed with automated methods, such as model checking may be so simplified that their verification does not necessarily guarantee that the property concerned will hold on the actual systems. Mechanization of formal deduction in support of verification must therefore strike a balance between the extent and the form of human guidance required, the heuristic automation provided and the convenience of the modeling supported.

Deductive reasoning techniques like theorem provers formalize the system behavioral requirements into mathematical theorems. By proving mathematical theorems, system-level requirements are verified to hold. So, while a model checker verifies the system using an abstract model, a theorem prover makes inferences about the system using strict logical reasoning based on a logic-based representation of the system e.g. algorithms. Theorem proving tools assist in the construction of such proofs. Such tools are not powerful enough to automate the entire process of theorem proving and so many of the involved steps are invented by the user and the theorem prover fills in the mathematical gaps. Deductive and algorithmic verification tools like the Stanford Temporal Prover, STeP [14], have shown to be useful in real-time system analysis. Real-time systems are expressed as clocked transition systems and the specifications are provided in Linear-time Temporal Logic (LTL) [35]. STeP implements verification rules and diagrams, along with decision procedures that couple with propositional and first-order reasoning to simplify verification conditions and prove them mathematically. Similar theorem provers include HOL [40], and the Prototype Verification System (PVS) [87]. There are also *proof checkers*, which unlike theorem

provers, do not generate proofs but instead check already generated proofs for validity. In general, deductive methods are not fully automated and require human intervention and expertise.

Verification using theorem proving has two advantages over algorithmic methods such as model checking - theorem provers can deal with unbounded or infinite systems and can support highly expressive, yet abstract, specifications of the system and its properties. However, theorem provers currently require guidance to be presented in their terms e.g. an interactive theorem prover such as PVS [87] requires the user to suggest case splits, lemmas, and so on during the course of the proof. A non-interactive prover such as ACL2 [101] receives human guidance through the selection and ordering of the lemmas it is invited to prove. Neither of these forms of guidance seem acceptable to non-specialists. Any system designer who is not an expert with theorem proving would rather provide guidance in terms of the system design (its properties and structure), and not in terms of its proof. This is the primary argument for choosing model checking methods over theorem proving techniques. In model checking, non-specialists are required to provide a simplistic model of the system that the model checker can accept; this is the form of human guidance in model checking. But for an average system integrator, this is more acceptable and often times easier than aiding with tough mathematical proofs.

### 3.1.4 Static Code Analysis

In the 1970s, Stephan Johnson, then at Bell Labs., wrote Lint [54], a tool to examine C source code that had compiled without errors in order to find bugs that had escaped the compilation step. There are many ways to detect and reduce the number of bugs in a program e.g. JUnit [76] in Java is a useful tool for writing tests. Static code analysis [72] is defined as a method of detecting errors and defects located in the program binary. A static code analyzer can thoroughly analyze code and make suggestions where the code should be changed based on rules defined by the user. The range of errors that a static analyzer

can detect is diverse, ranging from coding defects to meeting coding standards like CERT C/C++ [99].

These tools are entirely automated and analyze 100% of the source code without execution, or the use of test cases. Execution paths are analyzed by the tool, and variable ranges and concurrent data access points are known. After analysis, static code analyzers give a detailed list of errors encountered, each with a description and place of discovery. Static code analyzers also issue warnings/errors about concurrency violations, implementation defects, boundary conditions, security weaknesses, logical errors and other general defects.

There are several different techniques used in static code analysis. Control flow analysis can be conducted using tools or done manually at various levels of abstraction. This is done to ensure that code is executed in the right sequence. This helps locate syntactically unreachable code blocks and highlights parts of the code e.g. loops where termination is needed. With data flow analysis, the goal is to show that no execution paths in the software exist that would access uninitialized variables. Tools use the results of control flow analysis in conjunction with read/write access to variables. Thirdly, Information flow analysis identifies how execution of a unit piece of code creates dependencies between the inputs to and output from that code. These dependencies can then be reviewed. This analysis is often appropriate for a critical output that can be traced all the way back to the inputs of the hardware-software interface. Lastly, formal verification, also called compliance analysis, is a process of automatically proving that the program code is correct with respect to the formal specification of its requirements, expressed using a goal-oriented requirements language [37]. All possible program executions are explored, which is not feasible by dynamic testing alone. Verification conditions can enhance compliance analysis. They consist of conditions that should be valid during and at the start and end of a block of code e.g invariant, pre- and post-conditions. Compliance analysis essentially performs a proof of code against a low-level mathematical specification. In this respect, it is by far the most

rigorous of the static analysis techniques. However, this rigor is at the expense of cost - productivity is at around five lines of code per man-day [36].

## 3.2 Fundamental Timing Analysis Tools

### 3.2.1 Cheddar Real-Time Scheduling Framework

Cheddar [103] is an Ada framework based on real-time scheduling theory that provides tools to study temporal behavior of real time applications. These applications are often associated with timing constraints such as response times, deadlines, execution rates etc. Real-time scheduling theory helps system designers to predict the timing behavior of a set of real-time tasks with scheduling simulation and feasibility tests. Scheduling simulation involves calculating the schedule for the task set within an interval and checking timing properties. Feasibility tests allow designers to study real-time tasks without computing scheduling. The development of Cheddar aimed to provide a framework which implemented many of the classical real-time scheduling theories, with feasibility tests for tasks running on single processor and distributed systems with different scheduling policies and task activation patterns.

In Cheddar, the characteristics of real-time applications are specified by a set of processors, buffers, shared resources, messages and tasks. The simplest of task models in Cheddar is the periodic task model [70]: Each task periodically takes up the CPU for a certain run-time during which it performs some computation, aiming to complete execution before its deadline. Scheduling simulation consists of predicting for unit of time, the task to which the processor should be allocated. As the simulation progresses, the engine is capable of checking if any of the tasks in the application have missed their deadlines. Cheddar provides most of usual real-time schedulers such as Earliest Deadline First, Deadline Monotonic, Least Laxity First and POSIX schedulers. Information such as worst/best/average case response-time, blocking time, number of pre-emptions, context switches etc. can

be extracted from the simulation. If the scheduling simulation takes very long to compute for a given task set, then feasibility tests can be used.

### 3.2.2 UPPAAL

UPPAAL [12, 24, 66] was developed for the design, simulation and verification of real-time systems that can be modeled as a network of Timed Automata [5], extended with integer variables and rich user-defined data types. Here, a timed automaton is a finite state machine extended with clock variables. Clock variables evaluate to real numbers and all clocks progress synchronously. Uppaal consists of a suite of tools for verifying safety properties of real-time systems. The UPPAAL model checker is able to check for reachability properties i.e. whether a specific combination of control-nodes and constraints on clocks and data variables are reachable from some initial configuration. UPPAAL is capable of analyzing various types of classical schedulability problems such as Fischer' protocol, and the Train-Gate Controller [12].

## 3.3 System-level Timing Analysis Methodologies

### 3.3.1 Petri net-based Timing Analysis for Concurrent Systems

A Petri net [90] is an abstract, formal paradigm to model concurrent systems. The properties, concepts, and techniques of Petri nets are developed in search for natural, simple and powerful methods for describing and analyzing the flow of information and control in systems, particularly systems that exhibit asynchronous and concurrent activities. The major use of Petri nets has been the modeling of systems of events in which it is possible for some events to occur concurrently but there are constraints on the concurrence, precedence, or frequency of these occurrences.

Figure 2 shows a sample Petri net. The pictorial representation of a Petri net as a graph used in this illustration is common practice in Petri net research. The Petri net graph models

the static properties of a system, much as a flowchart represents the static properties of a computer program.



**Figure 2: Sample Petri Net, reprinted from [90]**

This Petri net contains two types of nodes: circles (called *places*) and bars (called *transitions*). These graph nodes are connected using directed arcs from places to transitions and from transitions to places. If an arc is directed from node x to node y (either from a place to a transition or a transition to a place), then x is an input to y, and y is an input of x. In Figure 2, place  $P_2$  is an input to transition  $t_1$ .

In addition to static properties, a Petri net has dynamic properties that result from its execution. The execution of a Petri net is controlled by the position and movement of

markers (called tokens) in the Petri net. Tokens, indicated by black dots, reside in the circles representing the places of the net. A Petri net with tokens is a marked Petri net.

Tokens are moved by the *firing* of transitions of the net. A transition must be enabled in order to fire; a transition is enabled when all of its input places have a token in them. A transition fires by removing the enabling tokens from their input places and generating new tokens which are deposited in the output places of the transition. In the marked Petri net in Figure 2, the transition  $t_2$  is enabled since it has a token in its input place  $P_1$ . If  $t_2$  fires, the token in  $P_1$  is removed and a token is placed in places  $P_2$ , and  $P_3$ . The distribution of tokens in a marked Petri net defines the state of the net and is called a *marking*. This marking may change as a result of firing transitions.

Formally, a Petri net is a five tuple  $(P, T, A, W, M_0)$ , where  $P$  is a finite set of places,  $T$  is a finite set of transitions,  $A$  is a finite set of arcs between places and transitions,  $W$  is a function assigning weights to arcs and  $M_0$  is some initial marking of the net. Places hold a discrete number of markings called tokens. These tokens often represent resources in the modeled system.

Petri nets enable the modeling and visualization of dynamic system behaviors that include concurrency, synchronization and resource sharing. Theoretical results and applications concerning Petri nets are plentiful [25, 45], especially in the modeling and analysis of discrete event-driven systems. Models of such systems can be either *untimed* or *timed* models. Untimed models are those approximations where the order of the observed events are relevant to the design but the exact time instances when a state transitions is not considered. Timed models, however, study systems where its proper functioning relies on the time intervals between observed events. Petri nets and extensions have been effectively used for modeling both untimed [45] and timed systems [116]. For a detailed study of Petri nets and its applications, the reader is referred to standard textbooks [90, 96] and survey papers [83, 113, 117].

Petri nets have evolved through several generations from low-level Petri nets for control

systems [96] to high-level Petri nets for modeling dynamic systems [53] to hierarchical and object-oriented Petri net structures [26] that support class hierarchies and subnet reuse. Several extensions to Petri nets exist, depending on the system model and the relevant properties being studied e.g. Timed Petri nets [107], Stochastic Petri nets [10, 75] etc. High-level Petri nets are a powerful modeling formalism for concurrent systems and have been widely accepted and integrated into many modeling tool suites for system design, analysis and verification.

### 3.3.1.1 Example High-level Petri net Tool: CPN Tools

CPN Tools [94] is an open source tool for editing, simulating and analysis of Colored Petri nets (CPN) [51]. A Colored Petri Net is a tuple  $(\Sigma, P, T, A, N, C, G, E, IN)$  where:  $\Sigma$  is a finite set of non-empty types called **color sets**. Color sets determine the types of tokens, operations, and functions that can be used in the net inscriptions (arc expressions, guards, etc.). **Places**, **transitions** and **arcs** are defined by three finite sets  $P$ ,  $T$ , and  $A$  respectively.  $N$  is a **node function** that maps each arc into a pair where the first element is the source node and the second a destination node. The two nodes must be of different types i.e. arcs can exist only between a place and a transition. The **color function**  $C$  maps each place  $p$  into a set of possible token colors  $C(p)$  i.e. each token on  $p$  must belong to the type  $C(p)$ . The **guard function**  $G$  maps each transition  $t$  into an expression of type boolean i.e. a predicate. The **arc expression** function  $E$  maps each arc  $a$  into an expression which must be of type  $C(p(a))$  i.e. evaluation of the arc expression must yield a resultant token that can be attached to the corresponding place. Lastly, the **initialization function**  $IN$  maps each place  $p$  into an expression which must be of type  $C(p)$  i.e. this function initializes each place with a token value of the appropriate color set type.

CPN Tools currently supports two types of analysis for CPNs: simulation and state space analysis. Like many simulation software packages, CPN Tools supports single-step

simulation where the Petri net is executed one transition step at a time: on an enabled transition it causes that particular transition to occur, while on a page it will cause a randomly selected, enabled transition on that particular page to occur. CPN Tools also supports executing a user-defined number of steps, where the simulation graphics will be updated after each step. A fast-forward tool will also execute a user-defined number of simulation steps, but without any graphics updates till after the last step.

CPN Tools also contains facilities for generating and analyzing full and partial state spaces for CPN. To facilitate the implementation of the state space facilities, CPN Tools uses syntactical constraints which are important for state space generation but not for simulation e.g. a state space cannot be generated unless all places and transitions have unique names, and all arcs have inscriptions i.e. expressions that evaluate to some token value. Standard state space reports can be generated automatically and saved. Such reports contain statistical information about the net e.g. boundedness properties, home properties, liveness properties and fairness properties. Querying facilities enable searching a generated state space for the presence or absence of interesting system properties.

### 3.3.2 Analyzing AADL models with Petri nets

Teams of researchers have, in the past, identified the need for in-depth timing analysis tools that integrate with complex system design challenges, especially in model-driven architectures [98]. Modeling languages like MARTE [85] and AADL [33] provide a high-level formalism to describe a DRE system, at both the functional and non-functional level. MARTE (Modeling and Analysis of Real-time Embedded Systems) defines the foundations for model-based description of real-time and embedded systems. MARTE supports the annotation of models with information required to perform specific types of analysis such as performance and schedulability analysis.

In general MARTE provides a generic canvas to describe and analyze systems. The user is required to add domain-specific and system-specific properties and artifacts on top

of the generic platform. Compared to MARTE, AADL (Architecture Analysis and Design Language) comes with a stand-alone and complete semantics that is enforced by the standard. In [98], the authors propose a bridge that translates AADL specifications of real-time systems to Petri nets for timing analysis. This formal notation is deemed to be well-suited to describe and analyze concurrent systems and provides a strong foundation for formal analysis [38] methods such as structural analysis and model checking. The high-level goal is to check and verify AADL models for properties like deadlock-freedom and boundedness. The workflow presented here is similar to the proposed work in this thesis in the sense that a system design model along with user-specified properties are translated into a high-level Petri net-based analysis model.

The execution behavior of the software in AADL is represented by AADL components called *Threads*. Interactions are modeled by communication places in the Petri net to trigger associated actions when AADL threads receive new data (new Petri net tokens). The thread execution is represented by an automata that has three parts: (1) thread life cycle that handles dispatching, initialization and completion; (2) thread execution that executes thread-specific code; and (3) error management that handles potential errors. Symmetric nets are high-level Petri nets commonly used for analysis of causal properties in distributed systems, where tokens can carry data. Using this Petri net, the analysis uses model checking to verify (1) lack of deadlocks in the system and (2) correct causality e.g. a message sent by a producer is always received and processed by a consumer.

However, there are some potential improvements to this work. Not only is the generated Petri net structure hard to follow, it is seemingly composed of sub-Petri nets, one for each thread (and its lifecycle) in each process. It is clear that although the transformation is sound, the generated Petri net models are going to be intractably large for complicated scenarios. The state space of the Petri net is dependent on the number of places in the net and the corresponding internal states. The generated net would not scale well for large process sets or distributed scenarios without using state space reduction techniques that rely

on symmetry [104]. Such troubles can be alleviated by using a high-level Petri net such as Colored Petri nets where much more information can be packed in a Petri net token. Complex token data structures reduce the number of places required to describe a system model e.g. a list of C-style struct data structures can abstractly model a set of processors. This reduces the number of places that would be required to represent a full system. Such modeling constructs are essential in component-based systems where the full system is typically a large assembly of tested black box components. Lastly, the modeling constructs used are strictly bound to AADL and cannot be easily modified for systems not modeled using AADL, especially strictly-defined component models with precise execution semantics.

### 3.3.3 Analyzing AADL Models with Timed Petri nets

The authors in [98], have also investigated AADL model analysis using Petri net extensions such as Timed Petri nets [97]. Using the modeling concepts and analysis capabilities of Petri net extensions means that developers can analyze for a larger set of system-level properties such as schedulability dimensioning, and deadlock detection. This work allows for efficient model-driven development and prototyping of real-time systems.

Petri nets have proven to be useful mathematical means to analyze both the structure and behavior of a real-time system. Structural analysis involves analyzing a model structure to obtain knowledge about properties like circular dependencies, and causal flaws. Behavioral analysis is performed by generating and searching a bounded state space of the system, typically deducing safety properties e.g. deadline violations. By using Timed Petri nets, the authors insert time into any property that needs to be verified. By tagging these properties, state space queries reveal the temporal nature of system-level events that enable timing analysis results e.g. worst-case response times.

The approach presented in this work uses a *Timed Petri net pattern* to model the thread life-cycle, derived from the corresponding AADL model. The state of the AADL threads are modeled using places and the life cycle is handled by the transitions. The periodicity of

the thread execution is managed external to the thread pattern, by using timed notations that represent the system clock. Multi-threaded execution is managed by the *Processor* place. The presence of a token in this place indicates an idle processor, enabling potential thread state changes.

Analysis techniques using Petri nets need to record/detects errors in Petri net places. To detect missed deadlines, a deadline-detection subnet is simply added to the TPN pattern. Similar to missed deadlines, missed activations can also be detected. When the thread must be dispatched but misses its activation deadline, a detector transition fires, marking a missed activation. When model checking this system, if there is no token in some *Missed Activation* place any where in the state space of the system, then no thread activations were missed.

Similar to this work, our Colored Petri net-based analysis work uses bounded observer places [4] that observe the system behavior for property violations and prompt completion of operations. However, this work [97] only considers periodic threads in systems that are not preemptive. The non-preemptable thread execution is evident in the need to check for missed activations. Our analysis aims to improve on this work by (1) generating a more scalable and efficient pattern-based analysis model and (2) supporting various types of hierarchical scheduling algorithms, both preemptable and non-preemptable with (3) complex periodic and aperiodic interaction patterns.

### 3.3.4 MAST: Modeling and Analysis Suite

MAST [39] is a modeling and analysis suite for real-time applications. MAST, still in development, aims to provide a set of tools that enable engineers and system integrators to developing real-time applications to check the timing behavior of their system, including schedulability analysis. The techniques implemented by this tool focus on fixed-priority scheduled systems, such as the ones in commercial operating systems. The tools aims

to address the timing analysis results developed for both single processor [58, 70] and distributed real-time systems [89, 105].

A model describing real-time applications should not only represent the structure of the system but also the hard real-time requirements imposed on it. Most of the existing schedulability analysis methods are based on a *linear* timing and interaction model where each task is activated by the arrival of a single event or message and each message is sent by a single task. This linear model does not allow for complex interactions and event sequences, and so in such cases the analysis methods are not applicable. The MAST model of real-time system is a rich representation. It is an event-driven model where complex dependence patterns among tasks are established e.g. tasks may be activated by the arrival of several events at their output, making it suitable for analysis of real-time systems designed with object-oriented methodologies. This MAST model description is derived from a standard UML and MARTE description [80].

For analysis, the MAST suite includes schedulability analysis tools that use newly published research techniques such as the offset-based methods [88] that enhance analysis results, providing less pessimistic estimates than previous results [105]. The system description is specified through an ASCII description language that serves as the input to the analysis tools.

Using UML, the real-time *view* of the DRE is described [100] by adding appropriate behavior specifying classes. The application design is linked with the real-time view to get a full description of the modeled system, along with its timing behavior and requirements. Some of the tools in Figure ?? like the graphical editor and results display interfaces are not available or incomplete.

### 3.3.5 Verification in AutoFocus 3

FOCUS is a general theory providing a model of computation based on the notion of streams and stream processing functions [16]. It is suitable to describe models of distributed, reactive systems. Based on this mathematical foundation, a tool called AutoFocus 3 [48] allows for a graphical description of systems according to this model of computation.

In AutoFocus 3, the system model is described as a set of communicating components. Each component has a defined interface (black box view) and an implementation. The interface consists of a set of communication ports. A port is either an input port or an output port, each identified by its name and its type. Components can exchange data by sending messages through output ports and receiving messages via input ports. Communication paths are called channels. A channel connects an output port to some input port, thus establishing a relationship. From the logical point of view, channels transmit messages instantaneously.

AutoFocus 3 networks are executed synchronously based on a discrete notion of time and a global clock. In this setting, a component can be either strongly causal or weakly causal. A strongly causal component has a reaction delay of at least one logical time tick which means that the current output cannot be influenced by the current input values. A weakly causal component may produce an output which depends on the current input i.e. the reaction is instantaneous. A network of strongly causal components are always well defined e.g. unique fixed-points for recursive equations induced by channel connections always exist. Networks of weakly causal components are also well defined under the constraint that no cycles exist i.e. no weak causal component may send a signal that feeds back to itself in the same time tick.

Input/Output automaton models are used to define stateful component behavior. The automaton consists of a set of control states, a set of data variables and a state transition function. One of the control states is defined to be the initial state of the component, while each data state variable has a defined initial value. The state transition function is

defined as a mapping from the current state, the current input values, and the state variable values to output values. Using this model, AutoFocus 3 supports techniques to verify the logical architecture early in the development process e.g. automatic test case generation and model checking. Methodologies such as in [34] present the application of model checking techniques to verify the logical architecture of AutoFocus 3 models.

The formal verification process with AutoFocus 3 comprises of: (1) selecting system parts to be verified, (2) selecting requirements for the selected parts to be verified, (3) formally specifying the selected requirements, and (4) formally verifying using model checking. If the model checking succeeds, then the verification is finished, but if the model checking fails, then analysis is required to identify the reason e.g. implementation error, requirement formalization error etc. This analysis uses Linear Temporal Logic [35] to convert informal textual specification e.g. "The Adaptive Cruise Control (ACC) starts by driver interaction only" to a formal temporal logic specification using boolean and temporal operators. The system model is exported into the modeling notation of the model checking tool SMV [78, 79] used by the AutoFocus 3 project.

The workflow on the above verification methodology is fairly tenuous. For any system part, a state-transition model of the part has to be constructed; this model represents the logic working of the part. Then, all of the informal textual requirements of the system part need to be translated into LTL specifications by the user and then fed into the integrated model checker. One of the reasons the model checking could fail is improper formalization of the requirements. Secondly, the approach does not seem to model the actual execution of the software i.e. code execution on top of layers of management software running on appropriate hardware. Thus, the method is best utilized for identifying logical errors in the system design or inconsistent property specification. The approach does not necessarily model or analyze the complete timing behavior of the component processes and is applicable mainly to early designs where the component assembly is being prepared for integration.

## CHAPTER IV

### DESIGN MODEL: DISTRIBUTED MANAGED SYSTEMS (DREMS)

#### 4.1 DREMS Component Model

Timing analysis of component-based software, as presented in this dissertation, is facilitated by the DREMS component model ( **D**istributed **R**Ealtime **M**anaged **S**ystem) [31] [86]. DREMS was designed and implemented for the class of distributed real-time embedded systems that are remotely deployed and characterized by strict timing requirements e.g. a cluster of satellites, UAV swarms, disaster relief robots etc. DREMS is a software infrastructure for the design, implementation, deployment and management of component-based distributed real-time embedded systems. The infrastructure includes design-time modeling tools [28] that integrate with a well-defined and fully implemented component model [65, 86] used to build component-based applications. Rapid prototyping and code generation features coupled with a modular runtime platform automate the tedious aspects of the software development and enable robust deployment and operation of mixed-criticality distributed applications. This chapter elaborates on the DREMS component model, the operation execution semantics, and the process scheduling aspects i.e. properties of the DREMS software stack that are relevant for generating a timing analysis model.

Figure 3 presents a typical DREMS-style component. Component-based software engineering relies on the principle of assembly – large and complicated systems can be iteratively constructed by composing small reusable component building blocks. Each *component* contains a set of communication ports and interfaces, a message queue, time-triggered event handling and state variables. Using ports, components communicate with the external world. Using interfaces and message passing schemes, components process requests from other components. This interaction mechanism lies at the heart of component-based software. Each DREMS component supports four basic types of ports for interaction with



**Figure 3: DREMS Component**

other collaborating components: Facets, Receptacles, Publishers and Subscribers. A component's **facet** is a unique interface that it exposes to its clients. This interface can be invoked either synchronously via remote method invocation (RMI) or asynchronously via asynchronous method invocation (AMI) [93, 106]. A component's **receptacle** specifies an interface required by the component in order to function correctly. Using its receptacle, a component can invoke operations on other components using either RMI or AMI. A **publisher** port is a single point of data emission. This port emits data produced by a component operation. A **subscriber** port is a single point of data consumption, feeding received data to the associated component. Communication between publishers and subscribers is contingent on the compatibility of their associated topics. Publishers and Subscribers enable the OMG DDS anonymous publish/subscribe [32] style of messaging. More details on this component model can be found in [86].

## 4.2 Component Execution Semantics

An *operation* is an abstraction for the different tasks undertaken by a component. These tasks are implemented by the component’s source code written by the developer. Application developers provide the functional, *business-logic* code that implements operations on local state variables and inputs received on component ports. For example, a proportional integral differential control operation could receive the current state of dynamic variables from a *sensor* component, and using the relevant gains, calculate a new state to which an *actuator* component should progress the system. In order to service interactions with the underlying framework and with other components, each component has a *message queue*. This queue holds operation requests received from the external environment i.e. messages, service requests or responses, and timer activations. Each request is characterized by a priority and a deadline. Priority refers to the relative importance of one operation over another within the scope of the component. Deadline refers to the worst-case duration of time that an operation can take to complete. If the execution of the operation takes beyond its deadline to complete, then a *deadline violation* is said to have occurred.

DREMS component execution is handled by a single executor thread. This executor thread picks the next available request from the message queue and executes the operation to completion i.e. the operation scheduling is non-preemptive. So, all operations in the queue, regardless of priority, need to wait for the currently executing operation to complete. Allowing only a single executor thread per component and enforcing a single-threaded non-preemptive scheduling scheme on the operations helps avoid synchronization primitives for internal state variables and establishes a more easily analyzable system. It is true that multi-threaded solutions to operation scheduling would avoid starvation i.e. operations in the queue will not have to wait forever if the currently executing operation is blocked on a resource. We address such cases in our experimental evaluation (Section 7.3.6). However, the DREMS execution semantics is still the more predictable design as it is more easily analyzable. The non-determinism in multi-threaded executions causes a tree

of possible behaviors, leading to a common analysis challenge called state space explosion [68]. Keeping the operation execution to a single thread per component bounds the overall number of threads in the system leading to a more tractable analysis.



**Figure 4: Component Operation Execution Semantics:** This figure shows the DREMS operation scheduling on an incoming operation request. At some time  $t_i$ , the component executor thread is busy executing an operation – component operations can be triggered into execution by the (1) expiry of a timer, (2) the arrival of a subscription message, or (3) the arrival of a service request.  $t_{req}$  represents the arrival time of a remote request.  $t_{wait}$  is the wait time of this request in the message queue while the current operation is still executing.  $t_{req\_schld}$  is the time stamp at which the current operation completes executing. At this time, the remote request is finally scheduled for execution.  $t_{req\_cmpl}$  is the time stamp at which the remote request completes. The overall time taken by the component to respond to this request is calculated as:  $t_{wait} + t_{exec} = t_{req\_cmpl} - t_{req}$ .

Figure 4 shows the execution semantics of a component operation executed on the component's executor thread. Simplifying assumptions include that this component is the only component thread executing on the CPU and the CPU is a single core one. This is a simple scenario showing how a single operation on a single component is affected by the operation scheduling semantics. The wait times of the remote request are also worsened by OS scheduling non-determinism – when multiple components are scheduled concurrently, fixed-priority scheduling is enforced. If the executor threads of various components are of different priorities, then the highest priority ready executor thread is always chosen. If

these threads are of equal priority, then round-robin scheduling is carried out. Round-robin scheduling assigns a fixed time slice to each thread in equal portions and in circular order, handling all threads without priority.

### 4.3 Temporal Partition Scheduler

DREMS components are grouped into processes that are assigned to temporal partitions, implemented by the DREMS OS scheduler. This scheduler was implemented by modifying the behavior of the standard Linux scheduler, introducing an ARINC-653 [7] style temporal and spatial partitioning scheme. The primary goal of this scheme is to enable isolation or partitioning of processes so as to prevent one process from adversely affecting any other process in a different partition. This goal extends to any contended resources, including I/O bandwidth, CPU caching and CPU execution time. In fields like aviation, isolation is important because it allows applications at different levels of certification e.g. autopilot, in-flight entertainment etc. to be run in different partitions on the same platform. Beyond aviation, an ARINC 653-style scheduler can be used where ever temporal isolation of "domains" is a top priority, or in security environments with in-distinguishability requirements, since a malicious domain should be unable to extract information through a timing side-channel.

Temporal partitions are periodic fixed intervals of the CPU's time. Threads associated with a partition are scheduled only when the partition is active. This enforces a temporal isolation between processes assigned to different partitions. The repeating partition windows are called *minor frames*. The aggregate of repeating minor frames is called a *major frame*. The duration of a major frame is called the *hyperperiod*, which is typically the lowest common multiple of the partition periods. Each minor frame is characterized by a period and a duration. The period specifies how often this partition becomes active and the duration defines how much of the CPU time is available for scheduling the runnable threads associated with that partition. Figure 5 shows a sample temporal partition schedule.



**Figure 5: Sample Temporal Partition Schedule with Hyperperiod = 300 ms**

By confining applications to partitions i.e. its own memory space and temporal window of possession of computing resources, this scheduling scheme aims to guarantee safety and timeliness in mission-critical systems; safety by isolating processes in different applications and security levels from interacting with each other, and timeliness by providing processes with a guaranteed slice of the CPU time.

#### 4.4 Motivation to use DREMS

DREMS supports a wide variety of interaction patterns: synchronous and asynchronous service-oriented request-response style communication, and non-blocking anonymous publish-subscribe style communication. The wide variety of interactions and communication mechanisms are inspired by other common industrial component models such as CIAO [108] and ACM [29], and the execution semantics are precisely defined and implemented. A qualitative evaluation of its capabilities [86] show that although the model was designed for fractionated spacecraft, DREMS is suitable for a variety of distributed and embedded environments. All of these properties make this component model very generic and a suitable target for the timing analysis work presented in this thesis.

## CHAPTER V

### COLORED PETRI NET-BASED MODELING METHODOLOGY

#### 5.1 Problem Statement

Consider a set of mixed-criticality component-based applications that are distributed and deployed across a cluster of embedded computing nodes. Each component has a set of interfaces that it exposes to other components and to the underlying framework. Once deployed, each component functions by executing operations observed on its component message queue. Each component is associated with a single executor thread that handles these operation requests. The nature of mixed-criticality means that these executor threads are scheduled in conjunction with a known set of highly critical system threads and other low priority *best-effort* threads. All scheduled threads are also subject to a temporally partitioned scheduling scheme.

#### System Assumptions:

1. Knowledge about the component definition i.e. the properties of all ports and timers in each component.
2. Knowledge about the mapping between ports/timers and component operations i.e. how each component functionality is exposed to the environment around it.
3. Knowledge about the sequence of computation *steps* of finite duration that are executed inside each component operation. This is dependent on the operation business-logic code written by the application developer.
4. Knowledge of worst-case estimated time taken by the computational steps. There are some exceptions to this assumption e.g. blocking times on RMI calls cannot be accurately judged as these times are dependent on too many external factors e.g. the

nature of the process scheduling on both the client and the server, the business logic of the server operation that could include RMI calls to other remote servers etc.

5. Knowledge about the component assembly i.e. the connections between instantiated components that forms communication patterns.
6. Knowledge about the mapping between application processes and hosts i.e. the actual embedded computer on which the process will execute.
7. Knowledge about the temporal partitioning schedule enforced by the operating system on each host.
8. Knowledge about the mapping between application processes and temporal partitions on each host.

Using these assumptions, the problem here is to ensure that the temporal behavior of the composed system meets its end-to-end timing requirements e.g. trigger-to-response times between distant sensors and actuators. Providing this guarantee implicitly requires that communicating components in a component assembly meet individual timing deadlines. Following the DREMS component model execution, a blocking I/O operation blocks a component from attending to any other requests till the operation is completed. Such blocking interaction patterns can propagate large delays to other components, especially in a highly connected system. A useful analysis result here would not only be in identifying end-to-end timing violations but also tracing delays within individual components. Tracking timing violations enables the analysis in identifying the causes for the anomalies e.g. nontrivial circular dependencies or scheduling delays. If an abstract model of the business logic of component operations is also encoded in the analysis model, then inefficient coding practices such as wasteful loops can also be marked as probable causes for deadline violations.

Individual components need to be analyzed to identify the *pure* execution times of the various computational steps in the component operations. When a set of tested components

are composed together, each component’s execution is affected by various factors including scheduling delays, network communication delays, blocking delays and other interaction-specific variabilities. Any timing analysis model for component-based software should account for such factors. As described in Section 3.1.2, there are two important challenges to modeling and analyzing DRE systems: scope and abstraction level. The scope of the analysis here should be the full system of composed components. The abstraction level of the analysis must include enough detail to account for the various timing delays mentioned above while also not capturing all aspects of low-level code. A highly detailed and dynamic low-level model is necessary for simulation but not ideal for model checking and verification-based analysis due to issues like state space explosion. Also, highly composable system designs provide recombinant components that can be selected and assembled in various combinations to satisfy user requirements. In such cases, the analysis model must be efficiently capable of tackling changes in component assembly e.g. moving components to separate processes or devices, adding or removing additional instances of the same component etc. This is a challenge when building and non-trivially generating an analysis model from a system design. Thus, efficiency, scalability and extensibility are also modeling requirements for our timing analysis.

## 5.2 Colored Petri Net-based Analysis Model

Petri nets have been introduced in Section 3.3.1. Ordinary Petri nets have no types and are not modular. Tokens are abstract dots that represent the presence or absence of some entity. Tokens in a place could represent the presence of a message, or availability of resource i.e. the state of execution of the place that is holding the token. The number of tokens in a place could therefore be used to represent the quantity of some available resource. Also, ordinary Petri nets are flat structures; no hierarchy can be established to make the model more readable or concise. With Colored Petri nets (CPN) [51], it is possible to use data types and complex data manipulations – each token has attached a

typed data value, called the token *color*. This color represents the data type of the token. Colored tokens can be evaluated and modified by the occurring transitions. With CPN, it is also possible to make hierarchical descriptions i.e. a large model can be obtained by combining a set of submodels with well-defined interfaces between submodels and well-defined semantics of the combined model. Furthermore, each submodel can be reused.

One of the primary reasons for choosing Colored Petri Nets over other high-level Petri Nets such as Timed Petri Nets or other modeling paradigms like Timed Automata is because of the powerful modeling concepts made available by token colors. Each *colored token* can be a heterogeneous data structure such as a *record* that can contain an arbitrary number of fields. This enables modeling within a single *color-set* (C-style *struct*) system properties such as temporal partitioning, component interaction patterns, and even distributed deployment. The token colors can be inspected, modified, and manipulated by the occurring transitions and the arc bindings. Component properties such as thread priority, port connections and real-time requirements can be easily encoded into a single colored token, making the model considerably concise.

The CPN analysis model, as modeled in CPN Tools [94], is shown in Figure 6. *Places*, shown as ovals, in this model contain colored (typed) tokens that represent the state of interest for analysis e.g. *Clocks* place holds tokens of type *clocks* maintaining information regarding the state of the clock values and temporal partition schedule on all computing nodes. *Transitions*, shown as rectangular boxes, are responsible for executing this model, progressing the state of the modeled system and transferring tokens between places. *Arcs*, between transitions and places dictate the token flow and data structure manipulations. All arcs contain *inscriptions*, which are essentially function calls, written in Standard ML [81], that manipulate token structures e.g. arc inscriptions in the arc from the transition *Timer\_Expiry* to the place *Timers*, manipulate all timer tokens by updating the timer expiry offsets. When a transition fires, every input arc to this transition is evaluated first. Using the input tokens and the arc inscriptions on all output arcs, tokens are released to all output

places i.e. all input arcs are associated with token values before the transition fires and output arc tokens are calculated after the transition fires. This process of associating a token value to an arc is called an *arc binding*. So, every transition firing leads to the production of strict arc bindings used to (1) evaluate guard conditions on the transitions and (2) produce output tokens.



**Figure 6: Colored Petri Net Analysis Model**

From the design model of the system, we generate the initial CPN tokens that are injected into places in this analysis model. The modeling concepts in Figure 6 can be divided

and categorized based on system-level concepts being analyzed. Figure 7 shows the organizational structure of this CPN. Below, we describe each of these structural divisions in detail.



Figure 7: Analysis Model - Structural Aspects

### 5.2.1 Model of Time

Appropriate choice for temporal resolution is a necessary first step in order to model and analyze threads running on a processor. The OS scheduler enforces temporal partitioning and uses a priority-based scheme for threads active within a temporal partition. If there is no active OS-level temporal partitioning, the analysis model assumes the presence of a

single infinitely long temporal partition in which all the application processes can execute. If multiple threads have the same priority, a round-robin (RR) scheduling is used. Each thread has a *quantum* which is effectively a duration of time where the thread is allowed to keep hold of the CPU if the thread remains runnable and the scheduler determines that no other thread needs to run on that CPU instead. Thread quanta are generally defined in terms of some number of clock ticks. If it doesn't otherwise cease to be runnable, the scheduler decides whether to preempt the currently running thread every clock tick. In order to observe and analyze this behavior, we have chosen the temporal resolution to be 1 us; a fraction of 1 scheduler clock tick. In Section 6.1.3, we describe the disadvantages of managing time as a fixed-step increasing variable and describe our solutions that significantly improve the generated state space and the efficiency of the analysis.

### 5.2.2 Modeling Temporal Partitioning

The place *Clocks* in Figure 6 holds the state of the node-specific global clocks. The temporal partition schedule modeled by these clocks enforces a constraint: component operations can be scheduled and component threads can be run only when their parent partition is active. Each clock token  $NC$  is modeled as a 3-tuple:

$$NC = \langle Node_{NC}, Value_{NC}, TPS_{Node_{NC}} \rangle \quad (1)$$

where,  $Node_{NC}$  is the name of the computing node,  $Value_{NC}$  is an integer representing the value of the global clock and  $TPS_{Node_{NC}}$  is the temporal partition schedule on  $Node_{NC}$ . Each  $TPS$  is an ordered list of temporal partitions.

$$TP = \langle Name_{TP}, Prd_{TP}, Dur_{TP}, Off_{TP}, Exec_{TP} \rangle \quad (2)$$

Each partition  $TP$  (Eq. 2) is modeled as a record color-set consisting of a name  $Name_{TP}$ , a period  $Prd_{TP}$ , a duration  $Dur_{TP}$ , an offset  $Off_{TP}$  and the state variable  $Exec_{TP}$ . Aggregate of such partitions can fully describe a partition schedule. Complete partition

schedules are maintained per computing node. Figure 8 shows a C++-style struct representation of the temporal partition schedule. Each clock structure is modeled similarly in Standard ML as a color and these colored tokens are placed in the *Clocks* place (Figure 6).

```

struct Partition {
    string name;           /* Name of the Partition */
    unsigned long duration; /* Duration of the Partition */
    unsigned long period;   /* Period of the Schedule */
    unsigned long offset;   /* Offset of the Partition */
};

struct Clock {
    string name;           /* Name of the Embedded Node */
    int value;              /* Current value of the Clock */
    int next_tick;          /* Next clock tick */
    vector<Partition> schedule; /* Temporal Partition Schedule*/
};

```

**Figure 8: Temporal Partition Schedule Data Structure**

### 5.2.3 Modeling Component Thread Behavior

Figure 9 presents a snippet of our CPN, modeling the thread execution cycle. The place *Components* holds tokens that keep track of all the ready threads in each computing node. Each component thread *CT* is a record characterized by:

$$CT = < ID_{CT}, Prio_{CT}, O_{CT} > \quad (3)$$

where  $ID_{CT}$  constitutes the concatenation of strings required to identify a component thread in CPN (i.e. component name, node name and partition). Every thread is characterized by a priority ( $Prio_{CT}$ ) which is used by the OS scheduler to schedule the thread.

If the highest priority (OS schedulable) thread is not already servicing an operation request, the next ready operation from the message queue is dequeued and scheduled for execution (represented by  $O_{CT}$ ). Depending on the component scheduler, this operation



Figure 9: Component Thread Execution Cycle

may be the highest priority, or may have the earliest deadline or may be the oldest request. The scheduled thread token is placed in *Running Threads*.

When a thread token is marked as running, the model checks to see if the thread execution has any effect on itself or on other threads. These state changes are updated using the transition *Execute\_Thread* which also handles time progression. Keeping track of *Value<sub>NC</sub>*, the thread is preempted at each clock tick. This transition loop i.e. *Schedule\_Thread* -> *Execute\_Thread* -> *Preempt/Unblock\_Thread* -> *Schedule\_Thread* ... cycle repeats forever, as long as there are no system-wide deadlocks and some upper limit on the clock values isn't reached.

### 5.2.4 Modeling Component Operations

Every operation request  $O$  made on a component  $C_x$  is modeled as a Standard ML *record* of the 4-tuple:

$$O(C_x) = \langle ID_O, Prio_O, Dl_O, Steps_O \rangle \quad (4)$$

where,  $ID_O$  is a unique concatenation of strings that help identify and locate this operation in the system (consisting of the name of the operation, the component, the computing node, and the temporal partition). Assuming a PFIFO operation scheduling scheme, the operation's priority ( $Prio_O$ ) is used by the analysis engine to enqueue this operation request on the message queue of  $C_x$ . The completion of this enqueue implies that this operation has essentially been *scheduled* for execution. Figure 10 shows the operation scheduling cycle. The message queue tokens are placed in  $CMQ$  (i.e. Component Message Queues). The tokens in this place are a constraint on the *Schedule\_Thread* transition and are used to calculate the subset of threads that are *runnable*. When threads execute, operation requests may be enqueued in component message queues as shown by this cycle. The transition *Enqueue\_Operation* enqueues any operation requests that are sent by the currently executing threads. These requests are enqueued onto the appropriate message queue in the chosen scheduling scheme.

Once enqueued, if this operation does not execute and complete before its fixed deadline ( $Dl_O$ ), its real-time requirements are violated. The completion of each operation request is saved in the place *Completed\_Operations*. This place is an *observer* that is simply notified by the *Execute\_Thread* transition every time the thread execution also completes the executing operation. The tokens in *Completed\_Operations* saves a snapshot of the operation's properties including its operation's enqueue timestamp, dequeue timestamp, completion timestamp and deadline. By analyzing the tokens in this place, deadline violations in an interval of time can be detected.

Once an operation request is dequeued, the execution of the operation is modeled as a



Figure 10: Component Operation Scheduling Cycle

transition system that runs through a sequence of *steps* dictating its behavior. Any of these underlying steps can have a state-changing effect on the thread executing this operation e.g. interactions with I/O devices on the component-level could block the executing thread (for a non-deterministic amount of time) on the OS-level. Therefore, every component operation has a unique list of steps ( $Steps_O$ ) that represent the sequence of local or remote interactions undertaken by the operation. Each of the  $m$  steps in  $Steps_O$  is a 4-tuple:

$$s_i = \langle Port, Unblk_{s_i}, Dur_t, Exec_t \rangle \quad (5)$$

where  $1 \leq i \leq m$ . *Port* is a *record* representing the exact communication port used by the operation during  $s_i$ .  $Unblk_{s_i}$  is a list of component threads that are unblocked when  $s_i$  completes. This list is used, e.g., when the completion of a synchronous remote method

invocation on the server side is expected to unblock the client thread that made the invocation. Finally, temporal behavior of  $s_i$  is captured using the last two integer fields:  $Dur_t$  is the worst-case estimate of the time taken for  $s_i$  to complete and  $Exec_t$  is the relative time of the execution of  $s_i$ , with  $0 \leq Exec_t \leq Dur_t$ .

Consider the simple RMI application show in Figure 11. The application has two components, a client and a server. The client component is associated with a periodic timer that triggers a sequence of interactions between the two components. When the client timer expires, a timer operation is enqueued on the client's operation queue. When scheduled, the client executor thread executes this operation, which makes an RMI call to the server component. Once the query is made, the client thread is effectively blocked till a response is received. The server thread that produces this response may not be scheduled immediately due to the constraints of temporal partition scheduling and other thread scheduling delays. Once the RMI operation is completed on the server, the client thread is unblocked.



Figure 11: RMI Application

In the above example, the duration of time for which the client is blocked, is dependent

on, among several factors, what happens inside the remote method on the server. This remote method could either simply take up CPU time, interact with the underlying framework or interact with other components in the application. To capture such interaction patterns, the *step* color-set is defined in CPN. In this example, two *operation* tokens are required to describe the operations handled by the components: a client side timer operation and a server side RMI operation. A sample client timer operation is shown in Figure 12.

```
1`[{\{node="Beaglebone_111", period=100000, offset=0,
    operation={node="Beaglebone_111", component="Client_Component", operation="TIMER_OP",
        priority=50, deadline=80000, enqueue_time=0,
        steps=[{\{kind="LOCAL", port="LOCAL", unblk=[], exec_time=0, duration=4000},
            {\{kind="CLIENT_RMI", port="client_port", unblk=[], exec_time=0, duration=0},
            {\{kind="LOCAL", port="LOCAL", unblk=[], exec_time=0, duration=4000}]}]}]
```

**Figure 12: RMI Application - Client Timer Operation**

This timer operation runs on the client component with a priority of 50, and a deadline of 80 ms. The business logic of this operation consists of a single RMI call that takes 4 ms to send out the query after which it blocks the executing client thread. After the client thread runs for time  $t = q_t$ , the client thread is moved to a blocked state and an RMI operation is induced on the server side. The client side thread remains blocked until the server thread completes executing the remote method. Once the server thread completes execution, it sends the response of the RMI back to the client. The model takes note of how long the client has been blocked by using the time stamp at which it receives a response. The client thread runs for an additional 4 ms to process this response before it marks completion. The token for the server RMI operation is shown in Figure 13. Note that all time measurements in this token are in micro-seconds i.e. a step duration of 4000 implies 4 ms of activity. The requested RMI operation is run on the server component with a priority of 50 and a deadline of 15 ms. The deadline of this operation cannot be worse than the deadline of the client side operation that initiated the interaction. If this operation delays past 80 ms,

a client side deadline violation is realized as the client thread is blocking for longer than expected.

```
1`{node="Beaglebone_112", component="Server_Component", operation="RMI_OP",
  priority=50, deadline=15000, enqueue_time=0,
  steps=[{kind="LOCAL", port="LOCAL",
  unblk=[{node="Beaglebone_111", component="Client_Component", port="client_port"}]],
  exec_time=0, duration=12000}}}
```

**Figure 13: RMI Application - Server Operation**

### 5.2.5 Modeling Component Interactions

In our earlier RMI example, the client is periodically triggered by a timer to make a remote method call to the server. When the client executes an instance of this timer-triggered operation, a related operation request is enqueued on the server's message queue. In reality, this is handled by the underlying middleware. Since the details of this framework are not modeled, the server-side request is captured as an *induced operation* that manifests as a consequence of the client-side activity. Tokens that represent such design-specific interactions are maintained in the place *Component Interactions* (Figures 6,14) and modeled as shown in equation 6. The interaction  $Int$  observed when a component  $C_x$  queries another component  $C_y$  is modeled as the 3-tuple:

$$Int(C_x, C_y) = \langle Node_{C_x}, Port_{C_x}, O(C_y) \rangle \quad (6)$$

When an operational *step* in component  $C_x$  uses port  $Port_{C_x}$  to invoke an operation on component  $C_y$ , the request  $O_{C_y}$  is enqueued on the message queue of  $C_y$ .

Every *interaction* token contains an interaction port and an operation. The transition *Execute\_Thread* observes the activity on the currently running thread. When executing the model, if a particular step executed by a component thread would, on completion, request the services of another component, a token is placed on the *Waiting to Enqueue* place. So,



**Figure 14: Operation Induction**

once the client thread pushes out an RMI query, an operation needs to be induced on the server queue. So an *interaction* token for this communication is constructed. The model waits for the RMI call on the client side to complete, at which point it places the operation  $i\_op$  on the server message queue. This induction is represented in Figure 15.

```
1`{node="Beaglebone_111", port="client_port",
  operation={node="Beaglebone_112", component="Server_Component", operation="RMI_OP",
  priority=50, deadline=15000, enqueue_time=0,
  steps=[{kind="LOCAL", port="LOCAL",
  unblk=[{node="Beaglebone_111", component="Client_Component", port="client_port"}],
  exec_time=0, duration=12000}]}
```

**Figure 15: Operation Induction Token**

### 5.2.6 Modeling Timers

DREMS components are inactive initially; once deployed, a component executor thread is not eligible to run until there is a related operation request in the component's message queue. To start a sequence of component interactions, periodic or sporadic timers can be used to trigger a component operation. In CPN, each timer  $TMR$  is held in the place  $Timers$  and represented as shown in Eq. 7. Timers are characterized by a period ( $Prd_{TMR}$ ) and an offset ( $Off_{TMR}$ ). Every timer triggers a component using the operation request  $O_{TMR}$ .

$$TMR = \langle Prd_{TMR}, Off_{TMR}, O_{TMR} \rangle \quad (7)$$

When the component's timer expires, a timer callback operation is placed on the component message queue. When the component executor thread is picked by the OS scheduler, this operation is dequeued and the timer callback is executed. In CPN, timer operations are modeled as shown in Figure 16.



Figure 16: Timer Operations

All component timers are expressed as separate tokens and initialized in the *Timers* place. It is important to note that the enqueue operation does not happen until the appropriate partition is active. This is because the component-specific thread responsible for enqueueing (or dequeuing) incoming operations is also affected by temporal partitioning.

## 5.3 Modeling Component Operation Business Logic

### 5.3.1 Problem Statement

Consider a set of component-based applications deployed on distributed hardware. Each application consists of groups of components that interact with each other and also with the external environment e.g. I/O devices, other applications, underlying middleware etc. Each component exposes a set of interfaces through which external entities can request *operations*. As mentioned earlier, an operation is an abstraction for the different tasks undertaken by a component. These operations are exposed through ports and can be requested by other components. When an operation is requested, the request is placed in the component's message queue and eventually serviced. When ready, the business logic of the operation i.e. a local callback is executed. This piece of code represents the brains of the operation. The goal here is to be able to model this business logic, for every component operation, effectively as part of the design model, including temporal estimates such as worst-case execution times for individual code blocks, so that the model can be translated into appropriate data structures in our CPN analysis model.

### 5.3.2 Challenges

The execution of component operations service the various periodic or aperiodic interaction requests coming from either a timer or other connected (possibly distributed) components. Each operation is written by an application developer as a sequence of execution steps. Each step could execute a unique set of activities, e.g. perform a local calculation or a library call, initiate an interaction with another component, process a response from external entities, and it can have data-dependent, possibly looping control flow, etc. The behavior derived by the combination of these steps contribute to the worst-case execution of the component operation. The behavior may include non-deterministic delays due to component interactions while being constrained by the temporally partitioned scheduling scheme and hardware resources. The challenge here is to identify a grammar that would

represent the potentially dynamic behavior realized in a component operation. The modeling aspects emerging from this challenge will have to propagate to any timing analysis model that studies the system. This is true because any non-deterministic delays such as blocking times need to be accounted for when analyzing the temporal behavior.

### 5.3.3 Outline of Solution

The business-logic model of a component operation requires to be completely integrated into our CPN modeling methodology. This means that the model, however complex, needs to be translated into some token data structure in CPN. This is our primary constraint. The CPN analysis model needs to know how an operation is structured i.e. what are the sequential steps in the code, along with WCET on each step. Lastly, since the CPN model does not model or simulate component data management, data-dependent conditional statements in the business-logic model were avoided or abstracted away. Following these rules, we designed a language for describing the component operation business logic. Each component operation model is then attached to a component port or timer in the main design model and enriches the model with refined details about the workings of the operation. In summary, this model is capable of representing several types of code blocks including local function calls, remote procedure calls, outgoing port-to-port interactions, incoming port-response processing, and bounded loops.

The execution of component operations service the various periodic or aperiodic interaction requests coming from either a timer or other connected (possibly distributed) components. Each operation is written by an application developer as a sequence of execution *steps*. Each step could execute a unique set of activities, e.g. perform a local calculation or a library call, initiate an interaction with another component, process a response from external entities, and it can have data-dependent, possibly looping control flow, etc. The behavior derived by the combination of these steps contribute to the worst-case execution of the component operation. The behavior may include non-deterministic delays due to

component interactions while being constrained by the temporally partitioned scheduling scheme and hardware resources. This section briefly describes the various aspects of this behavior specification that are general enough to be applicable to a range of component-based systems.

```
(* Business Logic syntax in Extended Backus-Naur Form *)
business_logic      =  'do', ws, operation_name, ws
                      '{', {functional_step}, '}';
operation_name       =  ID ;
functional_step      =  {sequential_code_block | rmi_call | ami_call | publish| loop} ;
sequential_code_block = 'LOCAL' INT, ',';
rmi call             =  'RMI', ws, receptacle_port, ',', remote_operation, ('T' query_time, processing_time 'T')?;
ami call              =  'AMI', ws, receptacle_port, ',', remote_operation, ('T' query_time, processing_time 'T')?;
publish              =  'PUBLISH', ws, publisher_port, ',', topic, ('T', publish_time, 'T')? ;
loop                 =  'LOOP', ws, 'T', count, 'T', ws,
                      '{', {functional_step}, '}';
receptacle_port      =  ID ;
remote_operation     =  ID ;
publisher_port        =  ID ;
topic                =  ID;
query_time            =  INT ;
processing_time        =  INT ;
publish_time           =  INT ;
count                  =  INT;
ws                     =  ? white space characters ?
identifier             =  alphabetic character, { alphabetic character | digit } ;
Integer                =  digit, {digit} ;
digit                  =  "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;
```

**Figure 17: Modeling the Business Logic of Component Operations**

Figure 17 shows the Extended Backus-Naur form representation of the grammar [63] used for modeling the business logic of component operations. The symbol *ID* represents identifiers, a unique grouping of alphanumeric characters, and the symbol *INT* represents positive integers. Each operation is characterized by a unique name, a priority, and a deadline. The priority is an integer used to resolve scheduling conflicts between operations *provided* by the same component when multiple messages from other entities are received. The arbitration is handled by the component-level scheduler. The deadline of the operation is the worst-case time that can elapse after the operation is marked as *ready* and the

completion of the operation. The business logic of every component operation is modeled as a sequence of steps, each with an assigned worst-case execution time. We broadly classify these steps into (1) blocks of sequential code, (2) peer-to-peer synchronous and asynchronous remote calls, (3) anonymous publish/subscribe distribution service calls, (4) blocking and non-blocking I/O interactions and (5) bounded control loops.

Model conforming to this grammar represent the sequence of steps that can be executed in a component operation. These component operations are executed when the component is triggered which can happen in three ways: (1) the expiry of a timer executes a callback, (2) the reception of message on the subscriber port, and (3) the reception of a method request from a remote client on a server port. When one of these events occurs, the corresponding operation is enqueued on the message queue and eventually handled. When the operation is executed, the component can use any and all outgoing ports at its disposal to publish messages or query other remote servers, as shown in Figure 17. Additionally, the operation can also execute local non-blocking function calls that perform any required computations. For each such step, the grammar enables the integration of timing properties. With RMI calls, *query\_time* is an optional annotation that represents the worst-case estimated duration of time taken by a client port to send out a request to the serving component. This time can be used to include network buffering delays or any other pre-processing steps enforced by the infrastructure before the request actually leaves the client component. Similarly, *processing\_time* represents the duration of time after the client component receives the response from the server when any post-processing steps are executed by the infrastructure. These delays are optional and can be encapsulated within *LOCAL* blocks on either side of the call. If these expected delays are set to zero, the analysis will execute these interactions in a single synchronous step taking no time. However, in reality these steps still take a non-zero amount of time to execute. Therefore, if such metrics are not known then these values can be set to zero and an overall worst-case execution time can be set per

operation. This is the maximum amount of time that can elapse after the component operation has begun to execute. This time will include all component interactions and network delays that affect the operation's execution.



**Figure 18: Sample Business Logic Model**

Figure 18 shows a sample business logic model conforming to this grammar. The Data Acquisition Module is a periodically triggered I/O component i.e. this component receives a stream of sensor information from various sensor devices e.g. inertial measurement units (IMU) and GPS modules. This component packages this information and publishes sensor state as a message to all subscribers e.g. controller components. Figure 18 shows the translation from the conceptual understanding of the workings of this component operation

to the abstract business logic model that is then translated into CPN tokens (Figure 19), as described in Section 5.2.4.

```
1`[{node="Beaglebone_111", period=500000, offset=0,
    operation={node="Beaglebone_111", component="Data_Acquisition_Module", operation="sensor_read_timer",
               priority=50, deadline=20000, enqueue_time=0,
               steps=[{kind="LOCAL", port="LOCAL", unblk=[], exec_time=0, duration=15000},
                      {kind="PUBLISH", port="sensor_state_publisher", unblk=[], exec_time=0, duration=0},
                      {kind="LOCAL", port="LOCAL", unblk=[], exec_time=0, duration=2000}]}]}
```

**Figure 19: CPN Business Logic Representation**

Everytime the above timer expires, this *sensor\_read\_timer* operation is enqueued onto the component message queue (Figure 10). When the *Data\_Acquisition\_Module* component is scheduled, this operation is dequeued from the message queue and is marked for execution. The *Execute\_Thread* transition fires, this component thread executes i.e. each operation step in the component operation proceeds to sequentially consume CPU. So, first the LOCAL code block executes for 15 milliseconds. Every 4 milliseconds (the default clock tick), the scheduler checks to see if this thread needs to be preempted in favor of any other higher priority ready thread. If no other higher priority threads are ready to execute, this thread continues to consume CPU and the *exec\_time* attribute of this step is incremented. When *exec\_time = duration*, this step is removed from the operation and the next step begins execution. Here, the component uses its publisher port to send out a message on the *sensor\_state* topic. Since no other delays are detailed for this step, the analysis model executes this step in zero time. When this step completes, the *Execute\_Thread* transition enqueues an operation on all subscriber components' message queues following the interaction rules described in Section 5.2.5. Lastly, a LOCAL code block consumes 2 milliseconds of CPU and the operation is marked as complete. The clock value on this node when the operation completes is saved as a *completion\_time* of this operation.

## CHAPTER VI

### STATE SPACE ANALYSIS AND VERIFICATION

The state of a dynamic system refers to a minimal set of variables, called state variables, that fully describe the system and its response to any given set of inputs. This minimum set of variables,  $s_i(t), i = 0, 1, 2, \dots, n$  along with knowledge of those variables at an initial time  $t_0$  and the system inputs for time  $t > t_0$ , are sufficient to predict the future system state and outputs for all time  $t > t_0$ . This asserts that the dynamic behavior of a state is completely characterized by the set of state variables  $s_i(t)$ .

CPN Tools uses a built-in *state space* analysis tool to generate a bounded state space from an initialized CPN model. Here, the state space is a directed graph structure where the vertices, called states, each represent a unique system state. An edge between two states represents the transition from one state to another. If a state  $S$  can non-deterministically transition into  $K$  possible future states, then  $S$  becomes the root of a K-ary tree. The state of the system in our case is a record of all the places in the CPN i.e. the token values in every place of the CPN model. This record is therefore a snapshot of the token configuration of the net and represents its execution state. State space generation is a process of generating this directed graph, from some initial state. State spaces of dynamic systems can be potentially infinite if there is always a potential unique state transition. For pragmatic reasons, we generate a bounded state space i.e. a graph structure bounded by some rule e.g.  $t_i < t_{bound}$ , where  $t_i$  is some global time variable. In this case, the state space will contain only nodes where the state variable  $t_i$  is less than some upper bound  $t_{bound}$ . Alternately, the rule can be to generate a state space as long as the component message queue size is under 50 waiting requests.

To illustrate the state space analysis of DREMS using our CPN, we consider a simple example – consider three equal-priority components, grouped into a process and executed

on a single device. Each component has a periodic timer that fires every 10 ms and triggers the respective components into executing a block of code. Each component maps to an executor thread and these three threads are scheduled concurrently with all other threads in the system. In this example, there are no other component threads or system-level threads considered. Based on the OS scheduling scheme, these components, with equal priority, are scheduled using round-robin conflict resolution i.e. one of these threads is chosen at random and then a cyclic scheduling order is maintained. Since all three timers have the same period, the three timers fire concurrently and the respective component threads are marked as 'ready' at the exact same time. Since all three component threads are ready to execute the same times, there are  $3!$  possible thread execution orders in the worst case when following round-robin scheduling.

Figure 20 shows a bounded state space generated in CPN Tools for this component assembly. There are 6 branches from the initial state of the system as the model realizes the 6 possible behaviors. Each node in this state space is annotated with a state space ID and also a pair of integers in the format " $p:c$ ", where  $p$  refers to the number of parent nodes and  $c$  refers to the number of child nodes. This figure also shows results of a *state space query*. Specifically, the query finds the *marking* on the *Completed\_Operations* place in two different state space nodes, 35 and 37. In node 37, *Timer\_3\_operation* is the first operation to complete as *Component\_3* is the first ready thread chosen by the OS. In node 35, *Timer\_1\_operation* is the first operation to complete as *Component\_1* is the first ready thread chosen by the OS. This illustrates the tree of possible behaviors that is encoded in the state space starting from an initial state. The goal of state space analysis is to search this tree of possibilities to identify a single execution trace i.e. a single branch in this tree that either satisfies or negates a system property e.g. that the deadline of one of these timer operations is violated.



**Figure 20: Bounded State Space for a Multi-component Timer example – The component threads have the same real-time priority and are executed in the same device. Each component is triggered with a 100 Hz periodic timer and all three timers are synchronized to illustrate non-determinism. The round robin scheduling quantum is set to 4 milliseconds. 'Mark' is a state space query function that provides the *marking* of a place in a particular state space node.**

### 6.0.1 Searching the State Space

CPNTools' inbuilt state space analysis tool comes with a programming interface – a set of function that can be used by a user to query a generated state space. One of the many available functions is the *SearchNodes* function, as shown in Figure 21. This function traverses the nodes of the state space and at each node, evaluates a predicate and accumulates a list of nodes that satisfy this predicate.

```
SearchNodes(Area, Pred, Limit, Eval, Start, Comb)
begin
    Result := Start; Found := 0
    for all n ∈ Area do
        if Pred(n) then
            begin
                Result := Comb(Eval(n),Result)
                Found := Found + 1
                if Found = Limit then
                    stop for-loop
            end
        end
    end
```

Figure 21: SearchNodes function provided by CPNTools

There are six parameters provided to this search function. *Area* refers to the search area i.e. the part of the state space that needs to be searched. Often, the search area is the entire graph but it is possible to provide a subset of the graph e.g. a list of strongly connected components<sup>1</sup>. The second argument, *Pred*, specifies a predicate function that evaluates each node and produces a boolean result. All nodes that evaluate to false are ignored and all nodes that evaluate to true are retained for further analysis. The third argument, *Limit* is an integer referring to how many times a predicate should evaluate to true before the search should terminate. If this limit is infinite, the entire state space is always searched. *Eval* is an evaluation function that is executed on all state space nodes

---

<sup>1</sup>A directed graph  $G$  is strongly connected if every vertex is reachable from every other vertex in the graph. A strongly connected component is a maximal strongly connected subgraph of  $G$  i.e. no additional edges or vertices from  $G$  can be included in the subgraph without breaking its property of being strongly connected

that satisfy the predicate function e.g. an evaluation function to find the execution time of an operation when its predicate function detects a deadline violation. Lastly, *Start* is the initial value of the result and *Comb* is a combination function that accumulates each new result from the evaluation function with prior results. The combination function is a constant time operation as it simply appends a new node to the end of the accumulated results list. The search algorithm itself has a time complexity of  $O(n)$  where  $n$  is the number of nodes in the state space.

The rest of this chapter details how a bounded state space can be used to analyze DREMS applications for deadline violations, response times predictions etc.

### 6.0.2 Deadline Violations

A *deadline violation* refers to a system state where the execution time of a component operation has exceeded its deadline. The operation may violate its deadline without beginning execution since wait times in the message queue count towards the total delay from the arrival of the message to the completion of the corresponding operation. The SearchNodes function in CPNTools is quite generic and can be easily applied to our analysis model to identify such violations in the state space. For all operations, either completed or waiting for execution, it is sufficient to execute the predicate  $current\_time - operation.enqueue\_time > operation.deadline$ . All nodes that satisfy this predicate are nodes that represent deadline violation states.

Alternatively, by adding observer places to our timing analysis model i.e. places that passively observe the system and accumulate tokens when certain conditional transitions execute, deadline violations can be recorded as the model is executing. A *Deadline\_Violation* (Figure 22) transition fires at any point in time when the guard  $dl\_guard$  is satisfied and arc bindings are realized with its input places. The transition observes the states of the

currently running threads and the component message queues to identify deadline violations on operations that are either executing or waiting to execute. The *dlViolation* tokens in *Late Operations* (*LO*) is of the form:

$$LO = \langle Node_{name}, O_{name}, O_{ST}, O_{DLT} \rangle \quad (8)$$

where operation  $O_{name}$  executing on computing node  $Node_{name}$  started at time  $O_{ST}$  and violated its deadline at time  $O_{DLT}$ . Since the component-level scheduler uses a non-preemptive scheme, this operation is still run to completion after the violated deadline. Delays like these propagate to the waiting operations in the message queue.



**Figure 22: Deadline Violation Observer place**

### 6.0.3 System-wide Deadlocks

System-wide deadlocks are caused by the inability of the OS schedulers (on all nodes) to schedule any component thread. This can be caused by situations where a set of executing threads are indefinitely blocked on each other because of cyclic dependencies in the interactions. Deadlocks can be identified by checking the leaf nodes of the bounded state space for *dead transitions* that are unable to fire. Alternatively, the tokens in *Component Interactions* are analyzed to identify cyclic dependencies and provide warnings to possible

deadlocks. Such queries are useful in large component assemblies where mutually blocking dependencies are not immediately perceivable.

#### 6.0.4 Response-time Analysis

Response time analysis identifies the worst-case time taken for the system to generate a desired output signal after an input trigger has been provided e.g. time taken for the emergency braking system on an automated train controller to respond to sensory input. A component-based system can have a variety of triggers but in this context, a trigger is considered as any operation request received by a component. The response to this trigger is the completion of some other operation at a future time instant after the occurrence of the trigger. With state space analysis, it is possible to identify the worst-case response times for a  $(\text{trigger\_operation}, \text{response\_operation})$  pair by first obtaining response times for all trigger-to-response cycles and finding the maximum.

Similar to deadline violation detection, using the `SearchNodes` function, this can be accomplished as follows – The predicate function for the search is the completion of the `response_operation`. The evaluation function scans the list of completed operations in all state space nodes where a response operation was the last operation marked as completed. In this list, by identifying the trigger operation and response operation, the response time is calculated as the difference  $\text{Response\_Operation}_{\text{cmpl\_time}} - \text{Trigger\_Operation}_{\text{enq\_time}}$ . This result is accumulated by the combination function and the maximum response time is calculated from the resultant list. The time complexity of this search is again linear with respect to the size of the state space. The size of the accumulated results list is at most half the size of the state space since the trigger and the response are always in separate state space nodes. Finding the maximum trigger-to-response time within the resultant list is also done in linear time.

This result, as with other state space analysis results here, assumes ideal functional behavior for all operations i.e. the completion of the operation always provides the desired

results. Since the business logic model for operations does not encode data-dependent behavior and conditional execution, the completion of a response operation is no indication about the correctness of the response, merely its timeliness. It is therefore also not possible to differentiate between response "types" as all responses are seen as equivalent.

### 6.0.5 Incomplete Designs

As shown in Figure 20, the state space generated by the timing analysis model contains all possible branches of execution from an initial system state. By encoding system requirements as predicates, it is possible to obtain useful analysis results that can aid system integrators in better designing components. When designing component-based systems, it is often the case where priorities need to be assigned to component threads executing on the same device, especially in real-time systems. These priorities are typically estimated based on relative importance of operations, order of component interactions, and frequency of associated triggers. An ideal priority assignment is one that leads to the most efficient schedule, one that avoids resource starvation while ensuring that the assignment does not violate any timing requirements. In large-sized component assembly, it is not always clear what each component's priorities should be. State space analysis can be useful here in providing a partial thread execution ordering based on the timing requirements of component operations.

In Figure 20, if the deadline i.e. timing requirement, of Timer\_1\_operation was 1 ms, then the thread ordering analysis would always suggest executing Component\_1 first before other components - essentially suggesting a higher priority for Component\_1 if this timing requirement is paramount. Furthermore, timing requirements can be ranked based on importance. A global timing requirement spanning distributed applications may be more important than a local timing requirement that is within the scope of a single component. To achieve this, the state space analysis identifies state space nodes where an operational timing requirement is satisfied and presents the order of thread execution on that device. By

comparing such orderings with other ordering where such timing requirements fail, a system integrator obtains a partial execution order for threads. Using this result as feedback, the integrator can change the relevant component priorities and repeat the analysis.

## 6.1 Modeling and Analysis Improvements

### 6.1.1 Problem Statement

The CPN analysis work presented in [65] has some limitations. The clock values in the distributed set of computing nodes progress by a fixed amount of time regardless of the pace of execution. This is one of the primary causes of state space explosion since many of the intermediate states between *interesting* events, though uneventful, are still recorded by the state space generation. For instance, in a temporal partition spanning 100 ms, even if a thread executes for 5 ms and the rest of the partition is empty, then if the clock progresses at a 1 ms rate, 100 states are recorded in the state space when there are at most 5-7 interesting events in this interval. For a larger set of distributed interacting components, this can become a problem. Also, for distributed scenarios where multiple instances of a set of applications are executed in parallel, in independent computers, our CPN modeling methodology isn't efficient, leading to a tree of parallel executions even when the distributed computers are independent i.e. the computers can be synchronously progressed. The goal of this work is to mitigate such analysis issues and arrive at a more efficient and scalable analysis model.

### 6.1.2 Outline of Solution

Improving the performance of our CPN analysis method required the evaluation of our existing results to identify how the state space generation worked. The state space of CPN is a tree of CPN *markings*, where each marking is a data structure representing the tokens in all its places. So, our goal is to reduce the number of markings accumulated in the CPN i.e. the number of distinct states of interest. This required us to evaluate our representation

of time. Using time as a fixed-step monotonically increasing entity means that the CPN place managing time would always contain a new *clock token*, therefore forcing the CPN marking to become a part of the state space.

To alleviate this issue, we modeled time as a dynamically changing variable, where the changes are strategically forced *time jumps* instead of a statically increasing clock value. This is similar to a discrete event scheduling model but the next (closest) interesting event needs to be calculated based on the current state of the system. The states of all timers are used to calculate the next timer expiry. The states of all executing threads is evaluated to calculate the next closest preempt timestamp. Similarly, the state of all executing operations is analyzed to identify the next closest enqueue onto the component message queue. When temporal partitioning is enabled, the next partition switching timestamp is also considered in effectively calculating the minimum amount of time by which the analysis clock on each node needs to be progressed to ensure that time is efficiently managed while ensuring that no interesting event is *skipped* because of the progression.

Similarly, our data structure representation for distributed deployments i.e. using unordered token sets instead of ordered lists, enabled our earlier CPN models to nondeterministically choose one of the various distributed nodes to execute, generating a exponentially increasing tree of execution orders. Once we moved to representing our distributed hardware nodes as a list, the execution engine iteratively executing the analysis on each node in the list, leading to one execution order instead of a tree.

Such issues are resolved with our analysis improvements, reported in [63]. We modified the timing analysis model to allow dynamic time progression i.e. the clocks (one for each computing node) in the CPN model do not progress at a constant rate but instead experience *time jumps* to the next interesting time step e.g. next timer expiry, end of partition, next scheduling preemption point, or next remote interaction. This makes the system execution progress at a much higher rate and reduces the overall number of states being

recorded in the state space. We also adjusted our modeling concepts when describing distributed deployments. We experienced needless state space explosions as a consequence of using CPN semantics when modeling distributed computers. If the computers are modeled as an aggregate of independent CPN tokens, then the CPN transition that progresses the execution in each computer is independent, leading to a potential  $C!$  different orders for  $C$  computers. For instance, 4 distributed computers leads to 24 possible execution orders displayed by the transition responsible for *picking* the next computer to evaluate and progress. We alleviate this issue by assuming that all computers in a distributed scenarios have synchronized clocks and execute simultaneously leading to a synchronous progress. In practice this can be achieved by using the Precision Time Protocol (PTP) [22] to synchronize clocks throughout the computer network. In CPN, this is done by maintaining the state of each computer in a *list* instead of an unordered aggregate.

### 6.1.3 Handling Time

The CPN-based analysis consists of executing a simulation of the model and constructing a state space data structure for the system (for a finite horizon), and then performing queries on this data structure. This is automated by CPN Tools. The first improvement over the basic CPN approach is in how we handle time. Although it is true that CPN and similar extensions to Petri Nets such as Timed Petri Nets inherently have modeling concepts for simulation time, we explicitly model time as an integer-valued *clock* color token in CPN. There are several reasons for this choice.

Firstly, this is an extension to our previous arguments about choosing Colored Petri Nets. Modeling the OS scheduler clock as a colored token allows for extensions to its data structure such as (1) intermediate time stamps and internal state variables, and (2) adding temporal partitioning schemes like the (time-partitioned) ARINC-653 [7] scheduling model (Figure 23).

These extended data structure fields can be more easily manipulated and used by the

```

1`[{clock_node="Sat1", clock_value=0,
    schedule = [{part_name="Part1", exec_t=0, dur=20, pr=40, off=0},
                {part_name="Part2", exec_t=0, dur=20, pr=40, off=20},
                {part_name="Part3", exec_t=0, dur=20, pr=40, off=40}]}]

```

**Figure 23: A Clock Token with Temporal Partitioning**

model transitions during state changes, allowing for richer modeling concepts that would not be easily attainable using token representations provided by Timed Petri Nets. The ability to pack colored tokens with rich data structures also reduces the total number of colors required by the complete model. This quantitative measure directly influences the reduced size of the resultant state space. The downside of this approach to modeling is that we have to choose a time quantum. But in practical systems this is usually not a problem, as the low-level scheduling decisions are taken by an OS scheduler based on a time scale with a finite resolution.

Secondly, modeling time as a token allows for smarter time progression schemes that can be applied to control the pace of simulation. If we did not have such control over time, the number of states recorded for this color token would eventually explode and itself contribute to a large state space. In order to manage this complexity, we have devised some appropriate *time jumps* in specific simulation scenarios.

If the rate at which time progresses does not change, then for a 1 msec time resolution,  $S$  seconds of activity will generate a state space of size:  $SS_{size} = \sum_{i=1}^{S*1000} TF_{t_i}$  where  $TF_{t_i}$  is the number of state-changing CPN transition firings between  $t_i$  and  $t_{i+1}$ . This large state space includes intervals of time where there is no thread activity to analyze either due to lack of operation requests, lack of ready threads for scheduling, or due to temporal partitioning. During such idle periods, it is prudent to allow the analysis engine to *fast-forward* time either to (1) the next node-specific clock tick, (2) the next global timer expiry event, or (3) the next activation of the node-specific temporal partition (whichever is earliest and most

relevant). This ensures that the generated state space tree is devoid of nodes where there is no thread activity.



**Figure 24: Dynamic Time Progression**

Figure 24 illustrates these time jumps using 4 scenarios. Assuming the scheduler clock ticks every 4 msec, Case 1 shows how time progression is handled when an operation completes 2 msec into its thread execution. At time  $t$ , the model identifies the duration of time left for an operation to complete. If this duration is earlier than the next preemption point, then there is no need to progress time in 1 msec increments as no thread can preempt this currently running thread till time  $t + 4$  msec. Therefore, the `clock_value` in Figure 23 progresses to time  $t + 2$  msec, where the model handles the implications of the completed operation. This includes possibly new interactions and operation requests triggered in other components. Then, time is forced to progress to the next preemption point where a new candidate thread is scheduled. This same scenario is illustrated in Case 2 when the time resolution is increased to 100 usec instead of 1 msec. Notice that the number of steps taken

to reach the preemption point are the same, showing how the state space doesn't have to explode simply because the time resolution is increased. Case 3 illustrates the scenario where at time  $t$ , the scheduler has no ready threads to schedule since there are no pending operation requests but at time  $t + 3$  msec, a component timer expires, triggering an operation into execution. Since timers are maintained in a global list, each time the *Progress\_Time* transition checks its firing conditions, it checks all possible timers that can expiry before the next preemption point. So, at time  $t$  when no threads are scheduled, the model immediately jumps to time  $t + 3$ . This scenario also shows that if the triggered operation does not complete before the preemption point *and* there are no other ready threads or timer expiries that can be scheduled, the clock value jumps to the operation completion. It must be noted here that this case is valid only because the DREMS architecture we have considered uses a non-preemptive operation scheduling scheme. Lastly, Case 4 shows time jumps working with temporal partitioning. At some time  $t + x$ , the model realizes the absence of ready threads and does not foresee any interaction requests from other components, then it safely jumps to the end of the partition without stepping forward in 1 msec increments. This time progression directly shows how the state space of the system execution reduces while still preserving the expected execution order, justifying our choice of modeling time as a colored token using CPN.

#### 6.1.4 Distributed Deployment

The second structural change to the analysis model is in how distributed deployments are modeled and simulated. Early designs on modeling and analysis of distributed application deployments [65] included a unique token per CPN place for each hardware node in the scenario. Since the individual *node* tokens are independent and unordered, there is non-determinism in the transition bindings when choosing a hardware node to schedule threads in. For instance, if there are 2 hardware nodes in the deployment with ready threads on both nodes, then either node can be chosen first for scheduling threads leading to two possible

variations of the model execution trace. Therefore the generated state space would exponentially grow for each new hardware node. In order to reduce this state space and improve the search efficiency, we have merged hardware node tokens into a single *list* of tokens instead of a unassociated grouping of individual node tokens. This approach is inspired by the symmetry method for state space reduction [60].



Figure 25: Structural Reductions in CPN

Figure 25 illustrates this structural reduction. Consider a distributed deployment scenario with an instance of a DREMS application deployed on each hardware node, Sat1 through Sat6. Components *Comp1* and *Comp2* are triggered by timers, eventually leading to the execution of component operations (modeled as shown in Figure 17). If all the

timer tokens in the system were modeled individually, the transition *Timer\_Expiry* would non-deterministically choose one of the two timer tokens that are ready to expire at  $t=0$ . However, if the timers are maintained as a single list, then this transition (1) consumes the entire list, (2) identifies all timers that are ready to expire, (3) evaluates the timer expiration function on all ready timers, (4) propagates the output *operation* tokens to the relevant component message queues in a single firing. This greatly reduces the tree of possible transition firings and therefore the resultant state space. Also, if there is no non-determinism in the entire system, i.e., there is a distinct ordering of thread execution, then this model can be scaled up with instantiating the application on new hardware nodes with no increase in state space size. This is because all of the relevant tokens on all nodes are maintained as a single list that is completely handled by a single transition firing.

An important implication of the above structural reduction is that the simulation of the entire system now progresses in synchronous steps. This means that at time 0, all the timers in all hardware nodes that are ready to expire will expire in a single step. Following this, all operations in all component message queues of all these nodes are evaluated together and appropriate component executor threads are scheduled together. When these threads execute, time progresses as described in Section 6.1.3, moving forward by the minimum amount of time that can be fast-forwarded.

## 6.2 Investigating Advanced State Space Analysis Methods

### 6.2.1 Problem Statement

State space analysis techniques have been successfully applied with Colored Petri Nets in a variety of practical scenarios and industrial use cases [50], [52]. The basic idea here is to compute all reachable states of the modeled concurrent system and derive a directed graph called the state space. The graph represents the tree of possible executions that the system can take from an initial state. It is possible from this directed graph to verify behavioral properties such as queue overflows, deadline violations, system-wide deadlocks and even derive counterexamples when arriving at undesired states.

Advanced state space analysis techniques arise from the need for efficient space space searching algorithms. State space analysis is challenged by time, memory, and computational power. Large state spaces require large CPU RAM and efficient search methodologies to quickly arrive at a useful result. With increasingly complexity in system designs, the number of state variables to store in memory also increases. Our problem here is to identify and apply advanced state space analysis techniques, applicable in the context of our CPN model and available as tested analysis tools that mitigate such complexities in state space analysis. This will help improve the scalability of our model and also reduce the memory footprint of the analysis.

### 6.2.2 Outline of Solution

The variety of CPN-specific state space reduction techniques [19], [49] developed in recent times has significantly broadened the class of systems that can be verified. In order to easily apply such techniques to our analysis model, we use the ASAP [109] analysis tool. The tool provides for several search algorithms and state space reduction techniques such as the *sweep-line method* [20] which deletes already visited state space nodes from memory, forcing on-the-fly verification of temporal properties. The main advantage of such

techniques is the amount of memory required by the analysis to verify useful properties for large models.

The sweep line method for state space reduction is used to check for important safety properties such as lack of deadlocks, timing violations etc. using user-defined model-specific queries. Practical results enumerated in [20] show improvements in time and memory requirements for generating and verifying bounded state spaces. The method relies of discarding generated states on-the-fly by performing verification checks during state space generation time. Any state that does not violate system properties can be safely deleted. Another advantage of this method compared to similar reduction methods such as bit-state hashing [46] is that a complete state space search is guaranteed.

### 6.2.3 Evaluation of Solution

We have evaluated these advanced state space analysis methods by comparing the obtained results with our basic state space analysis in CPN Tools. Using several criteria such as state space generation time, state space query generation, query processing time, memory usage etc., we can generate a comparison table to show the overall improvements in the analysis workflow. Advanced analysis techniques are usually accompanied by some expertise requirements that can be masked by nicely designed analysis tools. Our goal with ASAP is to use a model-driven approach to enable advanced analysis methods that does not require much expertise. With ASAP, we are able to generate verification project templates i.e. building blocks much like Simulink that are wired up together and provide an interface to the low-level analysis engine. Therefore, evaluation of this work requires both the evaluation of the advanced methods applied, and the tool used. As for the analysis methods applied, it is important to ensure that the state space tree, with all of the applied analysis heuristics, is still sufficiently probed when searching for system properties. Heuristics that enable state space analysis but only by partially checking the tree can lead the case where the state space analysis does not identify timing errors because of the incomplete search.

This will be checked with negative test cases where the design model is known to be flawed; if the results from ASAP identify injected timing errors for all test cases, then the analysis is sound.

#### 6.2.4 Contributions

These methods were evaluated on our CPN analysis method and the results were presented in [63]. We used a large and diverse 100 component-based application for our testing. Using the CPN Tools’ built-in state space analysis tool, a bounded state space of thread activity was generated. The state space generation took 36 minutes on a typical x86 laptop. We imported the same CPN analysis model onto ASAP and performed on-the-fly verification checks for lack of dead states in the analysis model for the same bounded state space. The on-the-fly verification, without any graphical interface overheads, took less than 10 minutes to compute a lack of system-wide deadlocks. It must be noted here that this improved result is due to not only because of the efficient state space search but also because of symmetry-based structural reduction discussed in the previous section.

In order to illustrate the utility of such state space reduction techniques, we consider a large-scale deployment. Figure 26 shows the generated CPN model for a domain-specific DREMS application. This is a scaled-up variant of several satellite cluster examples we have used in previous publications [31, 65]. The example consists of a group of communicating satellites hosting DREMS applications. The component assembly for this application consists of 100 interacting components distributed across 10 computing nodes, many of which are triggered by infrastructural timers. Notice in Figure 26 how there is only one token in each of the main CPN places, as described in Section 6.1.4. All of the component timers are appended to the list maintained in *Timers* place. Similarly, all node-specific clock tokens are maintained in place *Clocks*.

At time  $t=0$ , before the simulation is kicked off, the transition *Establish\_Order* generates the powerset of thread execution orders that are possible given the configuration of



**Figure 26: Generated CPN model for a Distributed Application Deployment**

the clock token. This may be a potentially large set depending on the number of threads of equal priority in each partition. Once this tree of possible orders is established, the complete set of timers that are ready to expire are evaluated. Each timer expiry manifests as an operation request and each callback operation modeled using the grammar shown in Figure ???. Once the operations are ready to execute, the highest priority component thread with a pending operation request is chosen for execution. This thread scheduling happens on all hardware nodes. When each thread executes, new interactions may occur as a consequence of the execution. For instance, if a component thread executes a timer operation in which the component publishes on a global topic, the consequence of this action would include a set of callback operation requests on all components that contain subscribers to that global topic. Lastly, all running threads are evaluated to identify the minimum amount of time that can be safely fast-forwarded in each node. If the running component threads are independent or symmetrical, then the maximum possible time progression is up to the end of the temporal partition. Note here that temporal partition in the deployment can be set to an empty list which simply removes the partitioning constraint and treats all component threads on a node as candidate threads for execution. The above sequence of transitions

repeat for as long as there is a timer expiry, a pending operation request or an unfinished component interaction.



**Figure 27: Sweep-Line Method**

Using the CPN Tools’ built-in state space analysis tool, a bounded state space was generated reaching up-to 20 hyperperiods of component thread activity. This bounded generation took 36 minutes on a typical laptop. Our goal with such an example is to evaluate the effectiveness and utility of state space reduction techniques with respect to speed and memory usage. Figure 27 shows a simple block diagram of the sweep-line method as configured in ASAP. Performing on-the-fly verification checks for lack of dead states in the analysis model, results indicate lack of system-wide deadlocks due to blocking behaviors triggered by RMI-style synchronous peer-to-peer interaction patterns. Figure 28 shows analysis results obtained from a *Verification Job* executed in the tool. Notice the on-the-fly verification taking less than 10 minutes to perform deadlock checks on this sample deployment. Using the *Palette* in ASAP, several standard ML (SML) user queries can be created to check for domain-specific properties.

| Statistics     |         | Results                      |  | Configuration |  |
|----------------|---------|------------------------------|--|---------------|--|
| Execution time | 495.147 |                              |  |               |  |
| Nodes          |         |                              |  | 144861        |  |
| Arcs           |         |                              |  | 659092        |  |
| Statistics     |         | Results                      |  | Configuration |  |
| No Dead States |         | true                         |  |               |  |
| Statistics     |         | Results                      |  | Configuration |  |
| Time           |         | Thu Jan 01 16:05:05 CST 1970 |  |               |  |
| Model          |         | 100_threads                  |  |               |  |

**Figure 28: Dead States Checking in a Component-based application**

It must be noted here that this improved result is due to not only because of the efficient state space search but also because of symmetry-based structural reduction discussed in the previous section. If not for this reduction, the state space search requirements would exponentially grow for each new hardware node added to the deployment.

## CHAPTER VII

### EXPERIMENTAL EVALUATION

Experimentally validating our timing analysis results is an important and necessary requirement. In order to obtain any level of confidence in our CPN-based work, the system design model needs to be completed implemented, and deployed on the target hardware platform. We have constructed a testbed [61] to simulate and analyze resilient cyber-physical systems consisting of 32 Beaglebone Black development boards [1]. We have chosen the light-weight ROS [92] middleware layer and implemented our ROSMOD Component model [62] on top of it. This component model provides the same execution semantics and interaction patterns as our DREMS component model [86]. Our goal with this work is to (1) establish a set of distributed component-based applications, (2) translate this design model to our CPN analysis model, (3) deploy these applications on our testbed and accurately measure operation execution times, and finally (4) perform state space analysis on the generated CPN model to check for conservative results, compared against the real system execution.

#### 7.0.1 Challenges

Experimental validation requires that online measurements of the real-time system match with the design-time timing analysis results in a way that the timing analysis results are always close but conservative. If the timing analysis results predict a deadline violation, this does not necessarily mean that the real system will violate deadlines but if the timing analysis and verification guarantees a lack of deadline violations, then the real system should follow this prediction. The design-time timing analysis primarily uses bounded state space analysis for such predictions. The analysis is bounded by necessity i.e. to tackle the state space explosion problem. Obtaining confidence from the timing analysis

results depends entirely on the behavior of the system and the applied bounds. Recall that DREMS components are dormant by nature and need to be triggered either by timers or through interaction ports. Component-based applications using DREMS typically exhibit some periodic behavior i.e. sequences of interactions between components that repeats periodically. Each interaction sequence could involve a set of distributed components or assemblies. In such scenarios, design-time analysis bounds the state space generation to some constant multiple of the overall period of the application. The overall period here is some duration of time after which all of the sequences of interactions in the application have repetition. By analyzing some reasonably large multiple of the application period, the state space analysis both generates and searches a sufficiently large set of states of the system i.e. execution behaviors. Assuming the execution correctness of the timing analysis model and sufficiently conservative worst-case estimates of execution times, a complete absence of timing anomalies in this bounded state space is typically a good indication of a safely executing system. There are various ways to obtain the WCET values for individual operational steps but the easiest approach is to execute the design on our testbed and make accurate measurements.

WCET of component operational steps needs to be measured by having the component operation execute at highest priority with no other component threads intervening this process. Secondly, this experiment must be repeated multiple times with different inputs (if any), including the worst-case input. The worst-case measurement across multiple runs of the experiment gives us a *pure execution time* of the code block.

Obtaining the WCET values by this method is not only more realistic but also an accurate representation of the target system. Once these individual numbers are obtained, the values are plugged into the CPN through our business-logic models. Ideally, the CPN model, consisting of a composition of component operation models, when analyzed, produces results that closely resemble a real-system deployment of the component assembly. Such results would validate the modeling accuracy and the analysis results.

## 7.1 Resilient Cyber-Physical Systems (RCPS) Testbed

### 7.1.1 Architecture

The testbed consists of 32 *RCPS nodes*, each of which is a Beaglebone Black (BBB) [1] development board. We execute a full software stack including a ROS-based middleware, called ROSMOD [62] and the DREMS component model. For the subset of CPS we are interested in, the behavior of the CPS can be much more precisely emulated with these boards compared to running the applications inside of a standalone simulation. For example, NASA’s CubeSat Launch Initiative (CSLI) [2] provides opportunities for nanosatellites to be deployed into space for research. CubeSats are small (4-inch long) satellites running low-power embedded boards and being prepared for interplanetary missions [3] to Mars. A distributed set of CubeSats can be easily tested with this architecture if it can be integrated with a high-fidelity space flight simulator.



Figure 29: Testbed Architecture

The Gigabit Ethernet port of each BBB is connected to a *Communication Network*

switch. This is a programmable OpenFlow [77] switch, allowing users to program the flowtable of the switch to control the routes that packets follow and completely configure the full network and subnets required for their emulated deployment. Furthermore, the configurability of the communications network enables per-link or per-flow bandwidth throttling, enabling precise network emulation. The primary *Development and Control* machine, running our software development tools, communicates with the BBBs using this network. After software applications are deployed on this testbed, the characteristics of the real CPS network can be enforced on the application network traffic. Therefore, this network emulates the physical network which a distributed CPS would experience on deployment.

Each RCPS node is also connected to a *Physics Network* using a 10/100 USB-to-Ethernet adapter, since the BBBs only have one gigabit ethernet port. This network is connected to a *Physics Simulation Machine* running Cyber-Physical Systems simulations. This network provides the infrastructure necessary to emulate CPS sensing and actuation in the loop, allowing application software to periodically receive sensor data and open interfaces to output actuator commands to the simulation.

The Physics Simulation Machine closes the interaction loop for the testbed nodes, allowing the physical dynamics of the RCPS nodes to be simulated in the environment in which it would be deployed, *e.g.* satellites' orbital mechanics and interactions can be simulated for a satellite cluster in low Earth orbit (LEO).

## 7.2 ROSMOD Software Infrastructure

The software infrastructure includes our model-driven toolsuite and DREMS-style component model called ROSMOD [62], the Robot Operating System middleware [92], and component-based software applications developed for ROSMOD. The applications are cross-compiled for Beaglebone Black and the relevant processes are started at real-time

priority with *SCHED\_RR* linux real-time process scheduling using our ROSMOD deployment framework (Figure 31).

### 7.2.0.1 ROSMOD Modeling Language

To enable the design, development, and testing of software on distributed CPS, we have developed a modeling language specific to the domain of distributed CPS which utilize ROS, the ROSMOD Modeling Language (RML). Figure 30 shows the metamodel for this language using GME [67] notation; the GME-based metamodel figure is very similar to a traditional UML class diagram with some minor differences in notation. RML captures all the relevant aspects of the software, the system (hardware and network), and the deployment which specifies how the software will be executed on the selected system. Using ROSMOD, developers can create models which contain instances of the objects defined in RML. This approach of using a domain specific modeling language to define the semantics of the models allows us to check and enforce the models for correctness. Furthermore, this approach allows us to develop generic utilities or extensions, called *plugins* [74] which can act on any models created using ROSMOD, for instance generating and compiling the software automatically or automatically deploying and managing the software on the defined system. The rest of this section goes into the specific parts of the modeling language, called the metamodel, and how they define the entities in a ROSMOD Model.



**Figure 30: ROSMOD Metamodel.** Containment is specified from *src* to *dst* where the source has a containment attribute *quantity*, meaning that *quantity* objects of type *src* can be contained in an object of type *dst*. Pointers are specified as a one to one mapping from source to destination, using the name of the pointer. Sets allow for pointer containment. All objects contain a *name* attribute of type *string*, not shown for clarity. Note: the metamodel is used to create the ROSMOD Modeling Language, but users do not see or interact with it; it is used to enforce proper model creation semantics.

The top-level entity of RML is a *Project*, which is shown in the upper left of Figure 30. The language supports a variety of modeling concepts that address structural and behavioral aspects for distributed embedded platforms. ROSMOD users can create models of software workspaces, required software libraries, embedded devices, network topologies, component constraints and hardware capabilities. The language also supports code development, particularly with regards to port interface implementations i.e. the execution code for operations owned and triggered by communication ports or local timers. Below, we describe in detail the various aspects of this metamodel and how these concepts are integral to developing distributed CPS and rapid prototyping needs.

#### 7.2.0.2 Motivation for ROSMOD Software Model

The goal of the ROSMOD software model is to provide a language to precisely model the application software. When using a DREMS-style component model, the software is primarily a collection of components, where each component is defined by its ports and timers. Building a precise model of the software has various benefits. Firstly, applying model-driven development techniques enables reuse of previously defined components i.e. a single component can be instantiated or copied or modified as required and executed on the runtime system. Secondly, the development time of the application can be reduced significantly as a large part of the runtime code can be fully generated based on templates. Lastly, a clear model of the software provides a canvas for design-time analysis. If the structural aspects of the software are captured in the model of the component assembly, and the behavioral properties of the components are encoded in the attributes of its ports and timers, then using translation rules, a design-time analysis model can be fully generated.

#### 7.2.0.3 Software Model

The *Software* class in Figure 30 models a software workspace. A workspace, following ROS terminology, is a collection of applications that are developed and compiled together

into binaries. Thus, each Software class can contain ROS applications, called *Packages*, and *Libraries* required for the applications. Packages consist of *Messages*, *Services* and *Components*. Components contain a set of pointers to Libraries to establish dependence e.g. an *ImageProcessor* component *requires* OpenCV, an open-source computer vision library. Libraries are of two types: Source libraries and System libraries. Source libraries are standalone archives that can be retrieved, extracted and integrated into the software build system with no additional changes. System libraries are assumptions made by a software developer regarding the libraries pre-installed in the system. Here, system refers to the embedded device on which the component is intended to execute.

*Messages* represent ROS message types used by publisher and subscriber ports for topic-based communication. Similarly, *Services* describe the ROS peer-to-peer request-reply interaction pattern. Each service is characterized by a pair of messages, *request* and *response*. A client entity can call a service by sending a request message and awaiting a response. This interaction is presented to the user as a remote procedure call. Each ROSMOD component contains a finite set of communication ports. These ports refer to messages and services to concretize the communication interface. Components can also contain *Timers* for time-triggered operation e.g. periodically sampling inertial measurement sensors while operating an unmanned aerial vehicle (UAV).

#### 7.2.0.4 Motivation for ROSMOD System Model

The goal of the ROSMOD system model is to provide a language to precisely model the network of computers capable of executing applications defined in the software model. The system model is necessary for both compilation and deployment. The software defined in the software model, and therefore the generated source code must be compiled down to a binary for runtime execution. An accurate model of the available runtime system provides necessary information for cross-compilation requirements and any integrated runtime load balancing features. The deployment framework can use the system model to find a suitable

candidate device onto which the application processes are deployed. To automate this process, the system model must capture fine-grained details about each available device, including information such as the IP address, the user permissions, and means to access the device e.g. Secure Shell Protocol (SSH) [111]. As with the software model, this system model can be reused in all ROSMOD projects for a given hardware assembly, such as the RCPS tested.

#### 7.2.0.5 System Model

A *System Model* completely describes the hardware architecture of a system onto which the software can be deployed. A ROSMOD Project contains one or more *Systems*. Each System contains one or more *Hosts*, one or more *Users*, one or more *Networks*, and one or more *Links*. A host can contain one or more *Network Interfaces*, which connect through a link to a network. On this link the host's interface is assigned an IP address, which matches the subnet and netmask specification of the network. Additionally, a host has a set of references to users, which define the user-name, home directory, and ssh-key location for that user. The host itself has attributes which determine what kind of processor architecture it has, e.g. *armv7l*, what operating system it is running, and lastly a combination of Device ID and Device ID Command which provide an additional means for specifying the type of host (and a way to determine it), for instance specifying the difference between a BeagleBone Black and an NVIDIA Jetson TK1 which both have *armv7l* architecture but can be separated by looking at the model name in the device tree. Finally, a host may contain zero or more *Capabilities* to which the component constraints (described in the previous section) are mapped. The final relevant attribute is the *Network Profile* attribute of a link. Using the network profile, which is specified as a time-series of bandwidth and latency values, we can configure the links of the network using the Linux TC to enforce time-varying bandwidth and latency. This network configuration is useful when running experiments on

laboratory hardware for which the network is not representative of the deployed system's network.

#### 7.2.0.6 Deployment and Experimentation

*Deployment* refers to the act of starting application processes on candidate hosts, where each host can provide for and satisfy all of the constraints of the processes e.g. general purpose input/output (GPIO) ports, CPU speed etc. Therefore, a deployment is a mapping between application processes and system hosts on which the application processes run. The ROSMOD Deployment Model makes this map a loose coupling to enable rapid testing and experimentation. Each Deployment consists of a set of *Containers*. Each container, conceptually, is a set of processes that need to be colocated. Containers also ensure that no process outside a container is deployed along with the container once it has been mapped to a host. Each container, therefore, contains a set of *Nodes* (ROS terminology for processes). In each node, application developers instantiate one or more components previously defined in the Software model. Following the ROSMOD component model, each such instance maps to an executor thread that executes the operations in the component's operation queue. Note that the same component can be instantiated multiple times even within the same node. Also note the container is not mapped to a specific host within the deployment model, but rather is automatically mapped to a host by the deployment infrastructure within an *Experiment*.

As shown in Figure 30, each project supports a set of *Experiments*. Each experiment has pointers to one System model and one Deployment model. The system model provides the set of available devices and the deployment model provides the set of containers, where each container contains a set of component constraints that need satisfying (the union of all the container's nodes' component constraints). ROSMOD uses these sets to find a suitable mapping and then deploys the containers on the chosen host devices, if a mapping can be

found which satisfies all the constraints of all the containers. When the deployment infrastructure selects hosts from the system model for mapping, it first checks to see which of the system model’s hosts are 1) reachable, 2) have valid login credentials, 3) have the correct architecture, operating system, and device ID, and 3) are not currently running any other compilation or experiment processes from any other user. In the case that there are no available hosts in the system or the deployment’s constraints cannot be satisfied, the infrastructure informs the user. Upon successful deployment of the experiment, the infrastructure automatically stores the specific mapping relevant to the deployed experiment for later management and destruction. When such experiments are stopped, ROSMOD retrieves the component logs from the hosts and displays results.

Such a loose coupling between the deployment model and the system model, along with the infrastructure automatically mapping the containers to valid, unused hosts enables the execution of the same deployment onto subsets of a large system, for instance running many instances of the same deployment as separate experiments in parallel on a large cluster of embedded devices. Additionally such a loose coupling enables redeployment onto a completely different system by simply changing the experiment’s system model reference; the infrastructure will automatically verify that the new system meets the constraints of the deployment and has available hosts.

### 7.2.1 Deployment Infrastructure

The workflow for software deployment is as shown Figure 31. After the user has generated and compiled the software model into binary executables, they can run an experiment that has valid deployment model and system model references. Every ROSMOD workspace is generated with an additional *node* package. This builds a generic node executable that can dynamically load libraries. When the software infrastructure generates and compiles the source code for the software model, the components are compiled into dynamically loadable libraries, one for each component definition along with a single executable

corresponding to the generic node package. The first step the deployment infrastructure performs when running an experiment is generating the XML files which contain metadata about each ROS node modeled in the ROSMOD Deployment Model. This metadata includes the component instances in each node and the appropriate component libraries to be loaded. Based on the XML file supplied to the node executable, the node will behave as one of the ROS nodes in the deployment model. This allows for a reusable framework where a generic executable (1) loads an XML file, (2) identifies the component instances in the node, (3) finds the necessary component libraries to load and (4) spawns the executor threads bound to each component.



Figure 31: Software Deployment Workflow

The reason for having the components be libraries that are loaded at run-time by a node executable is 1) to help enforce separation between each component’s code, 2) to shorten the compilation time (which may be quite long for embedded processors and complex models), and most importantly 3) to enable the ability for users to change the mapping of component instances to processes without the need to recompile any of the code. Because ROSMOD is focused on the rapid development and deployment of reusable software components, we designed the infrastructure to allow users to experiment with which components are collocated within processes dynamically and rapidly iterate through several scenarios and experiments without having to wait for code compilations between experiments. Finally, such a component-based deployment framework enables unit testing and integration testing in a very well-defined manner for the software components. If an error occurs during the execution of an application on a system, the user can easily and quickly break the deployment down into sub-deployments for unit-testing of only the relevant components to determine the source of the error.

In the above architecture, the deployment needs three primary ingredients: (1) the generic node executable, (2) dynamically loadable component libraries, and (3) an XML file for each ROS node in the deployment model. For each new node added to the deployment model, by merely regenerating the XML files, we can establish a new deployment. The ROS workspace is rebuilt only if new component definitions are added to the Software Model. This architecture not only accelerates the development process but also ensures a separation between the Software Model (i.e. the application structure) and deployment-specific concerns e.g. component instantiation inside ROS nodes.

When the user has selected an experiment to run, the deployment infrastructure first determines whether the selected deployment can execute on the selected system. Like the software infrastructure described above, the deployment infrastructure queries the selected system to validate that the system is reachable, conforms to the model, and has available hosts for deployment (i.e. they are not currently running any compilation or deployment

processes). Once those available hosts have been determined, the infrastructure attempts to map the deployment’s containers to the available hosts based on the constraints and capabilities of the two sets. If the constraints cannot be satisfied by the capabilities of the available hosts, the user is informed and the deployment of the experiment is halted. If the capabilities of the hosts do satisfy the constraints of the containers and their associated components, the deployment infrastructure generates the required XML files for the deployment and copies the XML files and the binaries over to the selected hosts, before starting the relevant processes. Finally, if all of those steps are successful, the infrastructure saves the mapping into the model for user inspection and for later teardown of the experiment.

When the user chooses to end a currently running experiment, the deployment infrastructure verifies that the experiment is still running before stopping the associated processes, copying the relevant log files back, cleaning up the deployment artifacts from the hosts, and removing the saved experiment mapping from the model.

### 7.3 Evaluation of Timing Analysis Results

Experimental validation should demonstrate that online measurements of the real-time system match with the timing analysis results in a way that the timing analysis results are always close but conservative. The goal of the analysis is to obtain a fairly accurate estimation of the runtime behavior i.e. estimates of component timing behavior that isn’t so conservative that the results are useless. If the real execution of a specific operation takes 100 milliseconds and the timing analysis is 135 milliseconds, then this is close but conservative. On the other hand, if the design-time analysis predicts the execution time to be in the order of seconds or tens of seconds, then the analysis is conservative but grossly overestimate. One of the biggest assumptions in our CPN work is the knowledge of worst-case execution times of the individual steps in the component operations. We have previously designed [63] a business-logic modeling grammar that captures the temporal behavior of

component operations, especially WCET metrics for the different code blocks inside an operation. For example, consider a simple client-server example as shown in Figure 11. The client component is periodically triggered by an internal timer and executes a synchronous remote method invocation to a remote server component. The interaction here demands that the client component be blocked for the duration of time it takes the server to receive the operation, process its message queue, execute the relevant callback, and respond with output.

Note that in Figure 11, we only annotate isolated code blocks that take a fixed amount of execution time on a specific hardware architecture. These are the only measurements that we can reliably make with repeated testing and instrumentation. The client-side blocking delay is not measured because the number of factors responsible for this delay are numerous e.g. server's message queue state, scheduling non-determinism, network delays etc. In order to be able to predict this delay, we need to use state space analysis and search through the tree of possible executions to identify the worst-case blocking delay. This also means that our CPN model must capture and account for such delay-causing factors.

The remainder of this section presents various primitive interaction patterns and assemblies that have been evaluated. The results are restricted to simple cases, though we have tested on medium-to-large scale examples spanning 25-30 computing nodes, and with up to a 100 components. The scalability of our model, however, is not within the scope of this paper as we have previously evaluated this metric [63]. As mentioned earlier, in all of our tests, we use the ROS [92] middleware and our ROSMOD [62] component model.

### 7.3.1 Understanding the CPN Analysis Plots

By performing state space analysis, we are analyzing a bounded tree of possible component behaviors. By identifying the worst-case execution trace in this tree, we're able to obtain a suitable conservative candidate execution that represents a possible behavior. Once this trace is identified, we plot the execution time behavior of all components in this

trace. This pattern is followed in all of the following plots. Figure 32 describes the analysis plots presented later in this chapter. Each subplot in this figure represents the execution of a component operation. The x axis of this plot represents the analysis time, and the y axis represents the execution time of the operation. Each execution is shown as a rectangular pulse, the amplitude of which is the execution time of the operation. The rising edge of the pulse represents the enqueue time stamp of the operation i.e. the time instant when a request for this operation was enqueued onto the component message queue. The falling edge of the pulse represents the completion time stamp i.e. the time instant when the component executor thread has completed execution of the operation and is ready to service the next request waiting in the queue. Since the execution time of the operation is calculated from the enqueue time instant, the plot can have intersecting pulses, as shown in the second subplot. Here a new operation request is enqueued onto the message queue while an existing instance of the operation request is being executed by the component executor thread.



**Figure 32: Interpreting Execution Time Plots**

### 7.3.2 Client-Server Interactions

As shown in Figure 11, a simple client server example involves a periodically triggered client component that fetches data from a remote server. Figure 33 shows our experimental trace of a simple distributed client-server sample. The client (client\_timer\_operation) is triggered every 500 ms, and performs floating-point calculations in a loop requiring the services of a remote operation. The server (Power\_operation) periodically receives this operation request and responds to it, taking about 1.2s to complete each operation instance. In this experiment, these component threads are running at high uninterrupted real-time priorities.



**Figure 33: Experimental Observation: Client-Server Interactions**

Figure 34 shows the execution time plot derived from our CPN. As expected, since there are no other interruptions on the server side, the server is able to promptly respond to the client.



**Figure 34: CPN Analysis Results: Client-Server Interactions**

Table 1 shows a summary of these results. Similar tables are included as part of all other experiments in this section.

### 7.3.2.1 Bad Designs

The goal of our CPN timing analysis is to identify bad component designs, unacceptable execution times, response times etc. There are various ways in which we can accidentally design a poorly performing client-server interaction. In the above case, the server operation takes 852 ms in its worst-case before responding to the client and unblocking the client

**Table 1: Client Server Example - Summary of Results**

| Operation Name         | Component Name | Experimental WCET (ms) | Timing Analysis WCET (ms) | Deadline (ms) |
|------------------------|----------------|------------------------|---------------------------|---------------|
| Power Operation        | Server         | 829.155041             | 852.0                     | 2000          |
| Client Timer Operation | Client         | 1121.306376            | 1152.0                    | 3000          |

executor thread. If instead, the server operation took 8.5 seconds, the client component will stay blocked for 10 times longer and the client timer expiries will not be serviced faster than the timer periods. This shows a simple use-case where the currently blocked client timer operation is starving subsequent timer expiries from being handled promptly.

Figure 35 shows our CPN predictions after simply changing this server execution time.



**Figure 35: CPN Analysis Results: Client-Server Response Times in Bad Designs**

The execution time of each new client-side timer operation is worse than the previous since the operation is spending much longer waiting in the queue. Recall that the execution time of a component operation includes the waiting time in the message queue. Even with a bounded state space that spans just 1 minute, it is clear that the client component message queue size is monotonically increasing. This is a use case where a client component execution is affected by delays caused on a remote server. Each client-server interaction delay will only be worsened when the server component has other operations to tend to aside from the client requests.

### 7.3.3 Publish-Subscribe Interactions

Similar to the earlier example, consider the ROSMOD publish-subscribe interaction. A publisher is periodically triggered by a timer when this component broadcasts a message on a topic. A subscribing component receives this message and performs some computation. In this case, the timer period is set to 2 seconds i.e. every 2 seconds, these two component interact via publish-subscribe messaging passing.



**Figure 36: Experimental Observation: Publish-Subscribe Interactions**

Figure 36 shows our testbed observations and Figure 37 shows our CPN analysis results. As evident, the CPN results closely match and validate this sample. Table 2 summarizes the results.



**Figure 37: CPN Analysis Results: Publish-Subscribe Interactions**

### 7.3.3.1 Bad Designs

Similar to the client-server example, we can explore another accidental bad design that may become hard to track. In a publish-subscribe interaction, the publisher and the subscriber are completely detached from each other i.e. delays in the subscriber operation do not affect the publisher. So, the only way the publisher component can affect its own behavior is how it is triggered. Periodically triggered data dissemination is the most commonly used streaming pattern aside from event-driven messaging. Here, if the period of the timer

**Table 2: Publish Subscribe Example – Summary of Results**

| Operation Name            | Component Name | Experimental WCET (ms) | Timing Analysis WCET (ms) | Deadline (ms) |
|---------------------------|----------------|------------------------|---------------------------|---------------|
| Subscriber Port Operation | Subscriber     | 194.718958             | 200.0                     | 1000          |
| Publisher Timer Operation | Publisher      | 249.2155               | 252.0                     | 1000          |

is decreased from 2 s to 10 ms, the publisher gets triggered too frequently and is seriously affected by a local design flaw.



**Figure 38: CPN Analysis Results: Time-triggered Publisher – Periodicity Issues**

Figure 38 shows the CPN results for this design. The subscriber still performs as expected taking 200 ms to process each incoming message. But the publishing component is affected by the periodicity of its trigger. The publish\_timer fires every 10 ms, and at each expiry enqueues a timer operation request onto the publisher component's message queue. Each timer operation itself takes about 250 ms i.e. 25 times of the periodicity of its trigger. The inevitable result of this design is the monotonically rising execution times of each subsequent timer operation due to progressively delayed response. Here, the  $t_{response\_time} = t_{dequeue} - t_{enqueue}$  becomes progressively worse and eventually the publisher's message queue overflows. In the real experiment, we were unable to access the publisher component's device via remote shell as the device CPU was saturated.

### 7.3.4 Trajectory Planner

In the past [65], we have used a *Trajectory Planner* deployment to illustrate the utility of our state space analysis. As shown in Figure 39, a Sensor component is periodically triggered every second by the *sensor\_timer* at which point it publishes a notification to the Trajectory Planner, alerting the planner of new sensor state. The planner component receives this notification on its *state\_subscriber*. On receiving this message, the planner component executes a remote method invocation to the *compute* server located in the Sensor, blocked and waiting for a response. At this point, the *compute\_operation* is executed on the Sensor which returns the updated sensor state. This unblocks the planner component which uses the new sensor state to perform trajectory planning tasks.



Figure 39: Trajectory Planner Test

Figure 40 shows the execution time plot of this sample where the sensor updates happen once a second. Figure 41 presents the CPN analysis plot and Table 3 summarizes the results.



**Figure 40: Experimental Observation: Trajectory Planner**

This is a common interaction pattern in Cyber-Physical systems since embedded sensors are updated at a much higher frequency than a path planning entity. Thus, the planner can query the sensor at a lower rate to sample the sensor state. In this example, the planner is matching the frequency of the sensor since the execution cost is low. However, when more components are added to this deployment, the planner would have to fetch sensor state less frequently so as to not affect other system-level deadlines.



**Figure 41: CPN Analysis Results: Trajectory Planner**

If the sensor update frequency is increased to 100 ms, the sensor begins to notify the planner at a much higher frequency than expected. If there is no down-sampling on the planner's side, every single update will be handled by the planner, leading to dangerous queue size growth on the planner. Figure 42 shows this deployment, as observed in the CPN analysis.

**Table 3: Trajectory Planner Example – Summary of Results**

| Operation Name             | Component Name     | Experimental WCET (ms) | Timing Analysis WCET (ms) | Deadline (ms) |
|----------------------------|--------------------|------------------------|---------------------------|---------------|
| Sensor Timer Operation     | Sensor             | 199.00875              | 199.05                    | 400           |
| Compute Operation          | Sensor             | 445.373417             | 448.95                    | 600           |
| State Subscriber Operation | Trajectory Planner | 446.874334             | 450.95                    | 1000          |



**Figure 42: CPN Analysis - Sensor firing too frequently**

### 7.3.5 Time-triggered Operations

Time-triggered operations are an integral part of our component model. DREMS components are dormant by default. A timer has to trigger a inactive component for all subsequent interactions to happen. Since the DREMS component model supports various scheduling schemes on a single component message queue, this following test evaluates a priority first-in first-out (PFIFO) scheme. Multiple timers are created in a single component, each with a unique priority and period. A timer with a high frequency is assigned a high priority. Figure 43 shows our experimental observations on a 5-timer example.

Since ROSMOD components are associated with a single executor thread and component operations are also non-preemptive, a low-priority operation could theoretically run forever, starving a higher priority operation from ever executing, leading to deadline violations e.g. *Timer\_1\_operation* can affect all other higher priority timers. Figure 44 shows our CPN prediction where such a scenario is evident. It can be seen that *Timer\_5\_operation*, the timer with the highest priority is periodically seeing spikes in execution time,

courtesy of other lower priority operations consuming CPU without preemption. Table 4 summarizes the results.



**Figure 43: Experimental Observation: Periodic Timers**



**Figure 44: CPN Analysis Results: Periodic Timers**

### 7.3.6 Long-Running Operations

Our ROSMOD component model implements a non-preemptive component operation scheduling scheme. A component operation that is in the queue, regardless of its priority, must wait for the currently executing operation to run to completion. This is a strict rule for operation scheduling and does not work best in all system designs e.g. in a long-running computation-intensive application, rejuvenating the executing operation periodically and

**Table 4: Periodic Timers – Summary of Results**

| Operation Name | Component Name | Experimental WCET (ms) | Timing Analysis WCET (ms) | Deadline (ms) |
|----------------|----------------|------------------------|---------------------------|---------------|
| Timer_1        | Component_1    | 243.160041             | 253.0                     | 300           |
| Timer_2        | Component_1    | 122.4165               | 138.0                     | 150           |
| Timer_3        | Component_1    | 62.789916              | 69.5                      | 75            |
| Timer_4        | Component_1    | 28.962792              | 29.54                     | 37.5          |
| Timer_5        | Component_1    | 5.15275                | 9.89                      | 10            |

restarting it at a previous checkpoint increases the likelihood of successfully completing the application execution. In applications executing long-running artificial intelligence (AI) search algorithms e.g. flight path planning algorithms, the computation should not hinder the prompt response requirements of highly critical operation requests such as sudden maneuver changes. Our ROSMOD component model does not support the *cancellation* of long-running component operations to service other highly critical operations waiting in the queue. With a few minor modifications to our scheduling schemes, long running operations can, however, be suspended if a higher priority waiting operation requires service. With these additions, we are able to model and analyze component-based systems that support long-running operations, with checkpoints, enabling the novel integration of AI-type algorithms into our design and analysis framework.

#### 7.3.6.1 Challenges

One of the primary challenges here is to identify the semantics of a long-running component operation i.e. the scenarios under which the component operations scheduler suspends a cooperating long-running operation in favor of some other operation waiting in the queue. If a long-running computation is modeled as a sequence of execution steps with bounded checkpoints, then the operation would execute one step at a time and suspend at such checkpoints if necessary. An important challenge here is accurately identifying the priority difference between the long-running operation and the waiting operation. If the long-running operation is one checkpoint away from completion e.g. 100-200 ms of execution time, then strictly following our suspension rules would not be the most prudent choice since this operation is almost complete. However, if the waiting operation is a critical one, then regardless of the state of the long-running operation, the executing operation must be suspended. Secondly, the modeled long-running computation semantics must be incorporated into our component model so that any analysis results obtained can be suitably validated.



**Figure 45: Long Running Operations - Timing Diagram**

### 7.3.6.2 Implementation and Results

In each long-running operation, we, therefore, include a synchronous *checkpoint step*, as shown in Figure 45. The only assumption we make about this long-running operation is the periodicity of these checkpoint steps i.e. we know how frequently a new checkpoint is reached and we assume that the search algorithm used by the long-running operation is capable of reaching a safe state (the checkpoint) before suspending itself if required. If a higher priority operation is ready and waiting in the queue, the long-running operation runs till the next checkpoint is reached, then suspends. The higher priority operation is then processed. Figure 46 shows the *Software Model* for a component assembly with long running operations.

## Package: three\_component\_example



**Figure 46: Long Running Operation - Software Model**

The assembly consists of three components. Components *Component\_1* and *Component\_2* periodically publish on the *ComponentName* message. *Component\_3* periodically queries the server in *Component\_2*. During these interactions, *Component\_1* is performing a long running operation, the duration of which, is magnitudes larger than the average execution time of all other operations. Figure 47 shows the execution time plot of this scenario, as measured on our testbed.



**Figure 47: Experimental Observation: Composed Component Assembly**

For the CPN analysis, in order to obtain pure execution times of all these operations, each operation on each component is executed as a stand-alone function on the hardware. This way, we know the average and worst-case execution times of all operational steps with minimal interruptions. These numbers are injected into our generated CPN and state space analysis is performed. Figure 48 shows our CPN analysis results for the same assembly.



**Figure 48: CPN Analysis Results: Composed Component Assembly**

## 7.4 Analysis Limitations

The business logic abstraction, as presented in Section 5.3, is quite simple. This model represents abstractly how long the different blocks of code in an operation take, but does not fully model its behavior e.g. each non-blocking local code block is represented by a single WCET and no other properties. Since variables local to these functional code blocks are not modeled, this grammar also does not model conditional behavior. So, a high-level controller component that executes an event-driven state machine cannot be sufficiently modeled with the current model. Moreover, depending on the local state of such code blocks, different interaction patterns may be invoked e.g. if this high-level controller was responsible for performing periodic image processing to identify an object, the image processing (and all the associated image processing) would stop once the object was identified. Since this conditional interaction is not modeled in the CPN business logic grammar, the

abstraction would assume that the image processing interactions always happen, instead of conditionally. Such abstractions can lead to gross over-estimation of the worst-case execution times of component operations. The primary reason for keeping the business logic grammar to this level of simplicity is to reduce the level of state space explosion i.e. modeling all local variables of an operation, and consequently the state changes for each of these variables would exponentially increase the size of the state space to be analyzed. Also, modeling the local variable state would also require (1) modeling the semantics of language with which the operation code was written, (2) evaluating all expressions using these local variables, and (3) calculating the result of all conditionals used in the business logic at all possible times. Such an analysis would be too refined, hard to implement and susceptible to semantic errors.

## CHAPTER VIII

### SUMMARY AND FUTURE WORK

*Summarize contributions first*

All of the results presented in this dissertation make an important assumption about the network – the network resources available to each component is much larger than the requirements of the application i.e. there are no buffering delays on the network queues when components periodically produce data. The current analysis model, in this respect, is quite lacking. When a component publishes a message on a topic, the analysis immediately generates a reception message that waits to enqueue on the receiver’s message queue. In reality, this interaction could be a lot more involved – the published message is sent to the kernel network queue on the sender’s side and removed from this queue following a data production *profile* i.e. available bandwidth as a function of time. When dequeued, the packets take a finite worst-case transmission time before being noticed on the receiver’s side. The buffering delays on the sender’s side and the transmission time on the network are completely ignored by the timing analysis model. In order to improve on this design, we have attempted to integrate existing Network Calculus-based analysis methods [30] into our CPN. Specifically, a place is added to model the *Network Queue* and a *Dequeue* transition fires when the network is ready to transport more packets from the sender. The dequeuing follows a strict network profile and ceases transmission when the data production rate is larger than the available bandwidth.

# **Appendices**

## APPENDIX A

### PUBLICATIONS

The full text in each of the following papers was reviewed by at least 3 reviewers.

#### 1.1 Workshop Papers

- P. S. Kumar, A. Dubey, and G. Karsai. Colored petri net-based modeling and formal analysis of component-based applications. In *11th Workshop on Model Driven Engineering, Verification and Validation MoDeVVA 2014*, page 79, 2014
- P. Kumar and G. Karsai. Integrated analysis of temporal behavior of component-based distributed real-time embedded systems. In *Object/Component/Service-Oriented Real-Time Distributed Computing Workshops (ISORCW), 2015 IEEE International Symposium on Real-time Computing (ISORC)*, pages 50–57, April 2015
- P. Kumar and G. Karsai. Experimental validation of timing analysis for component-based distributed real-time embedded systems. In *Object/Component/Service-Oriented Real-Time Distributed Computing Workshops (ISORCW), 2016 IEEE International Symposium on Real-time Computing (ISORC)*, May 2016

#### 1.2 Conference Papers

- P. Kumar, W. Emfinger, A. Kulkarni, G. Karsai, D. Watkins, B. Gasser, C. Ridgewell, and A. Anilkumar. ROSMOD: A Toolsuite for Modeling, Generating, Deploying, and Managing Distributed Real-time Component-based Software using ROS. In *Proceedings of the IEEE Rapid System Prototyping*, RSP 2015, Amsterdam, Netherlands, 2015. IEEE
- P. Kumar, W. Emfinger, and G. Karsai. A Testbed to Simulate and Analyze Resilient

Cyber-Physical Systems. In *Proceedings of the IEEE Rapid System Prototyping*, RSP 2015, Amsterdam, Netherlands, 2015. IEEE

- W. Emfinger, P. Kumar, A. Dubey, W. Otte, A. Gokhale, G. Karsai. DREMS: A Toolchain for the Rapid Application Development, Integration, and Deployment of Managed Distributed Real-time Embedded Systems. In *Proceedings of the IEEE Real-Time Systems Symposium*, RTSS@Work 2013, Vancouver, Canada, 2013. IEEE
- Balasubramanian, D., W. Emfinger, P. S. Kumar, W. Otte, A. Dubey, and G. Karsai. An application development and deployment platform for satellite clusters. In *Proceedings of the Workshop on Spacecraft Flight Software*, 2013
- Balasubramanian, D., A. Dubey, W. R. Otte, W. Emfinger, P. Kumar, and G. Karsai. A Rapid Testing Framework for a Mobile Cloud Infrastructure. In *Proceedings of the IEEE International Symposium on Rapid System Prototyping*, RSP, 2014. IEEE

### 1.3 Journal Papers

- D. Balasubramanian, A. Dubey, W. Otte, T. Levendovszky, A. Gokhale, P. Kumar, W. Emfinger, and G. Karsai. Drems ml: A wide spectrum architecture design language for distributed computing platforms. *Science of Computer Programming*, 2015
- Levendovszky, T., A. Dubey, W. R. Otte, D. Balasubramanian, A. Coglio, S. Nyako, W. Emfinger, P. Kumar, A. Gokhale, and G. Karsai. DREMS: A Model-Driven Distributed Secure Information Architecture Platform for Managed Embedded Systems. In *IEEE Software*, vol. 99: IEEE Computer Society, 2014. IEEE

#### 1.3.1 Awaiting Reviews

- P. Kumar, W. Emfinger, G. Karsai, D. Watkins, B. Gasser, and A. Anilkumar. ROS-MOD: A Toolsuite for Modeling, Generating, Deploying, and Managing Distributed

Real-time Component-based Software using ROS. In special issue of *Journal of Electronics* on Rapid System Design with Dedicated Architectures and Specific Software Tools, 2016.

## REFERENCES

- [1] Beaglebone Black. <http://beagleboard.org/BLACK/>.
- [2] NASA CubeSat Launch initiative. [https://www.nasa.gov/directories/heo/home/CubeSats\\_initiative.html](https://www.nasa.gov/directories/heo/home/CubeSats_initiative.html).
- [3] NASA CubeSats Mission to Mars. <http://www.nasa.gov/press-release/nasa-prepares-for-first-interplanetary-cubesats-on-a>
- [4] B. Alpern and F. B. Schneider. Verifying temporal properties without temporal logic. *ACM Trans. Program. Lang. Syst.*, 11(1):147–167, Jan. 1989.
- [5] R. Alur and D. L. Dill. A theory of timed automata. *Theoretical Computer Science*, 126:183–235, 1994.
- [6] D. P. Appenzeller and A. Kuehlmann. Formal verification of a powerpc microprocessor. In *Computer Design: VLSI in Computers and Processors, 1995. ICCD'95. Proceedings., 1995 IEEE International Conference on*, pages 79–84. IEEE, 1995.
- [7] ARINC Incorporated, Annapolis, Maryland, USA. *Document No. 653: Avionics Application Software Standard Interface (Draft 15)*, Jan. 1997.
- [8] Autosar GbR. AUTomotive Open System ARchitecture. <http://www.autosar.org/>.
- [9] D. Balasubramanian, A. Dubey, W. Otte, T. Levendovszky, A. Gokhale, P. Kumar, W. Emfinger, and G. Karsai. Drems ml: A wide spectrum architecture design language for distributed computing platforms. *Science of Computer Programming*, 2015.
- [10] F. Bause and P. S. Kritzinger. *Stochastic Petri Nets*. Springer, 1996.
- [11] D. Bell. Uml basics: An introduction to the unified modeling language. 2003.
- [12] J. Bengtsson, K. Larsen, F. Larsson, P. Pettersson, and W. Yi. *UPPAAL-a tool suite for automatic verification of real-time systems*. Springer, 1996.
- [13] S. Beydeda, M. Book, V. Gruhn, et al. *Model-driven software development*, volume 15. Springer, 2005.
- [14] N. S. Bjørner, Z. Manna, H. B. Sipma, and T. E. Uribe. Deductive verification of real-time systems using step. *Theoretical Computer Science*, 253(1):27–60, 2001.
- [15] B. W. Boehm. A spiral model of software development and enhancement. *Computer*, 21(5):61–72, 1988.

- [16] M. Broy and K. Stølen. *Specification and development of interactive systems: focus on streams, interfaces, and refinement*. Springer Science & Business Media, 2012.
- [17] M. Burke and N. Audsley. Distributed fault-tolerant avionic systems-a real-time perspective. *arXiv preprint arXiv:1004.1324*, 2010.
- [18] Y.-A. Chen, E. Clarke, P.-H. Ho, Y. Hoskote, T. Kam, M. Khaira, J. O’Leary, and X. Zhao. Verification of all circuits in a floating-point unit using word-level model checking. In *Formal Methods in Computer-Aided Design*, pages 19–33. Springer, 1996.
- [19] S. Christensen, L. M. Kristensen, and T. Mailund. *A sweep-line method for state space exploration*. Springer, 2001.
- [20] S. Christensen, L. M. Kristensen, and T. Mailund. A sweep-line method for state space exploration. In *Proceedings of the 7th International Conference on Tools and Algorithms for the Construction and Analysis of Systems*, TACAS 2001, pages 450–464, London, UK, UK, 2001. Springer-Verlag.
- [21] S. Clemens, G. Dominik, and M. Stephan. Component software: beyond object-oriented programming, 1998.
- [22] K. Correll, N. Barendt, and M. Branicky. Design considerations for software only implementations of the ieee 1588 precision time protocol. In *Conference on IEEE*, volume 1588, pages 11–15, 2005.
- [23] M. A. Cusumano. Reflections on the toyota debacle. *Commun. ACM*, 54(1):33–35, Jan. 2011.
- [24] A. David, J. Illum, K. G. Larsen, and A. Skou. Model-based framework for schedulability analysis using uppaal 4.1. *Model-based design for embedded systems*, 1(1):93–119, 2009.
- [25] R. David and H. Alla. Petri nets for modeling of dynamic systems: A survey. *Automatica*, 30(2):175–202, 1994.
- [26] G. A. A. F. De Cindio and G. Rozenberg. Object-oriented programming and petri nets. 2001.
- [27] A. Dubey, W. Emfinger, A. Gokhale, G. Karsai, W. Otte, J. Parsons, C. Szabo, A. Coglio, E. Smith, and P. Bose. A Software Platform for Fractionated Spacecraft. In *Proceedings of the IEEE Aerospace Conference, 2012*, pages 1–20, Big Sky, MT, USA, Mar. 2012. IEEE.
- [28] A. Dubey, A. Gokhale, G. Karsai, W. Otte, and J. Willemse. A Model-Driven Software Component Framework for Fractionated Spacecraft. In *Proceedings of the 5th*

- [29] A. Dubey, G. Karsai, and N. Mahadevan. A Component Model for Hard Real-time Systems: CCM with ARINC-653. *Software: Practice and Experience*, 41(12):1517–1550, 2011.
- [30] W. Emfinger, G. Karsai, A. Dubey, and A. Gokhale. Analysis, verification, and management toolsuite for cyber-physical applications on time-varying networks. In *Proceedings of the 4th ACM SIGBED International Workshop on Design, Modeling, and Evaluation of Cyber-Physical Systems*, CyPhy ’14, pages 44–47, New York, NY, USA, 2014. ACM.
- [31] T. L. et al. Distributed real-time managed systems: A model-driven distributed secure information architecture platform for managed embedded systems. *IEEE Software*, 31(2):62–69, 2014.
- [32] P. T. Eugster, P. A. Felber, R. Guerraoui, and A.-M. Kermarrec. The many faces of publish/subscribe. *ACM Computing Surveys (CSUR)*, 35(2):114–131, 2003.
- [33] P. H. Feiler, D. P. Gluch, and J. J. Hudak. The Architecture Analysis & Design Language (AADL): An Introduction. Technical Report ADA455842, DTIC Document, 2006.
- [34] M. Feilkas, A. Fleischmann, C. Pfaller, M. Spichkova, D. Trachtenherz, et al. A top-down methodology for the development of automotive software. 2009.
- [35] D. M. Gabbay, I. Hodkinson, M. Reynolds, and M. Finger. *Temporal logic: mathematical foundations and computational aspects*, volume 1. Clarendon Press Oxford, 1994.
- [36] A. German. Software static code analysis lessons learned. *Crosstalk*, 16(11), 2003.
- [37] S. Ghanavati, D. Amyot, and L. Peyton. Compliance analysis based on a goal-oriented requirement language evaluation methodology. In *2009 17th IEEE International Requirements Engineering Conference*, pages 133–142, Aug 2009.
- [38] C. Girault and R. Valk. *Petri nets for systems engineering: a guide to modeling, verification, and applications*. Springer Science & Business Media, 2013.
- [39] M. Gonzalez Harbour, J. Gutierrez Garcia, J. Palencia Gutierrez, and J. Drake Moyano. Mast: Modeling and analysis suite for real time applications. In *Real-Time Systems, 13th Euromicro Conference on*, 2001., pages 125–134, 2001.
- [40] M. J. Gordon and T. F. Melham. Introduction to hol a theorem proving environment for higher order logic. 1993.

- [41] G. T. Heineman and W. T. Councill. Component-based software engineering. *Putting the Pieces Together*, Addison-Westley, 2001.
- [42] J. L. Hennessy and D. A. Patterson. *Computer architecture: a quantitative approach*. Elsevier, 2011.
- [43] D. Henriksson, A. Cervin, and K.-E. Årzén. Truetime: Simulation of control loops under shared computer resources. In *Proceedings of the 15th IFAC World Congress on Automatic Control. Barcelona, Spain*, 2002.
- [44] D. Henriksson, A. Cervin, and K.-E. Årzén. Truetime: Real-time control system simulation with matlab/simulink. In *Proceedings of the Nordic Matlab conference*, 2003.
- [45] L. E. Holloway, B. H. Krogh, and A. Giua. A survey of petri net methods for controlled discrete event systems. *Discrete Event Dynamic Systems*, 7(2):151–190, 1997.
- [46] G. J. Holzmann. An analysis of bitstate hashing. *Formal methods in system design*, 13(3):289–307, 1998.
- [47] G. J. Holzmann. *The SPIN model checker: Primer and reference manual*, volume 1003. Addison-Wesley Reading, 2004.
- [48] F. Huber, B. Schätz, A. Schmidt, and K. Spies. Autofocusa tool for distributed systems specification. In *Formal Techniques in Real-Time and Fault-Tolerant Systems*, pages 467–470. Springer, 1996.
- [49] K. Jensen. Condensed state spaces for symmetrical coloured petri nets. *Formal Methods in System Design*, 9(1-2):7–40, 1996.
- [50] K. Jensen. An introduction to the practical use of coloured petri nets. In *Lectures on Petri Nets II: Applications*, pages 237–292. Springer, 1998.
- [51] K. Jensen and L. M. Kristensen. *Coloured Petri Nets - Modelling and Validation of Concurrent Systems*. Springer, 2009.
- [52] K. Jensen, L. M. Kristensen, and L. Wells. Coloured petri nets and cpn tools for modelling and validation of concurrent systems. *International Journal on Software Tools for Technology Transfer*, 9(3-4):213–254, 2007.
- [53] K. Jensen and G. Rozenberg. *High-level Petri nets: theory and application*. Springer Science & Business Media, 2012.
- [54] S. C. Johnson. *Lint, a C program checker*. Citeseer, 1977.
- [55] M. B. Jones. What really happened on mars, 1997.

- [56] M. Joseph and P. Pandya. Finding response times in a real-time system. *The Computer Journal*, 29(5):390–395, 1986.
- [57] M. H. Kim and Y.-D. Kim. Simulation-based real-time scheduling in a flexible manufacturing system. *Journal of manufacturing Systems*, 13(2):85–93, 1994.
- [58] M. Klein, T. Ralya, B. Pollak, R. Obenza, and M. G. Harbour. *A practitioners handbook for real-time analysis: guide to rate monotonic analysis for real-time systems*. Springer Science & Business Media, 2012.
- [59] M. Krichen and S. Tripakis. Black-box conformance testing for real-time systems. In *Model Checking Software*, pages 109–126. Springer, 2004.
- [60] L. M. Kristensen. State space methods for coloured petri nets. *DAIMI Report Series*, 29(546), 2000.
- [61] P. Kumar, W. Emfinger, and G. Karsai. Testbed to simulate and analyze resilient cyber-physical systems. In *Rapid System Prototyping, 2015. RSP '15.*, October 2015.
- [62] P. Kumar, W. Emfinger, A. Kulkarni, G. Karsai, D. Watkins, B. Gasser, C. Ridgewell, and A. Anilkumar. Rosmod: A toolsuite for modeling, generating, deploying, and managing distributed real-time component-based software using ros. In *Rapid System Prototyping, 2015. RSP '15.*, October 2015.
- [63] P. Kumar and G. Karsai. Integrated analysis of temporal behavior of component-based distributed real-time embedded systems. In *Object/Component/Service-Oriented Real-Time Distributed Computing Workshops (ISORCW), 2015 IEEE International Symposium on Real-time Computing (ISORC)*, pages 50–57, April 2015.
- [64] P. Kumar and G. Karsai. Experimental validation of timing analysis for component-based distributed real-time embedded systems. In *Object/Component/Service-Oriented Real-Time Distributed Computing Workshops (ISORCW), 2016 IEEE International Symposium on Real-time Computing (ISORC)*, May 2016.
- [65] P. S. Kumar, A. Dubey, and G. Karsai. Colored petri net-based modeling and formal analysis of component-based applications. In *11th Workshop on Model Driven Engineering, Verification and Validation MoDeVVA 2014*, page 79, 2014.
- [66] K. G. Larsen, P. Pettersson, and W. Yi. Uppaal in a nutshell. *International Journal on Software Tools for Technology Transfer (STTT)*, 1(1):134–152, 1997.
- [67] A. Ledeczi, M. Maroti, A. Bakay, G. Karsai, J. Garrett, C. Thomason, G. Nordstrom, J. Sprinkle, and P. Volgyesi. The generic modeling environment. In *Workshop on Intelligent Signal Processing*, 2001.
- [68] F. J. Lin, P. Chu, and M. T. Liu. Protocol verification using reachability analysis: the state space explosion problem and relief strategies. In *ACM SIGCOMM Computer*

- Communication Review*, volume 17, pages 126–135. ACM, 1987.
- [69] J.-L. Lions et al. Ariane 5 flight 501 failure, 1996.
  - [70] C. L. Liu and J. W. Layland. Scheduling algorithms for multiprogramming in a hard-real-time environment. *Journal of the ACM (JACM)*, 20(1):46–61, 1973.
  - [71] F. Liu, A. Narayanan, and Q. Bai. Real-time systems. 2000.
  - [72] P. Louridas. Static code analysis. *Software, IEEE*, 23(4):58–61, July 2006.
  - [73] P. S. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner. Simics: A full system simulation platform. *Computer*, 35(2):50–58, 2002.
  - [74] M. Maróti, T. Kecskés, R. Kereskényi, B. Broll, P. Völgyesi, L. Jurácz, T. Levente-dovszky, and Á. Lédeczi. Next generation (meta) modeling: Web-and cloud-based collaborative tool infrastructure. In *MPM@ MoDELS*, pages 41–60, 2014.
  - [75] M. A. Marsan, G. Balbo, G. Conte, S. Donatelli, and G. Franceschinis. *Modelling with generalized stochastic Petri nets*. John Wiley & Sons, Inc., 1994.
  - [76] V. Massol and T. Husted. *Junit in action*. Manning Publications Co., 2003.
  - [77] N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner. Openflow: enabling innovation in campus networks. *ACM SIGCOMM Computer Communication Review*, 38(2):69–74, 2008.
  - [78] K. L. McMillan. *Symbolic model checking*. Springer, 1993.
  - [79] K. L. McMillan. Getting started with smv. *Cadence Berkeley Laboratories*, 1999.
  - [80] J. L. Medina and A. G. Cuesta. From composable design models to schedulability analysis with uml and the uml profile for marte. *SIGBED Rev.*, 8(1):64–68, Mar. 2011.
  - [81] R. Milner. *The definition of standard ML: revised*. MIT press, 1997.
  - [82] J. Morrison and T. Nguyen. On-board software for the mars pathfinder microrover. In *Proceedings of the Second IAA International Conference on Low-Cost Planetary Missions*, 1996.
  - [83] T. Murata. Petri nets: Properties, analysis and applications. *Proceedings of the IEEE*, 77(4):541–580, 1989.
  - [84] N. Navet and F. Simonot-Lion. *Automotive embedded systems handbook*. CRC press, 2008.

- [85] Object Management Group. *UML Profile for Modeling and Analysis of Real-Time and Embedded systems (MARTE)*, OMG Document realtime/05-02-06 edition, May 2005.
- [86] W. R. Otte, A. Dubey, S. Pradhan, P. Patil, A. Gokhale, G. Karsai, and J. Willemse. F6COM: A Component Model for Resource-Constrained and Dynamic Space-Based Computing Environment. In *Proceedings of the 16th IEEE International Symposium on Object-oriented Real-time Distributed Computing (ISORC '13)*, Paderborn, Germany, June 2013.
- [87] S. Owre, J. M. Rushby, and N. Shankar. Pvs: A prototype verification system. In *Automated DeductionCADE-11*, pages 748–752. Springer, 1992.
- [88] J. C. Palencia and M. G. Harbour. Schedulability analysis for tasks with static and dynamic offsets. In *Real-Time Systems Symposium, 1998. Proceedings., The 19th IEEE*, pages 26–37. IEEE, 1998.
- [89] J. C. Palencia and M. G. Harbour. Exploiting precedence relations in the schedulability analysis of distributed real-time systems. In *Real-Time Systems Symposium, 1999. Proceedings. The 20th IEEE*, pages 328–339. IEEE, 1999.
- [90] J. L. Peterson. Petri nets. *ACM Computing Surveys (CSUR)*, 9(3):223–252, 1977.
- [91] P. Puschner and A. Burns. Guest editorial: A review of worst-case execution-time analysis. *Real-Time Systems*, 18(2-3):115–128, 2000.
- [92] M. Quigley, K. Conley, B. P. Gerkey, J. Faust, T. Foote, J. Leibs, R. Wheeler, and A. Y. Ng. Ros: an open-source robot operating system. In *ICRA Workshop on Open Source Software*, 2009.
- [93] R. R. Raje, J. I. Williams, and M. Boyles. Asynchronous remote method invocation (armi) mechanism for java. *Concurrency - Practice and Experience*, 9(11):1207–1211, 1997.
- [94] A. V. Ratzer, L. Wells, H. M. Lassen, M. Laursen, J. F. Qvortrup, M. S. Stissing, M. Westergaard, S. Christensen, and K. Jensen. Cpn tools for editing, simulating, and analysing coloured petri nets. In *Proceedings of the 24th International Conference on Applications and Theory of Petri Nets*, ICATPN’03, pages 450–462, Berlin, Heidelberg, 2003. Springer-Verlag.
- [95] J. Reineke, B. Wachter, S. Thesing, R. Wilhelm, I. Polian, J. Eisinger, and B. Becker. A definition and classification of timing anomalies.
- [96] W. Reisig. *Petri nets: an introduction*, volume 4. Springer Science & Business Media, 2012.
- [97] X. Renault, F. Kordon, and J. Hugues. Adapting models to model checkers, a case

- study : Analysing aadl using time or colored petri nets. In *Rapid System Prototyping, 2009. RSP '09. IEEE/IFIP International Symposium on*, pages 26–33, June 2009.
- [98] X. Renault, F. Kordon, and J. Hugues. From aadl architectural models to petri nets: Checking model viability. In *Object/Component/Service-Oriented Real-Time Distributed Computing, 2009. ISORC '09. IEEE International Symposium on*, pages 313–320, March 2009.
  - [99] R. C. Seacord. *Secure Coding in C and C++*. Pearson Education, 2005.
  - [100] B. Selic. A generic framework for modeling resources with uml. *Computer*, 33(6):64–69, 2000.
  - [101] G. Shelley and S. Forrest. Acl2 theorem prover.
  - [102] M. Simulink and M. Natick. The mathworks, 1993.
  - [103] F. Singhoff, J. Legrand, L. Nana, and L. Marcé. Cheddar: A flexible real time scheduling framework. In *Proceedings of the 2004 Annual ACM SIGAda International Conference on Ada: The Engineering of Correct and Reliable Software for Real-time & Distributed Systems Using Ada and Related Technologies, SIGAda '04*, pages 1–8, New York, NY, USA, 2004. ACM.
  - [104] A. P. Sistla and P. Godefroid. Symmetry and reduced symmetry in model checking. *ACM Transactions on Programming Languages and Systems (TOPLAS)*, 26(4):702–734, 2004.
  - [105] K. Tindell and J. Clark. Holistic schedulability analysis for distributed hard real-time systems. *Microprocessing and microprogramming*, 40(2):117–134, 1994.
  - [106] J. Waldo. Remote procedure calls and java remote method invocation. *Concurrency, IEEE*, 6(3):5–7, 1998.
  - [107] J. Wang. *Timed Petri nets: Theory and application*, volume 9. Springer Science & Business Media, 2012.
  - [108] N. Wang, D. C. Schmidt, A. Gokhale, C. Rodrigues, B. Natarajan, J. P. Loyall, R. E. Schantz, and C. D. Gill. QoS-enabled Middleware. In Q. Mahmoud, editor, *Middleware for Communications*, pages 131–162. Wiley and Sons, New York, 2004.
  - [109] M. Westergaard, S. Evangelista, and L. M. Kristensen. Asap: an extensible platform for state space analysis. In *Applications and Theory of Petri Nets*, pages 303–312. Springer, 2009.
  - [110] R. Wilhelm, J. Engblom, A. Ermedahl, N. Holsti, S. Thesing, D. Whalley, G. Bernat,

- C. Ferdinand, R. Heckmann, T. Mitra, et al. The worst-case execution-time problem overview of methods and survey of tools. *ACM Transactions on Embedded Computing Systems (TECS)*, 7(3):36, 2008.
- [111] T. Ylonen and C. Lonvick. The secure shell (ssh) protocol architecture. 2006.
  - [112] S. Yovine. Kronos: A verification tool for real-time systems. *International Journal on Software Tools for Technology Transfer (STTT)*, 1(1):123–133, 1997.
  - [113] M. Zhou and K. Venkatesh. *Modeling, simulation, and control of flexible manufacturing systems: a Petri net approach*, volume 6. World Scientific, 1999.
  - [114] A. Zimmermann and G. Hommel. A train control system case study in model-based real time system design. In *Parallel and Distributed Processing Symposium, 2003. Proceedings. International*, pages 8–pp. IEEE, 2003.
  - [115] A. Zoitl. *Real-time Execution for IEC 61499*. ISA, 2008.
  - [116] W. Zuberek. Timed petri nets definitions, properties, and applications. *Microelectronics Reliability*, 31(4):627–644, 1991.
  - [117] R. Zurawski and M. Zhou. Petri nets and industrial applications: A tutorial. *Industrial Electronics, IEEE Transactions on*, 41(6):567–583, 1994.