| ... | @@ -20,7 +20,6 @@ Dependability of a computing system is the ability to deliver service that can j |
... | @@ -20,7 +20,6 @@ Dependability of a computing system is the ability to deliver service that can j |
|
|
The function of a system is what the system is intended to do, as described by the functional specification. A system failure occurs when the service delivered does not comply with the specification. The system state is the set of the component states.
|
|
The function of a system is what the system is intended to do, as described by the functional specification. A system failure occurs when the service delivered does not comply with the specification. The system state is the set of the component states.
|
|
|
|
|
|
|
|
*An error is a system state that may lead to failure. An error is detected if an error message or signal is produced within the system or latent if not detected. A fault is the cause of an error, and is active when it results in an error, otherwise it is dormant.
|
|
*An error is a system state that may lead to failure. An error is detected if an error message or signal is produced within the system or latent if not detected. A fault is the cause of an error, and is active when it results in an error, otherwise it is dormant.
|
|
|
|
|
|
|
|
*Fault tolerance is the ability of a system to deliver of correct service in the presence of faults <ref>A. Avizienis. Fault Tolerant Systems, IIEE Trans. Computers Vol C-25 No 12. 1976</ref>. This is achieved by error processing —removing the system error state— and by treating the source of fault. The ability to detect and process error states and assess the consequences is critical requirements of fault tolerant design.
|
|
*Fault tolerance is the ability of a system to deliver of correct service in the presence of faults <ref>A. Avizienis. Fault Tolerant Systems, IIEE Trans. Computers Vol C-25 No 12. 1976</ref>. This is achieved by error processing —removing the system error state— and by treating the source of fault. The ability to detect and process error states and assess the consequences is critical requirements of fault tolerant design.
|
|
|
Fault tolerance —both hardware and software— is achieved through some kind of redundancy. Hardware redundancy techniques often make use of multiple identical units, in addition to a means for arbitrating the resulting output. ECC memory, for example, uses a few extra bits to detect and correct errors resulting from faults in the individual storage bits. Running the same input data through a faulty software module multiple times yields the same erroneous result each time. Software fault tolerance is built by applying algorithmic diversity, computing results through independent paths, and by judging the results. This adds complexity to the system in general. Adding software fault tolerance will improve system reliability only if the gains made by the added redundancy are not offset by commensurate new faults introduced by the parallel code.
|
|
Fault tolerance —both hardware and software— is achieved through some kind of redundancy. Hardware redundancy techniques often make use of multiple identical units, in addition to a means for arbitrating the resulting output. ECC memory, for example, uses a few extra bits to detect and correct errors resulting from faults in the individual storage bits. Running the same input data through a faulty software module multiple times yields the same erroneous result each time. Software fault tolerance is built by applying algorithmic diversity, computing results through independent paths, and by judging the results. This adds complexity to the system in general. Adding software fault tolerance will improve system reliability only if the gains made by the added redundancy are not offset by commensurate new faults introduced by the parallel code.
|
|
|
|
|
|
| ... | @@ -82,7 +81,7 @@ The main components for the diagnosis infrastructure for universAAL are as follo |
... | @@ -82,7 +81,7 @@ The main components for the diagnosis infrastructure for universAAL are as follo |
|
|
! align="left" bgcolor="#DDDDDD" colspan="2" | Artifact: '' Failure Diagnosis Module in universAAL ''
|
|
! align="left" bgcolor="#DDDDDD" colspan="2" | Artifact: '' Failure Diagnosis Module in universAAL ''
|
|
|
|-
|
|
|-
|
|
|
| GIT Address
|
|
| GIT Address
|
|
|
| http://forge.universaal.org/svn/uaal_context/trunk/ctxt.reliability.reasoner
|
|
| http://github.com/universAAL/context/tree/master/ctxt.reliability.reasoner
|
|
|
|-
|
|
|-
|
|
|
| Javadoc
|
|
| Javadoc
|
|
|
| http://depot.universaal.org/hudson/job/context/javadoc/
|
|
| http://depot.universaal.org/hudson/job/context/javadoc/
|
| ... | @@ -460,7 +459,7 @@ The integrated diagnosis framework uses the power of the Context bus in universA |
... | @@ -460,7 +459,7 @@ The integrated diagnosis framework uses the power of the Context bus in universA |
|
|
|
|
|
|
|
[[https://raw.githubusercontent.com/wiki/universAAL/middleware/DiagnosisFramework.png|600px|center]]
|
|
[[https://raw.githubusercontent.com/wiki/universAAL/middleware/DiagnosisFramework.png|600px|center]]
|
|
|
|
|
|
|
|
From the context bus, the context events related to faults are taken as symptoms for a failure. These symptoms are analyzed by a priori knowledge of the FCR and the related static knowledge on the associated failure mode. These symptoms are further queried by Reliability Reasoner with the help of the KB (Knowledge Base) and [http://forge.universaal.org/wiki/ontologies:Dependability# Dependability Ontology]. These symptoms can be analyzed either in a rule based approach or simple SPARQL query. The rules for the failure analaysis are inside the Reliability Reasoner. Then the reasoner will publish the context event with the diagnosis information into the context bus. This diagnosis information includes the actions for the failure that have to be adopted for the specific failure modes for that specific FCR.
|
|
From the context bus, the context events related to faults are taken as symptoms for a failure. These symptoms are analyzed by a priori knowledge of the FCR and the related static knowledge on the associated failure mode. These symptoms are further queried by Reliability Reasoner with the help of the KB (Knowledge Base) and [https://github.com/universAAL/ontology/wiki/Dependability Dependability Ontology]. These symptoms can be analyzed either in a rule based approach or simple SPARQL query. The rules for the failure analaysis are inside the Reliability Reasoner. Then the reasoner will publish the context event with the diagnosis information into the context bus. This diagnosis information includes the actions for the failure that have to be adopted for the specific failure modes for that specific FCR.
|
|
|
|
|
|
|
|
==Artefact #2 : Error Detection Unit ==
|
|
==Artefact #2 : Error Detection Unit ==
|
|
|
|
|
|
| ... | @@ -477,13 +476,13 @@ Because of its importance in fault tolerance operation, an Error detection frame |
... | @@ -477,13 +476,13 @@ Because of its importance in fault tolerance operation, an Error detection frame |
|
|
! align="left" bgcolor="#DDDDDD" colspan="2" | Artifact: ''Error Detection Unit''
|
|
! align="left" bgcolor="#DDDDDD" colspan="2" | Artifact: ''Error Detection Unit''
|
|
|
|-
|
|
|-
|
|
|
| GIT Address
|
|
| GIT Address
|
|
|
| http://forge.universaal.org/svn/uaal_context/trunk/ctxt.error.detection.unit
|
|
| http://github.com/universAAL/context/tree/master/ctxt.error.detection.unit
|
|
|
|-
|
|
|-
|
|
|
| Javadoc
|
|
| Javadoc
|
|
|
|
|
|
|
|
|
|
|-
|
|
|-
|
|
|
| Design Diagrams
|
|
| Design Diagrams
|
|
|
| [http://forge.universaal.org/wiki/https://raw.githubusercontent.com/wiki/universAAL/middleware/Physical_distribution_of_EDU.png Physical Distribution of EDU], [http://forge.universaal.org/wiki/https://raw.githubusercontent.com/wiki/universAAL/middleware/Conceptual_model_of_EDU.png Conceptual Model of EDU], [http://forge.universaal.org/wiki/https://raw.githubusercontent.com/wiki/universAAL/middleware/Data_structure_in_EDU.png Data Structure in EDU], [http://forge.universaal.org/wiki/https://raw.githubusercontent.com/wiki/universAAL/middleware/Event_list_calendar.png Event List Calendar]
|
|
| [https://raw.githubusercontent.com/wiki/universAAL/middleware/Physical_distribution_of_EDU.png Physical Distribution of EDU], [https://raw.githubusercontent.com/wiki/universAAL/middleware/Conceptual_model_of_EDU.png Conceptual Model of EDU], [https://raw.githubusercontent.com/wiki/universAAL/middleware/Data_structure_in_EDU.png Data Structure in EDU], [https://raw.githubusercontent.com/wiki/universAAL/middleware/Event_list_calendar.png Event List Calendar]
|
|
|
|-
|
|
|-
|
|
|
| Reference Documentation
|
|
| Reference Documentation
|
|
|
|
|
|
|
|
| ... | |
... | |
| ... | | ... | |