Fault
Tolerant
Systems

Learn how to make your systems work under unexpected and faulty conditions.

The goal of the course is to introduce the basics of fault tolerance, as a measure to improve the dependability of systems in the presence of faults, and correlating this dependability with effects to the system and functional safety. Basics concepts of design and implementation of fault tolerance mechanisms in general systems are introduced. Students are familiarized with the quantitative and qualitative methodology which is used in the evaluation of specific fault tolerance principles. Specific classes of fault tolerance are addressed, such as physical fault tolerance (adding redundant physical components), fault tolerance for information (error detecting and correcting codes) and temporal fault tolerance (via retry mechanisms). After the course students will be able to understand and perform key alterations to system functions, components, or mechanisms, to add an appropriate level of redundancy with the goal to achieve expected reliability of systems despite the residual faults in the system design.

Learning Outcomes:

By the end of this course, students will be able to:

  • Understand the main system dependability concepts and be able to correlate system dependability with system safety and functional safety;

  • Understand system integrity and safety integrity concepts;

  • Understand threats to system dependability (such as faults, errors, or failures), stemming from residual design faults or security issues;

  • Perform basic analysis of high-level composite system diagrams and identify key areas to perform dependability improvement, either through fault prevention/avoidance, fault removal, fault tolerance, or fault forecasting;

  • Design fault tolerance mechanism in appropriate stages, starting from error detection, through damage assessment, error recovery to fault treatment;

  • Determine the required level of redundancy for the target reliability of a system, and perform its high-level design alterations by means of physical fault tolerance – performing static, dynamic, and hybrid redundancy calculation;

  • Design fault tolerant information exchange, by understanding and applying appropriate information fault tolerance mechanisms, such as channel and error models, detection/correction codes, and retry mechanisms;

  • Understand the relation between fault tolerance and system repair.

Hardware (required): Computer with Internet connection, working speakers and microphone.

Software: Chrome browser.

Course Typically Offered: Live Online in Fall quarter (September - mid December) and Winter quarter (mid January - March) .

Prerequisites: Students should have previous knowledge of selected calculus topics, such as basic differential equations, operation with matrices, Galois fields, vector spaces, and Boolean algebra. Students shall have basic knowledge of system and safety engineering and system reliability concepts, such as failure probability, reliability, failure rate, constant failure rate, MTTF, FIT, etc. Ideally, students shall have completed the course “NIT-FSBA-01: Systems, Functions and Safety”.

Next Step: To achieve full insight into system and functional safety basics, proceed to taking the course NIT-FSBA-03: Safety Analysis Methods. To expand the knowledge of functional safety in the field of automotive engineering, consider taking the courses NIT-FSBA-04: Managing Quality, Processes and Projects in Automotive and NIT-FSBA-05: Functional Safety Standards in Automotive.

Course Number: NIT-FSBA-03

Duration: 3.00 units (~30 live teaching hours, ~60 hours of individual practice and preparation work)

Offered next: Contact us!

Class type: Live Online Intensive (according to the schedule published at the beginning of the course, approximately 3x2 live classes per week)

Instructor: To be announced

How to join: Google Meet (link will be available upon enrollment ), NIT Canvas

How to apply: Please apply by filling up the form here and we will get in touch with you as soon as possible.

Customized schedule for your company or team (call for price)

Class type: Live Online (Regular or Intensive), Live Bootcamp (Company premises)

Instructor: To be announced

For groups and organizations: please contact us directly to arrange this course according to your scheduling, needs and participant lists - via the contact form here.