logo slogan
Phaedsys Logo

MTE Logo

 

When you can’t use Ctrl-Alt-Del:
OS for critical systems
Autumn 2008

By Chris Hills

Chris Hills

 

Chris Hills of Phaedrus Systems reviews your options when choosing an operating system for critical systems.

 

The wider use of embedded systems in just about everything, together with the relative fall in the cost of 32 bit MCUs means we are getting more complex embedded and critical systems, most of which work with an operating system.

 

Operating systems are taken for granted by many programmers: they simply assume that there will be one. This produces a shock for many computer science graduates on their first embedded project… Linux is just not going to run in the 64K bytes available!

 

Equally, we all take operating systems for granted on the desktop. However most of the systems we use are not real time or safety critical and usually get rebooted at least once in 24 hours. Most of the operating systems used on the desktop have a lot of additional features and functions that are often not needed in a critical system, such as control for CD drives and hard disks, file viewers and indeed file systems. For a critical system, the RTOS or OS needs to do just the minimum required and no more. But it does need to do the task reliably and repeatedly.

 

Just as everyone has written their own compiler or graphics library, so many people have written their own OS; search the internet and there are hundreds of them. So what should you look for to sort out the sheep from the goats? And remember, when choosing your RTOS, you may live with it for some time: research suggests that once the investment is made in an RTOS, it will often dictate the choice of processor for subsequent projects.

 

I don’t propose to go though the Tanenbaum/Torvalds debate on monolithic vs micro kernels (whether to use a single monolithic operating system or a small kernel with functionality added through external processes) but for a critical system you only want the minimum of software and functionality. The less there is the easier it is to test, and therefore to certify, if necessary, for critical use. (We will look at certification in a moment.)  So start with a kernel where you can add functionality, but add only the functionality needed.

 

The principal object of an OS is to run several tasks, safely, and to ensure there are no memory conflicts or resource deadlocks. (I once stopped an entire system whilst waiting for a “Clear To Send” from a printer. Fortunately it was while training on device drivers.)

 

Some critical systems work by having multiple MCU, with critical tasks running on one and non-critical on others. Where this is not the case, you need to ensure that the tasks are kept completely separate. This is where the Memory Management Unit (MMU) comes in. The MMU is a hardware system in the MCU that translates virtual pages of memory to hardware memory, ensuring that virtual memory from one process never overwrites memory for another process. But this will only work if you have MMU support in the RTOS. Some do, some don’t and some have it as an option. Where it is an option you should ensure that it can be added to the OS, or the OS swapped for the MMU version, with no other changes. This way parts of the software running on the MCU(s) can be certified, while other software, for non-critical aspects, need not be.

The other thing you need to look for is a design of RTOS that has as little shared memory as possible, ensuring that tasks and data remain separate and cannot be overwritten or corrupted. Message based systems are good for this, since only the pointer to the message has to pass between processes.

 

In aerospace, defence, automotive or other safety critical or high reliability applications it is usually necessary that an embedded system conforms to certain standards, such as IEC 61508. For the system as a whole to be certified as conforming to the standard also requires that the RTOS be certified, by an appropriate body, confirming that it reaches the required Safety Integrity Level (SIL). (A SIL is a measure of safety performance, on a scale of 1 to 4, with 4 being most secure.)

 

There are advantages to creating software to run in more than one path. All the safety critical aspects can be gathered into one path for certification while the remaining software can remain uncertified. For example, a serious problem for certification is communications protocols like TCP/IP and USB: while, in theory, these are both possible  to certify, in practice the very dynamic nature of these protocols, creating and killing objects frequently, produces a virtually indefinite number of states, making it close to impossible to examine every state, and thus impossible to certify. However these elements can be moved on to the non-critical software path, and the MMU ensures that the two paths do not interact except under precisely defined and certifiable circumstances.

 

If certification of the RTOS is necessary, then the developer can either buy from a supplier with certification, or carry out a certification exercise in-house. Non-certified code requires a full test of software and there may be a need to re-engineer the RTOS for documentation, to meet design and requirements specifications and to meet modelling specifications.

 

Buying in saves a LOT of time but you still need your own procedures and processes in place for the rest of the development. Some people actually believe that if they buy a certified RTOS that is all they need and the project is somehow certified. All the RTOS certification does is to say that the RTOS code has been developed to the relevant standard with the correct procedures and tests. It means you don’t have to test it again nor produce your own design specs.

 

The company you buy your RTOS from should also be certified in general for having the correct procedures, methods and processes in place and the RTOS itself should be certified as a specific project. This will be for a specific compiler and target MCU combination, and the system will need retesting for any other compiler and MCU combinations.

 

Taking an RTOS designed for applications that are not system critical and then converting can be problematical, so you should normally look for an RTOS that was designed from the ground up to be a critical system RTOS.

 

Even if you do not need a certified RTOS, there can be benefits in buying your RTOS from a source of certified RTOS since they will have all the processes for software development certified and audited to the appropriate SIL. In many ways the difference between what you are buying and what the buyer of the certified RTOS is paying a lot more for, is a bundle of very expensive documents.  Much of the software will be the same in both the certified and non-certified RTOS.

 

So, in summary, for safety critical projects, looking carefully at the RTOS can be a way to bring the project, and later projects, to the market faster and with higher quality

 

 

Author Details and contact

 

Eur Ing Chris Hills BSc CEng MIET MBCS MIEEE  FRGS   FRSA is a Technical Specialist and can be reached at This Contact

 

Copyright Chris A Hills  2003 -2008
The right of Chris A Hills to be identified as the author of this work has been asserted by him in accordance with the Copyright, Designs and Patents Act 1988