Archives

Page Ranking

CoOS Real Time Kernel (CooCox.org) – Part 3

Lesson learned: the presence of an RTOS doesn’t mean you will get a better real-time performance for your application! I know, it sounds counterintuitive but this is real. A hard real-time system where the interrupts should be serviced as quickly as they come will have to rely on small and fast interrupt service routines only and not on operating system services that may be served with longer latencies. Most embedded programmers know this but in the rush of writing applications and meeting tight deadlines they forget the basic things like “keep the darn ISRs short“.

After using CoOS for few months in a relatively complex project, I must say that this kernel does a very good job for its intended target: ARM Cortex M3 processor. As I mentioned before, the CoOS kernel isn’t perfect in the sense that heavily loaded systems may experience issues like lost interrupts. After additional testing, I was surprised that more established kernels (µC/OS II, FreeRTOS) exhibit worse problems when it comes to latency related problems. As usual, these problems can be solved by designing the application differently and reducing the interrupt rate that make use of kernel services that are the most likely causes of system misbehavior.

For example, if you have a system with serial interfaces working at 115200 bps in interrupt mode, the time between interrupts coming from one UART is only ~87 µs. If the ISR involves setting a semaphore through a system call, about 1.8 µs will be added to the ISR execution time. Not to mention that the UART ISR time will add to the system timer ISR time occasionally delaying system services processing and task switches. In some circumstances this may work fine and no abnormal behavior will be observed. This may be the case when the system uses one high speed UART as the main source of rapid coming interrupts. Adding another similar UART (or maybe more) will push the system over the limits and problems like missing system timer interrupts will arise. In such cases the embedded developer can spend days looking for the source of errors.

One way to solve this problem is to avoid using any system calls (including those designed as safe for ISR use) in the interrupt service routines that are triggered by fast peripherals (e.g. SPI, I2C or UARTS at high data rates). This approach may work very well if the ISR code is short and executed quickly.

Another way is to reduce the interrupt rate by using Direct Memory Access (DMA) or hardware FIFOs if available. Most modern microcontrollers, especially in the Cortex M3 class, will give you one option or another. For example, a UART with a 16 byte FIFO operating at 115200 bps can reduce the interrupt interval to ~1.39 ms! Using higher data rates (e.g. >1 Mbps) will only be possible with the help of hardware. Remember, no RTOS kernel will enable better responsiveness for such fast occurring events — only the hardware can help.

As a rule of thumb, whenever the interrupt rate is faster than ~100 kHz, do not use the RTOS assistance at all.

While investigating the CoOS 1.12 responsiveness, I uncovered a problem in the way the kernel handles the service requests that come from user ISRs invoked by isr_Function() calls. Let’s examine first the implementation of the service request handling.

The service requests are placed synchronously in a queue that is processed either at the end of the ISR if there is no conflict with the OS (no task scheduling happening) or asynchronously in the SysTick handler when the scheduling takes place. Because in more than 90% of cases there is no conflict with the OS, the system requests are served promptly which explains why even a heavy loaded system works fine most of the time. Once a request cannot be served immediately and it is placed in the queue, the latency time increases up to one SysTick period. Assuming a SysTick every 1 ms (the minimum supported by CoOS) the response time may be too slow to accommodate the interrupt rate of a certain peripheral, the service request queue will become full quickly and the system may start missing events. One way to compensate is to increase the size of the service request queue which is 4 items by default. With enough RAM one can try 10…20 items with some hope for improvement.

Unfortunately, version 1.12 of CoOS contained a mistake in the way the service request queue was implemented: when I signaled this problem to the authors, they fixed this issue in version 1.13 which seems to function properly now. I’m sure that my solution to this problem isn’t optimal and probably violates what the authors wanted when they claimed that interrupt latency in CoOS is 0! Such a claim assumes that the interrupts will be kept enabled 100% of the time. My solution uses a short critical section that involved disabling interrupts briefly when accessing the service queue. If no system services are invoked in ISRs, the original claim stands. Not in the case isr_Function() calls are used in user ISRs. However, the penalty in latency time isn’t significant in most practical situations. I am convinced that a better implementation of the service queue management is possible using the mutual exclusion primitives that Cortex M3 provides at the machine level.

In conclusion, the current version of CoOS (1.13) seems to be stable enough for complex projects and I encourage other embedded developers to give it a try.

Please post any issues you find here and I will do my part in continuing the investigations and report my findings in the future.

2 comments to CoOS Real Time Kernel (CooCox.org) – Part 3

  • Alex

    Just small correction.
    With 115200 bps UART time between interrupts will be 87 us, not 8.7 us. And with FIFO it will be 1.4 ms. I’d say, very light load for Cortex.

    Personally, I don’t think that servicing system requests in the SysTick handler is good idea.
    With high-speed peripherals like Ethernet/CAN/SPI you can’t be sure that queue will not become full.

    Regards.

  • admin

    You are right (fixed). It is 87 us and not 8.7 us. Unfortunately, the processor I’ve been using in my projects does not have a UART with hardware FIFO. So, the interrupt rate was the same as the byte rate (87 us). In fact, in the stress test I activated three UARTs simultaneously – this was the moment I noticed missing events from time to time (intervals varying between few minutes to tens of minutes).

    The system requests servicing implemented in CoOS is a bit more complex: most of the time the requests are serviced immediately at the exit of the ISR. However, if the kernel is in the middle of scheduling tasks then the request servicing is delayed until the next SysTick. This delay may vary between almost 0 and the system timer time interval. This delay can be a problem in situations like you suggested – very fast peripherals. This is one of the reasons I suggested that you should try to reduce the interrupt rate significantly by using what the hardware provides: FIFO or DMA. Or avoid using system services completely in your ISRs.

    Otherwise, CoOS is responsive and reliable. Other RTOS kernels are subject of similar issues despite of different implementations and different internal architecture.