One of the most delicate problems in embedded programming is to ensure that the asynchronous access to a common resource (variable, object, peripheral) does not lead the system to unpredictable behavior. When studied, the concept of mutual exclusion seems logical and perceived by most as common sense. But when it comes to implementation many programmers struggle to deal with the side effects of our “single threaded” mindset.
In the following picture two independent processes – A and B – compete for an object that can be accessed by both between the moments t1 and t2. The two processes can be either an interrupt and the main thread or two tasks in a multitasking environment. The conflict comes from the fact that accessing the object requires a finite amount of time that is not 0 and the moments of preemption are completely unpredictable. Some may argue that most of the time the two processes will not access the resource in the same time and the probability for conflict is almost 0. Well, “most of the time” and “is almost 0” aren’t good enough. This has to change in “all of the time” and “is always 0” in order to have a reliable and determinist system.
Another category of people may think that accessing the object is a basic CPU feature that translates into some indivisible micro-operation sequence that cannot be disturbed (i.e. interrupted). Nothing can be more mistaken. In reality, even simple operations like incrementing a counter variable may translate into more than one machine instruction. For example, process B executing:
pcom_buf->rx_count++; if (pcom_buf->rx_count >= MAX_COUNT) { . . .
translates (for an ARM Cortex-M3 processor) into something like:
LDRH R3,[R0, #+28] ; load " pcom_buf->rx_count" ADDS R3,R3,#+1 ; increment its value STRH R3,[R0, #+28] ; put the result back CMP R3,#+10 ; compare with MAX_COUNT BCC.N . . .
Now assume that process A (e.g. an interrupt) preempts process B (e.g. main thread) somewhere between the instructions above, and executes:
pcom_buf->rx_count--;
What will be the value of pcom_buf->rx_count seen by process B in the if test?
What the programmer expects the value of pcom_buf->rx_count should be in the if test?
Imagine that in place of a simple counter a more complex object, like our Rx FIFO, has to be shared. The number of machine instructions generated for the access will increase and the probability of conflict will increase as well.
This simple example shows why the mutual exclusion represents such an important aspect of embedded programming.
The most logical way to solve this problem is to make sure that only one process can access the shared object at one time, and this access has to be complete or atomic (indivisible). The section of code we need to protect from concurrent access is known as critical section.
As for the implementation, most programmers will disable the interrupts before the access and re-enable the interrupts when the access was done. The method works well and without significant performance penalties as long as the critical section is short and the delay introduced by this temporary interrupt disabling is reasonable. The following picture shows the result of disabling interrupts while accessing the shared object:
The result is obvious: the two processes will access the object sequentially and the process A will be delayed slightly because of process B keeping the CPU locked while the critical section is executed.
Most modern compilers have the support for manipulating the interrupt system. This support is materialized by one of the following: special macros, library functions, inline assembly code or intrinsic functions. Either one is good as long as it is used wisely. For example, GCC for ARM uses the intrinsic functions __disable_irq() and __enable_irq(), while IAR EWARM uses the intrinsic functions __disable_interrupt() and __enable_interrupt().
Let’s examine a critical section that is protected by using this method:
U8 Com_ReadRxByte (COM_RX_BUFFER *pcom_buf) {//////////////////////////////////////////////////////////////////////////////////////// U8 b_val; // byte holder __disable_interrupt (); // enter the critical section b_val = pcom_buf->rx_buf[pcom_buf->rx_head]; // extract one byte and pcom_buf->rx_head = (pcom_buf->rx_head + 1) % // advance the head index and wrap pcom_buf->rx_buf_size; // around if past end of buffer pcom_buf->rx_count--; // decrement the byte counter __enable_interrupt (); // exit the critical section return b_val; // return the extracted byte } ///////////////////////////////////////////////////////////////////////////////////////
The statements between the functions __disable_interrupt() and __enable_interrupt() will have exclusive control of the CPU while the critical section is executed: their result is always predictable.
Is it good enough now? It is definitely better than before but still not perfect.
What happens if the caller disabled the interrupts before invoking Com_ReadRxByte()? Executing __enable_interrupt() before the return statement, Com_ReadRxByte() re-enables the interrupts unconditionally leaving the caller to think that the interrupts are still being disabled. Imagine a situation where the caller enters in its own critical section, then invokes Com_ReadRxByte(), executes few more statements after that and, finally, exits the critical section: the last part of the caller critical section remains unprotected.
There are several ways to fix this, but all have something in common: we need to know what is the status of the interrupt system before calling __disable_interrupt(). Saving the interrupt state and later restoring it solves this final problem. I will define few macros to deal with this (unfortunately, the definitions are processor and compiler dependent):
#define INTR_STAT_STOR INTR_STAT __intr_stat__ // this is storage declaration! #define IRQ_DISABLE_SAVE() __intr_stat__ = __get_PRIMASK (); \ __disable_interrupt () #define IRQ_ENABLE_RESTORE() __set_PRIMASK (__intr_stat__)
These macros implemented in IAR EWARM 5.xx will work fine for any Cortex-M3 microcontroller. The first macro (INTR_STAT_STOR) defines a local storage of type INTR_STAT (actually U32) that will be used to save the contents of PRIMASK special register that contains the interrupt enable bit. The IRQ_DISABLE_SAVE() macro reads the PRIMASK register and store its contents in the __intr_stat__ location defined before. The IRQ_ENABLE_RESTORE() macro reloads the PRIMASK register with its saved value, restoring the interrupt status.
Now we can put everything together:
U8 Com_ReadRxByte (COM_RX_BUFFER *pcom_buf) {//////////////////////////////////////////////////////////////////////////////////////// U8 b_val; // byte holder INTR_STAT_STOR; // interrupt status storage IRQ_DISABLE_SAVE (); // enter the critical section b_val = pcom_buf->rx_buf[pcom_buf->rx_head]; // extract one byte and pcom_buf->rx_head = (pcom_buf->rx_head + 1) % // advance the head index and wrap pcom_buf->rx_buf_size; // around if past end of buffer pcom_buf->rx_count--; // decrement the byte counter IRQ_ENABLE_RESTORE (); // exit the critical section return b_val; // return the extracted byte } ///////////////////////////////////////////////////////////////////////////////////////
We will return to the problem of mutual exclusion later in the context of a real-time multitasking environment. In the mean time, try to put everything to work in your projects and post your comments or questions here. Good luck!