Introduction

Way back in 2009 I took an elective subject at uni called Embedded Software. It was a great subject that gave you enough to get started but mostly required long hours in the lab to figure out how to get things "working". Now, I'm not saying that it was the hardest subject in the world (in fact I think that most universities in the world probably offer more difficult versions of this subject) but it was very interesting and very practical.

I intended to write this blog post just after finishing the subject however, a rather busy second half of 2009 (thanks thesis) and my "last summer" before working life conspired to prevent me from getting around to it. But now I am.

There are three main topics I want to write about and both pertain to the final project, which was to program the Freescale MC9S12 microcontroller to be an Arbitrary Waveform Generator (AWG). First, I wanted to briefly outline the method I used for constructing the sine wave. Second, I want to discuss some of the nuances that I found many of my fellow students didn't really understand about the hard real-time requirements of the system. Third, I want to cover the method used to fulfill the project requirement to generate additive gaussian white noise in a fixed point microcontroller.

Background

The lecturer running the subject, Dr Peter McLean, has structured the subject excellently to allow gradual introduction to embedded programming. The subject focuses on incrementally developing the functionality for the "Modular Controller" boards (a bepoke board with the MC9S12 at its heart). Those of you who have taken or are taking the subject will be aware of these boards, those of you who don't can always look here if you are interested.

Each week in the labs you implement more of the functionality, usually building on the previous week. I can't remember exactly but you progress through:

Getting serial comms to work / Getting watchdog timer to work
Getting clocks and timers to work
Implementing interrupt service routines (ISR)
Getting flash storage to work
Getting Serial Peripheral Interface (SPI) to work
Talking to the Analogue / Digital Converter (ADC)
Building a PC interface to the board

The Project

The final part of the subject is a major project that builds on the previous lab work but includes more demanding requirements (and significantly less help).

As I already mentioned, our project was to create an Arbitrary Waveform Generator. There were many requirements in our project specification but in summary a fully complete project would:

Support the generation of multiple waves: Sine, Square, Sawtooth, Triangle, Noise and an "Arbitrary Wave", that is changeable and loaded in by the user.
Support output of different waves on the 2 DAC output channels separately
Support frequencies from 100Hz to 0.1Hz in 0.1Hz steps
Support Amplitudes from X to Y in Z increments
Support DC Offset from X to Y in Z increments
Support user changing frequency, amplitude, DC offset, wave type and On/Off for each channel
Use the supplied bespoke Real-Time Operating System to implement the functionality using "threads" to ensure the hard real-time requirements were met.

The Sine Wave

This is going to be quite a brief coverage of this topic, as most of my work stemmed from this Application Note document by Freescale. Also, before we start looking at how the sine wave worked in my project I will make one comment that will no doubt inflame some people. In my opinion, one of the things that makes an engineer effective is the ability to reuse when possible. Many students (especially those in IT and Computer Science) like solving algorithm problems from scratch. While I also like the intellectual challenge these problems represent, when you are required to solve a problem with limited resources (esp. time) and you think it is likely that this problem has been solved in some form or another, it is well worth you spending a few hours trying to find a shortcut. The reasons for this approach abound and include:

It saves time and reduces risk if you find an exact solution
A solution of any sort has probably been tested and had some of its problems/bugs/gotchas ironed out or found
A partial solution or a solution in a different context probably lends you hand in making (crucial early) decisions in developing your own solution
Builders of software notoriously underestimate the amount of time needed to build brand new stuff, especially if it is complex (you always estimate in your head on the non-exception straight forward case, forget the debugging and testing required for all those edge cases you didn't think of, ...)
Generally, the people who have solved these problems are way smarter than you and/or have devoted their lives to these problems (see people like Knuth)

Ok, now that's out of the way we can move on to the fundamental problem at hand - designing our software to deliver a sine wave with frequencies that vary from 0.1Hz to 100Hz in 0.1Hz steps.

You will notice that my definition of the problem pretty much ignores the issues of amplitude, DC offset and wave type. My justification for this is that the hardest part of this problem is getting the frequency part correct and the other two components are easy:

the amplitute and DC offset are just fancy arithmetic once the frequency issue is solved
the wave type problem (if solved for the most difficult case) is also just some fancy arithmetic once the frequency issue is solved (or is actually solvable in a very lazy fashion, as I did, but more on that later)

To address this problem we need to hold a few requirements and facts in our head at once:

The specification calls for 1000 different frequencies ranging from 0.1 through to 100 Hz. It should be evident that one 8-bit (0-255) byte is insufficient to store the range of frequencies required, however 16 bits is more than enough (0-65535). It is worth noting that one 8-bit byte is sufficient to store the integer range of frequencies (1,2,..99,100), as this is key to our choice of representation later on.
The maximum frequency is 100Hz. By the Nyquist rate this would require reconstruction at a rate of at least 2f, or 200Hz. The period of 200Hz = (1/200) = 5ms. This implies that we need to output a sample at least every 5ms if we want to reconstruct a 100Hz signal. However, as you probably know, the Nyquist rate is a "limit" and the resulting waveform probably wouldn't be a very good representation. To solve this issue, reconstructing with samples every 1ms would theoretically allow us to reconstuct signals up to (1/0.001)/2 = 500Hz and would provide good fidelity across the 0.1-100Hz range.
The solution to the sine wave problem (and most likely the arbitrary wave) seems to call for a lookup table that holds a reference copy of the wave (and is indexed in a looping fashion).
Frequency of a wave is essentially just a constantly changing phase (or a constantly changing period offset) : f(t) = sin(wt+a), usually t changes, but a could also be varied, right? ;-p

The Sine Wave reference table / lookup

The application note from Freescale explains this far more thoroughly than I will and I really suggest you read it if you are tackling a similar problem.

My solution (and theirs) uses three main components/ideas:

A reference sine wave in a table that is 256 bytes in length.
A 16 bit (2 byte) accumulator style variable for storing the current phase of the output wave (and it is essentially used as the index into the reference table).
A constant sample/reconstruction rate (one sample every 1ms as mentioned above).

Sine Wave Reference Table

Calculating values for a sine wave is actually relatively difficult, especially in a fixed point microcontroller with only floating point emulation and a specification that has demanding requirements for calculation time and accuracy. As mentioned in the application note, other research and apparent from my own analysis, storing a sine wave with enough values to be representative is one of the simplest solutions to this problem.

My choice of a table 256 values in length was also to help simplify the design for the following reasons:

256 values is enough to get a pretty good representation of a sine wave (not too large, not to "chunky"/quantised)
256 is the perfect length for using a single 8-bit byte as the index to the table (0-255), as the byte will loop/modulo count itself by overflowing

It should be noted that this is not the only solution to this problem. One of my classmates, Deon Poncini, wrote his own Taylor Series approximation of a Sine Wave and built a custom length (max 1000) table for his waves.

The 2 byte accumulator

As mentioned above, the table length I chose was perfect for indexing with a single 8-bit byte. However, as also mentioned, one 8-bit byte is insufficient to store the frequency with enough fidelity to represent the 1000 different frequencies required. What to do, what to do?

The answer is actually to store a "floating point equivalent representation" in a fixed point holder. Once again, this is discussed in the application note (and elsewhere in the interwebs).

It even comes with it's own notation (that is actually pretty flexible). I chose to store my values in a 16-bit/2byte holder as a 16Q8 value (which means 16 bits in total, with 8 bits "after the decimal point").

To simplify handing this representation I defined a union in C that allows direct access to either byte and use of the entire value as one.

typedef union
{
    UINT16 l;
    struct
    {
        UINT8 Hi;
        UINT8 Lo;
    } s;
} TUINT16;

The HIGH byte is then used as the index into the table (as it represents the units) and the LOW byte stores the fraction.

The other tricky part with this is that the current position in the wave is calculated by updating an accumulator register with the phase-shift/delta that would be expected from the set-point frequency and an assumed constant reconstruction rate. The following piece of code illustrates how the "Delta" value is calculated and stored from the parameters provided from the PC (over the RS-232 interface in this case).

TUINT16 newFrequency;
newFrequency.s.Hi = Packet_Parameter3;
newFrequency.s.Lo = Packet_Parameter2;

Analog_Output[activeChannel].Frequency = newFrequency.l; // comes as freq * 256

Analog_Output[activeChannel].Delta = (UINT16) ((256 * (UINT32) Analog_Output[activeChannel].Frequency) / 1000); // 1ms = 1000/sec
Analog_Output[activeChannel].PeriodOffset.l = 0;

The "PeriodOffset" value is our 16Q8 accumulator and keeps track of where we are within the wave for this frequency. The code where these two are used is below:

void Waves_PrepChannel(TAnalogOutput * const channel)
{
    // Only prepare the next value if the channel is ON
    if (channel->ChannelOn)
    {
        INT32 nextValue;  // use a 32 bit number to avoid overflow

        switch (channel->Wave)
        {
            ...
            case Sine:
                nextValue = (INT32) Waves_GetSineValue(channel->PeriodOffset.s.Hi);
                break;
            ...
        }

        // All values come back at full amplitude
        // Hence, adjust the value according to the set amplitude   
        ...
    }
    else
    {
        channel->NextValue = 0;
    }

    // Called every 1ms
    // Increase the current period offset by the precalculated delta amount for
    // this frequency. NB: This is 16Q8 value.
    channel->PeriodOffset.l += channel->Delta;
}

This might all seem a little bit like magic at this stage and stepping through an example is probably in order. If the frequency required is 38.8Hz:

The arriving packet has the frequency already multipled by 256 (for reasons we won't go into).
Therefore,

Analog_Output[activeChannel].Frequency  
            = 38.8*256 = 9932.8 = 9932 (fixed point)
            = 0x26CC
            = 0b 0010 0110 1100 1100

Analog_Output[activeChannel].Delta
            = (UINT16) ((256 * (UINT32) Analog_Output[activeChannel].Frequency) / 1000); // 1ms = 1000/sec
            = 2542
            = 0x09EE
            = 0b 0000 1001 1110 1110

The following Excel table and chart illustrate how this works in practice. Basically, the Delta value ensures the accumulator progresses along (and rolls over) at the more accurate representation of the frequency while just using the "Hi" byte ensures the right value is pulled from the sine reference table.

Through inspection of the graph alone we can see that this sine wave has 3 full oscillations by approximately the 77.3ms (;-p) mark, which translates to a period of (.0773/3) = 25.7666ms = (1/0.0257666) = 38.8Hz.

Another side note is that I could have chosen to store the values as 16Q9, as the top 7 bits would provide 0-127 values which is enough to represent the 0-100 Hz requirement. This would also allow greater fidelity between frequencies, as 9-bits would be used instead of 8. This would not be as clean as the 8 and 8 setup I chose and was not needed for the specifications of this system.

Constant Reconstruction Rate Issues

Before we discuss the issues around keeping the reconstruction rate constant, it is probably worth explaining how the table can be used to create different frequency waves.

Let's walk through some examples:

Example 1 - One sample every 1 ms.

So we have our 256 value sine wave in a table, what would be the frequency if we output each sequential value every (say) 1 ms. It would take 256ms for the wave to complete (the period), hence the frequency, f = (1/(1ms * 256)) = (1/0.256) = 3.9 Hz (appx). Pretty simple right.

Example 2 - One sample every 0.5 ms

This is pretty similar to the above example, but instead of having a sample every 1ms, we output one every 0.5ms (i.e. we do NOT have a constant reconstruction rate). It would take 128ms for the wave to complete, hence the frequency, f = (1/0.128) = 7.8Hz (appx).

Example 3 - Test the boundaries

So it seems pretty reasonable that changing the reconstruction rate will let us generate the frequencies we need. Let's look at the 2 boundary cases:

For 0.1Hz, we would need the period to be, T = (1/f) = (1/0.1) = 10 seconds. Therefore, we would need to output 1 sample from the table every 10 / 256 seconds, or 39.1ms. Still pretty reasonable.

For 100Hz, we would need the period to be, T = (1/f) = (1/100) = 0.01 seconds. Therefore, we would need to output 1 sample from the table every 0.01 / 256 seconds, or 39.1ns. Still pretty reasonable, or not? Definitely not. One sample every 40 nanoseconds is way too fast for this board considering the spec on the microcontroller itself is "25 MHz bus operation at 5V for 40 ns
minimum instruction cycle time", let alone the DAC we are using.

What to do instead?

Well, instead of changing the rate at which we output the samples, we change the rate at which we loop over the table. Instead of halving the period of outputting our samples to double the frequency of the output wave, we can just use every second value. The following chart probably explains this as quickly as text will. Example 1 and 2 from earlier are on the chart, along with a third example demonstrating that selecting every second value from the reference wave has the same general effect as doubling the output rate.

And the issues...

As discussed at length in the application note, the decision to have a constant sample rate has some important consequences. The most important consequence (especially when combined with the lookup table and index decisions) is that the rate of change of the frequencies is not linear with respect to changes in the size of the index/accumulator (more on this next). I.e. changing the "Delta" value used by 1 has a larger effect on the resulting frequency of the wave if it is small (e.g changing from 10 to 11) than if it is large (e.g. changing from 65534 to 65535).

To see this demonstrated you should read the application note (I would just be reproducing their work).

The Hard Real-time Requirements

The final version of this system with all the additional options for the project implemented would consist of:

Implementation of functionality via "threads" with differing priorities controlled by the real-time operating system.
Implementation of a variety of wave types, including additive gaussian white noise.

The functionality included things like:

Receiving and sending packets via serial port (RS-232)
Taking readings on the ADC inputs
Updating clocks
Outputting values on the DAC outputs (i.e. the AWG part)

Some of this functionality could take quite a while to perform, so the threads etc have to be constructed with care to ensure the system meets its real-time requirements.

What are these real-time requirements you might ask? Well, there is really only one for this system, and it is that the system outputs the next value of the function on each channel every 1ms without fail.

While this requirement seems simple at first (and in the context of this project handling it is probably pretty inconsequential), the devil is in the detail. First, you might notice that this microcontroller can only do "one thing at once" and cannot update both output channels at exactly the same time. Second, you might also notice that even if we set up our timer to create an interrupt every 1ms it takes time for the interrupt to fire and for us to get to the piece of code that moves the next value out to the DAC. However, these two problems point us to re-analyse the real-time requirement to reveal that to meet it we must update the value "every 1ms". This suggests that the actual time that the update occurs for each channel is not so important so long as they updates occur exactly 1ms apart, ergo so long as the interrupt fires every 1ms and we do exactly the same thing each time to update channels we will meet our requirement.

The reason this requirement is important in the AWG is that even slight variations in the timing of the output has the potential to upset the frequency of the waveform being generated.

The real-time operating system provided was pre-emptive. According to wikipedia, "Preemptive multitasking allows the computer system to more reliably guarantee each process a regular "slice" of operating time". To achevieve this multitasking we were required to implement the system in a certain way that included creating "Threads", ThreadControlBlocks and EventControlBlocks. We were also required to make calls to operating system functions such as:

OS_Init() - once on boot
OS_ISREnter() - To inform the OS that an interrupt has occured and to re-enable interrupts (this allows interrupts to be interrupted (or pre-empted) themselves
etc ...

Some Typical Answers

A typical structure that many students employed was to have an interrupt every 1ms that looked something like this:

void interrupt 26 Timer_MDC_ISR(void)
{
    ...
    UINT8 idx;

    // acknowledge interrupt by clearing status flag
    MCFLG_MCZF = 1; // set to clear
    OS_ISREnter(); // Tell the OS & reenable interrupts

    // Output any DAC values. 
    for (idx = 0; idx < NbAnalogOutputs.l; idx++)
    {
        Waves_PrepChannel(*(Analog_Output[idx].OutputChannel));
        Analog_Set(idx, Analog_Output[idx].NextValue);
    }

    ...

    OS_ISRExit();
}

Or maybe they will have implemented the two AWG channels in separate OS threads, and instead of the for loop they might signal the threads to "do their stuff again":

(void)OS_SemaphoreSignal(DACChannel0OutputNextValue);
(void)OS_SemaphoreSignal(DACChannel1OutputNextValue);

And the threads themselves would independently output the next values, something like:

static void AnalogOutputThread(void *pData)
{
    ...

    for (;;)
    {
        Waves_PrepChannel(thisChannel.OutputChannel);
        Analog_Set(thisIndex, thisChannel.NextValue);
  (void) OS_SemaphoreWait(DACChannel1OutputNextValue, 10); // then wait for signal
    }
}

This solution was actually quite common and, in my opinion, demonstrates a total lack of understanding about "real-time" and the operation of the OS.
While these options (and I am sure there are more) will generally perform to the level required, in my opinion they will not technically meet the real-time requirements of the system. My reasons are:

once you re-enable interrupts you lose all possible guarantee about real-timeness (as something else with a shorter period might fire off a whole set of interrupts in a row, blocking your access to the CPU)
if preparing the next value in "Waves_PrepChannel" possibly varies in the number of calculations it performs, as is the case when producing the noise wave, then you will have slight variations in the frequency produced.

My Answer

My solution was to split the responsibility and the control over timing. I am sure there are other solutions, but here are the salient bits of mine. First, I decided to have a thread for each channel, this would let each channel prepare the next value (and perform any variable length calculations) anywhere within the 1ms time gap between outputs.

static void AnalogOutputThread(void *pData)
{
    ...

    for (;;)
    {
        Waves_PrepChannel(thisChannel.OutputChannel);
        (void) OS_SemaphoreWait(thisChannel.CalcNextFlag, 10); // then wait for signal
    }
}

Second, I ensured that in the ISR for my 1ms timer the values where outputted prior to re-enabling interrupts.

void interrupt 26 Timer_MDC_ISR(void)
{
    ...
    UINT8 idx;

    // acknowledge interrupt by clearing status flag
    MCFLG_MCZF = 1; // set to clear

    // Output any DAC values. Ensure this is performed prior to re-enabling
    // interrupt in order to guarantee this hard real-time requirement
    for (idx = 0; idx < NbAnalogOutputs.l; idx++)
    {
        Analog_Set(idx, Analog_Output[idx].NextValue);
    }

    OS_ISREnter();

    // Signal the Semaphores controlling the output threads
    (void)OS_SemaphoreSignal(DACChannel0Recalculate);
    (void)OS_SemaphoreSignal(DACChannel1Recalculate);

    ...

    OS_ISRExit();
}

The following diagrams might help illustrate the difference.

The Noise wave

TBC later (I'm a little over writing this today).

Blodd

Monday, 14 November 2011

Arbitrary Waveform Generator