FW upgrade on Embedded Systems

Today's world is full of dynamic changes, software release cycle is getting shorter and shorter and everyone expect the latest features to be immediately available everywhere. It's not that hard when you are developing cloud apps, but what happens when update arrives on embedded hardware?

FW upgrade theory

The firmware upgrade is straightforward process, just connect the J-Link or other programmer to your device and load a new binary, right? Well, not really, your random user doesn't have the J-Link and you really don't want him to open up your device to locate the programming header (and break few parts while doing so). Also you might want to disable programming interface completely during device production to avoid anyone tampering with it, reading the encryption keys or stealing your know how by disassembling the firmware. That means you need a better way to load up a new firmware to your device and you really want to make sure the upgrade process won't brick the device (e.g. when power is lost, invalid firmware image is loaded, etc.).

Let's say your device has a SD card slot and can load a new firmware from a memory card. How it works? The device needs to read the binary, verify it's correct and replace the currently running application by the new one. To do this, a bootloader is needed. From definition, it's a software responsible for booting the computer and loading/launching operating system, or in embedded world, loading and starting the application firmware. It could also be used to write the new firmware to the MCU flash memory (e.g. update over UART,...), un-brick your device (providing some service mode access to the internal memory),...

The standard embedded bootloader functionality could be:

  • Looks for the firmware image in the MCU Flash, verifies the image integrity and executes it.
  • Reads new FW image from external sources (over UART or USB, from SD card, over Ethernet,...) and writes it to MCU Flash
  • When multiple FW images are present, selects the most recent one or loads the fallback image when the recent one is buggy or on user request

There are basically two options how to load the new firmware:

  • From bootloader:
    • Simplest option
    • Application is not running, we can simply rewrite it with a new firmware
    • A larger bootloader size due to a logic for the firmware upgrade (e.g. the USB stack can eat up a lot of memory)
    • Easy to recover from failed update - the MCU stays in the bootloader, waiting for another attempt
  • From application:
    • Bootloader can be very tiny and very simple
    • We can share the updater code with application - e.g. the FAT library or USB stack is already present in the app and we can use it to load the firmware
    • Twice as much memory is needed for the update - we can't rewrite the application while it's running so we need to store the new firmware somewhere else.
    • When the new firmware is buggy, it might not be possible to revert the update - two application slots (see below) would be needed or device needs to be recovered from bootloader

Running the application

The basic memory layout of the MCU Flash could look like this basic_structure

  • The bootloader starts at address 0 as any other firmware
  • The firmware image is placed right after bootloader, say at address 0x2000, leaving 8kB of Flash to bootloader itself
    • The firmware image usually starts with some metadata like magic number to identify compatible firmware, firmware binary length, checksum, maybe firmware version or the git commit hash.
    • After header, the firmware itself is placed, say at 0x2080, leaving 128 bytes for metadata

The metadata are kind of optional, but it's usually nice to know what's there and that it's valid so you won't execute some random gibberish or incomplete firmware (the MCU could have been restarted in the middle of the upgrade process,...). I usually generate the metadata by piece of a python code by Makefile right after generating .bin files by the objcopy - the code takes the resulting binary, gets the version from git tag (if present), calculates checksum and prepends the gathered data to the binary, creating a nice file that can be directly shared with the customers.

When the MCU is powered on, the bootloader: 1) Reads the application metadata to check if the firmware is present 2) Calculates the checksum of the firmware image 3) If the checksum is valid, the MCU vector table is relocated to the firmware one and the program counter is loaded with firmware entry function. 4) If firmware is considered invalid, the bootloader might signalize issue by blinking LED, wait for FW upgrade over UART or anything like that.

Vector table

Wait, vector table? The interrupt vector table is usually at the very beginning of the firmware, for example, on the Cortex M0 cores, the address 0x0 of the flash memory is initial stack pointer value (loaded to stack pointer register by the MCU on boot), the address 0x4 contains pointer to the reset function (that initializes the RAM and few other things and then calls your main function), address 0x8 is the Non Maskable Interrupt handler pointer, etc. Basically the vector table points to all the interrupt handlers, program entry point, etc.

Once the bootloader work is finished, the MCU must use vector table of the firmware itself, if it was still pointing to the bootloader, any interrupt would end up in the bootloader code, not in your application. Most MCUs have some way to do the vector table relocation, or in other words, tell the MCU to look for the vector table at a different memory address. Some MCUs allow setting vector table relocation to any address in the memory, some limit it to a certain addresses or alignment, sometimes you can only switch between address 0x0 in Flash and 0x0 in RAM, etc. It's the duty of the bootloader to properly setup table relocation mechanism to point to the firmware vector table and to find a reset vector there to be able to jump into it or in other words, run the firmware code.

Linker, Position independent code

When you build your plain old firmware that is flashed directly to target platform, it's built in a way that it must be placed in exactly the expected place in the Flash memory (= start of the memory), the MCU expect the vector table to be at 0x0, loads reset vector address from there and runs the code at that address. What happens if you move this firmware binary to other place in the memory, read the reset vector address and let the MCU to run the code at that address? You'll end up running some unexpected code, like from middle of the bootloader or the firmware, or it might even jump into the firmware metadata area.

The code is usually built using an absolute addressing, it expects all the functions to be at exact address in the flash memory. If you open the linker file, you'll most likely see something like this:

MEMORY
{
    rom (rx) : ORIGIN = 0x08000000, LENGTH = 32K
    ram (rwx) : ORIGIN = 0x20000000, LENGTH = 6K
}

The code is placed in rom section at address 0x08000000 (for Cortex M0, the flash starts at this address, but it's also mapped at 0x0, so the previous text is still valid). But when using a bootloader, the bootloader is at that address, the application is placed few kB after it. Easy to fix, just update the rom ORIGIN and LENGTH accordingly and you are good to go. This limits placing the firmware in the memory to a single location, makes using multi-slot firmware images harder to manage,...

A much more powerful (and much more complex) way is building the firmware to be position independent. This is usually used for shared libraries and is non-trivial to make it working as running gcc with -fpic is not enough, the library loader (or bootloader in our case) has to do some initialization, like filling lookup tables with proper addresses and stuff. If you are interested, there's a nice blog about this about the position independent code on embedded targets. I've once spent few weeks trying this on a large project, it kind of worked, but noone really trusted it for production code as it was very complex with a bit of dark magic.

Memory layout

There are several ways of memory layout, each has some advantages and disadvantages and it's important to select the best solution early in the project development as it might be impossible to change later.

Single slot

single_slot

The bootloader updates the firmware by rewriting it, the application has no update logic.

Pros:

  • Very simple
  • Takes least amount of Flash memory
  • Hard to brick - the bootloader handles the update, if the app gets corrupted, just run the update again

Cons:

  • Bootloader tends to be larger as it contains the FW updater code
  • If the FW update is interrupted, the device remains in bootloader and is not fully functional until another update
  • The FW updater logic in bootloader means it's hard to change it - bootloader update needed, it is risky - you are stuck with the protocol you released the device with
  • Updates are usually not very user friendly as the functionality of the bootloader is usually very limited to fit into a small space
  • There's no fallback application firmware, if the new firmware misbehaves, you have to downgrade it manually through the bootloader

Dual slot, no PIC

update_slot

The application (or bootloader as fallback) updates the firmware by writing it to a dedicated memory area outside the application firmware. After the update, the bootloader detects a new valid firmware image in the update area and copies it to the application area.

Pros:

  • FW update can be done from the application code and be very user friendly (loading FW from USB Drive or over the internet, showing progress on the display,...)
  • The FW update logic is in the application itself, it can be completely redesigned between the app releases
  • When update is interrupted, the current application is not affected
  • The bootloader code can be vary basic and tiny

Cons:

  • Consumes twice as much Flash space as the application itself
  • If the new firmware contains a bug in the update logic, it might not be possible to update the device again (fallback might be implemented in bootloader)
  • The update memory slot is used only briefly during fw update and only occupies space for the rest of the time

Dual slot, PIC or dual image

dual_slot

The application (or bootloader as fallback) updates the firmware by writing it to an inactive slot. After the update, the bootloader decides which slot is newer and launches it. The firmware must be able to run from any of the slots, either by building it as Position independent code or by distributing two firmware upgrade images, one for each slot, each compiled with a different linker file.

Pros:

  • FW update can be done from the application code
  • The FW update logic is in the application itself
  • When update is interrupted, the current application is not affected
  • If the new firmware is buggy (e.g. updater code is broken), it's possible to switch back to the previous one, e.g. by a button press or some mechanism to detect issues can be implemented (e.g. detect boot loop and invalidate the new image).
  • Both update slots are used all the time, one always contain a firmware to return to if needed.

Cons:

  • Consumes twice as much Flash space as the application itself
  • Hard to implement (PIC) or inconvenient (building and distributing two images)

Triple slot, external memory

external

Not very common, but makes sense in certain scenarios. Imagine you have a large business-critical application that you are updating often, it makes sense to use the dual slot architecture for this as you probably want a fallback image, but it just won't fit the Flash memory. What to do? One of the solutions is to use some external memory, e.g. SPI flash to store the update images and copy the selected one into the internal Flash to run it (in case the MCU is not able to run it from the external Flash directly or if it's too slow). It's actually blend of the the variants mentioned above - the code always runs from the same address, therefore PIC magic is not needed, but you still have two update images to select from.

Other options

There's unlimited amount of other layouts. The triple slot mentioned above can be done without an external memory if you can fit the three slots into the MCU Flash memory. You could have some well tested shared library placed somewhere in the flash to be used by both the bootloader and the application (e.g. for the ethernet stack), to reduce size of both the application a bootloader while keeping a complex functionality in both of them.

If you have a large RAM (usually external one) and very little Flash, you could load the whole firmware into RAM and run it there. You could load the firmware at each boot from a SD card, USB thumb drive, load it from some internet server,...

The logic behind selecting the image to boot can also be quite complex, e.g. requiring interaction with other world (e.g. the application might be talking to some server during the update and when new firmware boots, it has to run for several minutes while talking to server before marking the firmware slot as reliable to use it again after reboot).

Bootloader updates

The update of the bootloader is always risky, if you mess it up, the device is bricked. There's always a change of power outage during update, someone tripping over the cable, corrupted image being loaded, etc. The bootloader shall be very well tested, stable as brick an you shall never update it, never ever. So how to update the bootloader?

  • Update from application
    • Very easy to do, the app is running, the bootloader part of the Flash is not used in this moment, just rewrite it with a new image and you are done. If it fails for some reason, congrats, you device is bricked, it won't ever boot again.
  • Update from bootloader
    • You can't update the bootloader while it's running, at least not directly.
    • You could copy the updater code to RAM and run the update from there, same risks as when updating from app.
    • You could build a specialized application firmware that would update the bootloader and flash it instead of the ordinary app, slightly better than the update from application - the new bootloader code can be included in the very basic app code, making the update really quick, shortening the time window when the bootloader could be bricked.

There's always a change to brick your device, but there's one option how to reduce the change of bricked device close to zero - adding a preloader.

preloader

Think of it as the two slot layout from above, only the first slot is much smaller and contains a bootloader, the second slot contains the app and the bootloader is replaced by a preloader. The preloader is very simple, it check the bootloader image and boots it if it's valid. If not, it tries to boot the application. The preloader is never to be updated, but it's quite small and very simple, so it's reasonably easy to test it end to end. So, if the bootloader update gets messed up, the application boots again and you can give it another try.

The preloader kind of doubles the bootloader functionality, especially in the single slot scenarios, but it is handy if you have a complex bootloader that you want to update from time to time (imagine a large app where you can't use multi-slot layout while the customer requires update over ethernet - the bootloader contains the whole ethernet stack and lots of related logic and it's very likely you'll need to update it in the future).

Other thoughts

Firmware encryption

Sometimes you care about your know how, maybe you want to avoid getting copied by Chinese vendors or your competitor. Most MCUs have some mechanism of flash read protection, so the firmware cannot be read out, but that's useless if you distribute your FW upgrade binaries with no encryption. The beauty of a firmware upgrade process is the ability to modify the upgrade file on the fly, the MCU can easily decrypt the data before writing it to the flash memory.

The easiest encryption can be as simple as XORing every byte with a sequence known to both sides (MCU and builder of the FW update images). Using a simple counter to get the sequence is probably not enough, but some form of a pseudorandom generator could be ok and only take few dozen of bytes in Flash. If the MCU is equipped with an AES accelerator, you have the best solution at your hands.

Signed images

Another layer of security is images signing, this is an upper class solution for security, an ultimate way to make sure you are the only person that can release the update image. Sign the image with a private key and let bootloader verify the signature with a public key, classical cryptography. The only downside is the complexity of such solution, the crypto libraries are huge, takes a lot of processing power, not something you want to run on your Attiny with 16kB Flash - here a simple image encryption does a job good enough. But it could make sense on larger MCUs connected to internet where the mbedTLS or similar library is used anyway...

Previous Post