The official Nordic DFU bootloader is nice, but huge, on smaller devices like NRF52811 it takes a large portion of memory, combined with a softdevice, only few kB of Flash are left for the application. But what if I don't need a full blown secure wireless bootloader and a tiny uart bootloader is all I want? And what if I could squeeze it into the 4kB MBR section of the softdevice?
I was faced with developing a product that runs a main beefy MCU for all the hard work and a small Nordic NRF52 based radio for the wireless connectivity to nearby devices. The problem was that the radio firmware was not to be ready for release before the product mass production release date. The idea was to ship the device with only a bootloader in the radio chip and update it later together with the main CPU firmware. It was not decided, whether the nRF firmware would be based on the NRF5 SDK everyone is familiar with, or on the shiny new Zephyr based nRF Connect SDK that takes like an hour just to download and build a blinky example.
The official NRF5 SDK bootloader example kind of worked, but it takes like 30 kB of Flash for bootloader, 8kB for configuration, add a 100+kB softdevice (binary blob from Nordic with the BLE stack) and you are left with like 40 kB of Flash for application, make it 8 kB when using a larger softdevice. After removing all the firmware signing / crypto code and leaving only UART protocol, I was able to shave the bootloader down to like 15 kB which is still quite a lot! I've also hit few walls trying to convince the bootloader to write just the application (no softdevice - fw for Thread radio protocol) over app+softdevice, etc. And I'm not talking about Zephyr, that would probably add another layer of insanity.
So, the idea was to avoid the Nordic bootloader completely and squeeze all the fw upgrade related code to the initial 4kB of the Flash that are usually occupied (on NRF5 SDK based firmware) with a binary blob from Nordic called MBR - 38 kB of Flash saved with almost no loss of functionality (well, image signing is still a nice thing, but that can be offloaded to the main MCU). The custom MBR wouldn't care what data are written to flash, so app, app+softdevice, zephyr app and any other weird combinations of binary blobs are not a problem - let's future proof this crap.
The only issue is that the MBR does a little more than jumping to app or bootloader when requested... Let's dive deeply into the mighty Nordic binary blob called the MBR.
The NRF5 SDK based app usually ends up with this Flash layout:
Based on the disassembly of the MBR (btw, the Ghidra is a very good tool for such work), browsing the documentation, forums and google results, I came up with the following boot process (for MBR from the latest 17.1.0 SDK.
nrf_mbr.h
- 0xffc), if this address contains 0xffffffff, the value at UICR.NRFFW[1]
is used instead if valid. This configuration basically contains details needed for copying bootloader image to correct place after bootloader update, etc, it is only written by the MBR. If not found, the MBR ignores it and continuesnrf_mbr.h
- 0xff8) or in UICR.NRFFW[0]
, it bootloader address is found, MBR passes execution to the bootloader and the bootloader later passes execution to reset vector at 0x1004 (0x1000 is initial stack pointer)The MBR_PARAM_ADDR and MBR_BOOTLOADER_ADDR are usually written by the bootloader when it starts for the first time. When flashing the MBR/softdevice and bootloader to a fresh MCU, the UICR.NRFFW[0]
register is written with the bootloader address (check the bootloader linker script and the resulting .hex file, the register write is defined there) so the MBR knows where to find the bootloader although the MBR_BOOTLOADER_ADDR
is not set yet.
The MBR is responsible for passing the execution to app/softdevice/bootloader, based on configuration, but it also does another dark magic stuff. Let's ignore the fw upgrade support related code as that's not needed for the custom bootloader and focus on the bare minimum - the interrupts forwarding.
Before executing the reset vector of the application, the MCU needs to switch the vector table address in the MCU registers, so the correct application functions are called when an interrupt is triggered. This can be done by modifying the VTOR register with a new vector table address. On olders/smallers MCUs with cores like Cortex M0, the VTOR is not available, so another hack must be used - the new vector table is copied to start of the RAM and the RAM is relocated to 0x0, effectively changing the vector table to the new one. On even dumber MCUs, the bootloader must implement all possible interrupt handlers to catch them all and forward them to the application manually.
The NRF5 SDK still works on older Cortex M0 based MCUs, so VTOR is not always available, additionally some interrupts are to be processed by the softdevice, some by the application, so it needs a way to select to where the interrupt needs to be forwarded based on its type. And this is when the MBR steps in, the interrupt forwarding is designed like this by Nordic:
*((uint32_t *)0x20000000) + interrupt_id*4
, usually softdevice, that either processes the interrupt or forwards it to the applicationTo make everything working when using a custom MBR, the MBR must:
Putting it all together results in something like this:
__attribute__((naked)) static void Int_Handler(void)
{
__asm("ldr r1, =0xE000ED04\n"
"ldr r0, [r1]\n" // load SCB->ICSR
"ldr r1, =0x1ff\n"
"and r0, r0, r1\n" // mask VECTACTIVE bits
"lsls r0, r0, #2\n" // multiply interrupt id by 4
"r1, =0x20000000\n"
"ldr r2, [r1]\n"
"add r2, r2, r0\n"
"ldr r3, [r2]\n" // load pointer at *((uint32_t *)0x20000000) + int_id*4
"bx r3\n" // pass execution to app/softdevice handler
);
}
static void Reset_Handler(void)
{
// TODO you shall initialize .data and .bss segments here, pretty standard code
uint32_t *table = (uint32_t *)0x20000000;
*table = 0x1000;
__set_MSP(vectorTable[0]);
((void (*)(void))table[1])();
}
const uint32_t* vectors[] __attribute__((section(".isr_vector"), used)) = {
&_estack,
Reset_Handler,
[2 ... 99] = Int_Handler,
};
This is a very basic implementation, but that's all that's needed to implement a tiny MBR that will work with both bare app and softdevice based system. Each interrupt takes 10 instructions on top of the actual application handler code and no, it's not possible to use the VTOR if you intend to use the softdevice, it seems to be changing the address stored at the 0x20000000 sometimes and I always ended up at some fixed address inside the softdevice.
Take the code above, add some basic uart code, flash writing and some DFU protocol and you've made yourself a nice bootloader! Just make sure it fits into the first 4 kB of flash and disable all the peripherals you've enabled before jumping to the application. I've managed to squeeze everything including a custom uart DFU protocol, 16 bit CRC of the written data and some minor debug functionality into 3 kB of the flash, leaving me with most of the Flash memory for the application.
The custom MBR implementation shall be able to run anything you place at 0x1000, a barebone application, application with softdevice bundled, a Zephyr based application,... There are just few details to remember:
nrf-dfu
author.