Approximately a year ago, I wrote about running Debian on a Synology DS212j. In that post, I noted that the stock kernel was quite old, and did not support kexec. At the time, that meant that the only option for having a permanent install of modern Debian was to completely replace the stock kernel and initrd on the boot flash with our own.

This isn’t particularly ideal because:

  • Kernel and initrd changes are tedious, since they involve special steps to convert them into the U-Boot image format and re-flashing the bootrom.
  • Extra steps are needed to get an initrd small enough to fit in flash.
  • There is no easy way to change the kernel boot command line.
    • As far as I can tell, it’s hard-coded into the bootloader.
    • Needing to patch the bootloader seems risky.
  • Going back to stock firmware is non-trivial.
    • It involves obtaining a copy of the stock firmware, extracting the stock kernel and initrd, and re-flashing them back to the boot flash.

Furthermore, messing with the boot flash is risky. A small mistake could easily result in a bricked device, requiring special equipment to fix. A bad kernel or initrd flash would require serial access and a TFTP server to remediate. Accidentally overwriting the bootloader may require desoldering the chip, and using a SPI flasher to restore a working bootloader. For that reason, the less we touch the boot flash, the better.

A kexec-based second-stage bootloader

In the previous post, we already reverse engineered the boot process. We can already pretend to be Synology firmware, and have the stock kernel execute our userspace code as root. What if we could build a minimal userspace environment whose only purpose is to boot a more modern kernel and initrd using kexec - essentially creating a second-stage bootloader? Petitboot is one such project that implements this idea.

Unfortunately, as we discovered before, it appears that kexec-support is not built into the stock kernel:

Cannot open /proc/atags: No such file or directory
Cannot open /proc/device-tree/memory/reg: No such file or directory
kexec_load failed: Function not implemented

It looks like we’ll need to do some more work in order to enable kexec support. This does mean that we’ll need to re-flash a patched kernel that can do kexec, but at least it’s only once.

The benefit of using a kexec approach is that it’s very low-touch. We can maintain near-perfect backwards compatibility with stock Synology firmware since we use essentially the same kernel (with all of the custom vendor changes), and retain the stock boot (and recovery) processes.

Getting kexec to work

Mysterious hanging

My first attempt was to simply recompile the Synology kernel with kexec functionality enabled. This produced a kernel with kexec that appeared to work at first. I was able to kexec into the stock kernel perfectly fine, yet attempting to kexec a modern Debian kernel results in hanging:

C:0x000080A0-0x0025F000->0x0073B300-0x00992260
Uncompressing Linux... done, booting the kernel.

Unfortunately, I didn’t have much to go on. The new kernel has earlyprintk support built in, and (in theory) should be spitting out serial output early on in the boot process. Since I wasn’t getting any serial output after the kernel had finished uncompressing, I assumed that something was going wrong very early in the boot-process. However, without some kind of debugging mechanism available (e.g. JTAG, or KGDB), there was no easy way to get insight into what the CPU was doing.

DMA issues?

My first thought was that the stock kernel was failing to switch off certain peripherals and they were continuing to perform DMA which would corrupt the new kernel memory while it was booting.

To test this hypothesis, I used the --mem-min flag in the kexec command line tool to cause the new kernel to be loaded at different addresses in memory. The reasoning was that by trying to load the new kernel at various locations in physical memory, it would be likely that at least one region would not overlap with any DMA regions. If we can successfully boot when loaded at certain memory regions, and not others, this would be strongly indicative of DMA corrupting memory somewhere.

However, the new kernel would hang regardless where it was loaded. This didn’t entirely mean that DMA was not contributing to the hanging, but there was clearly a different, more immediate issue.

Prior art

One of the things I found unusual was the tiny amount of code for the kexec mechanism in the stock kernel. I’d expected the kexec mechanism to be quite complex, in order to ensure compatibility for all sorts of hardware. The stock kernel version was 2.6.32, which was probably very close to when kexec was first introduced. Perhaps the issue I was facing had been fixed already since then?

There were only a few dozen of commits to machine_kexec.c since 2.6.32, so I started reading the commit messages. One commit was described to be fixing an issue suspiciously close to mine:

arm: Disable outer (L2) cache in kexec

kexec does not disable the outer cache before disabling the inner
caches in cpu_proc_fin(). So L2 is enabled across the kexec jump. When
the new kernel enables chaches again, it randomly crashes.

Sure enough, the Marvell 88F6281 SoC on the DS212j has a separate L2 cache, and a driver for it is enabled in the stock kernel. It seems like this commit adds a fix to the cache driver to ensure that it would not break kexec:

ARM: 7696/1: Fix kexec by setting outer_cache.inv_all for Feroceon

On Feroceon the L2 cache becomes non-coherent with the CPU
when the L1 caches are disabled. Thus the L2 needs to be invalidated
after both L1 caches are disabled.

On kexec before the starting the code for relocation the kernel,
the L1 caches are disabled in cpu_froc_fin (cpu_v7_proc_fin for Feroceon),
but after L2 cache is never invalidated, because inv_all is not set
in cache-feroceon-l2.c.
So kernel relocation and decompression may has (and usually has) errors.
Setting the function enables L2 invalidation and fixes the issue.

Cache coherency issues

I was able to quickly validate that the L2 cache was causing issues with kexec. I simply recompiled the stock kernel without the cache driver enabled, and tested kexec. Indeed, it was able to boot further into the new kernel, and I could see serial output from the new kernel. Hooray!

At this point, I could simply leave the cache driver disabled if all that’s needed is to be able to kexec. However, it would be nice to still be able to run stock Synology software without the performance penalty of not utilizing the L2 cache.

I ended up backporting all the fixes above into the stock kernel, and added some extra code to disable the L2 cache just prior to jumping to the new image (Linux expects all data caches to be disabled upon booting). Unfortunately, the documentation for the cache controller is under NDA, so I had to make an educated guess on how to properly disable the cache based on existing code comments, the SoC reference manual, and the U-Boot bootloader code. In practice, I don’t think this is strictly required if the new kernel is also compiled with the cache driver enabled, but I did it for completeness anyway.

I will share the patches as soon as I can publish them.

Device-tree handling

Now that it’s possible to properly kexec a new kernel, I had to deal with a different issue:

C:0x000080A0-0x0025F000->0x0073B300-0x00992260
Uncompressing Linux... done, booting the kernel.

Error: invalid dtb and unrecognized/unsupported machine ID
  r1=0x0000020f, r2=0x00001000
  r2[]=05 00 00 00 01 00 41 54 00 00 00 00 00 00 00 00
Available machine support:

ID (hex)        NAME
ffffffff        Generic DT based system
ffffffff        Marvell Kirkwood (Flattened Device Tree)
0000054e        Marvell Orion-2 Development Board
000005e4        Marvell Orion-NAS Reference Design
00000631        Buffalo Linkstation Pro/Live
000005e5        Buffalo/Revogear Kurobox Pro
00000630        Buffalo Terastation Pro II/Live
000007d5        Buffalo Linkstation LS-HGL
0000061d        QNAP TS-109/TS-209
00000641        QNAP TS-409
00000661        Linksys WRT350N v2
00000674        Technologic Systems TS-78xx SBC
0000069d        HP Media Vault mv2120
00000926        LaCie 2Big Network
00000709        Netgear WNR854T
00000714        Marvell Orion-VoIP GE Reference Design
0000071a        Marvell Orion-VoIP FXO Reference Design
00000766        Marvell Orion-1-90 AP GE Reference Design
ffffffff        Marvell Orion5x (Flattened Device Tree)

Please check your kernel config and/or bootloader.

There are two main ways to pass parameters to an ARM Linux kernel: ATAGs and device trees.

Prior to the advent of device trees for ARM, kernels for ARM platforms weren’t generic. Each ARM platform needed to be allocated a machine ID, and needed a plaform-specific directory for code to describe, and initialize the hardware. On boot up, the kernel would need to look up the machine ID provided by the bootloader against the hardcoded list of platforms that had support compiled in. Auxiliary parameters like the kernel command line, initrd, and the physical memory region information were passed in via a tagged list called ATAGs.

The push to device trees changed the landscape for supporting new ARM platforms. Instead of describing the hardware in code, the bootloader would pass in a descriptor of the hardware (called a device tree blob) to the kernel. That descriptor would also contain information that would previously be in the ATAGs. Platforms that didn’t need any platform-specific initialization could (in theory) use a generic ARM Linux kernel, and simply build a device tree for it - no actual kernel development would be required to support the platform. The machine ID would only really matter for platforms with platform-specific initialization code. Device trees became the norm, and ATAGs was deprecated.

The error we see above showed we were trying to boot a modern, generic device-tree-based kernel using the old ATAGs method, and it didn’t know anything about our platform. Modern kernels come with a build-time option to enable backwards compatibility with bootloaders that only support ATAGs booting. It works by having the kernel searching for a device tree blob immediately after the compressed image in memory. If it finds one, the kernel behaves as if it was booted using that device tree. Old ATAGs-based bootloaders could be instructed to boot a kernel image with the device tree blob appended to it. The Debian kernel we’re booting indeed has this option enabled.

I tried kexec-ing using the modern boot method, by passing a device tree through the --dtb flag. I tried kexec-with the backwards-compatible method using the --atags flag, with the device tree blob appended to the kernel image. Yet, the kernel would not boot, no matter which method I tried. The booted kernel would not recognize that I’d passed in a device tree.

Kexec and device trees

The kexec tool does not do anything special when a device tree is passed in via --dtb. It merely adds it as yet-another-blob to be loaded into memory when kexec-ing. As seen in this commit, there’s actually extra code in the kernel to look for device tree blobs, and adjust the kernel boot arguments as needed. The stock kernel is too old, and does not include that code. The device tree blob is simply loaded into memory, but the new kernel has no way to tell that it exists.

So, why does appending the device tree blob to the end of the kernel image and booting via the backwards compatibility mechanism not work? It turns out that the kexec tool is being clever, and is parsing the header of the kernel image we’re trying to boot. It sees that the size of the kernel image as reported by the headers does not match the file size, and will only load up to the size specified in the headers. That means that our device tree is “conveniently” chopped off when loading into memory.

I backported the commit that causes kexec to look for device tree blobs, and was finally able to boot the Debian kernel using the --dtb flag.

What’s next?

The next step is to write this modified kernel to the boot flash (bonus points if the upgrade mechanism of the Synology firmware can be utilised in order to do this more safely), so it permanently replaces the stock kernel. When this is complete, it’ll pave the way for us to create a small userspace environment to boot our more modern Debian kernel.