bFLT format - implementation notes
One of the biggest show-stoppers in writing a bFLT file loader for Ndless was the lack of documentation with the bFLT file format. Information was scattered everywhere and it was hard to find how to implement some specific feature. I decided to write a simple description of all the features in bFLT and how to implement them.
The first thing I should mention is that there are two versions of bFLT out there. I will be describing the simpler, version 4 format.
First, we have the file header structure:
#define FLAT_VERSION 0x00000004L
struct flat_hdr {
char magic[4];
unsigned long rev; /* version (as above) */
unsigned long entry; /* Offset of first executable instruction
with text segment from beginning of file */
unsigned long data_start; /* Offset of data segment from beginning of
file */
unsigned long data_end; /* Offset of end of data segment
from beginning of file */
unsigned long bss_end; /* Offset of end of bss segment from beginning
of file */
/* (It is assumed that data_end through bss_end forms the bss segment.) */
unsigned long stack_size; /* Size of stack, in bytes */
unsigned long reloc_start; /* Offset of relocation records from
beginning of file */
unsigned long reloc_count; /* Number of relocation records */
unsigned long flags;
unsigned long build_date; /* When the program/library was built */
unsigned long filler[5]; /* Reservered, set to zero */
};
#define FLAT_FLAG_RAM 0x0001 /* load program entirely into RAM */
#define FLAT_FLAG_GOTPIC 0x0002 /* program is PIC with GOT */
#define FLAT_FLAG_GZIP 0x0004 /* all but the header is compressed */
#define FLAT_FLAG_GZDATA 0x0008 /* only data/relocs are compressed (for XIP) */
#define FLAT_FLAG_KTRACE 0x0010 /* output useful kernel trace for debugging */
For simplicity, all fields are in network byte order (i.e. big endian) and are 32 bits wide. When you write your loader, make sure you correctly convert the endian appropriate for your machine.
Loading Executables
The magic field should be the bytes “bFLT”. The rev field should be set to FLAT_VERSION. It’s advisable to check these fields before loading the binary.
The next 3 fields are all file offsets. If your system doesn’t differentiate between code and data, you can load the text and data sections all in one reading since the sections are one after the other in the file. It is assumed that text_start = entry, text_end = data_start, data_end = bss_start.
Although the bss_end field implies that a bss section exists in a file, it actually doesn’t. You need to make sure that (bss_end - data_end) bytes are available and zero’d at the end of the executable image.
The total executable image size can be calculated by (bss_end - entry).
The stack_size field is pretty much self-explanatory.
The flags field is a bitfield of flags. The FLAT_FLAG_* definition comments describe the meaning of the flags.
Relocations
There are two kinds of relocation you may need to do. The first is relocating global and static pointers and the other being fixing up the GOT.
Relocating global and static pointers
If reloc_count is more than 0, it means that there are pointers that needs to be fixed up.
The reloc_start field is a file offset to a list of pointers (note, they are still big endian!). These pointers point to pointers in the executable image. These pointers are zero-based offsets into the executable image (exceptions being shared library references - explained later).
If you’re implementing XIP and you’re keeping code and data separate, you can work out which section the offsets are pointing to by taking the pointer and comparing it to the sizes of each section. I.e.
if (offset < (data_start - entry)) {
//pointing to code section
} else if (offset < (data_end - entry)) {
//pointing to data section
} else if (offset < (bss_end - entry)) {
//pointing to bss section
}
In most cases, you can derive the real address of the offsets by adding the executable image base address to the offset (obviously a little more work is required for XIP executables).
Deference the pointer and you get another offset (this time, it’s the native endian for the target). This is an offset you need to fix up. In most cases you can add the executable image base image to it and store it back. With XIP, you will need to work out how you can get the absolute address for the offset (not very difficult).
After you’ve gone through all reloc_count offsets, you’re done for this part.
GOT relocation
If your flags indicate that your program is PIC with GOT, you need to do some additional fixups.
At the start of your data section is a list of pointers terminated by a pointer value of -1. They need to be fixed up in the same way you fixed up relocation above.
Entry point
After all the relocation has taken place, you can simply jump to the base of the code section to run the executable.
Shared libraries
Shared libraries are equivalent to executables in bFLT. The executable linkage to the shared library is determined at compile-time rather than runtime.
Shared libraries are identified by numbers between 0-255. 0 and 255 are reserved.
During relocation, the high byte of the offset determines the library needed to complete the linkage.
For example, if the offset needed fixing during relocation was 0x030003a0, the fix up for that offset would be a pointer pointing to the offset 0x03a0 into the library identified by the number 0x03.
You would find the library associated with the number in the high byte, load it into memory, find the absolute address of the offset into the library and replace the fixup with that.
Using the same example, if I had a executable loaded into memory at 0x1000 and a library with an ID of 3, loaded into memory at 0x2000, the fix up for 0x030003a0 would be 0x23a0.
When a shared library is loaded, I believe you also need to enter its entry point to initialize the library before the main binary runs.
Shared libraries are, otherwise, loaded and treated exactly the same as a normal executable.