Boot and Link¶
Responsibility¶
- Define how the bootloader hands control to the kernel (
loader→kernelMain). - Describe the kernel’s binary format, memory layout, and segment setup via the linker script and GDT.
- Ensure all subsystems share a consistent view of addresses and CPU mode.
Multiboot Header and Loader (loader.s)¶
Multiboot header¶
- Constants:
MAGIC = 0x1badb002FLAGS = (1 << 0) | (1 << 1)CHECKSUM = -(MAGIC + FLAGS)- Section:
- This forms a valid Multiboot header:
MAGICidentifies the kernel as Multiboot‑compatible.FLAGSrequest specific Multiboot info (e.g., memory info, aligned modules).CHECKSUMmakesMAGIC + FLAGS + CHECKSUM == 0(required by spec).- The linker script places
.multibootat the start of.text, so GRUB/Multiboot will find it before any code.
Entry symbol and call sequence¶
- Global entry symbol for the kernel image:
loaderis the ELF entry point (seeENTRY(loader)inlinker.ld).- Execution flow:
loader:
mov $kernel_stack, %esp # set up kernel stack (top)
call callConstructors # run C++ global/static constructors
push %eax # push Multiboot magic (convention)
push %ebx # push pointer to multiboot_info
call kernelMain # transfer control to C++ kernel
- Register conventions at entry:
- Multiboot bootloader sets:
%eax= multiboot magic%ebx= pointer tomultiboot_infostruct
loaderforwards these as arguments tokernelMainvia the stack.- Stop path:
- Used as a halt loop if the kernel ever returns to
_stop.
Kernel stack¶
- Reserves 2 MiB in
.bssfor the kernel stack. mov $kernel_stack, %espsets%espat the end of this region (growing downward).
Linker Script (linker.ld)¶
Target and entry¶
- Kernel is a 32‑bit i386 ELF binary.
- Linker entry symbol is
loader(matching the assembly label above).
Memory layout¶
SECTIONS
{
. = 0x0100000; # load/virtual address: 1 MiB
.text :
{
*(.multiboot)
*(.text*)
*(.rodata)
}
.data :
{
start_ctors = .;
KEEP(*( .init_array ));
KEEP(*(SORT_BY_INIT_PRIORITY( .init_array.* )));
end_ctors = .;
*(.data)
}
.bss :
{
*(.bss)
}
/DISCARD/ : { *(.fini_array*) *(.comment) }
}
- The link address starts at
0x0010_0000(1 MiB). - Traditional Multiboot‑friendly location; avoids low real‑mode memory.
- Sections:
.text:*(.multiboot)at the very start, so the Multiboot header lives at the beginning of the kernel image.- All code
*(.text*)and read‑only data*(.rodata).
.data:- Defines
start_ctorsandend_ctorsaround.init_array: KEEP(*( .init_array ))KEEP(*(SORT_BY_INIT_PRIORITY( .init_array.* )))- Used by
callConstructors()inkernel.ccto run C++ global constructors. - Then places
*(.data).
- Defines
.bss:- All zero‑initialized data
*(.bss)(including the 2 MiB kernel stack).
- All zero‑initialized data
/DISCARD/:- Drops
.fini_array*and.commentsections from the final binary.
- Drops
C++ constructors interface¶
linker.ldcreates two symbols:start_ctors– beginning of.init_arrayregion.end_ctors– end of.init_arrayregion.kernel.ccprovides:
typedef void (*constructor)();
extern "C" constructor start_ctors;
extern "C" constructor end_ctors;
extern "C" void callConstructors() {
for (constructor* i = &start_ctors; i != &end_ctors; i++)
(*i)();
}
loadercallscallConstructorsbeforekernelMain, so all global/static objects are initialized before kernel code uses them.
Global Descriptor Table (gdt.cc)¶
Responsibility¶
- Define and load a basic GDT with:
- Null descriptor.
- Unused placeholder descriptor.
- Code segment descriptor.
- Data segment descriptor.
- Provide segment selectors to other subsystems (e.g., multitasking, interrupts).
Layout and construction¶
GlobalDescriptorTable::GlobalDescriptorTable()
: nullSegmentSelector(0, 0, 0),
unusedSegmentSelector(0, 0, 0),
codeSegmentSelector(0, 64 * 1024 * 1024, 0x9A),
dataSegmentSelector(0, 64 * 1024 * 1024, 0x92)
{
uint32_t i; [github](https://github.com/pac-ac/osakaOS/actions)
i = (uint32_t)this; [github](https://github.com/pac-ac/osakaOS)
i = sizeof(GlobalDescriptorTable) << 16;
asm volatile("lgdt (%0)" : : "p"(((uint8_t*)i) + 2));
}
- Segment descriptors:
nullSegmentSelector– mandatory null descriptor (selector 0).unusedSegmentSelector– reserved/unused descriptor.codeSegmentSelector:- Base:
0. - Limit:
64 * 1024 * 1024bytes (64 MiB). - Type:
0x9A(present, ring 0, executable, readable code).
- Base:
dataSegmentSelector:- Base:
0. - Limit:
64 * 1024 * 1024bytes. - Type:
0x92(present, ring 0, writable data).
- Base:
lgdtload:- Builds a pseudo‑descriptor
iwith:i[0]high word = size of GDT.i [github](https://github.com/pac-ac/osakaOS/actions)= base address of the GDT object.
- Calls
lgdtto load the GDT register with the address of this table.
Segment selector accessors¶
uint16_t GlobalDescriptorTable::DataSegmentSelector() {
return (uint8_t*)&dataSegmentSelector - (uint8_t*)this;
}
uint16_t GlobalDescriptorTable::CodeSegmentSelector() {
return (uint8_t*)&codeSegmentSelector - (uint8_t*)this;
}
- These return the offsets (selector values) of the code and data descriptors within the GDT object.
- Other components (e.g., task state, segment loading code) should use these functions to obtain selectors instead of hard‑coding values.
SegmentDescriptor encoding¶
- Constructor:
GlobalDescriptorTable::SegmentDescriptor::SegmentDescriptor(uint32_t base, uint32_t limit, uint8_t type)
{
uint8_t* target = (uint8_t*)this;
if (limit <= 65536) {
target = 0x40; // 16‑bit segment, byte granularity [github](https://github.com/AlgorithMan-de/wyoos/actions)
} else {
// 32‑bit segment with 4 KiB granularity
if ((limit & 0xFFF) != 0xFFF)
limit = (limit >> 12) - 1;
else
limit = limit >> 12;
target = 0xC0; // granularity bit set, 32‑bit segment [github](https://github.com/AlgorithMan-de/wyoos/actions)
}
// Limit (lower 16 bits + high 4 bits)
target = limit & 0xFF;
target = (limit >> 8) & 0xFF; [github](https://github.com/pac-ac/osakaOS)
target [github](https://github.com/AlgorithMan-de/wyoos/actions) |= (limit >> 16) & 0xF;
// Base (32 bits)
target = base & 0xFF; [github](https://github.com/pac-ac/osakaOS/actions)
target = (base >> 8) & 0xFF; [github](https://github.com/pac-ac/osakaOS/issues)
target = (base >> 16) & 0xFF; [github](https://github.com/pac-ac/osakaOS/activity)
target = (base >> 24) & 0xFF; [github](https://github.com/pac-ac/osakaOS/issues/19)
// Type / access byte
target = type; [github](https://github.com/pac-ac/osakaOS/blob/main/Makefile)
}
- Limit handling:
- If
limit <= 65536, segment uses byte granularity (16‑bit style). -
Else:
- Uses 4 KiB granularity (
G=1). - Limit is shifted by 12 bits; if low bits are not all 1, it adjusts to keep the effective limit correct and within physical constraints.
- Uses 4 KiB granularity (
-
Helper methods:
uint32_t GlobalDescriptorTable::SegmentDescriptor::Base();
uint32_t GlobalDescriptorTable::SegmentDescriptor::Limit();
Base()reconstructs base address from descriptor bytes.Limit()reconstructs limit and re‑applies granularity (ifG=1, multiplies back by 4 KiB and or’s with0xFFF).
Invariants and Assumptions¶
- CPU boots under a Multiboot‑compliant bootloader (e.g., GRUB) that:
- Loads the kernel ELF at or compatible with the link address
0x0010_0000. - Provides
multiboot_infoand magic values in%ebxand%eaxbefore jumping toloader. loader:- Must be the first executed code in the kernel image.
- Must not use C/C++ runtime features before
callConstructors. - Stack:
- Kernel stack is a 2 MiB
.bssregion starting atkernel_stack, growing downward. - No stack guard or detection yet; overflow is undefined behavior.
- GDT:
- Assumes flat 0‑based memory with code and data segments spanning at least 64 MiB.
- Other subsystems rely on
CodeSegmentSelector()andDataSegmentSelector()values instead of hard‑coded selectors. - C++:
- All global/static objects that need initialization must have their constructors placed into
.init_array;callConstructors()must be called once beforekernelMain.
Open Questions / TODO¶
- Decide whether to explicitly document the transition into protected mode (currently assumed done by the bootloader) or add a protected‑mode setup stub.
- Confirm and document the exact physical vs virtual mapping strategy around
0x0010_0000(tie this tomemory.md). - Consider adding a small diagram for memory layout showing:
- bootloader, kernel
.text/.data/.bss, stack, heap start (10 MiB), etc. - If segmentation is later minimized (pure paging model), clarify which parts of GDT remain relevant (e.g., just flat code/data). ```