Relocatable vs. Absolute Machine Code

There is a fundamental difference between Relocatable Code and what is considered Position-Independent Code.
Now I’ve been coding assembly a long time and on many different architectures and I’ve always thought of machine code as coming in three specific flavors:

  • Position-Independent-Code
  • Relocatable-Code
  • Absolute-Code

Let’s firstly discuss position-independent code. This is code that when assembled has all of its instructions relative to one other. So branches, for instance, specify an offset from the current Instruction Pointer (or Program Counter whichever you want to call it). Code that is position independent will consist of only one segment of code and have its data also contained within this segment (or section). There are exceptions to data being embedded within the same segment, but these are benefits usually passed onto you by the operating system or loader.

It’s a very useful type of code because it means the operating system does not need to perform any post-loading operations on it in order to be able to start executing. It will just run anywhere that it is loaded in memory. Of course, this type of code has its problems too, namely things like not being able to segregate code and data that might be suitable to different memory types and limitations on size before relatives start moving out of range, etc. to name but a few.

Relocatable-Code is quite like a position-independent code in many ways, but it has a very subtle difference. As its name suggests, this type of code is relocatable in that code can be loaded anywhere in memory, but usually has to be relocated or fixed up before it is executable. In fact, some architectures that use this type of code embed things like “reloc” sections for this very purpose of fixing up the relocatable parts of the code. The downside of this type of code is that once it is relocated and fixed up, it almost becomes absolute in nature and fixed at its address.

What gives relocatable code its major advantage and the reason why it is the most prevalent code around is that it allows code to be easily broken down into sections. Each section can be loaded anywhere in memory to fit its requirements and then during relocation, any code that references another section can be fixed-up with a relocation table and thus the sections can be tied together nicely. The code itself is usually relative (as with the x86 architecture), but it need not be, as anything that might be out of range can be assembled as a relocatable instruction such that it consists of an offset added to its load address. It also means that limitations imposed by relative-addressing are no-longer an issue.

The final type of code is Absolute-Code. This code that is assembled to work at one specific address and will only work when loaded at that specific address. Branch and jump instructions all contain a fixed exact (absolute) address. It is a type of code usually found on embedded systems whereby it can be guaranteed that a piece of code will be loaded at that specific address as it is the only thing that is loaded there. On a modern computer, such absolute code wouldn’t work because the code needs to be loaded wherever there is free memory and there’s never any guarantee that a certain memory range will be available. The absolute code does have its advantages though, mainly being that it is generally the fastest executing, but this can be platform dependent.

Compilation process stages:

Linker and loader for elf relocatable files


This site uses Akismet to reduce spam. Learn how your comment data is processed.