Quantcast
Viewing latest article 1
Browse Latest Browse All 2

Answer by Martin Rosenau for How are static data variables addresses determined at compile time in MIPS1?

However, does the assembler determine these addresses at compile time, or are these addresses determined when the program is loaded int memory by the OS?

This depends from OS to OS and is even dependent of the type of the variable:

10 years ago, most WLAN routers used a MIPS CPU running Linux on it.

On Linux, there exists a code type which is called "position-dependent" code. This means that the code will always be loaded to the same address in memory. In this case, one tool of the compiler tool chain (compiler, assembler, linker, ...) (*) calculates the address of the variable and writes the code "correctly". Example: If the variable is located at address 0x100024, the code will look like this:

lui t2, 0x10
lw  t3, 0x24(t2)

Then there is "position-independent" code. This type of code can be loaded into any address. Let's assume that the variable is stored at address X and the label "nextAddress" is located at address Y. Addresses X and Y may change, however the difference (X-Y) is fixed. Let's say X-Y = 0x100024:

  bgezal zero, nextAddress
  nop
nextAddress:
  ; Now the address "nextAddress" is in register "ra"
  lui    t2, 0x10
  addu   t2, t2, ra
  ; The next instruction will access address X+0x100024
  lw     t3, 0x24(t2)

Of course, the difference (X-Y) is calculated by a tool of the compiler tool chain during compile time.

There were also handheld devices (similar to a smartphone today) running Windows CE on MIPS CPUs.

This operating system uses a so-called "base relocation table": The .EXE or .DLL file contains a "desired" address. The code in the file looks like "position-dependent" code described above. When the file is loaded to the "desired" address, no action needs to be taken.

The "base relocation table" contains information about addresses in the file. For example a list of all lui instructions in the file. Let's say the "desired" address for loading the file is 0x80000 but Windows has to load the file to address 0xA0000. This means that the address of all variables in the file changes by 0x20000. Windows will process the information in the "base relocation table" and add 0x20 to the last 16 bits of each lui instruction in the file, so lui t2, 0x10 becomes lui t2, 0x30 in the example for "position dependent code" above.

So the address of the variable is first calculated by the compiler tool chain, but later it is modified by the operating system.

The next possibility are "global offset tables" (Linux) and "import tables" (Windows). These are typically used for accessing variables in DLL files.

The global offset table contains the addresses of the actual variables. In C programming language, this can be explained the following way:

Actual C code:

static int staticVariable;
...
staticVariable = 1234;

What the resulting assembler code actually does:

static int staticVariable;
static int * staticVariable_got = &staticVariable;
...
(*staticVariable_got) = 1234;

Assembly code:

lui t2, 0x10
lw  t3, 0x24(t2)
; Now t3 contains the address of the variable
sw  t4, (t3)

In this case the compiler tool chain will calculate the address of the auxiliary "variable" (staticVariable_got). However, the address of the actual variable (staticVariable) is only known to the OS. The OS writes that address to the "variable" staticVariable_got.


(*) Note that this needs not necessarily be the linker; there are also compilers that directly write executable files (without the need of an assembler or linker) and there are linkers whose output is post-processed by another tool.


Viewing latest article 1
Browse Latest Browse All 2

Trending Articles