Executable files and shared library files are used to create a process image when a program is started by the system. This chapter describes the object file structures that relate to program execution and also describes how the process image is created from executable and shared object files.
This chapter addresses the following topics:
The following sections describe several general factors that are involved in the linking and loading process.
The following object file structures contain information that is used in linking and loading operations:
See Chapter 7 for further details on file headers, optional headers, and section headers.
Executable files and shared library files have a base address, which is the lowest virtual address associated with the process image of the program. The base address is used to relocate the process image during dynamic linking.
During program loading, the base address is calculated from the memory load address, the maximum page size, and the lowest virtual address of the program's loadable segment.
A program that is to be loaded by the system must have at least one loadable segment, even though this is not required by the file format. When the process image is created, the segments are assigned access permissions, which are determined by the type of segment and type of program image. Table 9-1 shows the access permissions for the various segment and image types.
Image | Segment | Access Permissions |
OMAGIC | text, data, bss | Read, Write, Execute |
NMAGIC | text | Read, Execute |
NMAGIC | data, bss | Read, Write, Execute |
ZMAGIC | text | Read, Execute |
ZMAGIC | data, bss | Read, Write, Execute |
An object file segment can contain one or more sections. The number of sections in a segment is not important for program loading, but specific information must be present for linking and execution. Figure 9-1 illustrates typical segment contents for executable files and shared object files. The order of sections within a segment may vary.
Text segments contain instructions and read-only data, and data segments contain writable data. Text segments and data segments typically include the sections shown in Figure 9-1.
As the system creates or augments a process image, it logically copies a file's segment to a virtual memory segment. The time at which the system physically reads the file depends on the program's execution behavior, system load, and other factors. A process does not require a physical page unless it references the logical page during execution.
Processes commonly leave many pages unreferenced. This improves system performance because delaying physical reads frequently obviates them. To obtain this efficiency in practice, shared executable files and shared library files must have segment images whose virtual addresses are zero, modulo the file system block size.
Virtual addresses for the text and data segments must be aligned on 64KB (0x10000) or larger power of 2 boundaries. File offsets must be aligned on 8KB (0x2000) or larger power of 2 boundaries.
Because the page size can be larger than the alignment restrictions of a segment's file offset, up to seven file pages (depending on page size) can hold text or data that is not logically part of the segment. The contents of the various file pages are as follows:
Logically, the system enforces the memory permissions as if each segment were complete and separate; segment's addresses are adjusted to ensure that each logical page in the address space has a single set of permissions.
The end of the data segment requires special handling for uninitialized data, which must be set to zero. If a file's last data page includes information not in the logical memory page, the extraneous data must be set to zero, not the contents of the executable file.
An executable file is loaded at fixed addresses; the system creates its segments using the virtual addresses from the optional header. The system transfers control directly to the entry point of the executable file.
An executable file that uses dynamic linking requires one or more shared libraries to be loaded in addition to the executable file. Instead of loading the executable file, the system loads the dynamic loader, which in turn loads the executable file and its shared libraries.
When building an executable file that uses dynamic linking, the linker adds the flag F_MIPS_CALL_SHARED to the f_flags field of the file header. This flag tells the system to invoke the dynamic loader to load the executable file. Typically, the dynamic loader requested is /sbin/loader, the default loader. The exec function and the dynamic loader cooperate to create the process image. Creating the process image involves the following operations:
To assist the dynamic loader, the linker also constructs the following data items for shared library files and shared executable files:
These data items are located in loadable segments and are available during execution.
Shared library files may be located at virtual addresses that differ from the addresses in the optional header. The dynamic loader relocates the memory image and updates absolute addresses before control is given to the program.
If the environment variable LD_BIND_NOW has a non-null value, the dynamic loader processes all relocations before transferring control to the program. The dynamic loader may use the lazy binding technique to evaluate procedure linkage table entries, avoiding symbol resolution and relocation for functions that are not called. (See Section 9.3.3.1 for information about lazy binding.)
The following sections describe the various dynamic linking sections. The C language definitions are in the header files elf_abi.h and elf_mips.h.
The dynamic section acts as a table of contents for dynamic linking information within the object. Dynamic sections are present only in shared executable files and shared library files.
The dynamic section is located by its section header. This section header is identified by its name (.dynamic) or its section type (STYP_DYNAMIC) in the flags field (s_flags).
The dynamic section is an array with entries of the following type:
typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn;
The structure and union members in the preceding structure definition provide the following information:
The d_tag requirements for shared executable files and shared library files are summarized in Table 9-2. "Mandatory" indicates that the dynamic linking array must contain an entry of that type; "optional" indicates that an entry for the tag may exist but is not required.
Name | Value | d_un | Executable | Shared Object |
DT_NULL | 0 | ignored | mandatory | mandatory |
DT_NEEDED | 1 | d_val | optional | optional |
DT_PLTRELSZ |
2 | d_val | optional | optional |
DT_PLTGOT | 3 | d_ptr | optional | optional |
DT_HASH | 4 | d_ptr | mandatory | mandatory |
DT_STRTAB | 5 | d_ptr | mandatory | mandatory |
DT_SYMTAB | 6 | d_ptr | mandatory | mandatory |
DT_RELA |
7 | d_ptr | mandatory | optional |
DT_RELASZ |
8 | d_val | mandatory | optional |
DT_RELAENT |
9 | d_val | mandatory | optional |
DT_STRSZ | 10 | d_val | mandatory | mandatory |
DT_SYMENT | 11 | d_val | mandatory | mandatory |
DT_INIT | 12 | d_ptr | optional | optional |
DT_FINI | 13 | d_ptr | optional | optional |
DT_SONAME | 14 | d_val | ignored | optional |
DT_RPATH | 15 | d_val | optional | ignored |
DT_SYMBOLIC | 16 | ignored | ignored | optional |
DT_REL | 17 | d_ptr | mandatory | optional |
DT_RELSZ | 18 | d_val | mandatory | optional |
DT_RELENT | 19 | d_val | mandatory | optional |
DT_PLTREL |
20 | d_val | optional | optional |
DT_DEBUG |
21 | d_ptr | optional | ignored |
DT_TEXTREL |
22 | ignored | optional | optional |
DT_JMPREL |
23 | d_ptr | optional | optional |
DT_LOPROC | 0x70000000 | unspecified | unspecified | unspecified |
DT_HIPROC | 0x7fffffff | unspecified | unspecified | unspecified |
Table Notes:
The uses of the various dynamic array tags are as follows:
Name | Value | d_un | Executable | Shared Object |
DT_MIPS_RLD_VERSION | 0x70000001 | d_val | mandatory | mandatory |
DT_MIPS_TIME_STAMP | 0x70000002 | d_val | optional | optional |
DT_MIPS_ICHECKSUM | 0x70000003 | d_val | optional | optional |
DT_MIPS_IVERSION | 0x70000004 | d_val | optional | optional |
DT_MIPS_FLAGS | 0x70000005 | d_val | mandatory | mandatory |
DT_MIPS_BASE_ADDRESS | 0x70000006 | d_ptr | mandatory | mandatory |
DT_MIPS_CONFLICT | 0x70000008 | d_ptr | optional | optional |
DT_MIPS_LIBLIST | 0x70000009 | d_ptr | optional | optional |
DT_MIPS_LOCAL_GOTNO | 0x7000000A | d_val | mandatory | mandatory |
DT_MIPS_CONFLICTNO | 0x7000000B | d_val | optional | optional |
DT_MIPS_LIBLISTNO | 0x70000010 | d_val | optional | optional |
DT_MIPS_SYMTABNO | 0x70000011 | d_val | optional | optional |
DT_MIPS_UNREFEXTNO | 0x70000012 | d_val | optional | optional |
DT_MIPS_GOTSYM | 0x70000013 | d_val | mandatory | mandatory |
DT_MIPS_HIPAGENO |
0x70000014 | d_val | mandatory | mandatory |
Table Notes:
The uses of the various processor-specific dynamic array tags are as follows:
Flag | Value | Meaning |
RHF_QUICKSTART | 0x00000001 | Object may be quickstarted by loader |
RHF_NOTPOT | 0x00000002 | Hash size not a power of two |
RHF_NO_LIBRARY_REPLACEMENT | 0x00000004 | Use default system libraries only |
RHF_NO_MOVE | 0x00000008 | Do not relocate |
RHF_RING_SEARCH | 0x10000000 | Symbol resolution same as DT_SYMBOLIC |
RHF_DEPTH_FIRST | 0x20000000 | Depth first symbol resolution |
RHF_USE_31BIT_ADDRESSES | 0x40000000 | TASO (Truncated Address Support Option) objects |
All other tag values are reserved. Entries may appear in any order, except for the relative order of the DT_NEEDED entries and the DT_NULL entry at the end of the array.
When the linker processes an archive library, library members are extracted and copied into the output object file. These statically linked services are available during execution and do not involve the dynamic loader. Shared executable files also provide services that require the dynamic loader to include the appropriate shared library files in the process image. To accomplish this, shared executable files and shared library files must describe their dependencies.
The dependencies, indicated by the DT_NEEDED entries of the dynamic structure, indicate which shared library files are required for the program. The dynamic loader builds a process image by connecting the referenced shared library files and their dependencies. When resolving symbolic references, the dynamic loader looks first at the symbol table of the shared executable program, then at the symbol tables of the DT_NEEDED entries (in order), then at the second-level DT_NEEDED entries, and so on. Shared library files must be readable by the process.
Note
Even if a shared object is referenced more than once in the dependency list, the dynamic loader includes only one instance of the object in the process image.
Names in the dependency list are copies of the DT_SONAME strings.
If a shared library name has one or more slash characters in its name, such as /usr/lib/libz, the dynamic loader uses the string as the pathname. If the name has no slashes, such as liba, the object is searched as follows:
The following environment variables are defined:
_RLD_ARGS | Argument to dynamic loader |
_RLD_ROOT |
Prefix that the dynamic loader adds to all paths except those
specified by LD_LIBRARY_PATH |
Note
For security, the dynamic loader ignores environmental search specifications, such as LD_LIBRARY_PATH, for set-user-ID and set-group-ID programs.
Position-independent code cannot contain absolute virtual addresses. Global offset tables (GOTs) hold absolute addresses in private data, thus making the addresses available without compromising the position-independence and sharability of a program's text. A program references its global offset table using position-independent addressing and extracts absolute values, thus redirecting position-independent references to absolute locations.
The global offset table is split into two logically separate subtables - local and external:
The external entries for defined symbols must contain actual addresses. If an entry corresponds to an undefined symbol and the table entry contains a zero, the entry must be resolved by the dynamic loader, even if the dynamic loader is performing a quickstart. (See Section 9.3.10 for information about quickstart processing.)
After the system creates memory segments for a loadable object file, the dynamic loader may process the relocation entries. The only relocation entries remaining are type R_REFQUAD or R_REFLONG, referring to local entries in the GOT and data items containing addresses. The dynamic loader determines the associated symbol (or section) values, calculates their absolute addresses, and sets the proper values. Although the absolute addresses may be unknown when the linker builds an object file, the dynamic loader knows the addresses of all memory segments and can find the correct symbols and calculate the absolute addresses.
If a program requires direct access to the absolute address of a symbol, it uses the appropriate GOT entry. Because the shared executable file and shared library file have separate global offset tables, a symbol's address may appear in several tables. The dynamic loader processes all necessary relocations before giving control to the process image, thus ensuring the absolute addresses are available during execution.
The zero (first) entry of the .dynsym section is reserved and holds a null symbol table entry. The corresponding zero entry in the GOT is reserved to hold the address of the entry point in the dynamic loader to call when using lazy binding to resolve text symbols (see Section 9.3.3.1 for information about resolving text symbols using lazy binding).
The system may choose different memory segment addresses for the same shared library file in different programs; it may even choose different library addresses for different executions of the same program. Nonetheless, memory segments do not change addresses once the process image is established. As long as a process exists, its memory segments reside at fixed virtual addresses.
A single GOT can hold a maximum of 8190 local and global entries. If a program references 8K or more global symbols, it will have multiple GOTs. Each GOT in a multiple-GOT object is referenced by means of a different global pointer value. A single .got section holds all of the GOTs in a multiple-GOT object.
The DT_MIPS_LOCAL_GOTNO and DT_PLTGOT entries of the dynamic section describe the attributes of the global offset table.
The GOT is used to hold addresses of position-independent functions as well as data addresses. It is not possible to resolve function calls from one shared executable file or shared library file to another at static link time, so all of the function address entries in the GOT would normally be resolved at run time by the dynamic loader. Through the use of specially constructed pieces of code known as stubs, this run-time resolution can be deferred through a technique known as lazy binding.
Using the lazy binding technique, the linker builds a stub for each called function and allocates GOT entries that initially point to the stubs. Because of the normal calling sequence for position-independent code, the call invokes the stub the first time that the call is made.
stub_xyz: ldq t12, .got_index(gp) lda $at, .dynsym_index_low(zero) ldah $at, .dynsym_index_high($at) jmp t12, (t12)
The stub code loads register t12 with an entry from the GOT. The entry loaded into register t12 is the address of the procedure in the dynamic loader that handles lazy binding. The stub code also loads register $at with the index into the .dynsym section of the referenced external symbol. The code then transfers control to the dynamic loader and loads register t12 with the address following the stub. The dynamic loader determines the correct address for the called function and replaces the address of the stub in the GOT with the address of the function.
Most undefined text references can be handled by lazy text evaluation, except when the address of a function is used in other than a jsr instruction. In the exception case, the program uses the address of the stub instead of the actual address of the function. Determining which case is in effect is based on the following processing:
The LD_BIND_NOW environment variable can also change dynamic loader behavior. If its value is non-null, the dynamic loader evaluates all symbol-table entries of type STT_FUNC, replacing their stub addresses in the GOT with the actual address of the referenced function.
Note
Lazy binding generally improves overall application performance because unused symbols do not incur the dynamic loader overhead. Two situations, however, make lazy binding undesirable for some applications:
- The initial reference to a function in a shared object file takes longer than subsequent calls because the dynamic loader intercepts the call to resolve the symbol. Some applications cannot tolerate this unpredictability.
- If an error occurs and the dynamic loader cannot resolve the symbol, the dynamic loader terminates the program. Under lazy binding, this might occur at arbitrary times. Once again, some applications cannot tolerate this unpredictability.
By turning off lazy binding, the dynamic loader forces the failure to occur during process initialization, before the application receives control.
The dynamic symbol section provides information on all external symbols, either imported or exported from an object.
All externally visible symbols, both defined and undefined, must be hashed into the hash table (seeSection 9.3.7).
Undefined symbols of type STT_FUNC that have been referenced only by jsr instructions may contain nonzero values in their st_value field denoting the stub address used for lazy evaluation for this symbol. The dynamic loader uses this to reset the GOT entry for this external symbol to its stub address when unloading a shared library file. All other undefined symbols must contain zero in their st_value fields.
Defined symbols in a shared executable file cannot be preempted. The symbol table in the shared executable file is always searched first to resolve any symbol references.
The dynamic symbol section contains an array of entries of the
following type:
typedef struct { Elf32_Word st_name; Elf32_Addr st_value; Elf32_Word st_size; unsigned char st_info; unsigned char st_other; Elf32_Half st_shndx; } Elf32_Sym;
The structure members in the preceding structure definition provide the following information:
A symbol's binding determines the linkage visibility and behavior. The binding is encoded in the st_info field and can have one of the following values:
Value | Description |
STB_LOCAL | Indicates that the symbol is local to the object. |
STB_GLOBAL | Indicates that the symbol is visible to other objects. |
STB_WEAK | Indicates that the symbol is a weak global symbol. |
STB_DUPLICATE | Indicates the symbol is a duplicate. (Used for objects that have multiple GOTs.) |
A symbol's type identifies its use. The type is encoded in the st_info field and can have one of the following values:
Value | Description |
STT_NOTYPE | Indicates that the symbol has no type or its type is unknown. |
STT_OBJECT | Indicates that the symbol is a data object. |
STT_FUNC | Indicates that the symbol is a function. |
STT_SECTION | Indicates that the symbol is associated with a program section. |
STT_FILE | Indicates that the symbol as the name of a source file. |
All symbols are defined relative to some program section. The st_shndx field identifies the section and can have one of the following values:
Value | Description |
SHN_UNDEF | Indicates that the symbol is undefined. |
SHN_ABS | Indicates that the symbol has an absolute value. |
SHN_COMMON | Indicates that the symbol has common storage (unallocated). |
SHN_MIPS_ACOMMON | Indicates that the symbol has common storage (allocated). |
SHN_MIPS_TEXT | Indicates that the symbol is in a text segment. |
SHN_MIPS_DATA | Indicates that the symbol is in a data segment. |
The entries of the dynamic symbol section are ordered as follows:
Figure 9-2 shows the layout of the .dynsym section and its relationship to the .got section.
The DT_SYMENT and DT_SYMTAB entries of the dynamic section describe the attributes of the dynamic symbol table.
The dynamic relocation section describes all locations within the object that must be adjusted if the object is loaded at an address other than its linked base address.
Only one dynamic relocation section is used to resolve addresses in data items, and it must be called .rel.dyn. Shared executable files can contain normal relocation sections in addition to a dynamic relocation section. The normal relocation sections may contain resolutions for any absolute values in the main program. The dynamic linker does not resolve these or relocate the main program.
As noted previously, only R_REFQUAD and R_REFLONG relocation entries are supported in the dynamic relocation section.
The dynamic relocation section is an array of entries of the following type:
typedef struct { Elf32_Addr r_offset; Elf32_Word r_info; } Elf32_Rel;
The structure members in the preceding structure definition provide the following information:
The entries of the dynamic relocation section are ordered by symbol index value.
The DT_REL and DT_RELSZ entries of the dynamic section describe the attributes of the dynamic relocation section.
The optional .msym section contains precomputed hash values and dynamic relocation indexes for each entry in the dynamic symbol table. Each entry in the .msym section maps directly to an entry in the .dynsym section. The .msym section is an array of entries of the following type:
typedef struct { Elf32_Word ms_hash_value; Elf32_Word ms_info; } Elf32_Msym;
The structure members in the preceding structure definition provide the following information:
The dynamic relocation index identifies the first entry in the .rel.dyn section that references the dynamic symbol corresponding to this msym entry. If the index is 0, no dynamic relocations are associated with the symbol.
The symbol flags field is reserved for future use.
The DT_MIPS_MSYM entry of the dynamic section contains the address of the .msym section.
A hash table of Elf32_Word entries provides fast access to symbol entries in the dynamic symbol section. Figure 9-3 shows the contents of a hash table.
The entries in the hash table contain the following information:
The hashing function accepts a symbol name and returns a value that can be used to compute a bucket index. If the hashing function returns the value X for a name, bucket[X % nbucket] gives an index, Y, into the symbol table and chain array. If the symbol table entry indicated is not the correct one, chain[Y] indicates the next symbol table entry with the same hash value. The chain links can be followed until the correct symbol table entry is located or until the chain entry contains the value STN_UNDEF.
The DT_HASH entry of the dynamic section contains the address of the hash table section.
The dynamic string section is the repository for all strings referenced by the dynamic linking sections. Strings are referenced by using a byte offset within the dynamic string section. The end of the string is denoted by a byte containing the value zero.
The DT_STRTAB and DT_STRSZ entries of the dynamic section describe the attributes of the dynamic string section.
After the dynamic loader has created the process image and performed
relocations, each shared object file gets the opportunity to execute
initialization code.
The initialization functions are called in
reverse-dependency order.
Each shared object file's initialization functions are called
only after
the initialization functions for its dependencies have been executed.
All initialization of shared object files occurs
before the executable file gains control.
Similarly, shared object files can have termination functions that are executed by the atexit mechanism when the process is terminating. Termination functions are called in dependency order - the exact opposite of the order in which initialization functions are called.
Shared object files designate initialization and termination functions through the DT_INIT and DT_FINI entries in the dynamic structure. Typically, the code for these functions resides in the .init and .fini sections.
Note
Although atexit termination processing normally is done, it is not guaranteed to have executed when the process terminates. In particular, the process does not execute the termination processing if it calls _exit or if the process terminates because it received a signal that it neither caught nor ignored.
The quickstart capability provided by the assembler supports several sections that are useful for faster startup of programs that have been linked with shared library files. Some ordering constraints are imposed on these sections. The group of structures defined in these sections and the ordering constraints allow the dynamic loader to operate more efficiently. These additional sections are also used for more complete dynamic shared library file version control.
A shared object list section is an array of Elf32_Lib structures that contains information about the various dynamic shared library files used to statically link the shared object file. Each shared library file used has an entry in the array. Each entry has the following format:
typedef struct { Elf32_Word l_name; Elf32_Word l_time_stamp; Elf32_Word l_checksum; Elf32_Word l_version; Elf32_Word l_flags; } Elf32_Lib;
The structure members in the preceding structure definition provide
the following information:
The l_flags field can have one or both of the following flags set:
LL_EXACT_MATCH | At run time, use a unique ID composed of the l_time_stamp, l_checksum, and l_version fields to demand that the run-time dynamic shared library file match exactly the shared library file used at static link time. |
LL_IGNORE_INT_VER |
At run time, ignore any version incompatibility between
the dynamic shared library file and the shared library file used at
static link time.
Normally, if neither LL_EXACT_MATCH nor LL_IGNORE_INT_VER bits are set, the dynamic loader requires that the version of the dynamic shared library match at least one of the colon separated version strings indexed by the l_version string table index. |
The DT_MIPS_LIBLIST and DT_MIPS_LIBLISTNO entries of the dynamic section describe the attributes of the shared object list section.
Each .conflict section is an array of indexes into the .dynsym section. Each index entry identifies a symbol that is multiply defined in either of the following ways:
The shared library files that the shared object file depends on are identified at static link time.
The symbols identified in this section must be resolved by the dynamic loader, even if the object is quickstarted. The dynamic loader resolves all references of a multiply-defined symbol to a single definition.
The .conflict section is an array of Elf32_Conflict elements:
typedef Elf32_Word Elf32_Conflict;
The DT_MIPS_CONFLICT and DT_MIPS_CONFLICTNO entries of the dynamic section describe the attributes of the conflict section.
In order to take advantage of the quickstart capability, ordering constraints are imposed on the .rel.dyn section. The .rel.dyn section must have all local entries first, followed by the external entries. Within these subsections, the entries must be ordered by symbol index. This groups each symbol's relocations together.