ksymless Android

ARM64 port of ksymless. finds sys_call_table and reconstructs kallsyms_lookup_name on Android GKI without exported kernel symbols.

background

Since Linux 5.7 kallsyms_lookup_name is no longer exported. on Android GKI kernels this means kernel modules cannot look up symbols by name.

solution

the module starts from a single exported symbol (sprint_symbol) and rebuilds the full kallsyms lookup chain in 8 steps.

1. anchor

&sprint_symbol is a function pointer — the linker fills in the real address via R_AARCH64_ABS64 relocation. mask off the page offset to get the kernel base.

sprint_addr = (unsigned long)&sprint_symbol;
kernel_base = sprint_addr & ~0x1FFFFFULL;

2. sys_call_table

walk the x29 frame pointer chain out of the module init call. strip PAC signatures from return addresses. scan backwards for do_el0_svc frames.

inside do_el0_svc, the compiler emits an ADRP+ADD+B triplet to load sys_call_table:

unsigned long find_sct(struct fp_ret *frames, int nf)
{
    for (int i = nf - 1; i >= 0; i--) {
        unsigned long base = frames[i].addr - 128;
        scan_adrp_add(base, MAX_SCAN, adrps, MAX_ADRP);
        for (int j = 0; j < na; j++) {
            if (!adrps[j].has_b) continue;
            if (check_sct(adrps[j].target)) {
                sys_call_table_addr = adrps[j].target;
                return adrps[j].target;
            }
        }
    }
    return 0;
}

scan_adrp_add decodes the ARM64 instructions: ADRP extracts bits 21-16 of the page address, ADD supplies the page offset, and B gives the final target address of the instruction sequence after the ADRP+ADD pair.

3. BL chain

trace the BL (branch-and-link) instruction chain recursively from sprint_symbol. each BL instruction at offset i targets fn + i*4 + imm26*4. collect all visited functions (37 on this device).

int follow_bl(unsigned long fn, unsigned long *visited, int *nv_cnt, int depth)
{
    for (int i = 0; i < 256; i++) {
        unsigned int insn = bl_buf[i];
        if ((insn & 0xFC000000) != 0x94000000) continue;
        long imm26 = insn & 0x3FFFFFF;
        if (imm26 & 0x2000000) imm26 |= ~0x3FFFFFF;
        unsigned long tgt = fn + i * 4 + imm26 * 4;
        if (!is_ktxt(tgt)) continue;
        visited[(*nv_cnt)++] = tgt;
        follow_bl(tgt, visited, nv_cnt, depth - 1);
    }
}

4. ADRP pages

for each visited function, scan 256 instructions for ADRP instructions. extract the target page: (pc & ~0xFFF) + (imm << 12). collect 74 unique pages across all functions.

int collect_adrp_pages(unsigned long fn, unsigned long *pages, int max)
{
    for (int i = 0; i < 256 && n < max; i++) {
        unsigned int insn = buf[i];
        if ((insn & 0x9F000000) != 0x90000000) continue;
        unsigned long imm = ((insn >> 5) & 0x7FFFF) << 2 | ((insn >> 29) & 3);
        unsigned long page = ((fn + i * 4) & ~0xFFF) + (imm << 12);
        pages[n++] = page;
    }
}

5. find klbase

scan every ADRP page for an 8-byte value matching kernel_base. this value is kallsyms_relative_base (= _text).

for (int pi = 0; pi < total_pages && !klbase_addr; pi++) {
    for (int off = 0; off < 0x1000; off += 8) {
        unsigned long v;
        safe_read(&v, (void *)(page + off), 8);
        if (v == kernel_base) {
            klbase_addr = page + off;
            break;
        }
    }
}

6. find kloffs

kallsyms_offsets[0] is always 0 — the first symbol’s address equals kallsyms_relative_base. scan every 4-byte position in every ADRP page for a u32 value of 0.

verify each candidate by passing the first 4 offsets through sprint_symbol — the kernel’s own resolver. if any returns raw hex (0x...) instead of a symbol name, the candidate is rejected. sprint_symbol always returns the correct answer because it uses the kernel’s internal lookup path.

walk the sorted u32 sequence from each verified candidate. pick the one with the longest run. the real kallsyms_offsets has num_syms entries (126252 on these devices), unmatched by any other .rodata region.

// offsets[0] is always 0 (_text - relative_base)
if (v != 0) continue;

// sprint_symbol verify: first 4 offsets must resolve to real names
for (int i = 0; i < 4 && ok; i++) {
    sprint_symbol(name, klbase_val + read_u32(addr + i * 4));
    if (name[0] == '0' && name[1] == 'x')
        ok = 0;  // sprint_symbol returned raw hex = no symbol here
}
if (!ok) continue;

// count consecutive sorted u32 entries, pick the longest
int len = 0, prev = -1;
for (int i = 0; i < 500000; i++) {
    unsigned int v = read_u32(addr + i * 4);
    if ((int)v < prev) break;
    prev = (int)v;
    len++;
}
if (len > best_len) { best_len = len; best_addr = addr; }

7. layout derivation

from the kernel source (scripts/kallsyms.c) the data structures are laid out in .rodata:

num_syms (4B)
names (compressed)
markers (u32 × ceil(num_syms/256))
token_table (256 null-terminated strings)
token_index (256 × u16 = 512B)
offsets (u32 × num_syms)
relative_base (u64)
seqs_of_names (3B × num_syms)

all addresses are derived from klbase and kloffs:

klnum_val    = (klbase_addr - kloffs_addr) / 4;   // num_syms
klindex_addr = kloffs_addr - 512;                   // token_index
klseqs_addr  = klbase_addr + 8;                     // seqs_of_names

// find klnum_addr: match value == klnum_val in ADRP pages
// find kltable_addr: backward scan from token_index for 256 strings
// compute klmarks_addr and klnames_addr from layout

8. name lookup

implement both directions:

unsigned long kallsyms_name_to_addr(const char *name)
{
    int low = 0, high = (int)klnum_val - 1;
    while (low <= high) {
        int mid = low + (high - low) / 2;
        unsigned int seq = get_sym_seq(mid);      // name-sorted -> original idx
        unsigned int off = get_sym_offset(seq);    // idx -> names offset
        expand_sym(off, nbuf, sizeof(nbuf));       // decompress
        int r = strcmp(name, nbuf);
        if (r > 0) low = mid + 1;
        else if (r < 0) high = mid - 1;
        else return sym_addr(seq);                  // offsets[seq] + klbase
    }
    return 0;
}

verify with kprobe + sprint_symbol:

unsigned long test_addr = resolve("kallsyms_lookup_name");
sprint_symbol_no_offset(truth, test_addr);
sym_name_at(test_addr, our, sizeof(our));                // addr -> name
unsigned long lookup = kallsyms_name_to_addr(truth);     // name -> addr

conclusion

full kallsyms lookup without any exported symbol dependency. sym_name_at and kallsyms_name_to_addr match kernel values on device.

repo

github.com/Dere3046/ksymless_Android

references