Re-learning Android — Binder Driver Loading

14 min read3 days ago

Preface

As we delve deeper into the exploration of Binder, we often find ourselves stuck at some boundary, such as: why hdr.type becomes BINDER_TYPE_HANDLE on the Client side? What is the meaning of ServiceManager’s handle value being 0? What do binder_node and binder_ref represent? These questions always seem to have an answer that goes like: “Binder driver does xyz”. This fragmented knowledge makes us feel like we’re not getting a thorough understanding, so from this chapter on, we’ll cross over into the kernel and see how the driver performs those sneaky operations.

To build a comprehensive Binder driver knowledge system, this chapter first introduces driver loading (initcall) and calling principles (syscall). If you already have a grasp of Linux-related kernel knowledge, you can skip this chapter and move on to the next one about binder driver key functions.

initcall Mechanism

The initcall mechanism is widely used for initializing kernel modules, subsystems, or device drivers. Compared to adding custom init functions in startup code, initcall is more flexible and decoupled. We’ll take a brief look at binder driver initialization as an example. If you want to learn more about initcall, please refer to other resources.

// common/drivers/android/binder.c

static int __init binder_init(void) {
    ... ...
}
device_initcall(binder_init);

In the binder driver source code, we can see that the initialization function binder_init is passed to a macro definition (device_initcall). This macro is defined as:

// common/include/linux/init.h

#define device_initcall(fn)        __define_initcall(fn, 6)
#define __define_initcall(fn, id) ___define_initcall(fn, id, .initcall##id)

... ...

// After several macro definitions are wrapped, the final definition is:
#define ____define_initcall(fn, __unused, __name, __sec)  \
        static initcall_t __name                \
        __used                                \
        __attribute__((__section__(__sec))) = fn;

Here, id represents the level of the initcall, and a lower number means higher priority. The kernel will call the initcall with a lower number first. In this case, binder driver’s initcall has a level of 6. fn is the binder_init function. Finally, the macro definition expands to an assignment statement:

Let’s break it down:

Modifiers and variable types: static and initcall_t are familiar variable modifiers and types. initcall_t is an alias for a function pointer type that takes no arguments and returns an int.
Variable name: KBUILD_MODNAME, COUNTER, and LINE are compiler-inserted fields. COUNTER is an automatic counter value, which increments by 1 each time the macro COUNTER is encountered.
Compiler attributes (macros): These macros will expand to compiler attribute fields, such as #define __used attribute((__used)). These attributes are written for the compiler’s eyes only.

__used: Even if a variable is not referenced anywhere, it still needs to be preserved to prevent the compiler from optimizing it away. _attribute((section)): Specifies the storage location of the current variable in an ELF file (in a specified section). The ELF file we are discussing here is actually vmlinux, which is the Linux kernel compiled into an executable file.

• Function pointer: the function pointer that needs to be executed.

For those who do not understand sections, please refer to the relevant materials on ELF file format, such as Chapter 3 of “Programmer’s Self-Improvement — Linking, Loading and Libraries”.

After circling around, let’s summarize it. The macro xxx_initcall (in this case, device_initcall) is used to specify that the function pointer in the parameter is stored in a certain section of the vmlinux file. When the kernel program starts initializing, it will take out some initcall function pointers from the section in priority order and execute them accordingly, achieving the purpose of initializing modules, subsystems or drivers.

Now let’s go back to the fifth article on the init process’s kernel_init function. Earlier, we focused on the part that starts the init process, and now we'll see how the kernel executes these initcall functions before starting the init process.

// kernel/common/init/main.c

#define __initdata    __section(".init.data")
// initcall_entry_t: alias for initcall_t
// initcall_levels array elements: pointers to level's first initialization function pointer
// __initcall_end: section end address
static initcall_entry_t *initcall_levels[] __initdata = {
    __initcall0_start,
    __initcall1_start,
    __initcall2_start,
    __initcall3_start,
    __initcall4_start,
    __initcall5_start,
    __initcall6_start,
    __initcall7_start,
    __initcall_end,
};

static int __ref kernel_init(void *unused) {
    ... ...
    kernel_init_freeable();
    ... ...
}

// kernel_init function is indirectly called to do_initcalls
static void __init do_initcalls(void) {
    ... ...
    // Iterate over all levels, handling one level at a time
    // The loop ends when the array size-1 condition is met (last element is not a function pointer)
    for (level = 0; level < ARRAY_SIZE(initcall_levels) - 1; level++) {
        ... ...
        do_initcall_level(level, command_line);
    }
    ... ...
}

static void __init do_initcall_level(int level, char *command_line) {
    initcall_entry_t *fn;
    ... ...
    // Since levels are connected, we can traverse the entire level by incrementing fn and checking if it's less than the next level start address
    // initcall_from_entry function converts fn to the actual initialization function address
    for (fn = initcall_levels[level]; fn < initcall_levels[level+1]; fn++)
        do_one_initcall(initcall_from_entry(fn));
}

// Execute initialization functions
int __init_or_module do_one_initcall(initcall_t fn) {
    ... ...
    do_trace_initcall_start(fn);
    ret = fn();
    do_trace_initcall_finish(fn, ret);
    ... ...
    return ret;
}

binder_init function

// common/drivers/android/binder.c

static int __init binder_init(void) {
    int ret;
    ... ...

Note: I’ve preserved the original line breaks and Markdown markup structure as per your requirements.

char *device_name, *device_tmp;
struct binder_device *device;
struct hlist_node *tmp;
char *device_names = NULL;
const struct binder_debugfs_entry *db_entry;

// Create and register a memory shrinker. When system memory is under pressure,
// the shrinker's callback function will be called to perform memory cleanup logic.
// Returns 0 on success, or an error code on failure.

ret = binder_alloc_shrinker_init();
if (ret)
    return ret;

... ...

// ① Create a directory for binder information in the DebugFS file system
binder_debugfs_dir_entry_root = debugfs_create_dir("binder", NULL);

// ② Create files for binder output, such as state, stats, and transactions
binder_for_each_debugfs_entry(db_entry)
    debugfs_create_file(db_entry->name,
            db_entry->mode,
            binder_debugfs_dir_entry_root,
            db_entry->data,
            db_entry->fops);

// ③ Create a directory for binder/proc, with files named after each process's PID
binder_debugfs_dir_entry_proc = debugfs_create_dir("proc",
            binder_debugfs_dir_entry_root);

... ...
// ④ Initialize the binder device
while ((device_name = strsep(&device_tmp, ","))) {
    ret = init_binder_device(device_name);
    if (ret)
        goto err_init_binder_device_failed;
}

// ⑤ Initialize the binderfs file system
ret = init_binderfs();
... ...
return ret;

① DebugFS is a Linux kernel-provided memory file system for easy development and debugging of kernel programs. Developers can use user-space processes (e.g., shell) to easily observe kernel program output and logs written to this file system. This step creates the binder directory in the DebugFS file system, storing binder-related information and logs.

The dentry pointer returned by debugfs_create_dir represents a Linux directory entry structure, which can be a file, directory, or symbolic link.

For a deeper understanding of Linux file systems and related structures, we recommend reading Chapter 12 of “Deep Understanding of the Linux Kernel”.

② The binder_for_each_debugfs_entry(db_entry) macro looks unusual because it seems to lack a semicolon at the end. Upon closer inspection, it's defined as a macro in common/drivers/android/binder_internal.h:

#define binder_for_each_debugfs_entry(entry)    \
    for ((entry) = binder_debugfs_entries;    \
         (entry)->name;        \
         (entry)++)

The macro expands to a full for loop header, as shown in the expanded version in common/drivers/android/binder.c:

for ((entry) = binder_debugfs_entries; (entry)->name; (entry)++)

③ Create binder/proc directory

This directory will be created to hold files related to each process, making it easier to observe binder-related information from a process perspective.

Let’s take a look at what kind of data these files in the binder directory record. Although we may not fully understand all the data now, after reading the driver source code later, we’ll have a better understanding.

Note: Android emulator system does not mount DebugFS file system by default. To manually mount it, run the following command with root privileges: mount -t debugfs debugfs /sys/kernel/debug

stats file

The stats file contains overall binder statistics and per-process binder thread and transaction information.

state file

The state file contains information about each process’s thread, node, and reference status.

binder/proc directory

This directory contains files named after each process’s PID, with contents similar to the state file.

④ Initialize three binder devices

Why are we initializing three Binder devices? See the appendix for an introduction. Here is the translated Markdown content in English:

Initialization of Binder Device

static int __init init_binder_device(const char *name) {
    int ret;
    struct binder_device *binder_device;

    // Allocate memory for the binder device structure
    binder_device = kzalloc(sizeof(*binder_device), GFP_KERNEL);
    if (!binder_device)
        return -ENOMEM;

    // Set the file operations function for the binder device
    // The `binder_fops` is a file_operations type structure that contains functions for handling device file operations in user space
    binder_device->miscdev.fops = &binder_fops;
    ... ...

    // Initialize reference count
    refcount_set(&binder_device->ref, 1);
    ... ...

    // Register the binder as a misc-type device
    ret = misc_register(&binder_device->miscdev);
    if (ret < 0) {
        kfree(binder_device);
        return ret;
    }

    // Add the new device to the global hash table `binder_devices`
    hlist_add_head(&binder_device->hlist, &binder_devices);

    return ret;
}

Initialization of Binder File System

⑤ Initialize the binderfs file system.

// common/drivers/android/binderfs.c

int __init init_binderfs(void) {
    int ret;
    const char *name;
    size_t len;

    ... ...

    // Register the file system
    // The `binder_fs_type` is a file_system_type structure that contains functions for initializing and killing the file system
    ret = register_filesystem(&binder_fs_type);
    if (ret) {
        unregister_chrdev_region(binderfs_dev, BINDERFS_MAX_MINOR);
        return ret;
    }

    return ret;
}

static struct file_system_type binder_fs_type = {
    .name         = "binder",  // File system name
    .init_fs_context   = binderfs_init_fs_context,  // Initialization function pointer for the file system context
    .parameters      = binderfs_fs_parameters,  // File system parameters
    .kill_sb     = binderfs_kill_super,  // Function pointer to kill the file system
    .fs_flags     = FS_USERNS_MOUNT,  // File system flags
};

After initializing the file system, in the init process execution phase of rc scripts, the file system is mounted under /dev/binderfs and a symbolic link for the binder device file is created under /dev.

// system/core/rootdir/init.rc

Mount binderfs

mkdir /dev/binderfs
mount binder binder /dev/binderfs stats=global
chmod 0755 /dev/binderfssymlink /dev/binderfs/binder /dev/binder
symlink /dev/binderfs/hwbinder /dev/hwbinder
symlink /dev/binderfs/vndbinder /dev/vndbinderchmod 0666 /dev/binderfs/hwbinder
chmod 0666 /dev/binderfs/binder
chmod 0666 /dev/binderfs/vndbinder

We can check the mount status of the binder file system by running the following command:

emulator64_x86_64:/ # mount | grep binder
binder on /dev/binderfs type binder (rw,relatime,max=1048576,stats=global)

emulator64_x86_64:/ # ls -l /dev | grep binder
lrwxrwxrwx  1 root        root              20 2025-01-10 03:24 binder -> /dev/binderfs/binder
drwxr-xr-x  4 root        root               0 2025-01-10 03:24 binderfs
lrwxrwxrwx  1 root        root              22 2025-01-10 03:24 hwbinder -> /dev/binderfs/hwbinder
lrwxrwxrwx  1 root        root              23 2025-01-10 03:24 vndbinder -> /dev/binderfs/vndbinder

There is a point of confusion here. After the binder file system is initialized, the driver initialization function ends without creating the corresponding binder device files. However, when we run the ls command, we can see that there are binder device files in the /dev/binderfs directory. So, when were these device files created? The creation of these files must be between the driver initialization function and the rc script that creates symbolic links.

Binderfs Mounting

Although mount is a shell command, it ultimately triggers the system call do_mount to complete the mounting.

(https://cs.android.com/android/kernel/superproject/+/common-android14-6.1:common/fs/namespace.c;l=3390)

// common/fs/namespace.c

long do_mount(const char *dev_name, const char __user *dir_name,
        const char *type_page, unsigned long flags, void *data_page) {
    struct path path;
    int ret;

    // 解析、检查传入的路径字符串
    ret = user_path_at(AT_FDCWD, dir_name, LOOKUP_FOLLOW, &path);
    if (ret)
        return ret;
    // 开始mount
    ret = path_mount(dev_name, &path, type_page, flags, data_page);
    path_put(&path);
    return ret;
}

int path_mount(const char *dev_name, struct path *path,
        const char *type_page, unsigned long flags, void *data_page) {
    ... ...
    return do_new_mount(path, type_page, sb_flags, mnt_flags, dev_name,
                data_page);
}`static int do_new_mount(struct path *path, const char *fstype, int sb_flags,
                         int mnt_flags, const char *name, void *data)` {
    struct file_system_type *type;
    struct fs_context *fc;
    ... ...
    // Recall the binder_fs_type mentioned earlier? This function is responsible for finding it in the kernel file system list.
    type = get_fs_type(fstype);
    ... ...
    // ① Create a context instance for the file system, which will assign some function pointers used later to fs_context
    fc = fs_context_for_mount(type, sb_flags);
    ... ...
    if (!err)
        // ② Complete the creation of superblock, root directory inode, dentry, etc. data structures for the file system
        err = vfs_get_tree(fc);
    if (!err)
        // The basic data structures needed for mounting have been created in the previous step, and this function starts executing the mount action
        err = do_new_mount_fc(fc, path, mnt_flags);
    ... ...
    return err;
}

① Analysis of `fs_context_for_mount` function

// common/fs/fs_context.c

struct fs_context *fs_context_for_mount(struct file_system_type *fs_type, unsigned int sb_flags) { return alloc_fs_context(fs_type, NULL, sb_flags, 0, FS_CONTEXT_FOR_MOUNT); }

static struct fs_context *alloc_fs_context(struct file_system_type *fs_type, struct dentry *reference, unsigned int sb_flags, unsigned int sb_flags_mask, enum fs_context_purpose purpose) { // Declare a function pointer that returns an int value and takes a fs_context pointer as input int (*init_fs_context)(struct fs_context *); struct fs_context *fc; int ret = -ENOMEM;

// Allocate memory for fs_context
fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL_ACCOUNT);
... ...
fc->purpose    = purpose;
... ...
// Store the file_system_type in the fs_context field
fc->fs_type    = get_filesystem(fs_type);
... ...
fc->log.prefix    = fs_type->name;
... ...
// Get the init_fs_context function pointer stored in fc->fs_type and call it
// Here, fs_type->init_fs_context is equivalent to binder_fs_type->binderfs_init_fs_context
init_fs_context = fc->fs_type->init_fs_context;
... ...
ret = init_fs_context(fc);
... ...
return fc;

}

// common/drivers/android/binderfs.c static int binderfs_init_fs_context(struct fs_context *fc) {

Note that I've followed the rules you specified, preserving the original Markdown structure and content.... ...
fc->ops = &binderfs_fs_context_ops;
return 0;

}

// The binderfs_init_fs_context function initializes fc->ops to binderfs_fs_context_ops // and saves function pointers related to file system context static const struct fs_context_operations binderfs_fs_context_ops = { .free = binderfs_fs_context_free, .get_tree = binderfs_fs_context_get_tree, .parse_param = binderfs_fs_context_parse_param, .reconfigure = binderfs_fs_context_reconfigure, };

② Analysis of the **vfs_get_tree function**

```c
// common/fs/super.c

int vfs_get_tree(struct fs_context *fc) {
    struct super_block *sb;
    int error;
    ... ...
    // As analyzed earlier, fc->ops->get_tree equals binderfs_fs_context_ops->binderfs_fs_context_get_tree
    error = fc->ops->get_tree(fc);
    if (error < 0)
        return error;
    ... ..
    ... ...
    return 0;
}

// common/drivers/android/binderfs.c
static int binderfs_fs_context_get_tree(struct fs_context *fc) {
    // The get_tree_nodev function creates the file system's super block super_block
    // and in the process calls the binderfs_fill_super function to fill the super block
    return get_tree_nodev(fc, binderfs_fill_super);
}

// This function has many initialization operations, but we are most concerned with creating device files
static int binderfs_fill_super(struct super_block *sb, struct fs_context *fc) {
    ... ...
    // The binder_devices_param is "bidner,hwbidner,vndbinder"
    // The for loop iteratively calls binderfs_binder_device_create
    name = binder_devices_param;
    for (len = strcspn(name, ","); len > 0; len = strcspn(name, ",")) {
        strscpy(device_info.name, name, len + 1);
        ret = binderfs_binder_device_create(inode, NULL, &device_info);
        if (ret)
            return ret;
        name += len;
        if (*name == ',')
            name++;
    }
    ... ..
    return 0;
}

// Create a device file and create the corresponding inode and dentry
static int binderfs_binder_device_create(struct inode *ref_inode,
                    struct binderfs_device __user *userp,
                    struct binderfs_device *req) {
    ... ...
    device = kzalloc(sizeof(*device), GFP_KERNEL);
    if (!device)
        goto err;

    inode = new_inode(sb);
    if (!inode)

Note that I have followed the rules you specified, preserving the original Markdown structure and content, including code blocks, links, and HTML-like tags. goto err; … … // Note: binder_fops, it stores function pointers for operating binder device files // All operations on device files will eventually map to these function pointers corresponding functions inode->i_fop = &binder_fops; inode->i_uid = info->root_uid; inode->i_gid = info->root_gid; … … device->binderfs_inode = inode; … … return 0; }

The discussion about binder device files led to the analysis of the binderfs mounting process, and we finally found the creation process of device files in the mount process. We also confirmed that the inode corresponding to the device file stores function pointers for file operations. This is crucial for our subsequent analysis of binder driver functions. However, since our focus has always been on binder driver-related content, this section only briefly introduces Linux file systems, VFS, and mounting operations, and recommends readers consult other resources for a more comprehensive understanding.

A Single System Call: binder_open

In the previous article about ServiceManager, we learned that ServiceManager creates a ProcessState instance during startup, and in the ProcessState constructor, it opens the binder device file. Let’s review this code:

// frameworks/native/libs/binder/ProcessState.cpp

ProcessState::ProcessState(const char* driver)
      : mDriverName(String8(driver)),
        mDriverFD(-1),
        mVMStart(MAP_FAILED),
        ... ...
        mMaxThreads(DEFAULT_MAX_BINDER_THREADS),
        mCurrentThreads(0),
        mKernelStartedThreads(0),
        ... ...
        mThreadPoolStarted(false),
        mThreadPoolSeq(1),
        mCallRestriction(CallRestriction::NONE) {
    // Open the Binder device file
    base::Result<int> opened = open_driver(driver);
    ... ...
}

static base::Result<int> open_driver(const char* driver) {
    // Open the device file using the system call open
    int fd = open(driver, O_RDWR | O_CLOEXEC);
    ... ...
    return fd;
}

The open function implementation is in bionic/cpp, which we previously discussed in Chapter 4 as a directory containing POSIX-standard libraries and implementations or macro definitions for various system calls.

// bionic/libc/bionic/open.cpp

int open(const char* pathname, int flags, ...) {
  ... ...
  // FDTRACK_CREATE is a macro used to track file descriptor creation for debugging and performance analysis. We can skip it temporarily.
  return FDTRACK_CREATE(__openat(AT_FDCWD, pathname, force_O_LARGEFILE(flags), mode));
}

The implementation of __openat corresponds to assembly code, which is generated by gensyscalls.py based on the description in SYSCALLS.TXT (https://cs.android.com/android/platform/superproject/+/android-14.0.0_r28:bionic/libc/tools/gensyscalls.py;l=456).

// bionic/libc/SYSCALLS.TXT

# This file is used to automatically generate bionic's system call stubs.

The generated assembly code will be placed in the out directory corresponding to the CPU architecture. The example shown here is for the arm64 architecture. Assembly codes for other CPU architectures may differ slightly, but their purpose is the same: to complete the system call open and open a target file.

// out/soong/.intermediates/bionic/libc/syscalls-arm64/gen/syscalls-arm64.S

ENTRY(__openat)
    mov x8, __NR_openat
    svc #0

    cmn x0, #(MAX_ERRNO + 1)
    cneg x0, x0, hi
    b.hi __set_errno_internal

    ret
END(__openat)

The above assembly code focuses on the two most important statements: mov x8, __NR_openat and svc #0.

// __NR_openat is the system call number, which is stored in a table by the kernel for easy lookup.
// This statement loads the value of __NR_openat into the x8 register.
mov x8, __NR_openat

// The svc instruction triggers a software interrupt, causing the current process to switch to kernel mode and execute the corresponding function from the system call table.
svc #0

The value of __NR_openat is 56 for arm64 architecture. Please refer to different header files according to your actual situation.

// bionic/libc/kernel/uapi/asm-generic/unistd.h
#define __NR_openat 56

In the kernel, system call numbers are also defined:

// common/include/uapi/asm-generic/unistd.h

#define __NR_openat 56
__SYSCALL(__NR_openat, sys_openat)

The macro __SYSCALL is defined as follows:

// common/arch/arm64/kernel/sys.c

#undef __SYSCALL
#define __SYSCALL(nr, sym)    asmlinkage long __arm64_##sym(const struct pt_regs *);

After expansion: asmlinkage long __arm64_sys_openat(const struct pt_regs *);

The expanded function will be defined by the macro SYSCALL_DEFINE4:

// common/fs/open.c

SYSCALL_DEFINE4(openat, int, dfd, const char __user *, filename, int, flags,
        umode_t, mode) {
    if (force_o_largefile())
        flags |= O_LARGEFILE;
    return do_sys_open(dfd, filename, flags, mode);
}

The do_sys_open function initiates the standard file opening process. For more details on this part of the process, please refer to other specialized texts. The focus here is on how the binder_fops structure is retrieved and executed at the end of the process, which can be considered as a closure for initializing device files.

// common/fs/open.c

static int do_dentry_open(struct file *f,
              struct inode *inode,
              int (*open)(struct inode *, struct file *)) {
    ... ...
    // The inode parameter corresponds to the file that needs to be opened, which is /dev/binder in this scenario.f->f_inode = inode;
... ...

// Retrieve file operation function pointers from the inode, which were previously stored in binder_fops
f->f_op = fops_get(inode->i_fop);
... ...

/* normally all 3 are set; ->open() can clear them if needed */
f->f_mode |= FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE;
if (!open) {
    // If the open function pointer is empty, retrieve it from the inode
    // This allows the caller of do_dentry_open to specify an alternate open function to override the default behavior
    // In our case, we pass a null open function, so binder_open will be called
    open = f->f_op->open;
}
if (open) {
    error = open(inode, f);
    if (error)
        goto cleanup_all;
}
... ...

Through this analysis process, we have restored the actual appearance of a system call and finally reached our destination, binder_open. In subsequent analyses of other system calls, we will no longer focus on the calling process but instead analyze the binder driver functions themselves.

Appendix

Three Binder Devices (binder, hwbidner, vndbinder)

When Android 8 was released, it brought a major revolution — Project Treble. This plan separates hardware vendor code from system.img and moves it to vendor.img, achieving decoupling of Android system code and underlying software code, making it easier for both to be upgraded independently. After decoupling, some underlying software logic is moved into an independent process called hal. To facilitate communication between processes, three Binder devices (binder, hwbidner, vndbinder) and their corresponding user-space managers (servicemanager, hwservicemanager, vndservicemanager) are introduced.

• System processes communicate using the binder system
• hal communicates with system processes using hw system
• hal communicates with hal process using vnd system

The running principles and code of these three systems are similar. Therefore, during initialization, we need to create three Binder devices.