Reverse Engineering OMNIVISION OS12D40 Driver

In Part 6 of our series, we structured a theoretical discussion regarding the devices, how they communicate, and the hardware elements that enable this.

Another very interesting folder that we come across during our investigation is /lib, which includes all the libraries for executables and device drivers developed by Novatek that allow the operating system to properly configure and set up all the hardware devices connected to the board. To begin, let’s introduce the topic by explaining what device drivers are and how they work on Linux-based systems.

Development of a Device Driver in Linux

The development of a device driver is done through the creation of a kernel module. Kernel modules are pieces of code that can be loaded at runtime dynamically. To load a kernel module, you can use the insmod name_module_to_load tool, which represents the path to the .ko file where the module is compiled and linked.

As the name implies, the main advantage of adopting modules is modularity. Each module is a unit in itself, each of which performs a specific task and is capable of interacting with the others. Do we need to disable a peripheral? We can only disable one module. Are we installing another one? We can enable another module. All of this without losing the other modules that manage entirely other peripherals. The concept of software modularity is very close, if not completely, to physical modularity ― in an “open” machine, I am free to install the peripherals I want with the devices I want.

The development of a device driver, however, is not risk-free. In fact, a device driver sits at the level between the applications, the operating system, and the devices to be controlled. So, it has privileged access over certain structures of the operating system. This can present problems such as race-condition and deadlock. Another issue concerns system integrity: If, by enabling a kernel extension, the module is able to access privileged sections, how can we prevent cyber attacks in case of intrusion? Challenges such as these remain open to this day.

Returning to Reolink, all kernel modules and extensions are contained within the /lib/modules folder and are loaded at system startup. Technically, a kernel module is an object file that is a fragment of executable code, which references external functions. .ko files are nothing more than executable files, so it is possible to reverse engineer them without adopting any special gimmicks.

For those interested in developing new modules for Linux, there is an excellent resource on the subject named "The Linux Kernel Module Programming Guide" which gives an excellent introduction to device driver development. This is recommended reading!

OMNIVISION Sensor Device Driver

The modules are loaded dynamically by the /etc/init.d/S10_SysInit2 script, which is started as soon as the operating system is loaded. Enumerated within this file are a number of modules to control, operate, and set up the various devices on the board ― including the optical sensor, artificial intelligence engine, and external peripherals.

[...]
insmod module_to_load.ko
[...]

What we want to do in this section is to choose a device driver and try-through some reverse engineering actions to understand how it works and how it interfaces with the system. For this purpose, we have chosen to explore one of the device drivers par excellence, the device driver for the OMNIVISION optical sensor.

The device driver for the optical sensor nvt_sen_os12d40.ko is located within the /lib/modules/4.19.91/hdal/sen_os12d40 folder, where os12d40 is the model of the sensor equipped in the Reolink RLC-810A camera. A quick inspection within the file utility shows that it is an ELF file for the ARM architecture.

nvt_sen_os12d40.ko: ELF 32-bit LSB relocatable, ARM, EABI5 version 1 (SYSV), BuildID[sha1]=01ccd81e7f0982593f7ca5b5c8c31f79e0e5f0aa, not stripped

In addition, we find that the binary contains some useful information since it is “not stripped.” Let’s take a step back to better understand what “stripped” means. When we create a program, if specified, the compiler makes sure to put some useful debugging information inside the file. This includes the name of the function used, the name of the variables. This is very valuable information for reverse engineering.

Is it useful to leave non-stripped binaries in production? Yes and no. On the one hand, it allows those who are more experienced to be able to better understand how a binary works and to be able to generate a more detailed core dump, but on the other hand it is all overhead that weighs on the size of each binary. Knowing this, it is practical for developers to include kernel modules with debug information to get more details in case of an error.

We gather more information regarding the kernel extension through the modinfo utility.

filename:       /lib/modules/4.19.91/hdal/sen_os12d40/nvt_sen_os12d40.ko
license:        GPL
description:    sen_os12d40
author:         Novatek Corp.
version:        1.40.000
srcversion:     E22953EBFA0A94EB836FD2B
depends:        kdrv_builtin,kwrap,kflow_videocapture
name:           nvt_sen_os12d40
vermagic:       4.19.91 SMP preempt mod_unload modversions ARMv7 
parm:           sen_cfg_path:Path of cfg file (charp)
parm:           sen_debug_level:Debug message level (int)

Let’s spend some time commenting on the result of the tool. The driver was developed by Novatek and GPL was specified as the license. At the moment, we can ignore the version field and the srcversion. The depends field is very important, because it specifies any dependencies of the kernel module on others ― in this case, the optical sensor module depends on kdrv_builtin, kwrap, and kflow_videocapture. There are two parameters that the extension accepts: the absolute path to the configuration file (charp: pointer to a char array) and the debug level (int: an integer), which specifies how verbose the module should be.

Speaking of licensing, when a module is declared GPL-compliant, the manufacturer should attach the original source code along with the object modules to allow end users to modify or add new functionality. In this case, it is not clear as to why Novatek did not release the source code despite having declared a GPL license.

Reverse Engineering via Ghidra

Ghidra represents another essential set of tools for reverse engineering. Developed by NSA through Java, Ghidra allows us to analyze, investigate, and perform various forms of analysis on a binary file.

We can extract the module and upload it to Ghidra. To do this, we create a new Ghidra project (Create new Project) and load it through the handy GUI (Load new binary...). Ghidra will ask us what kind of analysis should be performed on the binary. For now, let’s leave the ones it proposes by default, which are more than sufficient. To start the analysis, simply click the Analyze button.

Once the analysis by Ghidra is finished, we can start the exploration via the Symbol Tree panel at the bottom left. It is interesting to be able to see what operations the kernel module performs ― this is why we are interested in functions. We open the functions folder and search for the main function of the module.

For those of you who already have programming experience, you understand that the main function is “main.” In the context of kernel module development, it is not called main but init_module. Since the binary is not stripped, it contains the name of the functions, so just search within the Symbol Tree for the function init_module. Once found, double click to open the decompiler. Now let’s try to comment on the resulting code.

The decompiler is a feature of Ghidra that allows a more readable overview of the program instructions. Each program, in fact, has a section named .text containing assembly instructions. These instructions are very difficult to understand, because they require non-trivial knowledge of the executor architecture and the Assembly language. The decompiler allows for higher-level code, while preserving the original assembly code. Obviously, the decompiler does not allow you to derive the original source code, but it does allow you to have a good approximation of that code.

void init_module(void){
  undefined *__haystack;
  undefined4 uVar1;
  char *pcVar2;
  size_t sVar3;
  int *piVar4;
  int iVar5;
  code *local_134;
  undefined4 local_130;
  undefined1 *local_12c;
  char local_125 [257];
  int local_24;
  
  local_24 = __stack_chk_guard;
  iVar5 = 0;
  memset(local_125,0,0x101);
  do {
    uVar1 = kdrv_builtin_is_fastboot();
    *(undefined4 *)(is_fastboot + iVar5 * 4) = uVar1;
    uVar1 = isp_builtin_get_i2c_id(iVar5);
    *(undefined4 *)(fastboot_i2c_id + iVar5 * 4) = uVar1;
    uVar1 = isp_builtin_get_i2c_addr(iVar5);
    *(undefined4 *)(fastboot_i2c_addr + iVar5 * 4) = uVar1;
    __haystack = sen_cfg_path;
    iVar5 = iVar5 + 1;
  } while (iVar5 != 8);
  pcVar2 = strstr(sen_cfg_path,"null");
  if ((pcVar2 == (char *)0x0) && (pcVar2 = strstr(__haystack,"NULL"), pcVar2 == (char *)0x0)) {
    if ((__haystack != (undefined *)0x0) && (sVar3 = strlen(__haystack), sVar3 < 0x101)) {
      strncpy(local_125,__haystack,0x100);
    }
    piVar4 = (int *)sen_common_open_cfg(local_125);
    if (piVar4 == (int *)0x0) {
      printk(&DAT_00018376,"sen_init_os12d40");
    }
    else {
      sen_common_load_cfg_map(piVar4,&sen_map);
      sen_common_load_cfg_preset(piVar4,(undefined4 *)sen_preset);
      sen_common_load_cfg_direction(piVar4,(undefined4 *)sen_direction);
      sen_common_load_cfg_power(piVar4,(undefined4 *)sen_power);
      sen_common_load_cfg_i2c(piVar4,(undefined4 *)sen_i2c);
      sen_common_close_cfg(piVar4);
    }
  }
  else {
    printk(&DAT_00018356,"sen_init_os12d40");
    local_125[0] = '\0';
  }
  local_130 = 0;
  local_134 = sen_pwr_ctrl_os12d40;
  local_12c = os12d40_sen_drv_tab;
  iVar5 = ctl_sen_reg_sendrv("nvt_sen_os12d40",&local_134);
  if (iVar5 == 0) {
    iVar5 = sensor_info_proc_init("nvt_sen_os12d40");
  }
  else {
    printk(&DAT_000183a2,"sen_init_os12d40");
  }
  if (local_24 != __stack_chk_guard) {
    __stack_chk_fail(iVar5);
  }
  return;
}

Illegible as code. At first, variables like uVar1, sVar3 may scare the reverse engineer in you. This situation is not unexpected, since compilation does not preserve variable names. However, it is possible (with a little ingenuity) to achieve a similar nomenclature in the source code to improve code readability.

We start by cleaning up the code and marking some comments. To do this, we have two main options: add comments to the code within Ghidra (a bit cumbersome) or copy some of the source code and make changes within a text editor. Personally, I prefer the second option, since I can always see the decompiled code produced by Ghidra.

The first block of code we can notice is inside a do{..}while() loop in which a counter named iVar5 is used that varies from 0 to 7. This is nothing more than a for loop used, perhaps, to retrieve the state of the 8 LEDs provided by the OMNIVISION chip. For each LED, the following parameters are retrieved: a flag named fastboot, an id, and an address. In all probability we are dealing with a code of this type:

int is_fastboot[8];
int fastboot_i2c_id[8];
int fastboot_i2c_addr[8];

For now it does not matter what kind of data we are talking about, since we are dealing with a flag (first parameter), an id (a sequence number ― second parameter), and a memory address (third parameter). All of the data can be represented as integers. Furthermore, the fact that we have jumps in memory locations of 4 in 4 makes us assume that it is a 32-bit integer. The resulting code is:

for(int i = 0; i<8; i++){
  is_fastboot[i] = kdrv_builtin_is_fastboot();
  fastboot_i2c_id[i] = isp_builtin_get_i2c_id(i);
  fastboot_i2c_addr[i] = isp_builtin_get_i2c_addr(i);
}

Let’s now focus our efforts to better understand the meaning of each function. The kdrv_builtin_is_fastboot function returns true or false (guessable), depending on a condition named fastboot about which we know nothing at the moment. This function is probably contained in another module named kdrv_builtin.ko. The other two functions isp_builtin_get_i2c_id and isp_builtin_get_i2c_addr seem much more interesting, because they mention a communication system called I2C that allows serial communication between integrated circuits.

All device information travels in components named bus ― in the form of electrical signals. Since multiple devices send electrical signals through a shared bus, it is essential that there be a protocol that determines how the information is to be communicated. The protocol that was used in this case is I2C, which we discussed in the previous article.

Once the program has populated these structures, we can locate the first parameter that is passed to the module, which is the absolute path to the sensor configuration file sen_cfg_path. The second block of instructions involves reading and setting the sensor configuration file.

Within the second block, the strstr function is used to find the first occurrence of a string s2 within a string s1. Defined in the <string.h> library, the str function has the following syntax:

const *char strstr ( const *char str1, const *char str2 );

It returns a valid char pointer only in the case where string s2 is contained within string s1, otherwise a NULL pointer is returned. Given this, we know more or less what our kernel module does. From the code above, we get the following code by rewriting a few instructions:

char *sen_cfg_path;
char path_name[257];
int status = strstr(sen_cfg_path, "null");
if(status == NULL){

  if(strlen(sen_cfg_path) < 257 && sen_cfg_path != NULL){
    strncpy(path_name, sen_cfg_path, 256);
  }

  file_config_fd = sen_common_open_cfg(path_name);
  
  if(file_config_fd == 0){
    printk("Error");
  } else {
    sen_common_load_cfg_map(file_config_fd, &sen_map);
    sen_common_load_cfg_preset(file_config_fd, sen_preset);
    sen_common_load_cfg_direction(file_config_fd, sen_direction);
    sen_common_load_cfg_power(file_config_fd, sen_power);
    sen_common_load_cfg_i2c(file_config_fd, sen_i2c);
    sen_common_close_cfg(file_config_fd);
  }
}

The kernel module first compares the string to null. If there is no match (that is, if the sen_cfg_path contains anything other than the value NULL), then it proceeds to count how many characters there are inside the absolute path. If they are less than 257 (the size of the buffer allocated at the beginning), it copies 256 characters of sen_cfg_path inside the path_name buffer. 256 characters are copied and the last one, the 257th, is used for the end-of-string character \0.

Next, the kernel module opens the sensor configuration file by calling the sen_common_open_cfg function. The sen_common_open_cfg function calls sen_cfg_open, which in turn calls the vos_file_open function. The vos_file_open function is an identical copy of the open syscall of Unix-like operating systems. It takes as input the path to the file to open, the flags to open the file (whether write and read), and optionally another parameter named mode used only if file creation is specified. The function returns a file descriptor, a kind of unique ID that allows the opening of a file to be referred to by a convenient number.

This descriptor file is used to read and set the different sensor configurations. To delve deeper into what each function performs, we can proceed by obtaining the configuration file from the firmware. The configuration file is located inside the /mnt/src/sensor/ folder and is named sen_os12d40.cfg.

[MAP]
path_1 = 1                          #Path 1 Enable
path_2 = 0                          #Path 2 Disable
path_3 = 0                          #Path 3 Disable
path_4 = 0                          #Path 4 Disable
path_5 = 0                          #Path 5 Disable
path_6 = 0                          #Path 6 Disable
path_7 = 0                          #Path 7 Disable
path_8 = 0                          #Path 8 Disable

[PRESET]
id_0_expt_time = 10000              #10000us
id_0_gain_ratio = 1000              #1x gain

[DIRECTION]
id_0_mirror = 0                     #no mirror
id_0_flip = 0                       #no flip

[POWER]
id_0_mclk = 0                       #CTL_SEN_CLK_SEL_SIEMCLK
id_0_pwdn_pin = 0xFFFFFFFF          #no pwdn pin   
id_0_rst_pin = 0x44                    #S_GPIO_4
id_0_rst_time = 1                   #1ms
id_0_stable_time = 1                #1ms

[I2C]
id_0_i2c_id = 0                     #SEN_I2C_ID_1
id_0_i2c_addr = 0x36                #0x6C >> 1 = 0x36

As we can see, the file is divided into 5 main sections: [MAP], [PRESET], [DIRECTION], [POWER], and [I2C]. Each section corresponds to a particular function whose task is to read values from the file and interpret them. For example, for the section [MAP], there is the function sen_common_load_cfg_map and for the section [PRESET], there is the function sen_common_load_cfg_preset, and so on.

From here, we study only one type of function since all other functions follow the same type of pattern.

void sen_common_load_cfg_map(int cfg_fd, uint *ptr)
{
  char str_to_find [16];
  undefined result [512];
  size_t bR;
  for(int i = 0; i<8; i++){
    sprintf(str_to_find, "path_%u", i);
    bR = sen_cfg_get_field_str("MAP", str_to_find, result, 511, cfg_fd);
    if(bR < 1){
      printk("path_%u not exist \n", i+1);
    }
    else {
      flag = simple_strtoul(result, 0);
      if(flag == 1){
        *ptr = *ptr | 1 << (i & 0xff);
      }
    }
  }
}

In more detail, the sen_common_load_cfg_map function fetches 8 lines from the configuration file in the MAP section. Through the sprintf function, it concatenates the string “path_” to the number of paths to be searched (from 1 to 8). Next, it calls the sen_cfg_get_field_str function to place on the result buffer the string that corresponds to the str_to_find key, using the cfg_fd file descriptor. The function returns the number of bytes read from file.

Once the module is sure it has read something, it attempts to convert the string to an integer using the simple_strtoul function. If that value is 1, then we write into the pointer that was given as the ptr parameter a number. This number is computed in a strange way at first glance (1<< (i && 0xff)), but it makes sense.

We are used to seeing numbers almost always through base 10 notation. However, peripheral management, computing, and electronics require the use of another base ― the binary base. A number in binary notation requires 2 digits to be represented, either 0 or 1. These digits are very important, because they correspond to the physical state of any signal (0: off and 1: on). We can therefore have a number (e.g., 204 in base 10) as a sequence of signals to be applied to our circuit.

The statement *ptr | 1 << (i & 0xff) allows us to set the bit in the i-th position of the content pointed to by ptr to 1. This is visible as a number when printed with %d, but it actually represents a kind of array of bits. The result is written to the sen_map pointer. As we can guess from the configuration file, the bit array is used to enable or disable a set of paths.

The other sections are very similar. We have some flags to set the direction of the camera (mirror effect: id_0_mirror and flipped effect: id_0_flip), the source (Master clock: clk, reset: RST_PIN, reset time: RST_TIME). An interesting task for the reader might be to document what the other functions sen_common_load_cfg_preset, sen_common_load_cfg_direction, sen_common_load_cfg_power, and sen_common_load_cfg_i2c do.

Going back to the main init_module function, once the configuration file has been read, we can proceed with another block of code that is used to instantiate and register the device driver within the operating system.

iVar5 = ctl_sen_reg_sendrv("nvt_sen_os12d40",&local_134);
if (iVar5 == 0) {
    iVar5 = sensor_info_proc_init("nvt_sen_os12d40");
}
else {
    printk(&DAT_000183a2,"sen_init_os12d40");
}

The ctl_sen_reg_sendrv function is contained in another module named kvideocapture, which most likely contains the “core” code, mainly for handling webcam video. We now have two main avenues: delve deeper into the ctl_sen_reg_sendrv function to fully understand what goes on behind the scenes or ignore it and delve into the other functions that are instantiated in this module.

The sensor_info_proc_init function is quite interesting. This function takes care of creating a new entry in the proc folder named sensor_id. The proc folder is based on a virtual file system mounted each time the operating system starts up and contains some run-time information, such as information about active processes (descriptor file). It is also defined as a virtual interface to kernel structures. In addition, it can contain information about external peripherals and their usage. After a bit of tweaking, here is the code for this function reduced a bit to the bone:

int sensor_info_proc_init(){
  p_sensor_dir = proc_mkdir("nvt_sen_os12d40",0);
  if (p_sensor_dir == 0) {
    puVar1 = &DAT_00018302;
  }
  else {
    p_sensor_id = proc_create("sensor_id",0x16d,p_sensor_dir, proc_sensor_id_fops);
    if (p_sensor_id != 0) {
      return 0;
    }
    puVar1 = &DAT_0001832b;
  }
  printk(puVar1);
  if (p_sensor_id != 0) {
    proc_remove();
  }
  if (p_sensor_dir != 0) {
    proc_remove();
  }
  return 0xffffffea;
}

The function then creates the /proc/nvt_sen_os12d40 folder and then the /proc/nvt_sen_os12d40/sensor_id entry, which presubably contains information about the sensor type, manufacturer, and other data. The construction of a new entry is made possible by the proc_create function, which accepts 4 parameters: the name of the entry, the associated permission types, the parent folder, and proc_sensor_id_fops which represents what kind of access to give to that file for user processes.

This concludes the analysis of the init_module function. We now move on to analyze how the driver manages to set up and configure the device. We know from Part 6 that hardware devices connected to the core board have many ways they can be configured ― via PIO or interrupts. We also delved into two asynchronous communication protocols, UART and I2C. The OMNIVISION OS12D40 sensor uses I2C to communicate with the central board.

We can then proceed with decompiling two other functions that the driver provides ― sen_write_reg and sen_read_reg. The former is used to write some values to the control registers, while the latter is used to read the status and data registers.

The two functions sen_write_reg and sen_read_reg allow you to read/write data to registers by calling other functions, such as i2c_transfer which is implemented within the lib/i2c/i2c-core.ko library.

void sen_write_reg_os12d40(int reg_number,undefined4 *param_2){
    local_32 = 0;
    local_26 = (undefined)*param_2;
    local_27 = (undefined)((uint)*param_2 >> 8);
    local_30 = 3;
    local_2c = &local_27;
    local_25 = (undefined)param_2[2];
    content = (undefined2)*(int *)(sen_i2c + reg_number * 8 + 4);
    if (((*(int *)(is_fastboot + iVar1) == 0) ||
        (*(int *)(fastboot_i2c_id + iVar1) != *(int *)(sen_i2c + reg_number * 8))) ||
       (*(int *)(sen_i2c + reg_number * 8 + 4) != *(int *)(fastboot_i2c_addr + iVar1))) {
      iVar2 = 5;
      do {
        iVar1 = sen_i2c_transfer(reg_number,&content,1);
        if (iVar1 == 0) goto LAB_00012bc0;
        iVar2 = iVar2 + -1;
      } while (iVar2 != 0);
      iVar1 = -5;
    }
    else {
      isp_builtin_set_transfer_i2c(reg_number,&content,1);
      iVar1 = 0;
    }
}

We now understand how the driver for the OMNIVISION OS12D40 sensor is initialized and how the operating system creates the interface in order to interact. The next step will be to better understand how high-level applications interface with the device in order to control it.