Patch Diffing Entire Firmware Images, Squeezing Out Bugs

TL;DR

Firmware patch diffing is a relatively under-documented process, but one that can be really important when doing IoT security research. In this post, I’m going to pick apart a Netgear firmware image, and figure out the root cause for one of the bugs in it.

We’ve also just released the first revision of our GraphQL API, which we’ve started using internally to speed up a number of repetitive tasks. One of those is patch diffing, so there will be some light references to the API in here today too.

What Do We Actually Mean By Patch Diffing?

As part of our ongoing research, we often “patch diff” firmware images. We might do this to identify the root causes of bugs we haven’t discovered ourselves, or to figure out exactly how a vendor has patched an issue we disclosed to them.

Patch diffing is the process of cross-comparing the vulnerable and fixed versions of a product. With this, and some light metadata, more often than not we will be able to pin down where the product has changed, and therefore insinuate exactly how it was vulnerable in the first place and how the issue was fixed.

Generic diff of a Lua script.
Diffing helps us figure out exactly how issues have been patched, and what the problem was in the first place.

Traditionally, patch diffing was done with two compiled binaries. Using tools like IDA, BinDiff, Ghidra, etc. it’s (relatively) simple to see exactly where and how a compiled executable has changed between revisions. However, turning this into actual insight is not necessarily a painless process. You have to be comfortable enough staring at endless assembly, figuring out exactly where the bug you are looking for might have been. And if there are lots of other non-security related changes made between revisions, you’ll also have to spend a lot of time sifting through noise.

Patch diffing firmware then adds another layer to the puzzle. For a start, we often have to work with entire filesystems, rather than the single patched file in isolation. We need to be able to drill down within a firmware image to focus only on files, which have changed over a given revision. We then need to compare each of those files to figure out exactly which section within each file is relevant to the bug, and which changes are less relevant for us.

This can be quite time-consuming, especially if you do a lot of this kind of work. Lucky, we’ve recently added a GraphQL API to make diffing and querying firmware images for this purpose a bit quicker and easier.

Drilling Down on Potentially Interesting Bugs

Let’s identify a bug we want to figure out the root cause for. Of course, I’m interested in all bugs, but near the top of my bugs-of-interest list is exploitable WAN-side router bugs.

Netgear recently released a series of patches for a whole swathe of their devices. The details in the official advisory are thin (read: there are absolutely no details), as Netgear has bundled 9 different fixes into one post and communicated close to nothing regarding the actual issues they’ve fixed.

Netgear advisories, February 2021.
There are “multiple” issues in “some” products.

However, many of the issues also have ZDI IDs, which mean that they were disclosed by the Zero Day Initiative (ZDI). The ZDI recently published a few advisories related to their 2020 Mobile Pwn2Own , a couple of which were related to a WAN-side RCE bug in the Netgear R7800. I’ve had my eye on that one since November, so let’s look at the details provided by ZDI.

WAN-Side Fun (CVE-2021-27251)

The issue we’re going to focus on is the snappily-titled “NETGEAR Nighthawk R7800 ready-genie-cloud Insecure Download of Critical Component Remote Code Execution Vulnerability”, aka CVE-2021-27251, used by “Team Flashback”. It affects firmware version 1.0.2.78 (and likely many before) of the R7800, but was fixed in version 1.0.2.80. At the time we did this diff, IoT Inspector wasn’t picking bugs exactly like this, so it warranted a look into to enhance our detection capabilities.

Armed with what we know from the advisory and both the vulnerable and fixed firmware versions, we can start trying to find the root causes of this issue.

Shall I Compare Thee to A Version 1.0.2.80?

While some firmware update blobs may contain only the files, which have been patched, this is quite uncommon in router firmware. The firmware blobs we get from Netgear both contain pretty much the entire device filesystem.

R7800 firmware version 1.0.2.78 filesystem open in 7zip
The 1.0.2.78 squashfs filesystem as seen in 7zip.

This means that in each firmware image, not only are we going to have the changed files we’re interested in diffing, we’re also going to be sifting through files, which haven’t been modified between revisions.

In order to lighten our workload, we need a way to figure out exactly which files have changed between these two revisions.

Hashing Everything

Probably the most efficient way to do this is to hash every file in the image. Then, for each comparable file, check whether the hash has changed between versions.

If you’re an IoT Inspector user, you can use our GraphQL API to query firmware images. The GraphQL API exposes a firmware type, for which you can list all the files and their associated hashes. I’m going to quickly go over how you might query the API so if you’re not (yet 😊) an IoT Inspector user, you can feel free to skip this bit (or get in touch for a free demo).

The query is pretty simple. For the given firmware, we can ask for a list of all the files, along with properties for each. Notably, we can also ask for the SHA256 hash for the file (SHA1 and MD5 are also available for those who like multiple hash types).

{ 
  firmware(id: "6d5dfcbd-6d92-43df-b0bf-770ae1760ecf") { 
    files { 
      path 
      name 
      binary 
      fileType 
      size 
      hash { 
        sha256 
      } 
    } 
  } 
}

The response is just a JSON object we can easily ingest into a script.

{"data": { 
  "firmware": { 
    "files": [ 
      { 
        "path": "/R7800-V1.0.2.78.img", 
        "name": "R7800-V1.0.2.78.img", 
        "size": 30075009, 
        "binary": true, 
        "fileType": "FILE_UNKNOWN", 
        "hash": { 
            "sha256": "330b77f348bc97849800b9f3af7d1fe52fb6e145525ad494149d7a202c0c3cbf " 
        } 
      }, 
      { 
        "path": "/_R7800-V1.0.2.78.img.extracted/[", 
        "name": "[", 
        "size": 17, 
        "binary": false, 
        "fileType": "FILE_UNKNOWN", 
        "hash": { 
            "sha256": "6afeed78bb88f94590401d627865d72f2a86fd1cc8ff23599460c2f2a24db2b6" 
        } 
      }, 
...etc...

Quick Aside: The DIY Version

It won’t give you quite so much granular information for each file, but Quentin Kaiser has done a really nice writeup on “ghetto” patch diffing some Cisco firmware. His approach was simply to list every file within each extracted firmware, md5sum, and store the results to a file. The resulting files can then be diffed using, well, diff. This is perhaps a little more work, but certainly a totally functional solution, especially for less complex device images.

Once we have the list for all files for a given firmware image, we can filter and compare them however we want. In this case, since this particular Netgear model uses opkg, there’s quite a lot of files in /usr/lib/opkg/, which vary very slightly between firmware versions, but which are basically only noise to us.

Barely any difference between two opkg control files.
Extremely arbitrary one-byte difference in an opkg .control file. There’s a lot of these.

Again, we can filter out plaintext files, which have different hashes, but the same exact size, in /usr/lib/opkg, in our client-side script logic.

With a little tweaking, we have a pretty short list of files, which may be candidates for further analysis.

Whittling Down

From what we know from the ZDI’s CVE-2021-27251 advisory, the affected component is “ready-genie-cloud”, and the issue stems from “a fallback to a[n] insecure protocol to deliver updates”.

There’s not an obvious “ready-genie-cloud” file in our list of changed files, but there is a /sbin/cloud.

Interestingly, there wouldn’t be much use in diffing these files directly. In version 1.0.2.78, /sbin/cloud is a shell script, and in 1.0.2.80 it’s an ELF binary.

{ 
  'binary': False,
  'fileType': 'FILE_SHEBANG', 
  'hash': {'sha256': '37e7f3c294aef429c64fc647a2ffc8a21184d5cb4f6cdef845498c45e0871a35'}, 
  'name': 'cloud', 
  'path': '/_R7800-V1.0.2.78.img.extracted/squashfs-root/sbin/cloud', 
  'short_path': 'sbin/cloud', 
  'size': 4581
}

 

{ 
  'binary': True,
  'fileType': 'FILE_ELF', 
  'hash': {'sha256': '478df6a55689bbb294d122af95fa49c5bf2e0418da9e0880a731459c82a00ab7'}, 
  'name': 'cloud', 
  'path': '/_R7800-V1.0.2.80.img.extracted/squashfs-root/sbin/cloud', 
  'short_path': 'sbin/cloud', 
  'size': 10536
}

We can’t do a precise diff, but we can do a quick manual overview to see if we’ve found our root cause.

Checking for Vulns

The old /sbin/cloud is a shell script. It contains several callable functions, however the one which looks the most promising is update():

https_url="https://http.fw.updates1.netgear.com/sw-apps/ready-genie-cloud/r7800"
ftp_url="ftp://updates1.netgear.com/sw-apps/ready-genie-cloud/r7800"
[...snip...]

update() {
# local cloud_binary_install=$(/bin/config get cloud_binary_install)
[ -f /tmp/.cloud_updated ] && return 1
PID_file=/var/run/cloud.pid
[ -f $PID_file ] && return 1
# install_local
echo "$$" > $PID_file
retry_count=0
while [ 1 ]; do
echo "start to get info from $https_url"
curl -L --capath /etc/ssl/certs $https_url/fileinfo.txt -o /tmp/cloud_info 2>/dev/null
url_way="https"
[ -s /tmp/cloud_info ] && break
echo "cannot $https_url/ or don't find readygeniecloud tarball with version $version"
echo "start to get info from $ftp_url"
curl $ftp_url/fileinfo.txt -o /tmp/cloud_info 2>/dev/null
url_way="ftp"
[ -s /tmp/cloud_info ] && break
echo "cannot access $ftp_url/ or don't find readygeniecloud tarball with version $version"
dynamic_sleep
done

[...snip...]

This script seems very likely to be the root cause we’re looking for. A call is made to curl to download fileinfo.txt over an HTTPS connection. However, if the HTTPS download fails (the tmp/cloud_info file isn’t created), then the fallback is to download over a normal FTP connection.

Since it would be possible for a well-positioned attacker on the WAN-side to serve up fake DNS, force the HTTPS connection to fail, and host a malicious FTP server, this could probably be chained to force this script to download whatever files they wanted. All this correlates with the advisory description. In this case the “fallback to [the] insecure protocol to deliver updates” is likely a reference to a fallback to FTP.

Let’s look a little further into the script flow and try to get a little more certainty.

The fileinfo.txt file that gets downloaded to /tmp/cloud_info is simple. Its contents look something like the following:

readygeniecloud-r7800new2-20191014.tar.gz 1.0.2h 3a66705fd5b33d5c5ac272a63dad2b1a

Each line is whitespace-separated with three fields: a filename, an OpenSSL version, and a hash.

We can see some of these values being extracted in the script:

    ssl_version=`openssl version | awk '{print $2}'`    
    fullversion=`cat /tmp/cloud_info | grep $ssl_version | awk '{print $1'}` 
    md5value=`cat /tmp/cloud_info | grep $ssl_version | awk '{print $3}'` 
[...snip...] 
   while [ 1 ]; do 
        if [ "x$url_way" = "xftp" ]; then 
           curl $ftp_url/$fullversion -o /tmp/cloud.tar.gz 2>/dev/null 
    [ "$(md5sum /tmp/cloud.tar.gz | awk '{print $1}')" = "$md5value" ] && break 
    echo "fail to download $ftp_url/$fullversion"

The $fullversion is read from /tmp/cloud_info based on the openssl version installed on the device. The file is downloaded from the same root that the fileinfo.txt file was ($ftp_url), then md5sum’d. The MD5 value is then checked against the md5 value read from fileinfo.txt. An attacker in control of fileinfo.txt could therefore easily force an affected Netgear device to download an arbitrary update file, which would then be verified against an attacker-controlled MD5 hash.

The Final “Install”

Once the file has been downloaded and its checksum checked, the final stages of the install are done. A working folder is created and the downloaded update package is un-tar.gz’d to this folder. Then, critically, all the files that are un-tar.gz’d are copied to the system root recursively.

mkdir /tmp/clouddir 
tar xfz /tmp/cloud.tar.gz -C /tmp/clouddir 
echo $fullversion > /tmp/clouddir/opt/version 
touch /tmp/clouddir/opt/filelist 
find /tmp/clouddir -type f | sed 's/\/tmp\/clouddir/\/overlay/' > /tmp/clouddir/opt/filelist 
cp -fpR /tmp/clouddir/* / 
rm -f /tmp/cloud_info 
rm -f /tmp/cloud.tar.gz 
rm -rf /tmp/clouddir

As we can see, the update package is designed for exactly this behavior.

Update Package Contents

The update package mirrors a Linux filesystem so that, for example, files within the /opt/ directory will be written to (or overwrite) other files in the /opt/ directory on the R7800.

This means any file in the update package will simply overwrite any file on the system! So, there’s quite a few ways an attacker might choose to practically exploit the device. It also likely means that the exploit would be very persistent, perhaps even surviving a factory reset.

What Was the Patch?

Since 1.0.2.80’s /sbin/cloud is an ELF binary rather than a script, we have to work a little harder to find the exact patch (not that hard, we just fire up Ghidra).

The logic previously handled by the update() shell script function is now handled by a series of compiled functions. As a quick example, the function at 0000924c is responsible for downloading the fileinfo.txt file, this time to a file called /tmp/fileinfo. Notably, the FTP fallback is gone. HTTPS is the only option now.

 

char * FUN_0000924c(void) 
{ 
  undefined4 uVar1; 
  char *pcVar2; 
  char acStack280 [256]; 
   
  snprintf(acStack280,0x100,"%s%s/fileinfo.txt", 
           "https://http.fw.updates1.netgear.com/sw-apps/ready-genie-cloud/",  
           &model); 
  DAT_00012550 = 0; 
  while( true ) { 
    printf("start to get info from %s%s/\n", 
           "https://http.fw.updates1.netgear.com/sw-apps/ready-genie-cloud/", 
           &model); 
    execve_custom("/tmp/fileinfo", 0, 0, "/usr/bin/curl", "-L", "--capath", "/etc/ssl/certs", acStack280, 0); 
    pcVar2 = (char *)FUN_00008bbc("/tmp/fileinfo"); 
    if ((pcVar2 != (char *)0x0) && (*pcVar2 != '\0')) break; 
    uVar1 = FUN_00008bbc(); 
    printf("cannot %s%s/ or don\'t find readygeniecloud tarball with version %s\n", 
           "https://http.fw.updates1.netgear.com/sw-apps/ready-genie-cloud/", 
           &model, 
           uVar1); 
    FUN_000091ec(); 
  } 
  return "/cloud_version"; 
}

Key Takeaways

Diffing can be a really valuable tool for identifying root causes of bugs in embedded systems. “Traditional” techniques sometimes need to be tweaked in order to apply them to entire filesystem images, update packages, or other kinds of firmware delivery mechanisms. Sometimes, you get a small curveball, like the type of file entirely changing, but serving the exact same purpose. It’s useful to go into these kinds of diffing exercises with an open mind, because device vendors can be quite unpredictable.

Copy Of Ads 480 120