Monday, July 18, 2022

Guessing, reconstructing and recovering data from a partially wiped disk

A friend asked me to help recover data from a encrypted external drive. He didn’t know how it was encrypted, but he has the password.

First examination of the drive showed that it has only one 200MB FAT partition in the front sectors labeled “EFI System Partition", the rest are unallocated. Since it’s a 2TB drive, the unallocated space probably contained partitions that had been deleted.

My first idea was to create an image of the disk to prevent further overwriting the data using dd. dd ran for about six hours and exited, reading only about 900GB of data. dmesg showed a lot of read errors, so the drive was probably also corrupted.

GNU ddrescue

ddrescue is a tool that is not derived from dd, designed to image a failing disk, salvaging as much uncorrupted data as possible. It has a smart algorithm to first dump the healthy sectors and gradually moving on to more difficult parts. It also saves a “mapfile" which is like a map of the disk, marking tried and non-tried areas, etc. Mapfile allows for continuation of interrupted dumps.

I first ran:

ddrescue -n /dev/sdc sdc.img mapfile

Then:

ddrescue -r 1 /dev/sdc sdc.img mapfile

Side note with ZFS: I forgot that the disk I’m saving sdc.img to only has 1TB, but ddrescue completed successfully, ls reported that the image file is indeed 2TB. So I’m probably saved by ZFS compression!

Is it BitLocker?

The crucial next step to recover data would be to rebuild the partition table to align with the actual partitions living on the disk. Once partitions have been identified, I can then mount the filesystem within it (or repair the filesystem). To do so, I must determine what exactly is the filesystem of the lost partition.

I decided to look at the files on the only healthy partition: the EFI System Partition. Mounting this partition was easy, it was just FAT. In the partition I found a few files generated by Windows, such as WPSettings.dat , so I assumed the disk to have been encrypted with BitLocker.

The dislocker-find utility from the dislocker project is capable of searching BitLocker partitions. However, it can only check a partition already in existence in the partition table, but not search for one in a disk.

Reading the dislocker-find source code shows that the program simply checks for a signature on partitions to determine if it contains BitLocker. Searching “signatures" in the repo shows two relevant files, one is dislocker.c which defines the Ruby signature variables (dislocker-find is written in Ruby); the other one is common.h which defines the signature constants:

#define BITLOCKER_SIGNATURE      "-FVE-FS-"
#define BITLOCKER_SIGNATURE_SIZE strlen(BITLOCKER_SIGNATURE)

#define NTFS_SIGNATURE           "NTFS    "
#define NTFS_SIGNATURE_SIZE      strlen(NTFS_SIGNATURE)

#define BITLOCKER_TO_GO_SIGNATURE "MSWIN4.1"
#define BITLOCKER_TO_GO_SIGNATURE_SIZE strlen(BITLOCKER_TO_GO_SIGNATURE)

I should be able to find the Bitlocker partition boundaries by searching for these strings in the entire disk. Bitlocker To Go for this case is more likely since it’s an external drive. I did so first with simple grep:

grep -a -b "MSWIN4.1" sdc.img #FAILED

This failed because grep would ingest a whole line into memory before processing. This is a binary file without newline, so during ingestion grep would eat all the memory and crash. Instead, one should use dd and pipe to fold then grep:

dd if=sdc.img | fold | grep -b -a "MSWIN4.1"

Before searching with the Bitlocker signature I also confirmed it working with different strings known to exist on the disk. However it couldn’t find any occurrence of the Bitlocker signature. (I only let it searched through a few gigabytes on the disk, because the first partition only occupied the first 200MB space, it wouldn’t make sense to create a second one with a ton of space in between.) I also tried “-FVE-FS-" with no result.

The findings so far indicates that either my method was incorrect, or that the disk really didn’t contain a Bitlocker partition. To prove my methods, I spawned up a Windows VM, attached a virtual USB disk (disk image) to it, enabled Bitlocker on the USB disk, and examines the disk image.

Note: To enable Bitlocker, the USB disk has to be first formatted in NTFS, otherwise the bitlocker options won’t show up.

There were two options to enable Bitlocker given by Windows: encrypt the entire disk, or encrypt only the files (I forgot the exact wording).

Before actually testing the Windows created Bitlocker disk image with grep, I noticed neither of the two methods created a “EFI System Partition", which made me suspicious whether it was actually encrypted with Bitlocker.

Note: some useful information I found on recovering Bitlocker.

Recover partitions with testdisk, is it HFS+?

I grew suspicious of the disk being in Bitlocker, so I decided to take testdisk for another spin. testdisk found an additional HFS+ partition that comes after the EFI System Partition, I added it back to the partition table, and tried to mount it but failed.

Reading the fsck.hfsplus manpage, seems like it’s able to repair the B-tree in the filesystem. I don’t actually know the internals of HFS+ but B-tree sounds like a internal structure to supports storing files.

Before running fsck I need to set up a loop device that contains only the HFS+ partition. Then I can use -fryd to repair the B-tree:

fsck.hfsplus -fryd /dev/loop22

Unfortunately it failed to repair. I also tried using fsck_hfs on macOS which reported the same error.

Note: to mount a full disk image on macOS to a virtual device, use:

hdiutil attach -imagekey diskimage-class=CRawDiskImage -nomount filename

This command would create a new device node /dev/diskX where X is a number depending on your device. Then simply run fsck_hfs -fryd /dev/diskXsY

Is it encrypted HFS+?

I vaguely remembered the days when Apple had to maintain and build new features upon the antique HFS+ because APFS was still in development. HFS+ does not support encryption, and Apple’s way of adding disk encryption support was to develop a “volume manager" called Core Storage that can do encryption on the block level and create the actual file system (HFS+) in a encrypted volume provided by Core Storage. Core Storage serves a similar purpose to LVM on Linux. This is probably why I couldn’t mount the HFS+ partition previously.

To serve as a comparison, I decided to create a “encrypted HFS+ external disk" with macOS. But Apple being Apple, removed the feature from its Disk Utility in recent macOS versions. Fortunately, I can easily create macOS VirtualBox VMs with this script. I used the High Sierra version. Note: though there are other projects that provision macOS KVM VMs, I decided to use VirtualBox because its GUI to passthrough USB devices (or to create virtual USB disks) is much easier to use than editing qemu commandline arguments.

Here’s the disk layout of encrypted HFS+ created by macOS:

sudo fdisk -lu /dev/sdd
Disk /dev/sdd: 7.45 GiB, 8000110592 bytes, 15625216 sectors
Disk model: X
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: X

Device        Start      End  Sectors  Size Type
/dev/sdd1        40   409639   409600  200M EFI System
/dev/sdd2    409640 15363031 14953392  7.1G Apple Core storage
/dev/sdd3  15363032 15625175   262144  128M Apple boot

That 200MiB EFI System partition is exactly the same as the one on my corrupted disk!

Verifying Core Storage signatures on disk

To verify that it is actually Apple Core storage and Apple boot partitions sitting on the disk, I have to find the partition signatures on it. The locations of signatures are also important because I need to recreate the partition table based on the actual partition location.

libfvde has documented the Core Storage signature and its location: at 88 bytes from the start of the partition, there is a signature “CS". “CS" translates to byte sequence “0x4353″ in ASCII. (Another reference here)

I then used SearchBin to search for the byte sequence in the disk image:

python3 SearchBin/searchbin.py -p 004353 sdc.img

“00″ was added to rule out false positives. The field technically contains other header data and might not always be “00″, but on my sample healthy Core Storage partition it is “00″ so I used it anyways.

SearchBin quickly gave me a match at offset 209735767, which translates to almost 200MiB. This could very well be the beginning of the missing Core Storage partition, since the preceding partition is exactly 200MiB in size.

On my healthy sample, the Apple Core storage partition starts at sector 409640, 409640 times sector size of 512 bytes equals 209735680 bytes, close to 200MiB. 209735767, the signature location, minus 87 (offset), equals 209735680 ! This means that the corrupted Apple Core storage partition starts at the same sector (409640) as the healthy one! This makes sense if both of them are created with default settings.

I also used hexdump to verify that the data on both samples are aligned and similar:

hexdump -C -n 10K -s 209735680 sdc.img

Finding the Apple boot partition

Next, I have to find the boundaries of the missing Apple boot partition. Looking at the healthy sample, the Apple boot partition occupies near the end of the disk, but leaves some trailing sectors.

The healthy disk:

  • has 15625216 sectors in total
  • 15363032 to 15625175 is the apple boot partition

The boundaries of the apple boot partition are probably calculated backwards from the end of the disk, since this would leave most of the space in the middle for actual Core Storage data.

Assuming these won’t change across different disks:

  • distance from end of apple boot partition to end of disk
  • distance from start to end of apple boot partition

I can work out the start and end of the missing apple boot partition from the size of the disk (3907029168 sectors):

  • start: 3906766984
  • end: 3907029127

And, according to the healthy disk, there are also no space between the Core Storage partition and the Apple boot partition, so I can work out the end of the Core Storage partition:

  • start: 409640 (already known)
  • end: 3906766983 (one sector less then the start of apple boot partition)

Similarly, I also used hexdump to verify the data on both disks are aligned (looking at the location of zeros vs non-zeros).

Recreating the lost partitions in the partition table

Now I have all information I need to recreate the lost partitions with fdisk sdc.img:

Note that, when specifying the type of partition, one cannot simply type the names of the types like “Apple boot" or “Apple Core storage". One has to use the GUIDs. All GUIDs supported by fdisk are also not listed if you press ‘L’ to list. This is weird but I don’t know why it is the case.

$ fdisk sdc.img                                                                       
                                                                                                                      
Welcome to fdisk (util-linux 2.36.1).                                                                                 
Changes will remain in memory only, until you decide to write them.                                                   
Be careful before using the write command.                                                                            
                                                                                                                      
                                                                                                                      
命令 (m 以獲得說明): p                                                                                               
Disk sdc.img: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors                      
Units: sectors of 1 * 512 = 512 bytes                                                                                 
Sector size (logical/physical): 512 bytes / 512 bytes                                                                 
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt                                        
Disk identifier: X

所用裝置        Start       結束       磁區  Size 類型
sdc.img1         40     409639     409600  200M EFI System
sdc.img2     409640 3906766983 3906357344  1.8T Apple boot
sdc.img3 3906766984 3907029127     262144  128M Apple HFS/HFS+

命令 (m 以獲得說明): t                                    
分割區編號 (1-3, default 3): 2
Partition type or alias (type L to list all): 53746F72-6167-11AA-AA11-00306543ECAC

Changed type of partition 'Apple boot' to 'Apple Core storage'.

命令 (m 以獲得說明): t                                    
分割區編號 (1-3, default 3): 3
Partition type or alias (type L to list all): 426F6F74-0000-11AA-AA11-00306543ECAC

Changed type of partition 'Apple HFS/HFS+' to 'Apple boot'. 

命令 (m 以獲得說明): p                                    
Disk sdc.img: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt                                        
Disk identifier: X

所用裝置        Start       結束       磁區  Size 類型
sdc.img1         40     409639     409600  200M EFI System
sdc.img2     409640 3906766983 3906357344  1.8T Apple Core storage
sdc.img3 3906766984 3907029127     262144  128M Apple boot

命令 (m 以獲得說明): w                                    
The partition table has been altered.
Syncing disks.                                             

After using fdisk to recreate the partition tables on the disk image, I used the same hdiutil attach command above to attach the disk image. macOS immediately detected the disk, and prompted for password. Once I entered the password, I can see everything inside. Success!

Revisiting testdisk

I was curious why testdisk did not identify the Apple Core storage partition, so I checked its source code. Quickly searching the signatures of Core Storage yields nothing. It seems that testdisk is not able to identify Core Storage partitions. Not sure why not, because it seems easy, maybe simply relying on such as short single signature would give too many false positives?

Lessons learned

  1. Most filesystems are robust in a way that corrupted disk sectors would only cause damage to whatever that is sitting on it, but little else. i.e. Entire file system would not be brought down because of a few corrupted sectors. Even if that’s a encrypted filesystem.
  2. Most filesystems mark their beginnings on disk with signatures or magic bytes. Deleting a partition from the partition table doesn’t change the partition data itself. Partition recovery could generally be achieved by finding those signatures, using the location of signatures to determine partition boundaries, and recreating the lost partitions in the partition table.
  3. Well-known easy-to-use tools can only automate so much, and they’re still dumb. Data recovery, forensics, reverse engineering requires a high level of human ingenuity, that calls on out-of-band information (such as I know the person is more likely to be using macOS encryption) and recognize patterns and weigh the nuances.



from Hacker News https://ift.tt/LSjrOs9

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.