Tuesday, July 27, 2021

Analysis of large binaries and games in Ghidra-SRE

Ghidra is a free and open source reverse engineering suite. It is flexible with scripting and plugins and can be used for almost any architecture. If one does not exist you can add it yourself, if there’s a bug you can fix it yourself, which is the beauty unlike competitors such as IDA Pro. (Did I mention it was free as well???) I have been checking out Ghidra since the initial public release and for “real work” situations it was less than ideal. I kept watching process on the bug tracker, and noticed that 10.x had gotten many improvements and bug-fixes over the 9.x release and thought to give it a second try.

The initial results were not spectacular, running into the same Swing timeout errors (which can be solved by looking here), and overall taking 24+ hours to analyze a 300MB executable with symbols just to finally crash about about a day and half later, leaving me back at square one. This is the experience of many people that I have talked to who are into games reverse engineering where binary sizes can balloon very fast, even without symbols. I will go over the steps for PC executables that are large without symbols, but the process should be about the same for other platforms (PSX, PS2, PS3, PS4, PS5) provided you have the loaders/scripts and follow the instructions for them.

Prerequisites

Before beginning we will need a few things, first you will need to have Ghidra downloaded and extracted, at the time of writing this is version 10.0.1. Since most games are written in C++ and may have some form of RTTI (Run-Time Type Information) having a plugin to handle this is ESSENTIAL. Luckily there is, made by someone who is one of the most amazing developers that contributes to Ghidra that I have met astrelsky! His work in this area is some of the greatest I’ve seen come out of the OSS community. We will be using his plugin for the C++ Class Analyzer which handles a plethora of C++ specific things. You can download the C++ Class Analyzer on the releases section of GitHub. At the time of writing 10.0.1 is not “supported”, but it is just a very minor tweak to get it to work and will probably be updated in the future. Open the download zip file (at time of writing, ghidra_10.0_PUBLIC_20210623_Ghidra-Cpp-Class-Analyzer.zip) in your favorite archival tool (I use 7-Zip) and find the file extension.properties in the folder Ghidra-Cpp-Class-Analyzer and double click to edit it.

extension.properties file in the archive

You will need to edit the version information to say version=10.0.1 instead of the default version=10.0 save the file, and 7-zip usually will ask if you want to save to the archive, select yes.

Updated version information

Launch Ghidra and click on File -> Install Extensions click the + and select the Ghidra-Cpp-Class-Analyzer zip file.

It should be added to the list and check the checkbox to enable it and restart Ghidra.

Once Ghidra is restarted, create a new project, give it a name (I use GhidraTutorial) for this and drag an drop in your binaries. If you are loading PS1/2/3/4 ensure that the right architecture is selected. For PC binaries it usually detects correctly on the first try. In this case I am using the game Horizon Zero Dawn GOG release on PC. Allow the import to complete, but do not launch the CodeBrowser in Ghidra quite yet.

Import prompt for Horizon Zero Dawn

You will need to close Ghidra at this point because the GUI for CodeBrowser when analyzing causes nothing but issues. This is where the normal usage of Ghidra will come to a pause. We will be using the Ghidra Headless Analyzer which is a CLI tool for Analyzing or Importing files into a Ghidra database. Open up a new Terminal window (I use Powershell on Windows, Bash on Ubuntu/Linux) and navigate to the Ghidra support directory. In my case this is PS C:\Users\godiwik\Documents_tools\Ghidra\ghidra_10.0.1_PUBLIC\support .

Since we have already imported the executable with the correct architecture we will just need to analyze using the -process flag. The command we will use is .\analyzeHeadless <directory to the Ghidra database> <DatabaseName> -process "*" -recursive which should handle analyzing anything we have already imported. If this does not work for you, try adding some to the wildcard such as Horizon*, or if you have it in a folder, supply the folder name with a wildcard behind.

The example that I am using is .\analyzeHeadless ../../ GhidraTutorial -process "/*" -recursive since my Ghidra database (GhidraTutorial.gpr/GhidraTutorial.rep) is one directory above the Ghidra root directory. If no errors appear, just go take a bathroom break, get some coffee, play with your favorite pet, feed your kids whatever you have to do because this will take awhile…

In process of analyzing… Waiting for completion

After everything is all completed, you can re-open Ghidra again. And then double click on the executable to open up the CodeBrowser.

Headless Analyzer Completed in ~30m

After this point everything should be analyzed, and the C++ Analyzer should also have been ran during the headless analyzer. If you aren’t seeing good results, then click Analysis->All Open->Deselect All->Windows (or GCC) C++ Class Analyzer (prototype), increase the Decompiler timeout from 30 seconds to 120 or 240, and click Locate Constructors then click Analyze. Wait for that to complete and everything should be ready for you to explore!

Additional C++ Class Analyzer options

All Done!

I hope that this was able to help someone, I know I was frustrated with attempting to use the GUI for large binaries, this is a workaround that is pretty awesome and you could even automate this for automatic analysis.

I’d like to thank OSM, Seremo, Shiro, krystalgamer_, Z80 in no particular order for taking interest and or helping proofread.



from Hacker News https://ift.tt/3ixvuMg

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.