Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
The Yahallo exploit uses a security vulnerability in the NVIDIA Tegra 3 / Tegra 4 UEFI firmware to disable the SMMU memory protection. As a result you can access all memory.
Disclaimer Most information only applies to SurfaceRT, but some yahallo information applies to Surface RT 2 too.
Because NVIDIA/Microsoft didn't implement a check for maximum buffer size when using a SMC instruction to register a new shared buffer, it is possible to create a buffer overflow, which is then used to overwrite the area where a BootServices->SetMem UEFI call writes its memory.
First, you need to know which values are used from the SMC in the calculation, we don't care about parameters 1 and 2. Parameter 3 is request_area, so in yahallo source 0x40000000. Parameter 4 is area_slice, so in yahallo source 0x6001e0e0.
The calculation has the following formular:
response_area = (request_area + (area_slice >> 1))
Examples:
0x7000_f070 = (0x4000_0000 + (0x6001_e0e0 >> 1)
0x7000_f010 = (0x4000_0000 + (0x6001_e020 >> 1))
The SMMU (System Memory Managment Unit) is part of the SOC, and not part of the Cortex-A9 cores. The CPU cores have a MMU too, you can simply disable it with a ArmDisableMmu()
call in the edk2 source tree. Yahallo reconfigures the SMMU, so all SMMU memory protection is gone. The CPU MMU still protects this memory, but you can easily disable it.
To disable the SMMU, you need to write 0's to the SMMU Enable Register (0x7000f010). This can only be done when you are in secure mode, so it doesn't work if you just disable the CPU MMU. With the help of the Yahallo exploit we can write to all memory locations in secure mode, because the BootServices->SetMem call executes in secure mode. (UEFI firmware operates in secure mode generally)
Yahallo writes 32 bytes of 0's to 0x7000f070, so it writes the Secure Region Configuration Base register, Secure Region Configuration Bound register and the Protected Region Configuration Bound register.
This makes the Trustzone 0MB in size (SMMU side), so you have access to non-trustzone-secured* memory, when CPU doesn't restrict memory access. The CPU restrictions can be removed by disabling the MMU.
* Some registers are restricted to be only accessible from Trustzone-secured CPU memory access. (e.g. from the SMMU_ENABLE register from TRM: This register can only be accessed by Trustzone-Secured accesses from the CPU.)
This is from the . It gives a brief overview about the parameters, but here is more detail about them:
The Tegra 3 Technical Reference Manual (TRM) has important information about virtual addressing and configuration registers in section 18. You can download it , but downloading requires a NVIDIA developer account.
// Register a new shared memory buffer at 0x4000_0000 (IRAM) with size // 0x6001_e0e0. The following algorithm is used to determine the end address // that the QueryVariable call will write over: // response_area = (request_area + (area_slice >> 1)); Ret = ArmCallSmcHelper(0x03, 0x06, 0x40000000, 0x6001e0e0); |
We tried to get rid of the trustzone with help from yahallo exploit
The Surface RT's Tegra 3 SOC's CPU uses Cortex-A9 cores. These cores use Virtual Memory System Architecture (VMSA), this means we have a Memory Management Unit (MMU).
This manual describes the ARMv7 A & R architecture (ARMv7 A is important for us) in Application Level Architecture and in System Level Architecture. The System Level Architecture is the interesting part for this type of development. Part B, System Level Architecture, only has 874 pages to read 🥳.
Download the manual here:
https://developer.arm.com/documentation/ddi0406/latest
The MMU is also used for virtual addressing, not only for protected memory. Virtual addressing is necessary for most applications and operating systems. Linux works without it, but most programs don't work without virtual addressing. Because of that we need to enable the MMU for linux, but when MMU is enabled, it blocks access to trustzone memory, so the trustzone memory needs to be unmapped by modifying the page tables of the Cortex-A9 cores.
The advantage of completely eliminating UEFI and TrustZone is, that we don't need to modify the linux source code, only a device tree is needed, which can be mainlined. Without getting this way to work, our linux work probably won't be mainlined ever.
Run arm32 UEFI in a virtual machine.
Emulating a arm32 UEFI device is useful for developing Linux and debugging it.
In the GDB Debugging page you can find instructions on how to compile Linux for this virtual machine.
A premade ZIP with all required files can be found at the bottom.
Run the following commands to install the required packages.
You will need other stuff too, but that is probably already installed. (e.g. git)
You need the source code of edk2 and acpica.
Go to your source directory and run the following commands.
Your output OVMF firmware file for qemu is$WORKSPACE/Build/ArmVirtQemu-ARM/RELEASEGCC5/FV/QEMU
_EFI.fd
Create a directory, where you want your files to be in. Put your QEMU_EFI.fd
firmware file in this directory, compiled in the previous section. Now run the following commands to create some disk images:
Now create a directory named boot
. This will be your EFI partition. You can now easily place your EFI files in there.
To start your virtual machine run the following command, and make sure qemu-system-arm
is installed.
This will run qemu with 4 virtual CPU cores. They are Coretx-A15 cores. Used because it works.
The following ZIP includes all files setup in their proper location. In addition its EFI partition folder has a UEFI shell in it. To run it either execute the run.sh
file or enter the command described in Run qemu.
Links where the above compiling information is from:
Booting linux with UEFI boot is now possible.
PMIC regulators don't work.
Audio, Wireless, Cameras, Sensors .This is also true for APX boot.
Implement ACPI for arm32. This will help in development for all other devices that run Windows RT too. Why? It removes the need for a device tree and enables us to comunicate with the firmware, which will hopefully enable us to use PMIC stuff. Also we can upstream our ACPI Parking Protocol driver, which is needed for SMP.
Get Audio, Wireless, Cameras, Sensors.
Possibly more, but we don't know about it yet.
Replace the Secure Monitor to execute user-defined code in Secure mode by using a Secure Monitor Call (SMC)
Using the Yahallo exploit we can get access to Trustzone memory. In the Trustzone you can find all of the firmware's functionality, where some code executes in Secure mode. On top of that, some of the memory is marked as secure, so you can execute it from Secure mode.
The goal is to get rid of Trustzone memory, so the page tables of the Trustzone memory need to be modified. To do this you need to be in the Secure mode.
But how do you get into Secure mode?
UEFI firmware provides runtime services to the running operating system. The communication happens with ACPI or other protocols. But also with interrupts, specifically the Secure Monitor Call (SMC). A user can trigger a SMC by executing the smc
instruction. See the SMC Calling Convention for more information on how such a call happens.
When executing a SMC, the processor receives the interrupt and looks in the Monitor Vector Table at the Monitor Vector Base (specified in the Monitor Vector Base Address Register (MVBAR)) for the address of the Secure Monitor and jumps to it. The Secure Monitor is then responsible for reading the variables from the SMC and running the desired function in Secure mode*. Essentially SMCs are used for communicating with the Secure world and the Trustzone kernel.
When you further think about it, you may notice that we could simply replace the Secure Monitor and put our own code there, by using the Yahallo exploit.
* The Secure Monitor executes in Monitor mode. Monitor mode ignores the SCR.NS bit and always executes in secure mode. You can use this to modify configurations which are specific to each execution state.
To replace the Secure Monitor, you first need to locate it. The Monitor Vector Base Address Register points to the Monitor Vector Table, where the third table entry points to the Secure Monitor
So just read the MVBAR, right?
So they are only accessible from Secure state. A normal UEFI App executes at PL1 in non-Secure mode.
So reading the register at runtime won't work. But there are other options to get the address.
You can get your UEFI firmware binary from C:\Windows\Firmware\SurfaceRTUEFI.bin
.
For reverse engineering we have used Ghidra. For importing make sure to use ARM:LE:32:V7:default
as language.
mcr p15,0x0,r0,cr12,cr0,0x1
is the instruction which sets MVBAR. Search for it. You should only find one result. The disassembly looks something like this:
When further analyzing this, you will quickly notice that you will not be able to find out MVBAR by just analyzing the UEFI binary. Maybe it's possible, but only with heavy reverse-engineering.
As you might have thought, there is an easier way to get the value of MVBAR.
To analyze a memory dump you need to create one first. Our github repository also compiles into a memory dump tool, modify MemoryDumpTool/App.c if you need to change the memory address.
Import the dump to Ghidra. Use the setting from above, but make sure to go to options and set the Base Address to 80000000
. This is where the memory dump started from. After this change continue the import as usual.
But what now? You have Megabytes of memory, and you need to pick an exact point out of it.
Of course we read the handful of existing and useful blog posts that are on the internet. Including this one: https://fredericb.info/2014/12/analysis-of-nexus-5-monitor-mode.html
It shows how a Monitor Vector Table looks like, when disassembled:
So only search for some branch instructions, right?
Not really. There are a lot of jump tables in the memory dump. After hours upon hours of starring at Ghidra and the internet, I managed to find the right one.
I searched for multiple branch instructions next to each other using this command xxd "trustzone.bin" | grep -E '(.{3}00 00ea){4}'
(CTS helped me with that command)
I just went through the results and found a jump table at 0x811f8000
.
At that point I already had an EFI app which loads a payload into memory, copies it to a desired location in the Trustzone and then fills some memory with an instruction to load the payload location into a register and an instruction to branch to this register. The payload was capable of printing something to UART. I have also tried other jump tables at that time, with no success.
But this time it was different: The payload started executing.
So it was clear: MVBAR is 0x811f8000
Right now the payload is placed at 0x8011219c
. When analyzing a memory dump you will notice that this is right after an instruction is executed which sets SCR.NS. In theory you will not need to place the payload exactly there. You could also place it before the SCR.NS set, you will always execute in secure mode, as described above. In theory you could also just replace a single Secure Monitor "Function" (When you pass paramteres to a SMC you specify what to do in the Secure Monitor).
TODO: Add picture of Ghidra with the memory location
You can find a fully working payload example on GitHub. It includes the Makefile, a linker script, an assembly file and a C source file.
Our payload is position independet, this means it can execute from anywhere in memory. It's really easy to make GCC compile position independet: Just add -fpic
to the CFlags. (Yes, the payload is written in C, other languages such as C++ or Rust may be useable too).
When configuring a large amout of registers, using a seperate .S/.asm file is more convinient than using inline assembly.
But there is something worth knowing: You need to make sure that your assembly is relocatable, otherwise the code will break. In C the compiler makes the code reloactable, but the assembler can't make the assembly relocateable for you.
Here is a short example of an assembly file which loads the address of mybuffer
into r0:
What to explain:
How the payload works and how it is executed and why exactly the strange memory location
verifying that payload executes in Secure mode
explain monitor mode a bit more (maybe)
what to do now
maybe how to use ghidra
Structure of Vector Tables (Including Monitor Vector Table)
ARM Processor Modes and Registers (Figure 3.3)
This is a place were we put configurations we tried, and didn't work or did work up to a certain point.
Note to members of the gitbook: I don't know if it is useful to make this person-specific.
This hangs at without supplying device tree with dtb=
If device tree is supplied, cpuidle complains at printed.
[ 0.000000] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[ 0.000000] cfg80211: failed to load regulatory.db
Testing without device tree from here on
Changed:CONFIG_CFG80211=n
[ 0.000000] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[ 0.000000] cfg80211: failed to load regulatory.db
This is issue doesn't exist anymore, it just freezes at [ 0.000000] Freeing unused kernel memory: 1024K
Note the line, clock should be 100kHz[ 0.000000] sched_clock: 32 bits at 100 Hz, resolution 10000000ns, wraps every 21474836475000000ns
[ 0.000000] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 0.000000] Warning: unable to open an initial console.
It looks like it's using some serial and then is unable to open a console. -> No device tree so it doesn't know /dev/ttyS0 (UART-A)
CONFIG_FB_EFI=y
CONFIG_CMDLINE="console=ttyS0,115200 console=tty0 earlyprintk initcall_debug sched_debug lpj=10000"
It booted, then disabled uart and printed to the screen. Log only contains print from uart. Stuck at the same line.
CONFIG_CMDLINE="console=ttyS0,115200 console=tty0 earlycon earlyprintk initcall_debug sched_debug lpj=10000"
Log should be the same apart from the cmdline. Again, it displayed to display, so log doesn't include that output.
CONFIG_CMDLINE="console=ttyS0,115200 earlycon earlyprintk initcall_debug sched_debug lpj=10000"
Screen with a cursor, all output on uart. But because there is no device tree, no console from initrd.
Testing with device tree from here on
Nothing changed in config. Only added device tree.
Kernel panic: [ 0.000000] Unable to handle kernel NULL pointer dereference at virtual address 00000010
Has to do with cpuidle. You can also see complaints about a bad device tree.
Only device tree edits.
Screen has no cursor anymore, only backlight is turned on. You can't print something on it with echo hello >> /dev/fb0
or echo hello >> /dev/tty0
Successful boot to Buildroot.
Note: Log has a lot of entries two times.
CONFIG_VGA_ARB=n
CONFIG_TEGRA_HOST1X=n
CONFIG_DRM=n
No different screen behaviour.
Log has some commands at the bottom.
No config changed. Only device tree edits. Added nodes to it until dtc didn't complain about anything.
Some strange screen behaviour.
[ 0.000000] tegra20-cpufreq tegra20-cpufreq: operating points not found
[ 0.000000] tegra20-cpufreq tegra20-cpufreq: please update your device tree
Looks like some cpu frequency nodes have to be added to device tree.
Only device tree edits.
Screen still doesn't work.
Boots fine to Buildroot. Log has commands at the bottom.
CONFIG_CMDLINE="console=ttyS0,115200 console=tty0 earlyprintk initcall_debug sched_debug lpj=10000"
Found out that screen works, but it gets cleared when Busybox (from ramdisk) starts. At least I think at that point it gets cleared, not 100% sure.
CONFIG_CMDLINE="console=ttyS0,115200 earlyprintk initcall_debug sched_debug lpj=10000"
Minimal device tree with sd-card+emmc+uart, it includes tegra30.dtsi
No display output with this command line, but you get log output when adding console=tty0
, but afterwards it stops around when Busybox takes over.
Added a raspberry pi rootfs. Specifying root= option with efi shell
Display stops working after the line [ 251.431349] tegra-devfreq 6000c800.actmon: Failed to get emc clock
Log contains output from rootfs.
Config reseted. Started from a new one.
CONFIG_EFI_STUB=y
CONFIG_EFI=y
CONFIG_CMDLINE="console=ttyS0,115200n8 earlyprintk initcall_debug sched_debug lpj=10000 boot_delay=50"
CONFIG_CMDLINE_EXTEND=y
CONFIG_SMP=n
CONFIG_CACHE_L2X0=n
CONFIG_EARLY_PRINTK=y
CONFIG_DEBUG_TEGRA_UART=y
CONFIG_TEGRA_DEBUG_UARTA=y
CONFIG_BOOT_PRINTK_DELAY=y
CONFIG_BOOT_PRINTK_DELAY=y
and boot_delay=50(cmdline)
are necessary so mmc driver gets loaded early enough.
Log contains a few commands from rootfs.
I'm not sure what changed.
Working display with booted raspberry pi os!
Debug Linux kernel with GDB
Clone a Linux tree and run make ARCH=arm defconfig
to make a generic kernel configuration suited for qemu. Now edit the kernel configuration (.config
) and add the following lines at the bottom:
Now run make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- -j$(nproc)
to compile the kernel. If you get asked about anything, just press enter to use the standard value.
Copy the output zImage (arch/arm/boot/zImage
) to efi/boot/bootarm.efi
on your EFI partition folder in your qemu directory.
Run sudo apt-get install gdb-multiarch
to install GDB on Ubuntu. gdb-mutliarch
is required because normal gdb
package doesn't have support for ARM.
Open up the terminal you want GDB to run in, and change directory to your Linux compilation directory. Then run gdb-multiarch vmlinux
., it will open GDB you and you can now connect to a target with target remote localhost:1234
. At this point GDB will wait for qemu to start. After that you can now debug with qemu, there are tutorials online to show you how to do this.
SurfaceRT/2 uses UEFI Secure Boot.
We have a test key that can be used to sign our EFI binaries so that they are trusted by the windows boot manager. (When secure boot is enabled).
5D7630097BE5BDB731FC40CD4998B69914D82EAD CN=Windows OEM Test Cert 2017 (TEST ONLY), O=Microsoft Partner, OU=Windows, L=Redmond, S=Washington, C=US
can use signtool on windows to sign our EFI builds eg
signtool.exe sign /tr http://timestamp.digicert.com /td sha1 /fd sha1 /sm /sha1 5d7630097be5bdb731fc40cd4998b69914d82ead *.efi
Debug Linux kernel within Visual Studio Code
The following steps have to be performed in your Linux source directory.
Create a file called tasks.json
in the directory .vscode
and paste the following contents into it:
You may want to change line 26 and 34, as they point to the directory where your qemu files are located.
Create a file called c_cpp_properties.json
in the directory .vscode
and paste the following contents into it:
Press F5 to start debugging. The following steps will be performed:
Compile Kernel
Copy zImage to qemu EFI partition
Launch qemu
Start GDB debugging
The following keys are important:
F9 for creating a breakpoint
F10 for going a step forward
F11 for stepping into a function
F12 to step out of a function
Go to the directory where your qemu files are located, start qemu as described in , only change is that you need to add a -s
parameter, this lets qemu know that it starts a GDB server.
Windows bootmanager exploit is deprecated, as you can now disable secure boot. Without secure boot you don't need to sign your EFI binaries. Visit for more information.
See for the documentation about the tasks.json
format
Create a file called launch.json
in the directory .vscode
and paste the following contents into it:
Use IntelliSense to learn about possible attributes. Hover to view descriptions of existing attributes. For more information, visit: