It turns out that aarch64 has exactly such support. Here's support heading in to the Linux kernel:
The original idea was to defeat ROP by having all of the instructions randomized a bit on a per-install basis. You know, the usual tricks such as applying equivalence transforms on the opcode stream. Such an approach would have some obvious downsides such as diagnosability and let's face it, implementing this would also feel a bit hacky. Can we do better?
Maybe we can. The original idea focused on the attacker knowing where the binaries are in virtual address space, but not knowing or being able to read or otherwise predict the content. What if we instead keep the binary content stable but try and make sure the attacker cannot discern the location of the binaries? With enough ASLR entropy, this would be an interesting approach.
For the sake of the exercise, imagine the attacker has the most powerful of bugs: an arbitrary read/write primitive relative to an existing heap location. The attacker can follow heap pointers to the stack, the BSS, vtables, etc. At first, this sounds prohibitively hard to deal with. But for every way the attacker might try to leak the address of the binary, there currently seems to be a solution:
- The heap is riddled with vtable pointers. If the attacker follows a vtable pointer, they get to read function pointers and the location of the binary is revealed. We fix this in one of two ways: either get sneaky and turn vtables into code (jmp 0xblah) instead of data, and reuse our exec-without-read primitive. Or we burn a register (aarch64 has lots) as a storage for a secret ASLR base for the binary.
- The heap is riddled with raw function pointers. We can redo function pointers as something like single-slot vtables and use the above trick. We don't want to directly store function pointers in writable memory as a relative position to our secret register, because the attacker could then easily jump to an arbitrary point in the binary.
- The BSS and data sections are typically stacked adjacent to the binary. We need to not do this, so that pointers into the BSS and data sections do not reveal the location of the binary.
- The stack contains saved return addresses. These return addresses reveal the address of the binary. And for sure, the heap will contain pointers to the stack from time to time. Separating your stack into control flow and data will sort this out -- perhaps burning another register to keep the control flow stack separate and at a secret location.
- JIT engines are a pain. And your heap is going to contain chains of pointers leading to the JIT pages. Depending on the type of JIT engine, there are various tricks that can be pulled. Enumerating them here is going to make the post too long. Some of the more amusing tricks including having the kernel ban syscalls from a writable page.
[Thanks to Lee Campbell for helping with discussions and this blog post]