-
Notifications
You must be signed in to change notification settings - Fork 612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x86: shadow stack support #2306
Conversation
Signed-off-by: Mike Rapoport (IBM) <[email protected]>
All architectures create on-stack structure for floating point save area in compel_get_task_regs() if the caller passes NULL rather than a valid pointer. The only place that calls compel_get_task_regs() with NULL for floating point save area is parasite_start_daemon() and it is simpler to define this strucuture on stack of parasite_start_daemon(). The availability of floating point save data is required in parasite_start_daemon() to detect shadow stack presence early during parasite infection and will be used in later patches. Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Signed-off-by: Mike Rapoport (IBM) <[email protected]>
To support sigreturn with CET enabled parasite must rewind its stack before calling sigreturn so that shadow stack will be compatible with actual calling sequence. In addition, calling sigreturn from top level routine (__export_parasite_head_start) will significantly simplify the shadow stack manipulations required to execute sigreturn. For x86 make fini_sigreturn() return the stack pointer for the signal frame that will be used by sigreturn and propagate that return value up to __export_parasite_head_start. In non-daemon mode parasite_trap_cmd() returns non-positive value which allows to distinguish daemon and non-daemon mode and properly stop at int3 in non-daemon mode. Architectures other than x86 remain unchanged and will still call sigreturn from fini_sigreturn(). Signed-off-by: Mike Rapoport (IBM) <[email protected]>
When calling sigreturn with CET enabled, the kernel verifies that the shadow stack has proper address of sa_restorer and a "restore token". Normally, they pushed to the shadow stack when signal processing is started. Since compel calls sigreturn directly, the shadow stack should be updated to match the kernel expectations for sigreturn invocation. Add parasite_setup_shstk() that sets up the shadow stack with the address of __export_parasite_head_start as sa_restorer and with the required restore token. Signed-off-by: Mike Rapoport (IBM) <[email protected]>
The shadow stack VMAs require special care because they can only be created and populated using special system calls. Add VMA_AREA_SHSTK flag and set it for VMAs that are marked as "ss" in /proc/pid/smaps Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Shadow stack VMAs cannot be mmap()ed, they must be created using map_shadow_stack() system call and populated using special wrss instruction available only when shadow stack is enabled. Premap them to reserve virtual address space and populate it to have there contents available for later copying after enabling shadow stack. Along with the space required by shadow stack VMAs also reserve an extra page that will be later used as a temporary shadow stack. Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Shadow stacks must be populated using special WRSS instruction. This instruction is only available when shadow stack is enabled, calling it with disabled shadow stack causes #UD. Moreover, shadow stack VMAs cannot be mremap()ed and they must be created using map_shadow_stack() system call. This requires delaying the restore of shadow stacks to restorer blob after the CRIU mappings are cleared. Introduce rst_shstk_info structure to hold shadow stack parameters required in the restorer blob and populate this structure in arch_prepare_shstk() method. Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Detect if CRIU runs with shadow stack enabled and store the result in kerndat. Unlike most kerndat knobs, kdat_has_shstk() does not check for availability of the shadow stack in the kernel, but rather checks if criu runs with shadow stack enabled. This depends on hardware availabilty, kernel and glibc support, compiler options and glibc tunables, so kdat_has_shstk() must be called every time CRIU starts and its result cannot be cached. The result will be used by the code that controls shadow stack enablement in the next commit. Signed-off-by: Mike Rapoport (IBM) <[email protected]>
There are several gotachs when restoring a task with shadow stack: * depending on the compiler options, glibc version and glibc tunables CRIU can run with or without shadow stack. * shadow stack VMAs are special, they must be created using a dedicated map_shadow_stack() system call and can be modified only by a special instruction (wrss) that is only available when shadow stack is enabled. * once shadow stack is enabled, it is not writable even with wrss; writes to shadow stack can be only enabled with ptrace() and only when shadow stack is enabled in the tracee. * if the shadow stack is enabled during restore rather than by glibc, calling retq after arch_prctl() that enables the shadow stack causes #CP, so the function that enables shadow stack can never return. Add the infrastructure required to cope with all of those: * modify the restore code to allow trampoline (arch_shstk_trampoline) that will enable shadow stack and call restore_task_with_children(). * add call to arch_shstk_unlock() right after the tasks are clone()ed; this will allow unlocking shadow stack features and making shadow stack writable. * add stubs for architectures that do not support shadow stacks * add implementation of arch_shstk_trampoline() and arch_shstk_unlock() for x86, but keep it disabled; it will be enabled along with addtion of the code that will restore shadow stack in the restorer blob Signed-off-by: Mike Rapoport (IBM) <[email protected]>
The restore of a task with shadow stack enabled adds these steps: * switch from the default shadow stack to a temporary shadow stack allocated in the premmaped area * unmap CRIU mappings; nothing changed here, but it's important that CRIU mappings can be removed only after switching to a temporary shadow stack * create shadow stack VMA with map_shadow_stack() * restore shadow stack contents with wrss * switch to "real" shadow stack * lock shadow stack features Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## criu-dev #2306 +/- ##
============================================
- Coverage 70.62% 70.17% -0.45%
============================================
Files 134 135 +1
Lines 33316 34153 +837
============================================
+ Hits 23528 23968 +440
- Misses 9788 10185 +397 ☔ View full report in Codecov by Sentry. |
A friendly reminder that this PR had no activity for 30 days. |
Sorry for the delay. It is still in my todo list. @0x7f454c46 @mihalicyn, you help will be welcome too;) |
Merged. Thanks a lot. |
Shadow stack support for userspace finally made it to the kernel and varying level of success to glibc.
This PR enables shadow stack support in CRIU.
Aside from saving/restoring the actual shadow stack contents and control, there are some changes to the way CRIU calls rt_sigreturn and a bit of black magic around restoring of the shadow stack contents.
As it's still unclear what will be glibc policy about making shadow stack on or off by default, this patchset takes care of both cases and lets CRIU fully control shadow stack for the restored tasks.
Testing: