Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preallocate space for Win64 shadow args #17095

Merged
merged 11 commits into from
Dec 12, 2024
Merged

Conversation

dstogov
Copy link
Member

@dstogov dstogov commented Dec 9, 2024

Otherwise "fixed_call_stack" doesn't make sense

Otherwise "fixed_call_stack" doesn't make sense
ext/opcache/jit/zend_jit_ir.c Outdated Show resolved Hide resolved
@dstogov
Copy link
Member Author

dstogov commented Dec 9, 2024

@cmb69 I think this PR is a step into the right direction. It leads to few new tests failures. One of them is reproducible with the following command:
x64\Release_TS\php.exe -d opcache.jit_debug=0x1ff405 C:\WORK\php-master\tests\basic\timeout_variation_0.phpt.

I tried to attach the necessary hard-coded SEH info with the following patch (on top of this PR), and this seems fixed the crash (debug backtrace at _zend_bailut() also became much better). Can you please take a look.

diff --git a/ext/opcache/jit/ir/ir_x86.dasc b/ext/opcache/jit/ir/ir_x86.dasc
index 284e1480d38..9bb56476c47 100644
--- a/ext/opcache/jit/ir/ir_x86.dasc
+++ b/ext/opcache/jit/ir/ir_x86.dasc
@@ -10806,24 +10806,48 @@ next_block:;
 	}
 	size = *size_ptr;
 
+	size_t extra_size = 0;
+
+#ifdef _WIN64
+	/* Hardcoded SEH info for PHP */
+	static const unsigned char uw_data[] = {
+		0x01, // UBYTE: 3 Version , UBYTE: 5 Flags
+		0x10, // UBYTE Size of prolog
+		0x09, // UBYTE Count of unwind codes
+		0x00, // UBYTE: 4 Frame Register, UBYTE: 4 Frame Register offset (scaled)
+		// USHORT * n Unwind codes array
+		0x10, 0x82, // c: subq $0x48, %rsp
+		0x0c, 0xf0, // a: pushq %r15
+		0x0a, 0xe0, // 8: pushq %r14
+		0x08, 0xd0, // 6: pushq %r13
+		0x06, 0xc0, // 4: pushq %r12
+		0x04, 0x70, // 3: pushq %rdi
+		0x03, 0x60, // 2: pushq %rsi
+		0x02, 0x50, // 1: pushq %rbp
+		0x01, 0x30, // 0: pushq %rbx
+	};
+	size = IR_ALIGNED_SIZE(size, 16);
+	extra_size = IR_ALIGNED_SIZE(sizeof(RUNTIME_FUNCTION) + sizeof(uw_data), 16);
+#endif
+
 	if (ctx->code_buffer) {
 		entry = ctx->code_buffer->pos;
 		entry = (void*)IR_ALIGNED_SIZE(((size_t)(entry)), 16);
-		if (size > (size_t)((char*)ctx->code_buffer->end - (char*)entry)) {
+		if (size + extra_size > (size_t)((char*)ctx->code_buffer->end - (char*)entry)) {
 			ctx->data = NULL;
 			ctx->status = IR_ERROR_CODE_MEM_OVERFLOW;
 			return NULL;
 		}
-		ctx->code_buffer->pos = (char*)entry + size;
+		ctx->code_buffer->pos = (char*)entry + size + extra_size;
 	} else {
-		entry = ir_mem_mmap(size);
+		entry = ir_mem_mmap(size + extra_size);
 		if (!entry) {
 			dasm_free(&data.dasm_state);
 			ctx->data = NULL;
 			ctx->status = IR_ERROR_CODE_MEM_OVERFLOW;
 			return NULL;
 		}
-		ir_mem_unprotect(entry, size);
+		ir_mem_unprotect(entry, size + extra_size);
 	}
 
 	ret = dasm_encode(&data.dasm_state, entry);
@@ -10831,12 +10855,12 @@ next_block:;
 		IR_ASSERT(0);
 		dasm_free(&data.dasm_state);
 		if (ctx->code_buffer) {
-			if (ctx->code_buffer->pos == (char*)entry + size) {
+			if (ctx->code_buffer->pos == (char*)entry + size + extra_size) {
 				/* rollback */
-				ctx->code_buffer->pos = (char*)entry - size;
+				ctx->code_buffer->pos = (char*)entry - (size + extra_size);
 			}
 		} else {
-			ir_mem_unmap(entry, size);
+			ir_mem_unmap(entry, size + extra_size);
 		}
 		ctx->data = NULL;
 		ctx->status = IR_ERROR_ENCODE;
@@ -10868,7 +10892,16 @@ next_block:;
 
 	dasm_free(&data.dasm_state);
 
-	ir_mem_flush(entry, size);
+#ifdef _WIN64
+	RUNTIME_FUNCTION *f = (RUNTIME_FUNCTION *)((char*)entry + size);
+	f->BeginAddress = 0;
+	f->EndAddress = size;
+	f->UnwindData = size + sizeof(RUNTIME_FUNCTION);
+	memcpy((char*)f + sizeof(RUNTIME_FUNCTION), uw_data, sizeof(uw_data));
+	RtlAddFunctionTable(f, 1, (DWORD64)(uintptr_t)entry);
+#endif
+
+	ir_mem_flush(entry, size + extra_size);
 
 #if defined(__GNUC__)
 	if ((ctx->flags & IR_GEN_CACHE_DEMOTE) && (ctx->mflags & IR_X86_CLDEMOTE)) {
@@ -10879,12 +10912,12 @@ next_block:;
 			/* _cldemote(p); */
 			asm volatile(".byte 0x0f, 0x1c, 0x06" :: "S" (p));
 			p += 64;
-		} while (p < start + size);
+		} while (p < start + size + extra_size);
 	}
 #endif
 
 	if (!ctx->code_buffer) {
-		ir_mem_protect(entry, size);
+		ir_mem_protect(entry, size + extra_size);
 	}
 
 	ctx->data = NULL;

@cmb69
Copy link
Member

cmb69 commented Dec 10, 2024

Wow, great job @dstogov! I can confirm that without the additional SEH info patch, the test crashes (STATUS_BAD_STACK), but with the patch applied, it succeeds, and the debug backtrace is fine now.

@dstogov
Copy link
Member Author

dstogov commented Dec 10, 2024

@cmb69 thanks for testing. Unfortunately, the general solution is not going to be simple...

  1. hard-coded SEH info should be replaced with a data specific to prologue of a concrete function
  2. RtlAddFunctionTable() should be replaced with RtlInstallFunctionTableCallback() (to support multi-process environment)
  3. information about all generated functions and associated SEH info should be stored somewhere (to support RtlInstallFunctionTableCallback())
  4. it would be great to provide the similar functionality for ELF systems as well.

I afraid, this is going to be a too complex change for PHP-8.4...
Any ideas are welcome.

@cmb69
Copy link
Member

cmb69 commented Dec 10, 2024

I'm seriously lacking understanding of the JIT implementation (and only barely understand how the stack unwinding works now), so the following might be nonsense.

Maybe we could store relevant unwind info for each jitted function right after the function in SHM (similar as you've done in the patch above), and then somehow link this info (if we know the size of the jitted code for a function before it is written to SHM, the layout might be size of jitted code, jitted code, size of unwind info, unwind info, size of next jitted code, …). Install a single function table callback, which then traverses the linked list until it finds the desired function, and then returns the unwind info. Of course, traversing the linked list would be slow, but stack unwinding should be an exceptional case anyway.

If the above would work, a possible optimization might be to store only relevant parts of the unwind info (it seems that the volatile registers are always pushed in the same way, so this info might not need to be stored for each function, etc.), and to reconstruct the proper unwind info in the function table callback.

I afraid, this is going to be a too complex change for PHP-8.4...

It would be great to solve the crashes for PHP-8.4, but if we can't, we could document that tracing JIT (or more generally certain JIT modes) are not supported on Windows (and maybe even disable them).

This is not a general solution.
It works while all the JIT-ed functions have the same "fixed stack frame".
Unwinder uses hard-coded unwind data this "fixed dtack frame".
@dstogov
Copy link
Member Author

dstogov commented Dec 11, 2024

This should fix #15709 now

@cmb69
Copy link
Member

cmb69 commented Dec 11, 2024

Thank you @dstogov!

Maybe we should revert the following commits to see if all issues are fixed:

@dstogov
Copy link
Member Author

dstogov commented Dec 11, 2024

@cmb69 thanks for the idea. Done.

@cmb69
Copy link
Member

cmb69 commented Dec 11, 2024

Two tests are (still) failing with STATUS_BAD_FUNCTION_TABLE, but the STATUS_BAD_STACK issues all seem to have been resolved. That's great!

I'll try to have a closer look this evening, bad so far I have not been able to reproduce any STATUS_BAD_FUNCTION_TABLE errors locally.

@dstogov
Copy link
Member Author

dstogov commented Dec 12, 2024

@cmb69 it seems I fixed that tests and support for multi-process environment.
Few JIT stubs may still cause the problem. Anyway, I would prefer to commit this.
Could you please test, if this doesn't introduce problems for multi-process PHP.

Copy link
Member

@cmb69 cmb69 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please test, if this doesn't introduce problems for multi-process PHP.

I don't have a good testing environment available; I've just did some stress testing on the Symfony demo app, and didn't notice any issues with the patch.

Given that this certainly fixes several crashes, and that we're very early in the PHP-8.4 release cycle, this looks good to be merged.

Thank you!


static void zend_jit_setup_unwinder(void)
{
/* Hardcoded SEH unwinf data for JIT-ed PHP functions with "fixed stack frame" */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* Hardcoded SEH unwinf data for JIT-ed PHP functions with "fixed stack frame" */
/* Hardcoded SEH unwind data for JIT-ed PHP functions with "fixed stack frame" */

@dstogov dstogov merged commit ccc6c0f into php:PHP-8.4 Dec 12, 2024
9 of 10 checks passed
dstogov added a commit that referenced this pull request Dec 12, 2024
* PHP-8.4:
  Fix GH-15709: Crashing tests on Windows x64  (#17095)
@dstogov
Copy link
Member Author

dstogov commented Dec 16, 2024

After this commit, Nightly and Push tests work-flows start to fail on WINDOWS_X64_ZTS Zend\tests\stack_limit\stack_limit_014.phpt. The test was XFAILED before. Unfortunately, I'm not able to reproduce the failure locally (it seems not always fail at github as well).

I verified the test in debugger: set breakpoint on _zend_bailout() and then checked that debug backtrace looks correct.
I didn't find any problems...

@cmb69 @arnaud-lb may be you get ideas how to debug this.

@cmb69
Copy link
Member

cmb69 commented Dec 16, 2024

@dstogov, stack_limit_014.phpt times out now, because we're using 64MB stack size on Windows by default. See #17166 for a simple fix for CI.

@dstogov
Copy link
Member Author

dstogov commented Dec 16, 2024

@cmb69 I see, you already found the fix and created PR (just not committed yet). Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants