Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform handle spill IO outside of locked section in SpillFramework #11880

Open
wants to merge 28 commits into
base: branch-25.02
Choose a base branch
from

Conversation

zpuller
Copy link
Collaborator

@zpuller zpuller commented Dec 16, 2024

Addresses #11830

Moves the IO outside of the critical section in the spillable buffer handle spill functions to allow threads interacting with the SpillFramework to manage the spill state of a handle consistently without being blocked on IO. Eg. if thread t1 is in the middle of spilling, and thread t2 wants to check whether this handle is currently spilling, it doesn't need to wait for the spill IO operation to complete in order to check whether the handle it spillable.

Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
@zpuller zpuller changed the title Spill lock Add an explicit spill lock to the Spill Framework Dec 16, 2024
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please update the docs for all of the public APIs in SpillableHandle and for SpillableHandle itself to make it clear what APIs need to be thread safe and how they are supposed to be protected. For example spill has no indication as to how it should behave when multiple people call it. We can infer that it one wins and the others fail. I want this mostly so it is clear what the contract is for each of these APIs. spillable being a best effort API intended only for quick filtering is fine, but it needs to be documented so if someone tries to use it in a way that requires it to be exact we know that is wrong and it violates the contract. This also will let me reason about the code and who is calling it so that I can better reason about the correctness.

Also could you please explain how error handling/error recovery is intended to happen when spilling? Like if an exception is thrown while we are in the middle of trying to spill something what should happen. Are we supposed to just keep it in the spilling state?

@zpuller
Copy link
Collaborator Author

zpuller commented Dec 17, 2024

Could you please update the docs for all of the public APIs in SpillableHandle and for SpillableHandle itself to make it clear what APIs need to be thread safe and how they are supposed to be protected.

Absolutely yeah, I'm just planning to resolve the other outstanding issues in the PR and ensure I have a clear understanding and answer to the other questions, and then will document accordingly.

@zpuller zpuller marked this pull request as draft December 19, 2024 18:30
@zpuller
Copy link
Collaborator Author

zpuller commented Dec 19, 2024

I discussed offline with @abellina and he gave me some ideas on how to rework this to not require a separate lock, but to just move the IO component out of the protected section of spill and swap the buffers atomically. I pushed some changes but I'm also converting this back to draft as it's not fully ready for review yet. I still have to address some comments above, plus I am planning to do some manual testing with pulling in the most recent 25.02 branch changes.

@zpuller zpuller changed the title Add an explicit spill lock to the Spill Framework Perform handle spill IO outside of locked section in SpillFramework Dec 19, 2024
@jihoonson
Copy link
Collaborator

I'm triggering a build job to see if some test consistently fails for all PRs.

@jihoonson
Copy link
Collaborator

build

zpuller added 10 commits January 8, 2025 10:10
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
@zpuller
Copy link
Collaborator Author

zpuller commented Jan 14, 2025

I've pushed several updates to the branch. Per my previous comment, now the changes are structured in such a way that the interface does not need to change.

I was able to test this and see evidence in traces showing that it allows multiple threads to spill concurrently.

I considered refactoring some duplicate code fragments between the handle types around how we do the buffer swaps etc. but I'm not sure if it will actually simplify the overall code readability/PR so I held off for now.

@zpuller zpuller marked this pull request as ready for review January 14, 2025 19:28
Signed-off-by: Zach Puller <[email protected]>
@zpuller zpuller requested review from revans2, jlowe and abellina and removed request for jlowe January 15, 2025 17:14
}
}

override def close(): Unit = {
// do we need to Cuda.deviceSynchronize here?
// what if we don't spill
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a need here. close is called from the user code and that's who owns the handle, so the caller needs to be careful about calling synchronize before closing memory, but at the same time we use stream aware allocators, so I don't see cases where we need to add extra synchronization.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I know we just discussed that offline, I forgot to delete this comment

@@ -527,33 +601,64 @@ class SpillableColumnarBatchHandle private (
materialized
}


private var toSpill: Option[ColumnarBatch] = None
private var spilled: Option[ColumnarBatch] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need both toSpill and spilled?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toSpill is used to both indicate which thread is currently spilling as well as hold a reference to the underlying buffer during spill. spilled is used to make sure the resource gets cleaned up properly after spilling. It's possible that there's some clever way to consolidate them that I couldn't think of

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but for instance if you are spilling only have toSpill set and you do toSpill = None, dev = None when closing the handle, then when spill IO finishes and it tries to swap the buffers you'll end up with all reference set to None

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what do you expect the final state of a closed handle to look like? I would expect them to all be set to None after we closed anything that could not be closed in when the spill was happening.

To me it just feels simpler that if we see a close when a spill is happening, we mark the handle as closed, but we don't try to close anything. Then when the spill is finished, but still in the synchronized step we call close again (or perhaps an underlying close method) to finish the job.

I would also really like some tests to verify that this is working because I keep seeing race conditions and I don't trust myself to have caught all of them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep they should all be set to None. I've tried to rework it to put as much as possible into a separate close impl method that we can invoke from close or spill as you described. It only doesn't handle the local variables like staging but I still set those to None and close, separately within the spill call.

Also added some monte carlo style tests to test overlaying async closing and spilling at different delays. I was able to see in my local testing that I'm producing all possible branches of the race so to speak, but I removed the print statements for the PR, not sure if it's sufficient to simply run the tests and see that no buffers leak, or if we want to add some test helper logic into SpillFramework to actually assert that we triggered the race. For now I left that out as it may be overkill.

Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
@zpuller zpuller requested review from revans2 and abellina January 17, 2025 17:17
host = stagingHost
}
spilled = dev
dev = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to have code to handle the case where close was called in the middle of a spill.

It looks like we are going to leak stagingHost, and dev which is then stored into spilled with the current code.

@@ -527,33 +601,64 @@ class SpillableColumnarBatchHandle private (
materialized
}


private var toSpill: Option[ColumnarBatch] = None
private var spilled: Option[ColumnarBatch] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what do you expect the final state of a closed handle to look like? I would expect them to all be set to None after we closed anything that could not be closed in when the spill was happening.

To me it just feels simpler that if we see a close when a spill is happening, we mark the handle as closed, but we don't try to close anything. Then when the spill is finished, but still in the synchronized step we call close again (or perhaps an underlying close method) to finish the job.

I would also really like some tests to verify that this is working because I keep seeing race conditions and I don't trust myself to have caught all of them.

Signed-off-by: Zach Puller <[email protected]>
@zpuller zpuller requested a review from revans2 January 22, 2025 16:11
@@ -160,6 +155,11 @@ trait StoreHandle extends AutoCloseable {
* removed on shutdown, or by handle.close, but 0-byte handles are not spillable.
*/
val approxSizeInBytes: Long

/**
* This is used to resolve races between closing a handle while spilling.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* This is used to resolve races between closing a handle while spilling.
* This is used to resolve races between closing a handle and spilling.

}
sizeInBytes
} else {
0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spacing is odd here.

} else {
host = staging
}
// set spilled to dev instead of toSpill so that if dev was already closed during spill,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little confused by this comment. toSpill is dev. It's just a reference to it. So when you increased the refcount at line 649, you increased it for dev and toSpill. I don't believe it makes much of a difference what we set spilled to (dev or toSpill) but we should just remove the comment I think.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here is that it's an issue of timing: yes we set toSpill to equal dev but later if we call close, we intentionally leave toSpill as is in case we are mid spill, but set dev to None, therefore the two variables are no longer equal at that point.

Having said that, I think I may be able to simplify this now that we explicitly keep track of closed but let me check.

}
}

private def withChunkedPacker[T](body: ChunkedPacker => T): T = {
val tbl = synchronized {
if (dev.isEmpty) {
if (toSpill.isEmpty) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bad design for this method, and it's my fault. Could we pass to withChunkedPacker a batch instead? That way we don't have this exception and don't rely on state.

So the signature would become:

private def withChunkedPacker[T](batchToPack: ColumnarBatch)(body: ChunkedPacker => T): T

} else {
host = staging
}
// set spilled to dev instead of toSpill so that if dev was already closed during spill,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, the comment is confusing to me.

}
}

override def close(): Unit = {
private def doClose(): Unit = synchronized {
releaseDeviceResource()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so now releaseDeviceResource is inside of the handle lock. This method ends up calling the spill store and takes its lock. I was trying to avoid this to prevent lock inversion deadlocks. Do you have a reason to move this inside of the handle lock?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it shouldn't be needed, let me try to undo that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants