-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CP-32622: avoid using select and instead use epoll #4877
Conversation
ocaml/xapi-idl/lib/posix_channel.ml
Outdated
let epoll = Polly.create () in | ||
List.iter (fun fd -> Polly.add epoll fd Polly.Events.inp) (r @ w) ; | ||
ignore | ||
@@ Polly.wait epoll 4 (-1) (fun _ fd _ -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the old code can be summarised as: "Wait for any r
to be ready to reading or any of the w
to be ready for writing. Then read/write to the fd found ready. Start again.". In every round of the loop (every time select
returns) one or more fds would be ready, and it may often be just one. So we can't assume that we can write + read in the same round.
The third argument of the function passed in here, currently ignore with _
, is set to the kind of event that happened and indicates whether we can read from or write to fd
.
ocaml/networkd/lib/jsonrpc_client.ml
Outdated
inner remain_time max_bytes | ||
else | ||
inner remain_time max_bytes | ||
Unix.setsockopt_float fd Unix.SO_RCVTIMEO (Int64.to_float max_time) ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is giving a parse error on make test
4a79673
to
a562d49
Compare
if !i < !buf_remote_end + b then final := true ; | ||
buf_remote_end := !i | ||
) ; | ||
if fd = file_desc then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative style could use match
:
match file_desc with
| file_desc when file_desc = Unix.stdin -> ..
| file_desc when file_desc = fd -> ..
| _otherwise ->
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to handle one or all FDs that are ready? Up to 2
can be ready. This code handles two if they are ready but we have other places where we handle only one. It's not obvious why we have different behavior.
823ef93
to
ea9cb03
Compare
ocaml/forkexecd/src/child.ml
Outdated
[comms_sock; fd_sock] ; | ||
(* Although there are two fds, we set max_fds to 1 here as we only want this | ||
function to trigger once so that we get one return value *) | ||
Polly.wait_fold epoll 1 (-1) state (fun _ fd _ _ -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 1 correct here though, would that prevent it from even watching the other FD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that one FD would be processed per round - and any other FD that is ready would be reported again in the next round. But does this guarantee fairness?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the only way I could think to make it work, do you have any other alternatives?
535388b
to
596a6dc
Compare
596a6dc
to
33c4bb6
Compare
Remember that the timeout in |
33c4bb6
to
d4d6c10
Compare
~finally:(fun () -> Polly.close epoll) | ||
(fun () -> | ||
ignore | ||
@@ Polly.wait epoll 4 (-1) (fun _ fd _ -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the code below matches exactly one condition and not all fds that are ready; could this be written as
if fd = x then do something;
if fd = y then do something;
..
Maybe leave a comment why the selected behavior is the right one.
if !i < !buf_remote_end + b then final := true ; | ||
buf_remote_end := !i | ||
) ; | ||
if fd = file_desc then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to handle one or all FDs that are ready? Up to 2
can be ready. This code handles two if they are ready but we have other places where we handle only one. It's not obvious why we have different behavior.
ocaml/xapi-idl/lib/posix_channel.ml
Outdated
(fun () -> | ||
ignore | ||
@@ Polly.wait epoll 4 (-1) (fun _ fd event -> | ||
if event = Polly.Events.inp then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this handles at most one FD. I think the unconditional else
case is a bit dangerous and I would explicitly check than any FD is ready before acting on it. This applies to other places as well.
match Unix.read fd bytes 0 4096 with | ||
| 0 -> | ||
Buffer.contents buf (* EOF *) | ||
| n -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be expressed as
| n when n >= max_bytes -> ..
| n (* otherwise *) ->
| exception .. ->
List.iter (fun fd -> Polly.add epoll fd Polly.Events.inp) r ; | ||
List.iter (fun fd -> Polly.add epoll fd Polly.Events.out) w ; | ||
Fun.protect | ||
~finally:(fun () -> Polly.close epoll) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we use this pattern quite a lot, should we have a with_polly
somewhere?
ocaml/database/block_device_io.ml
Outdated
(fun () -> | ||
Unix.setsockopt_float s Unix.SO_RCVTIMEO timeout ; | ||
try fst (Unix.accept s) | ||
with Unix.Unix_error (Unix.EAGAIN, _, _) -> raise Unixext.Timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does SO_RCVTIMEO work with accept
? I can't find a clear statement for or against in the manpage of 'socket(7)', it mentions read/recvmsg/send/sendmsg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you're right, the only examples online of using a timeout with accept are using select
inner remain_time max_bytes | ||
else | ||
inner remain_time max_bytes | ||
( try Unix.setsockopt_float fd SO_RCVTIMEO (Int64.to_float max_time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do seem to repeat this quite often, perhaps as a future cleanup PR we could define a helper function: 'timed_read' which does the setsockopt+read+fun.protect to set it back, same for write a 'timed_write' might be useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this would avoid mistakes, such as around resetting the socket option.
ocaml/xapi-idl/lib/posix_channel.ml
Outdated
to_close := fd :: !to_close ; | ||
proxy fd proxy_socket | ||
) else | ||
assert false (* can never happen *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't be so sure, we could get spurious notifications from 'epoll', and we don't want to crash the application (or thread). Perhaps just log it as error for now?
Especially that you've added 2 file descriptors s_ip and s_unix, this will NOT be just s_unix, it is entirely possible that s_ip will have something available, otherwise why add it to epoll in the first place?
8f13617
to
89bb393
Compare
I believe all of the comments have now been addressed. |
d4b4878
to
e124de2
Compare
ocaml/database/block_device_io.ml
Outdated
if fds = [] then (* We must have timed out *) | ||
raise Unixext.Timeout | ||
else (* There will only ever be a maximum of one fd *) | ||
List.hd fds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While List.hd is guarded here, it's preferrable to destructure the list to obtain values from it:
if fds = [] then (* We must have timed out *) | |
raise Unixext.Timeout | |
else (* There will only ever be a maximum of one fd *) | |
List.hd fds | |
match fds with | |
| [] -> (* We must have timed out *) | |
raise Unixext.Timeout | |
| fd :: _ -> (* There will only ever be a maximum of one fd *) | |
fd |
e124de2
to
da3f2d9
Compare
eed05d5
to
f259cae
Compare
All comments addressed and suiterun 179915 passing. |
ocaml/libs/stunnel/stunnel.ml
Outdated
@@ -394,7 +394,7 @@ let rec retry f = function | |||
try f () | |||
with Stunnel_initialisation_failed -> | |||
(* Leave a few seconds between each attempt *) | |||
ignore (Unix.select [] [] [] 3.) ; | |||
Unix.sleep 3 ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Thread.delay
Signed-off-by: Steven Woods <[email protected]>
Signed-off-by: Steven Woods <[email protected]>
of pipes Signed-off-by: Steven Woods <[email protected]>
Signed-off-by: Steven Woods <[email protected]>
Signed-off-by: Steven Woods <[email protected]>
loop as its contents has been removed in a previous commit. Signed-off-by: Steven Woods <[email protected]>
Signed-off-by: Steven Woods <[email protected]>
Signed-off-by: Steven Woods <[email protected]>
f259cae
to
f4c2087
Compare
n | ||
with Unix.Unix_error (Unix.EAGAIN, _, _) -> -1 | ||
in | ||
Unix.setsockopt_float ic.fd Unix.SO_RCVTIMEO 0. ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd better use finally
to ensure that this timeout is reset in case of any exception (only one kind is caught above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every read
on ic.fd
seems to go through here, and each read is preceded by setting a timeout, so I don't think we even need to reset the timeout, just drop this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are more reads in http.ml but they also set a socket timeout. Will need to check the other callers too.
r = [] | ||
try | ||
ignore (Unix.read sock_out (Bytes.create 1) 0 1) ; | ||
Unix.setsockopt_float sock_out Unix.SO_RCVTIMEO 0. ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we'd miss resetting the socket option in case of an exception.
Testing whether 'select' is completely gone or not is tricky due to a lot of dead code in the linked binary.
unix_select.gawk:
It shows these for XAPI and xenopsd:
And for xenopsd:
|
The 'select' in watch.ml can be replaced similarly as elsewhere by changing pipe->socket and using socket timeouts. |
Removed some Unix.select and Thread.wait* usages in xapi-project/stdext#80. And a few more from XAPI and xenopsd as well: https://github.com/edwintorok/xen-api/commits/private/edvint/CP-32622 We'll probably need a CA ticket and backport the watch.ml fix since it affects xenopsd which already claims to support >1024 fds. I think these now remove all the obvious Unix.select usages, but we should implement what has been suggested here and add some CI checks in xs-opam with various methods to check that we really don't call it anymore (and then add some unit tests to all these functions that we modified to check that they still work correctly): |
I've updated the |
I'm closing this and have opened a PR against Edwin's fork so that he can still keep track of it edwintorok#4 |
To avoid xapi from running out of file descriptors, replace select with epoll in most cases