-
Notifications
You must be signed in to change notification settings - Fork 433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCP/PERF: Use final sync up for all UCP ucx_perftest tests #10425
base: master
Are you sure you want to change the base?
UCP/PERF: Use final sync up for all UCP ucx_perftest tests #10425
Conversation
a86c6f9
to
fad3334
Compare
|
||
return (CMD == UCX_PERF_CMD_TAG) || (CMD == UCX_PERF_CMD_TAG_SYNC) || | ||
(CMD == UCX_PERF_CMD_AM); | ||
return !m_perf.params.ucp.is_daemon_mode; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it ok that for UCX_PERF_TEST_TYPE_STREAM_UNI
with UCX_PERF_CMD_PUT
progress is not invoked on the receiver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional since put should be a one-sided operation that does not require target side progress
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this is a very good question...
Unfortunately requester didn't confirm that this fix works for him, so I'll try to reproduce the issue locally and check it this solution actually helps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think CMD_PUT
is already synchronized with wait_last_iter
and send_last_iter
functions, which are only enabled for UCX_PERF_CMD_PUT
.
Strictly speaking, ACK is not really needed for UCX_PERF_CMD_PUT
, but it doesn't harm either.
However I believe it does not fix the requester issue, I will try to debug on his setup.
There must be something else
|
||
return (CMD == UCX_PERF_CMD_TAG) || (CMD == UCX_PERF_CMD_TAG_SYNC) || | ||
(CMD == UCX_PERF_CMD_AM); | ||
return !m_perf.params.ucp.is_daemon_mode; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional since put should be a one-sided operation that does not require target side progress
What?
Users complain on too optimistic performance reports provided by some ucx_perftest tests (e.g. ucp_put_bw), especially visible on huge messages and low iteration count. The root cause is the absence of synchronisation between client and server at the end of the test execution. This synchronisation was added (#10310) only for some tests: (UCX_PERF_CMD_TAG, UCX_PERF_CMD_TAG_SYNC, UCX_PERF_CMD_AM). It should be extended to others
Why?
There is a discrepancy between client and server final perf result, which is especially visible with pipeline protocols, but it occurs also with many others. On the client side we measure time needed to send all the messages, and with pipeline protocol we essentially measure the time needed to send message to the remote bounce buffer. With default window size of 32 we see an initial performance spike on the client side, but with larger amount of iterations the client and server results are converging to the same value due to implicit synchronisation between send/recv.
Workaround: increase iteration count (-n 1000) or decrease window size (-O 1)
How?
The fix is to send ack message from receiver to the sender at the end of processing, to confirm that all messages were received. For now we only fix the final report, while intermediate reports may still have a discrepancy.
Tested manually by running all UCP STREAM_UNI tests with sync up message