Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

router: add callro methods based on node zone #391

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions test/router/router.result
Original file line number Diff line number Diff line change
Expand Up @@ -1465,6 +1465,8 @@ error_messages
- Use replicaset:callbro(...) instead of replicaset.callbro(...)
- Use replicaset:callre(...) instead of replicaset.callre(...)
- Use replicaset:callro(...) instead of replicaset.callro(...)
- Use replicaset:callbzre(...) instead of replicaset.callbzre(...)
- Use replicaset:callbzro(...) instead of replicaset.callbzro(...)
- Use replicaset:callrw(...) instead of replicaset.callrw(...)
- Use replicaset:connect(...) instead of replicaset.connect(...)
- Use replicaset:connect_all(...) instead of replicaset.connect_all(...)
Expand Down
26 changes: 25 additions & 1 deletion vshard/replicaset.lua
Original file line number Diff line number Diff line change
Expand Up @@ -577,13 +577,33 @@ local function replicaset_template_multicallro(prefer_replica, balance)
local function pick_next_replica(replicaset, now)
local r
local master = replicaset.master

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, drop not needed empty lines.

if balance then
local prefered_zone = replicaset.priority_list[1].zone
local i = #replicaset.priority_list
while i > 0 do
r = replicaset_balance_replica(replicaset)
i = i - 1
if r:is_connected() and (not prefer_replica or r ~= master) and
replica_check_backoff(r, now) then
replica_check_backoff(r, now)
then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, revert unnecessary changes like these. These 3 lines didn't have to change at all. Besides, we never put then on a separate line. Lets keep it consistent with our code style.

-- Pick a replica according prefered zone (highest priority replica zone) in round-robin manner
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, keep line width limit at 80 symbols.

if balance == "prefer_zone" and prefered_zone then
local cbi = replicaset.balance_i
local nr = replicaset_balance_replica(replicaset)

if prefered_zone and r.zone and r.zone == prefered_zone
and nr.zone and nr.zone ~= prefered_zone
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to check r.zone and r.zone == .... You can just do r.zone == .... nil is comparable with other types just fine.

and (not prefer_replica or nr ~= master)
then
-- Reset cursor to the main position if next replica is in different zone.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replicas are not sorted by zones. Hence if in the replica list the first one is in zone1, then one in zone2, and others in zone1, your code will skip the tail-replicas always. That doesn't look good.

There are several solutions - don't touch balance_i. Simply skip replicas from other zones manually. Or sort the replicas by zones in the replica list. Or add a key-value dictionary to each replicaset object, where key = zone name, value = replica list in this zone.

I think it might be too complicated to do the last 2 options, so I suggest to just skip not fitting replicas during the iteration. It seems this code solves your problem and is multiple times shorter:

@@ -577,33 +577,18 @@ local function replicaset_template_multicallro(prefer_replica, balance)
     local function pick_next_replica(replicaset, now)
         local r
         local master = replicaset.master
-
         if balance then
-            local prefered_zone = replicaset.priority_list[1].zone
+            local prefered_zone
+            if balance == 'prefered_zone' then
+                prefered_zone = replicaset.priority_list[1].zone
+            end
             local i = #replicaset.priority_list
             while i > 0 do
                 r = replicaset_balance_replica(replicaset)
                 i = i - 1
                 if r:is_connected() and (not prefer_replica or r ~= master) and
-                    replica_check_backoff(r, now)
-                then
-                    -- Pick a replica according prefered zone (highest priority replica zone) in round-robin manner
-                    if balance == "prefer_zone" and prefered_zone then
-                        local cbi = replicaset.balance_i
-                        local nr = replicaset_balance_replica(replicaset)
-
-                        if prefered_zone and r.zone and r.zone == prefered_zone
-                            and nr.zone and nr.zone ~= prefered_zone
-                            and (not prefer_replica or nr ~= master)
-                        then
-                            -- Reset cursor to the main position if next replica is in different zone.
-                            replicaset.balance_i = 1
-                        else
-                            -- Restore rr-cursor position.
-                            replicaset.balance_i = cbi
-                        end
-                    end
-
+                   replica_check_backoff(r, now) and
+                   (not prefered_zone or prefered_zone == r.zone) then
                     return r
                 end
             end

Of course, we would need tests to be sure.

replicaset.balance_i = 1
else
-- Restore rr-cursor position.
replicaset.balance_i = cbi
end
end

Comment on lines +590 to +606
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should strive to keep the code indentation to the minimum. For example, here it is easily achievable by just handling simple cases first and doing early return. Then you can handle the zone balancing with -4 spaces indentation:

                 if r:is_connected() and (not prefer_replica or r ~= master) and
-                    replica_check_backoff(r, now)
-                then
+                   replica_check_backoff(r, now) then
+                    if balance ~= 'prefered_zone' or not prefered_zone then
+                        return r
+                    end
                     -- Pick a replica according prefered zone (highest priority replica zone) in round-robin manner
-                    if balance == "prefer_zone" and prefered_zone then
-                        local cbi = replicaset.balance_i
-                        local nr = replicaset_balance_replica(replicaset)
+                    local cbi = replicaset.balance_i
+                    local nr = replicaset_balance_replica(replicaset)
 
-                        if prefered_zone and r.zone and r.zone == prefered_zone
-                            and nr.zone and nr.zone ~= prefered_zone
-                            and (not prefer_replica or nr ~= master)
-                        then
-                            -- Reset cursor to the main position if next replica is in different zone.
-                            replicaset.balance_i = 1
-                        else
-                            -- Restore rr-cursor position.
-                            replicaset.balance_i = cbi
-                        end
+                    if prefered_zone and r.zone and r.zone == prefered_zone
+                        and nr.zone and nr.zone ~= prefered_zone
+                        and (not prefer_replica or nr ~= master)
+                    then
+                        -- Reset cursor to the main position if next replica is in different zone.
+                        replicaset.balance_i = 1
+                    else
+                        -- Restore rr-cursor position.
+                        replicaset.balance_i = cbi
                     end
-

return r
end
end
Expand Down Expand Up @@ -637,6 +657,7 @@ local function replicaset_template_multicallro(prefer_replica, balance)
end
end
opts.timeout = timeout

net_status, storage_status, retval, err =
replica_call(replica, func, args, opts)
now = fiber_clock()
Expand Down Expand Up @@ -990,8 +1011,10 @@ local replicaset_mt = {
callrw = replicaset_master_call;
callro = replicaset_template_multicallro(false, false);
callbro = replicaset_template_multicallro(false, true);
callbzro = replicaset_template_multicallro(false, "prefer_zone");
callre = replicaset_template_multicallro(true, false);
callbre = replicaset_template_multicallro(true, true);
callbzre = replicaset_template_multicallro(true, "prefer_zone");
map_call = replicaset_map_call,
update_master = replicaset_update_master,
};
Expand Down Expand Up @@ -1235,6 +1258,7 @@ local function buildall(sharding_cfg)
local function replica_cmp_weight(r1, r2)
-- Master has priority over replicas with the same
-- weight.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, drop this line.

if r1.weight == r2.weight then
return r1 == new_replicaset.master
else
Expand Down
16 changes: 15 additions & 1 deletion vshard/router/init.lua
Original file line number Diff line number Diff line change
Expand Up @@ -576,11 +576,15 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
local call
if mode == 'read' then
if prefer_replica then
if balance then
if balance == "prefer_zone" then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, change 'balance' to numeric IDs. String comparison in Lua not always works for constant time and might degrade to linear time. Like this:

@@ -65,6 +65,10 @@ local fiber_yield = fiber.yield
 local fiber_cond_wait = util.fiber_cond_wait
 local gsc = util.generate_self_checker
 
+local BALANCE_NONE = 0
+local BALANCE_RR = 1
+local BALANCE_BEST_ZONE_RR = 2
+
 --
 -- on_connect() trigger for net.box
 --
@@ -578,7 +582,7 @@ local function replicaset_template_multicallro(prefer_replica, balance)
         local r
         local master = replicaset.master
 
-        if balance then
+        if balance ~= BALANCE_NONE then
             local prefered_zone = replicaset.priority_list[1].zone
             local i = #replicaset.priority_list
             while i > 0 do
@@ -588,7 +592,7 @@ local function replicaset_template_multicallro(prefer_replica, balance)
                     replica_check_backoff(r, now)
                 then
                     -- Pick a replica according prefered zone (highest priority replica zone) in round-robin manner
-                    if balance == "prefer_zone" and prefered_zone then
+                    if balance == BALANCE_BEST_ZONE_RR and prefered_zone then
                         local cbi = replicaset.balance_i
                         local nr = replicaset_balance_replica(replicaset)
 
@@ -1009,12 +1013,12 @@ local replicaset_mt = {
         wait_connected_all = replicaset_wait_connected_all,
         call = replicaset_master_call;
         callrw = replicaset_master_call;
-        callro = replicaset_template_multicallro(false, false);
-        callbro = replicaset_template_multicallro(false, true);
-        callbzro = replicaset_template_multicallro(false, "prefer_zone");
-        callre = replicaset_template_multicallro(true, false);
-        callbre = replicaset_template_multicallro(true, true);
-        callbzre = replicaset_template_multicallro(true, "prefer_zone");
+        callro = replicaset_template_multicallro(false, BALANCE_NONE);
+        callbro = replicaset_template_multicallro(false, BALANCE_RR);
+        callbzro = replicaset_template_multicallro(false, BALANCE_BEST_ZONE_RR);
+        callre = replicaset_template_multicallro(true, BALANCE_NONE);
+        callbre = replicaset_template_multicallro(true, BALANCE_RR);
+        callbzre = replicaset_template_multicallro(true, BALANCE_BEST_ZONE_RR);
         map_call = replicaset_map_call,
         update_master = replicaset_update_master,
     };

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the comments. Will check it.

call = 'callbzre'
elseif balance then
call = 'callbre'
else
call = 'callre'
end
elseif balance == "prefer_zone" then
call = 'callbzro'
elseif balance then
call = 'callbro'
else
Expand Down Expand Up @@ -698,6 +702,10 @@ local function router_callbro(router, bucket_id, ...)
return router_call_impl(router, bucket_id, 'read', false, true, ...)
end

local function router_callbzro(router, bucket_id, ...)
return router_call_impl(router, bucket_id, 'read', false, "prefer_zone", ...)
end

local function router_callrw(router, bucket_id, ...)
return router_call_impl(router, bucket_id, 'write', false, false, ...)
end
Expand All @@ -710,6 +718,10 @@ local function router_callbre(router, bucket_id, ...)
return router_call_impl(router, bucket_id, 'read', true, true, ...)
end

local function router_callbzre(router, bucket_id, ...)
return router_call_impl(router, bucket_id, 'read', true, "prefer_zone", ...)
end

local function router_call(router, bucket_id, opts, ...)
local mode, prefer_replica, balance
if opts then
Expand Down Expand Up @@ -1660,6 +1672,8 @@ local router_mt = {
call = router_make_api(router_call),
callro = router_make_api(router_callro),
callbro = router_make_api(router_callbro),
callbzro = router_make_api(router_callbzro),
callbzre = router_make_api(router_callbzre),
callrw = router_make_api(router_callrw),
callre = router_make_api(router_callre),
callbre = router_make_api(router_callbre),
Expand Down