-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Do not merge] Add support for subaddresses #23
Conversation
caveat: a user can see if another user's subaddress is stored on the server by querying with a bad view key
All addresses stored prior *must* have been primary addresses, therefore explicitly mark them all as primary addresses
d8cbb3d
to
e8812bd
Compare
I'm curious to why you choose to return |
TL;DR because I believe a timing attack could theoretically still leak whether or not it's stored, and figured since it's not a perfect defense, may as well return the clearest possible error to users who fat finger the view key (or copy the wrong one). There are 2 cases that need to be handled: On (1) the signup case, if a user provides a duplicate subaddress but different view key than one already stored, it would need to return successfully without writing the create request to the db, otherwise a failure response indicates the subaddress is already stored (in the happy case, when you provide a fresh subaddress/view key, it writes the create request to the db). Not writing the create request to the db takes less time than writing it to the db. So someone should be able to theoretically infer if the account is already stored or not by timing how long the response takes, regardless of what view key they provide. On (2) it's a bit more subtle. If you hit a route and the server sees the subaddress is stored, it will compare the view key provided in the REST request with the one stored. This would take more time to do than if the subaddress simply were not present in the db in the first place. Therefore someone recording response times could potentially still learn that the view key they're providing is incorrect, and subaddress is stored or not. So I figured as long as it's not a perfect defense, might as well return The reason I suggested actually storing all address/view key pairs as a solution is to guarantee the round trip time to server is expected not to reveal any additional information to someone who doesn't have the correct view key. EDIT: clarification |
Well, okay.. That kinda makes sense 😃 The timing attack part. Wouldn't that require the attacker to be on the same server to make sense? |
I think you're right there's a solid chance the timing attack is impractical to pull off in a lot of (maybe even all?) cases. And perhaps that's a good enough reason to handle the return cases as described (signup always succeeds, others fail with non-descript error), rather than just assume all servers are automatically compromise-able by this theoretical timing attack. But I would also guess for certain honest users and certain circumstances, they may be more likely to be vulnerable. E.g. maybe a user who runs the server on machine with a slow HDD or running low on space or other poor specs, and is only running monero-lws so response times are more consistent. Even then I'm not sure how feasible it is. Maybe there is also some way for an attacker to hit the server in a way that would make it more likely to pull off with accuracy. I'm not sure. It feels like leaving a potential blind spot that I felt more comfortable either just assuming is compromised, or removing the blind spot entirely. But perhaps you are right that maybe it's worth assuming it may be difficult if not impossible to pull off. Still, I wouldn't feel comfortable assuming I'm safe from the attack. I originally wrote the code this way we're discussing, and included a comment that there may be a timing attack vulnerability and it just felt wrong to leave in and justify the more complex behavior to always return success on signup, even when not technically succeeding and not writing the request to the db |
So, assuming your environment has not been compromised, timing attacks are off the table. Edit: No matter what, one should always assume a hostile environment, and all users are not friendly. Just in case 😃 BTW: Happy new year 🍻 |
Upon re-reading prior discussions, I forgot I didn't handle the case when there are 3+ outputs in a tx involving subaddresses. Will implement it as part of this PR too.
The server could implement a lookahead when scanning for outputs like wallet2 does. So for all primary addresses stored, it could also derive X subaddresses (50*200 is the default in wallet2), and then compare against all of these subaddresses when scanning for outputs. This would be harder to scale for a server (10k subaddress comparisons per output, for all primary addresses stored on the server). Generally this would also be a bigger change to the light wallet REST API in that the tx's and outputs and info returned for a primary address are also from all subaddresses. I think a larger change to the API is ultimately warranted to support subaddresses in the most optimal way (also including support for the account/subaddress abstraction). The idea with this PR was to support them in a way that meshes with the API today without too much hassle, and later on think through a larger API change and collaborate with other light wallet ecosystem folks. I discussed this^ a bit with @vtnerd over DM, and you can see some related discussion on it here.
Alternatively, without too much hassle we could maintain the format of the API, but on the backend require the primary address to be stored first before allowing new subaddresses to be added to the watch-list via
I wouldn't say this with 100% confidence. I read this as "the attacker needs to have access to the environment to pull off the timing attack", but I think in theory the attacker may be able to pull it off without special privileges over the environment where the server is hosted. For example, if the user hosting the server has a disk that happens to write extremely slowly, then I think they'd be vulnerable to the timing attack in the signup case (i.e. in this case, it's just dependent on the user's existing environment, and isn't something the attacker necessarily needs to have control over). Also: happy New Year to you as well!! 😃 an enjoyable discussion to cap the year :) |
Look-ahead on 10K sub-addresses could become expensive on a server hosting wallets for multiple users, and is probably not a good idea for a program with If sub-addresses are treated as new accounts, that's pretty much it. But if they are linked to a primary address, I think some kind of short lived session should exist, created by a login primary-address + view-key, or the adding of sub-addresses should be an extension to the |
The wallet is light, not the server :) But did you measure how much resources it takes for 10k subaddresses lookahead? If I had to guess it should be next to unnoticeable with a slight RAM increase. Subaddresses use a lookup table so there is no impact on scanning time. |
Syncing 10K blocks on a i7-5500/16GB gobbles up 80% CPU for 3-4 mins. one primary and no account/sub-addresses used, both client and server running off SSD. Not exactly light load. But, if this thing can handle, say 100 primary, with 10K look-ahead, without noticeable impact on performance, that would be nice 😃 |
3-4 mins sounds a bit long for scanning 10k wallet blocks, but I don't have
If implemented correctly subaddresses shouldn't have any impact on scanning time, only RAM, but this has to be tested :) |
The 10K blocks synced was done by Connecting to |
The 10,000 subaddress comparisons per output could just be a hash table lookup (constant time avg lookup), but I would think the 10,000 subaddress public spend keys would be read into memory for each primary address stored on the server. Napkin math: 10,000 * 32 bytes * number of primary addresses stored = 0.32 mb per primary address stored = 1 GB for ~3,000 primary addresses = 10 GB for ~30,000 addresses = 100 GB for ~300,000 addresses etc. So it would become a significant consumer of RAM when we're talking 10's-100's of thousands of primary addresses stored on the server. An additional caveat, as mentioned in that other subaddress PR, because of the 3+ output case, when scanning tx's involving subaddresses searching for outputs belonging to subaddresses, I'll need to use the additional tx public keys stored for each output to derive shared secrets for each additional tx public key. This will add to scanning time depending on how many 3+ output tx's involving subaddresses there are. Naturally it makes sense to discuss the resource cost of a lookahead approach when scanning here, but just to be clear I think deciding on this optimal lookahead approach is out of scope for this PR, mostly since it would need a larger change to the light wallet API which is upstream The goal with this PR is to get subaddresses working without a change to the API. Which makes sense to do for the long term anyway imo in case a user wants to more granularly work with subaddresses.
That's this PR's approach to get initial subaddress support working :)
This is doable without adding to the endpoint's expected params instead of the approach taken in this PR, like how I mentioned starting at "Alternatively" in the above comment. Just require the user hit the server with the primary address first. This is a tradeoff though: if you give the server your primary address and view key, the server can derive all subaddresses and see all outputs belonging to your account. Maybe users might not want to give their primary address to the server for this reason, to allow scanning of some subaddresses on a case-by-case basis when the user wants. Figure it would make sense to leave as an option for users |
I don't think the memory requirement are scary at all. If you need to host 10K primary accounts with sub-addresses, you probably can afford dedicated hardware to run it on. Also, I don't see the use-case, where you would need a hosted wallet, for single dedicated sub-addresses, but maybe that's just me. Wasn't sub-addressed supposed to be single use anyway? Yeah, we got a bit side tracked withe the look-ahead discussion. I still think a |
Fair point, though at some point it would be infeasible (3 million addresses = 1 TB of RAM, 30 million = 10 TB etc.) [Edit: though at that point, some type of sharding solution probably makes sense to scan on different shards in parallel anyway]
Fair enough, I agree not a strong use case. Probably would be very few users if any who wouldn't send their primary address to a light waller server if they're using a light wallet server and were given the option not to. Not much reason to do that. So perhaps it would be a reasonable solution in this PR to require the primary address be stored before storing any subaddresses for an existing view key. The only challenge with this approach then is dealing with the lookahead in a solid way. I'm ok with the non-descriptive error approach as well (even though I would still feel uneasy about the timing attack vector), but curious to hear your thoughts on the solution to allow duplicate subaddresses to be stored for different view keys, and for admins to manually accept create/import requests by referencing a view key rather than an address? This should definitively close the timing attack vector. In summary, there are 4 options on the table for this PR: I personally still like option 4 if we don't want to leak that a subaddress is stored. It is simplest to reason through I think |
I guess I don't understand your option 4. Is the view-key derived per sub-address or the primary view-key? I like the login with primary address/key thingy. I also think, you are much deeper into this Monero thingy, than me, so I'm probably better served by your instinct 😃 |
3+ output transactions involving subaddresses place derivations in tx extra additional tx pub keys. Be sure to check additional derivations when scanning outputs sent to subaddress
Update: scanner now picks up on 3+ output tx's involving subaddresses
In option 4, there would be 0 server-side validation that the subaddress is derived from the secret view key. The server would simply assume any subaddress / secret view key pairs it receives are valid and store them. Even if the provided subaddress is not derived from the secret view key, it will store in the db and the server would attempt to scan outputs sent to this invalid subaddress / secret view key pair. The scanner simply would not pick up on outputs sent to this invalid pair, and the server would never return tx's associated with this invalid key pair ( Option 3 has grown on me too for the same reason: it fails early on and the DB is kept clean with valid subaddresses. I'll think further/look to wallet2 some more on handling the lookahead in that option.
You've been around longer than me so your instincts/insights have been super helpful :) EDIT: like on the subaddress use case thing discussed above where I thought maybe users might not want to send primary addresses to the server if they can avoid it. Helpful to realize that's probably not something people would want to do, and so option 3 actually seems pretty solid |
Heads up: going to start using the term "standard" address instead of "primary" since it seems more widely referred to as "standard" address. More importantly, I think this is what both you guys (@trasherdk and @selsta) were initially thinking too, but I'm starting to think the simplest (and most future-proof) thing would be for the server to just not accept subaddresses over the API at all, rather:
With this approach, the API wouldn't really change much, just that the aforementioned endpoints would return wallet-level info by default, rather than standard address-level info. If there is demand from lots of individual light wallet users to pre-load tons of subaddresses before accepting payments to them, the server can also support a user-specific desired lookahead over the I think this API approach is also most in line with what's brewing with JAMTIS. In JAMTIS, the server wouldn't even be able to separate data by subaddresses, and would just scan wallet-level potential received outs. The client in JAMTIS would manage the lookahead and subaddress state. Ultimately, it just does not seem desirable to treat subaddresses as distinct scanned wallets at the API layer and on the backend as this PR does. As discussed above, users aren't going to use the feature like that. Users will want wallet-level data, and that seems to be the long term direction we are headed anyway with JAMTIS. It doesn't really seem useful to allow subaddresses in place of the address param in the API, and it brings annoying complications with it (like how to safely deal with the leak as discussed above). Planning to submit a PR to the light wallet REST API spec fleshing this out. And I'll probably end up closing this PR and opening a new one implementing that. |
This has already been implemented, so the PR is no longer relevant. |
PLEASE READ
Submitted a PR to the light wallet spec highlighting a different way of supporting subaddresses: monero-project/meta#647
I anticipate closing this PR, and opening a new PR implementing the proposal accepted there.
Update [1/3/21]: scanner picks up on outputs sent to subaddresses when there are 3+ outputs in a transaction involving subaddresses
Overview
This PR adds support for subaddresses in
monero-lws
. Allmonero-lws
server endpoints that take an address can instead take a subaddress with this PR. For example, to login with a testnet subaddress:I implemented a migration to add a falsey
is_subaddress
boolean to every address stored in the db, since all addresses already stored must be primary addresses. The migration runs the first time you runmonero-lws
, then doesn't run again. I took inspiration frommonero/src/blockchain_db/lmdb/db_lmdb.cpp
in how to structure the migration.Also, drawing attention to 2 things in particular in this PR:
1. Potential subaddress leak
If you give someone your subaddress, they can use your subaddress to query the server to see if your subaddress is stored on the server, but cannot gain information beyond the fact that it is or is not stored. They would still need your view key to get more information beyond the fact that the subaddress is stored. The fact that it's stored is revealed to someone calling any of the endpoints and providing a subaddress with a different view key from the one already stored, which would return with a
"bad view key"
error iff already stored. This leak does not affect primary addresses.I figure the added complexity to avoid this leak may not be worth the gain for
monero-lws
, since if you give someone your subaddress, and also give them the endpoint of yourmonero-lws
server, you may already be comfortable with them knowing you're monitoring that subaddress on your own server already. I figure this would be a worse implication for MyMonero (and not monero-lws) since you probably don't want to tell someone who you give your subaddress to that you're using MyMonero by default (they would be able to find out if you're using MyMonero with just your subaddress), and therefore it's worth preventing in MyMonero's case if they start supporting subaddresses.Without modifying the light wallet API, the simplest safe solution to prevent this I can think of (with no timing attack) would be to allow storage of all subaddress/view key pairs (i.e. allow duplicate subaddresses across different view keys). monero-lws admins could approve create/import requests by referencing the view key, rather than address (so an admin accepting a single view key could approve multiple create/import requests at once). There would be no uniqueness constraint on addresses this way, but rather on address/view key pairs. The plumbing to get this working would be a bit more work, but happy to do it if it's the desired solution.
2. Equality check on secret keys
The equality check on
crypto::secret_key
uses libsodium's constant time crypto_verify_32. This is critical. If it didn't compare in constant time, and compared thelws::db::view key
types with a naive byte comparison, it could leak the view key via a timing attack.