-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Creating a unified mechanism to generate blocklists for ckBTC and ckETH #3401
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @THLO for making this script! Some minor comments and python newbie questions.
# | ||
# 2) Run this script as follows: | ||
# | ||
# python generate_blocklist.py [currency (BTC or ETH)] [path to the SDN.XML file] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we specify the output file as a 3rd argument instead of using the constant BLOCKLIST_FILENAME
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or alternatively, just write output to stdout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can introduce a third argument.
Note that the script writes stuff to stdout: The written file is supposed to be a valid Rust source file whereas the extracted addresses are written to stdout, which could be used for other purposes including sanity checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the argument.
blocklist_file.write('#[cfg(test)]\nmod tests;\n\nuse bitcoin::Address;\n\n') | ||
blocklist_file.write('/// The script to generate this file, including information about the source data, can be found here:\n') | ||
blocklist_file.write('/// /rs/cross-chain/scripts/generate_blocklist.py\n\n') | ||
blocklist_file.write('/// BTC is not accepted from nor sent to addresses on this list.\n') | ||
blocklist_file.write('/// NOTE: Keep it sorted!\n') | ||
blocklist_file.write('pub const BTC_ADDRESS_BLOCKLIST: &[&str] = &[\n') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is there a multiline string concept in Python, similar to what Rust has? This could simplify readability. Similar comment for the other preamble/postamble.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Python uses """
for multi-line strings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. I'll try to improve readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
def store_blocklist(currency, addresses): | ||
blocklist_file = open(BLOCKLIST_FILENAME, 'w') | ||
|
||
if currency == 'BTC': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe that's overkill for now but it might make sense to have a writer per currency, so that we don't have to intertwine if/else statements for preamble/body/postamble
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that it would probably be overkill. I'll see what I can do (without spending too much effort).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It didn't take all that much effort. Please check out the new version!
if currency == 'ETH': | ||
address = address.lower() | ||
if address in INVALID_ADDRESSES: | ||
blocklist_file.write(' // ' + address_prefix + '"' + address[offset:] + '"' + address_suffix + ' (Invalid address prefix)\n') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I would handle invalid addresses in that script:
- Separation of concerns: the purpose of that script is to parse the SDN into a list of addresses that can be understood by our Rust code. Our Rust unit tests should catch any invalid address.
- The stated reason there
(Invalid address prefix)
cannot work for all invalid addresses the script tries cannot really determine the reason why an address is invalid (it could try to parse the addresses but I'm not sure it's worth it, in particular because the parsing will be done by another library that we use in our canisters.)
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just added it for the sake of convenience. Otherwise, we always run into the issue that the Rust unit tests first need to catch the invalid address(es), which requires an additional update step.
I expect/hope that the guys compiling the list won't introduce many more invalid addresses.
But your comment regarding separation of concerns is valid, of course. If you feel that this concern trumps convenience, then I can remove the check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to remove the check for the sake of simplicity.
for address in addresses: | ||
# Ethereum addresses are case-insensitive. By contrast, only Bech32 Bitcoin addresses are case-insensitive. | ||
if currency == 'ETH': | ||
address = address.lower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave the address as is: while the address is case-insensitive, the case sensitivity is used as an optional checksum (see EIP-55). Also I have the impression that the generated list may no longer be sorted and this also produces a large diff with the existing blocklist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @gregorydemay that we shouldn't do the case conversion here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll remove the check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
addresses = [] | ||
|
||
# Generate the currency-specific suffix. | ||
currency_suffix = 'XBT' if currency == 'BTC' else 'ETH' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dumb question: why is XBT
used for BTC
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"XBT" is the Bitcoin ticker that complies with the ISO 4217 standard. According to this standard, currencies not tied to a specific nation begin with the letter “X”, and "BT" stands for "Bitcoin". 🙂
Co-authored-by: gregorydemay <[email protected]>
blocklist_file.close() | ||
|
||
if __name__ == '__main__': | ||
if len(sys.argv) < 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm a bit late to the party. Just a suggestion: the Python argparse library is pretty powerful and e.g. generates useful help/error message for the script CLI. Could be useful here to e.g. restrict currency in {BTC, ETH}
and provide a default value for filename
.
}''' | ||
|
||
def format_address(self, address): | ||
return 'ethereum_address!("' + address[2:] + '")' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also a small suggestion: I find f-strings generally easier to read, it is for sure also a matter of taste :)
This PR adds a Python script to generate blocklists for ckBTC and ckETH based on the OFAC SDN list.
There are some differences in the produced blocklist files compared to the current files:
Differences to the current ckBTC blocklist file:
Differences to the current ckETH blocklist file:
from_address
in the functionis_blocked
has been changed toaddress
to make it consistent with the corresponding function for ckBTC.Corresponding ticket: https://dfinity.atlassian.net/browse/XC-250