-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: ICS721 Recursive Loop #479
Comments
This was entirely unnecessary to do on the active channels of the testnet, it could just as easily have been demonstrated in a way that was less intrusive - locally even. Proof could have been provided without making the life of everyone else suck. Even if you thought it was better to test this live, you could have had it running for a short while and then terminated it. You knew this worked at least 4 days ago and have not stopped it: https://discord.com/channels/669268347736686612/1074987031270268958/1084200758334996561 This is not, however, a new issue, nor is it specific to ICS721: https://github.com/0xekez/cw-ibc-example/tree/zeke/ibc-replay All-in-all, I think you next time you want to show an exploit, do it in a nicer way. This was unnecessarily painful for all the other participants (and the organizers). |
Be aware that the prior #246 claim was updated after my contract notably disrupted the Juno <> Stars connection. The original bug report had been dealt with in Zeke's ics721-tester contract (infinite in-contract loops are futile as properly configured relayers impose gas limits and simulate transactions before sending them). Relevant Links
|
I've been in touch with an organizer, and we agreed to sit back and watch the show. The testnet exists to stress-test the protocol, not to soothe bruised egos or supplement validator incomes. If we don't stretch the interchain standards to their limits now, we'll pay the price later. The Ark Protocol team prevailed and produced a self-relaying solution.
I accomplished this locally weeks prior, as showcased in the integration tests. Alas, a false negative in the ts-relayer test suite misconstrued it as inert until I deployed it on the live connection.
If you'd perused the IBC-replay description and my issue, you'd see they're worlds apart. Zeke's protocol hinges on an ack to keep the loop going – a crucial difference, demanding full control over the IBC contract. Relayers set up their ports based on trust. An untrusted port can be dismissed at will. A highly trusted port acting in an untrusted way poses a significant threat with significant capital in the balance. Zeke's competence is undeniable, and I have no doubt that he would have taken the required measures if he'd been aware of the issue beforehand. To insinuate that he knew about the exploit and deliberately left the protocol vulnerable is absurd.
In future IBC games, try seeking solutions instead of resigning to the circumstances at the first sign of trouble. Your swift reaction wasn't inconsequential, influencing both my decision and the organizer's to let events unfold. |
We don't know whom you have been in touch with. At least from IRIS side, we have no idea about this. |
If that is true, it was unnecessary and didn't actually stress-test anything beyond the initial phase when its effectiveness was proven. One of the effects that can be seen, on the other hand, is that testers started dropping off because the only way to proceed was to have the technical abilities to get to a self-relaying solution. I'm sure setting the bar that high for the GoN was not intended by the organizers.
Not sure they did, haven't seen it. Their CLI, while excellent, currently only provides the ability to flush channels.
Actually, they're not. They use different mechanisms and are, indeed, different. But they both exploit the exact same issue which is not new and is still very open: IBC relaying economy. It's the only real issue here and is not new. That said, I do applaud you for finding this issue. If it can be mitigated or solved on the contract, or even protocol, level for this particular way of exploiting IBC relaying economy: Awesome! I will, however, stick to my initial assessment that the way this was done was unnecessary and could have been done in a way that didn't hurt participation.
I'm afraid my only resignation was to stop spending time relaying a hopeless situation that was deliberately being sabotaged. I spent 3 days getting a self-relaying solution up and helped a bunch of people get their own self-relaying up and running: #500
Hmm, I'm not sure mine was the ego that was bruised. |
Me neiher. Also agree with @gjermundgaraba, it was and still is a pain for all participants. I was also thinking of testing this infinite loop - but by setting up my own IBC channel. This way it wouldn't harm anyone - except myself. |
@taitruong @gjermundgaraba See, the cool thing is, I didn't harm anyone. Because this is a testnet. |
@mccallofthewild this is an awesome attack!! thank you for the detailed writeup and all the thought you put into it. I thought about very similar attacks while writing the contract. do you have any suggestions about how to defend against this in the ICS-721 contract beyond the self-relaying approach i described here? i thought long and deeply about this class of problem, and pretty much came to the conclusion that this is a bug in IBC and relayer software, and not something I can defend against at the contract level. Oak even flagged this as an issue in their audit report, but once again the conclusion was that this was a IBC relayer problem, and nothing could be done to defend on a permissionless protocol like IBC. |
also, @mccallofthewild, if you would like to contribute the test you wrote to the ICS-721 repo, i would love to merge it! |
Adding a gas_limit to SubMsg like cw20-ics20 does. The malicious contract would not have enough gas to send other packet |
@giansalex, i've thought about gas limits a fair bit. it's not clear to me that this is a good solution, as:
|
another detail here is that this bug is entirely removed if relayers just relay packets over connections they know to be using non-malicious cw721 implementations. i'm a little unclear on how deploying a new ICS-721 contract (which has a different port than other ones) would cause a different channel to be clogged. this once again seems like an issue with relayer software to me. 🤷 open to other opinions here, these are just mine. |
yup I suggested this in the issue and started adding it:
It's a tad bit different from But it still feels like there is room for attackers to work around this. |
Something else I've considered is if contracts could grant foreign write access to certain namespaces in their k/v store. All cw721 conforming contracts share the same ownership map structure, so we could execute the write directly from the bridge contract and make the protocol immutable / gov-maintained. |
Summary of Bug
The CosmWasm ICS721 implementation lacks sufficient constraints for safe relayer operations. To demonstrate this, I created the ICS721 virus.
Overview
The ICS721 virus is a malicious contract that infects either or both sides of a CosmWasm IBC connection. It creates an asynchronous recursive loop of relayers sending and timing out ICS721 packets, which execute a malicious transfer function, minting more NFTs and sending them as ICS721 packets, which they then need to transfer or timeout, and the loop repeats.
With only one packet required from the attacker, an IBC connection can be brought to its knees overnight. In the case of Game-of-NFTs, relayers minted over 500000 NFTs, ran out of gas funds, decommissioned their operations, and organizers had to set up new IBC channels for participants to complete their tasks.
Steps to Reproduce
Sequence Diagram
Written Steps With Code
4.1. Alice initiates an IBC transfer of NFT-A (of Virus A) from Chain A to Virus B Contract address on Chain B.
4.2. Alice initiates an IBC transfer of NFT-B (of Virus B) from Chain B to Virus A Contract address on Chain A.
#ibc_packet_receive
entrypoints on each chain'scw-ics721-bridge
, instantiating cw721 voucher contracts for each class id🔁 loop begins:
Virus-A.Transfer
on Chain A.Virus-A.Transfer
mints new NFT-A on Chain A and IBC transfers NFT-A1 and NFT-A2 to Chain B.Virus-A.Transfer
IBC transfers NFT-B back to Virus-B.Virus-B.Transfer
on Chain A.Virus-B.Transfer
mints new NFT-B on Chain B and IBC transfers NFT-B1 and NFT-B2.The above loop continues until the packets timeout in the relayer queue.
🔁 Packet timeouts initiate their own asynchronous loops as follows:
Virus-B.Transfer
on Chain A.Virus-B.Transfer
mints new NFT-B on Chain B and IBC transfers NFT-B1 and NFT-B2.Impact
The demonstrated attack can instigate significant harm by causing relayers to self-inflict exponentially increasing packet volume and gas expenditures.
This is not necessarily a denial of service attack. While the deployed version mints 4-10 new NFTs per invocation, the most profitable variant need only transfer two packets at a time, silently and continuously collecting income from Juno's FeeShare module at the expense of relayers.
Conclusion
If ICS721 is to utilize the altruistic relayer economy of ICS20, packets must have semi-fungible assurances in terms of cost and functionality. This can be achieved via submessage gas limits and/or code id whitelists.
Optimally, users should be relaying their own transactions, which addresses most of the described risks.
Links
The text was updated successfully, but these errors were encountered: