Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Page: Editor] [Rust] Exercise doesn't build (when external packages are required?) #26

Open
wischi-chr opened this issue May 21, 2021 · 12 comments

Comments

@wischi-chr
Copy link

This happened on the gigasecond exercise in rust:
https://exercism.lol/tracks/rust/exercises/gigasecond/edit

Steps to reproduce

  1. Copy-Paste the .meta/example.rs into the editor
  2. Run the tests
We received the following error when we ran your code:
WARNING: student did not upload Cargo.lock. This may cause build errors.
error: no matching package named `chrono` found
location searched: registry `https://github.com/rust-lang/crates.io-index`
required by package `gigasecond v2.0.0 (/mnt/exercism-iteration)`
As a reminder, you're using offline mode (--offline) which can sometimes cause surprising resolution failures,
if this error is too confusing you may wish to retry without the offline flag.
@iHiD
Copy link
Member

iHiD commented May 21, 2021

Thanks for reporting 🙂

Yes. The test runners are run offline so do not have access to the internet.

cc @exercism/rust - It's probably worth changing the examples.

@coriolinus
Copy link
Member

That's a tricky one: the point of the gigasecond exercise is to teach how to use external crates. It may be possible to solve that exercise using the standard library (I haven't tried), but it will be both harder and less idiomatic.

@ErikSchierboom
Copy link
Member

Will there be a wide variety in crates used? If not, you could pre-install the most used packages in the test runner.

@coriolinus
Copy link
Member

Given access to a database of student solutions, I could try to answer that question analytically and give a real answer. As things stand, I can only go by personal experience, which provides anecdotes, not data.

Given that caveat, I'd say that it's almost certainly possible to cover the majority of use cases with a careful selection of crates to include. There's precedent: the rust playground makes available the 100 most-downloaded crates.

That said, this is a bandaid for the online environment, not a real solution. There will always be students doing interesting and cool things which require crates outside the top N for any reasonable N that we include. I suggest that the website should present a notification in the case of this kind of error that the student attempt their solution on their local machine.

@ErikSchierboom
Copy link
Member

@coriolinus For the moment, you could have the test runner output a special message if a crate is missing and give instructions on what to do. That would make it clear to the student what they need to do.

@iHiD
Copy link
Member

iHiD commented May 22, 2021

I like both the idea of bunding commonly used crates into the test runner, and also the extra message. We could probably grep the existing solutions to see what crates people commonly use?

@coriolinus
Copy link
Member

Given a large corpus of Cargo.toml files, it's easy to get the list of the top 100 dependencies used:

rg 'dependencies' --glob Cargo.toml --files-with-matches | 
xargs -L1 toml2json | 
jq -r '.dependencies | keys  | .[]' | 
sort | 
uniq -c | 
sort -rn | 
head -100

A variant of that command is likely to work well given the database of student solutions. However, the tricky part is that students may or may not upload their Cargo.toml file to exercism.

There are additional greppable markers; extern crate foo; used to be a reliable signal that the foo crate was in use. However, that usage was phased out in the 2018 edition. These days, without Cargo.toml, it's pretty hard to identify external crates using only CLI tooling.

@iHiD
Copy link
Member

iHiD commented May 25, 2021

@coriolinus exercism/groovy-test-runner#3 got me thinking that we could host a mirror of crates.io with the top ~200 crates in it. Any reason that's a stupid idea?

@coriolinus
Copy link
Member

The most straightforward response is that hosting a partial copy of crates.io makes life easier for us if the docker image is allowed to reach it, but doesn't do anything for us if the docker image isn't. If we're thinking of allowing the docker images firewalled access to a short list of external websites, then whitelisting crates.io and github.com will probably give us the same properties without requiring the effort of actually setting up a mirror.

@iHiD
Copy link
Member

iHiD commented May 25, 2021

I wouldn't want to give internet access as it opens us up to being used as an attack vector for other sites (ie we're used to DDOS them across test runs) and also it adds a financial risk as someone could run up a large AWS bill for data transfer outside of AWS (internal transfers are free/cheap - external transfers rack up). I also just don't want to give access to the internet on these machines as it's quite possible (probable?) that someone will break out of any containers we have any be able to change and restrictive rules we put in place.

But if hosting a partial copy would work, then that's a possibility we can look into (almost certainly post launch).

@coriolinus
Copy link
Member

I guess I don't understand your intent. There's lots of ways bad actors could abuse an unlimited internet connection; I was thinking in terms of crypto mining, but I guess it depends on the attacker's needs. Point is, we agree that an unlimited connection is a bad idea.

It sounded to me like your plan goes like this:

  • establish crates.exercism.io
  • allow test runners to connect to that (and only that) site
  • perform some necessary magic within the docker container to ensure that cargo connects to the exercism clone of the site.

If that were the plan, then my point was that whitelisting both crates.io and github.com would have the same risk profile, without needing us to go to the effort of setting up a crates.io clone. If that's not the plan, then I don't understand what the plan was supposed to be.

@iHiD
Copy link
Member

iHiD commented May 26, 2021

We can allow access from one AWS machine to another without allowing any internet access at all. So the only connections available to the machine (at the AWS level) is to a specific other machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants