Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write a test showing InProcKeystore can restart successfully #120

Open
1 task done
guillemcordoba opened this issue Jan 19, 2024 · 15 comments
Open
1 task done

Write a test showing InProcKeystore can restart successfully #120

guillemcordoba opened this issue Jan 19, 2024 · 15 comments
Assignees
Labels
bug Something isn't working

Comments

@guillemcordoba
Copy link

guillemcordoba commented Jan 19, 2024

This issue has diverged into android load timing. I've split out the load timing into a separate issue: #139

  • This issue should now just be writing a test that shows in-proc-keystore loads after a restart.

Original Issue text retained:


I've tried to use directly an InProcKeystore with holochain, to reduce the startup time of my Android app. It works fine when first started, but when the conductor restarts, signing zome calls fails with Query returned no rows. This is my code:

    let config = get_config(&fs.keystore_config_path(), passphrase.clone())
        .await
        .map_err(|err| crate::Error::LairError(err))?;

    let store_factory = lair_keystore::create_sql_pool_factory(&fs.keystore_store_path());

    // create an in-process keystore with an in-memory store
    let keystore = InProcKeystore::new(config.clone(), store_factory, passphrase.clone())
        .await
        .map_err(|err| crate::Error::LairError(err))?;

    let client = keystore
         .new_client()
         .await
         .map_err(|err| crate::Error::LairError(err))?;

     let lair_client = MetaLairClient::new_with_client(client.clone())
         .await
         .map_err(|err| crate::Error::LairError(err))?;

    let connection_url = config.connection_url.clone();

    let lc = lair_client.clone();

    log::info!("Lair keystore spawned");

    tauri::async_runtime::spawn(async move {
        let config = crate::config::conductor_config(
            &fs,
            admin_port,
            connection_url.into(),
            override_gossip_arc_clamping(),
        );

        let conductor = Conductor::builder()
            .config(config)
            .passphrase(Some(passphrase))
            .with_keystore(lc)
            .build()
            .await
            .expect("Can't build the conductor");

        let p: either::Either<u16, AppInterfaceId> = either::Either::Left(app_port);
        conductor
            .clone()
            .add_app_interface(p)
            .await
            .expect("Can't add app interface");
    });

To do that, I've needed to fork holochain and add this method.

I've tried to reproduce the issue by implementing a test in this repository, which is mostly a copy of this test, here, but that fails with another error:

     Running tests/in_proc_sql.rs (target/debug/deps/in_proc_sql-05e6d2769c280074)

running 1 test
test in_proc_sql ... FAILED

failures:

---- in_proc_sql stdout ----
connectionUrl: unix:///socket?k=CZHNtNI5GCBYmxuMdWYYVSS3-7R6wOZkx_4iOI5zfys

# The pid file for managing a running lair-keystore process
pidFile: /pid_file

# The sqlcipher store file for persisting secrets
storeFile: /store_file

# Configuration for managing sign_by_pub_key fallback
# in case the pub key does not exist in the lair store.
# - `signatureFallback: none`
# - ```
#   signatureFallback: !command
#     # 'program' will resolve to a path, specifying 'echo'
#     # will try to run './echo', probably not what you want.
#     program: "./my-executable"
#     # args are optional
#     args:
#       - test-arg1
#       - test-arg2
#   ```
signatureFallback: none

# -- cryptographic secrets --
# If you modify the data below, you risk losing access to your keys.
runtimeSecretsSalt: ooG0RUxi1JEdD3X29eX-wQ
runtimeSecretsMemLimit: 67108864
runtimeSecretsOpsLimit: 2
runtimeSecretsContextKey:
- PZf3TQcscJKOfY3NM2T5oYjuV7SNLBr1
- Zl7vhBHMFgEhudDk-6dWhvAbYJJYS1Pznh8OcXBRs_Ac3cQPcaXaPXnYfvOqjCICSQ
runtimeSecretsIdSeed:
- se1u7x7-l5B8Y0WnqnQ8zQOomgFJoxo0
- 6RIMEN_B-YJEUFDKkMi9yDgAQyCvpDQg3uKPDgJ7CNZLHCSLE5El_4SLam16B97aRQ

[
    Seed {
        tag: "test-tag",
        seed_info: SeedInfo {
            ed25519_pub_key: BinDataSized<32>(oVB-1RAn5jaa7age6crq_0gkZ1bg0aNS_II7Aj2ycr4),
            x25519_pub_key: BinDataSized<32>(w1vx2mOy3RXxBok6VktHnWT-4L-rfn2-2n9qWqcl8h8),
            exportable: false,
        },
    },
    DeepLockedSeed {
        tag: "test-tag-deep",
        seed_info: SeedInfo {
            ed25519_pub_key: BinDataSized<32>(heKKE7_gIqWnthzSYR1DwCKvjGWQ8-rc9jtJcUl2iPI),
            x25519_pub_key: BinDataSized<32>(Y-2JERtk89mS2E3pMP4e6TowiKUm4ZIy8Cgo8NaC9Hg),
            exportable: false,
        },
    },
]
thread 'in_proc_sql' panicked at 'called `Result::unwrap()` on an `Err` value: {"error":"InternalSodium"}', crates/lair_keystore/tests/in_proc_sql.rs:124:14
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    in_proc_sql

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 4.28s

error: test failed, to rerun pass `-p lair_keystore --test in_proc_sql`

Is InProcKeystore supposed to work? Am I doing something wrong? This would be a big improvement on the UX experience of mobile apps.

@ThetaSinner
Copy link
Member

Yes it's supposed to work, it's the default in Holochain if you don't configure an external Lair to connect to.

@neonphog any ideas about these errors?

@guillemcordoba
Copy link
Author

It actually doesn't @ThetaSinner . The default is to run an lair-keystore server which uses StandaloneServer which uses the IpcKeystoreServer, which is not the same as the InProcKeystore. Running the server is what I've found to be really slow (5 seconds) on mobile, and I was hoping that doing away with that would makes things much faster.

@neonphog
Copy link
Collaborator

neonphog commented Jan 23, 2024

@guillemcordoba - The main difference between the ipc keystore and the in proc keystore is whether they use an ipc channel. The "Standalone" is a bit misleading, as it actually will also be run in process.

I suspect the startup time has more to do with the argon password hashing algorithm, as it is designed specifically to use a lot of memory and CPU, and is using the "Moderate" settings which largely target desktops instead of mobile devices.

Can you run a test where you set the mem limit and cpu/ops limit to mem min and ops min respectively?

For testing, you could potentially set them here. If that makes a difference, we can figure out the best way forward for actually configuring that...

@neonphog
Copy link
Collaborator

Not sure it's more readable, but looks like LairServerConfigInner::new respects PwHashLimits::with_exec if you want to set the limits that way.

@guillemcordoba
Copy link
Author

Thanks @neonphog that helps. I'll do some testing and see if that reduces the start time.

@abe-njama
Copy link

@guillemcordoba were you able to successfully test withe the configs that @neonphog shared? What's the experience like?

@guillemcordoba
Copy link
Author

I just did, I tried to do it with this code:

pub async fn write_config(
    config_path: &std::path::Path,
    passphrase: BufRead,
) -> LairResult<LairServerConfig> {
    let lair_root = config_path
        .parent()
        .ok_or_else(|| one_err::OneErr::from("InvalidLairConfigDir"))?;

    tokio::fs::DirBuilder::new()
        .recursive(true)
        .create(&lair_root)
        .await?;

    let mut config = LairServerConfigInner::new(lair_root, passphrase).await?;

    config.runtime_secrets_mem_limit = sodoken::hash::argon2id::MEMLIMIT_MIN;
    config.runtime_secrets_ops_limit = sodoken::hash::argon2id::OPSLIMIT_MIN;

    let mut config_f = tokio::fs::OpenOptions::new()
        .write(true)
        .create_new(true)
        .open(config_path)
        .await?;

    config_f.write_all(config.to_string().as_bytes()).await?;
    config_f.shutdown().await?;
    drop(config_f);

    Ok(Arc::new(config))
}

Using this code, lair doesn't even start. As a matter of fact, I lose all logs and I don't know at which point it gets stuck, which is really weird. But I'm certain that the whole holochain start up process doesn't complete at all. I can't offer much more on my end, I think this needs thorough investigation on how we can make both lair-keystore and holochain run as fast as possible in mobile.

@abe-njama
Copy link

@neonphog from Guillem's update, it looks like we might want to prioritize debugging lair startup on mobile or optimizing the process. We can discuss this internally in light of the other competing priorities.

@neonphog
Copy link
Collaborator

@guillemcordoba - can you give this a try: holochain/holochain#3391
Thanks!

@guillemcordoba
Copy link
Author

Will do! Thanks @neonphog

@github-project-automation github-project-automation bot moved this to Backlog in Holochain Mar 11, 2024
@abe-njama abe-njama moved this from Backlog to In progress in Holochain Mar 15, 2024
@guillemcordoba
Copy link
Author

Hi @neonphog , sorry for taking long with this.

I just tried it, and it hasn't made any difference as far as I could tell. I timed the startup times with the setting set to Interactive and the setting set to None, and both startup times took around 7s with no noticeable difference in them.

Any ideas on things that we could try next?

@guillemcordoba
Copy link
Author

Also ping @abe-njama about this.

@abe-njama abe-njama moved this from In progress to Ready in Holochain Sep 2, 2024
@neonphog neonphog self-assigned this Oct 22, 2024
@jost-s jost-s removed the status in Holochain Oct 22, 2024
@jost-s jost-s moved this to Ready for refinement in Holochain Oct 22, 2024
@zippy zippy added the bug Something isn't working label Oct 22, 2024
@neonphog neonphog moved this from Ready for refinement to Ready in Holochain Oct 22, 2024
@neonphog
Copy link
Collaborator

@guillemcordoba - I've split out the timing into a separate issue here: #139 -- This issue will be closed with a test that shows InProcKeystore is able to load after restart.

@neonphog neonphog changed the title InProcKeystore fails on restart Write a test showing InProcKeystore can restart successfully Jan 9, 2025
@matthme matthme moved this from Ready to In progress in Holochain Jan 10, 2025
@matthme matthme self-assigned this Jan 10, 2025
@matthme
Copy link

matthme commented Jan 10, 2025

I don't know how to reproduce the InternalSodium panic because the link to the test is a 404.

@matthme
Copy link

matthme commented Jan 13, 2025

Test written with holochain/holochain#4608

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: In progress
Development

No branches or pull requests

6 participants