Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using deepmod on basecalled fast5 from latest guppy #42

Open
hasindu2008 opened this issue Dec 9, 2020 · 8 comments
Open

Using deepmod on basecalled fast5 from latest guppy #42

hasindu2008 opened this issue Dec 9, 2020 · 8 comments

Comments

@hasindu2008
Copy link

In the usage page it is stated that FAST5 must be basecalled and events data must be available in them. However, it seems that the latest Guppy basecaller does not include any events data as Albacore used to do (see below). As mentioned in the readme, it is possible to convert multi-fast5 to single-fast5 using ont-fast5-api. However, I am not sure how Guppy can be asked to save events data in FAST5. Could you shed some light on this?

image

@liuqianhn
Copy link
Collaborator

@hasindu2008 You are right: the latest guppy uses move table rather than event table. Move table is supported by DeepMod now with --move True. Please note that we do not retrain new models or test old models(but we have been improving it). If you have any performance regarding this, please feel free to share it. Thanks.

@hasindu2008
Copy link
Author

Do you have an example guppy command for latest guppy 4 to ask it to generate this move table? I have some fast5 files generated from Guppy 4.0.3 live-base-calling which seem to have the FASTQ read inside but not any such move table. Is that supposed to be inside the Analyses/segmentation group? In this case, that group have a few attributes but no data tables.

@liuqianhn
Copy link
Collaborator

@hasindu2008 could you please what you can get from h5ls -r your-fast5 | head -n 50? In some cases, move/event tables are not available, and you need to re-basecalled with potential options so that move tables are generated in fast5 files, before using deepmod.

@hasindu2008
Copy link
Author

hasindu2008 commented Dec 11, 2020

/                        Group
/read_0013515e-5b4e-4588-843e-b5af4a4b87da Group
/read_0013515e-5b4e-4588-843e-b5af4a4b87da/Analyses Group
/read_0013515e-5b4e-4588-843e-b5af4a4b87da/Analyses/Basecall_1D_000 Group
/read_0013515e-5b4e-4588-843e-b5af4a4b87da/Analyses/Basecall_1D_000/BaseCalled_template Group
/read_0013515e-5b4e-4588-843e-b5af4a4b87da/Analyses/Basecall_1D_000/BaseCalled_template/Fastq Dataset {SCALAR}
/read_0013515e-5b4e-4588-843e-b5af4a4b87da/Analyses/Basecall_1D_000/Summary Group
/read_0013515e-5b4e-4588-843e-b5af4a4b87da/Analyses/Basecall_1D_000/Summary/basecall_1d_template Group
/read_0013515e-5b4e-4588-843e-b5af4a4b87da/Analyses/Segmentation_000 Group
/read_0013515e-5b4e-4588-843e-b5af4a4b87da/Analyses/Segmentation_000/Summary Group
/read_0013515e-5b4e-4588-843e-b5af4a4b87da/Analyses/Segmentation_000/Summary/segmentation Group
/read_0013515e-5b4e-4588-843e-b5af4a4b87da/Raw Group
/read_0013515e-5b4e-4588-843e-b5af4a4b87da/Raw/Signal Dataset {8409/Inf}
/read_0013515e-5b4e-4588-843e-b5af4a4b87da/channel_id Group
/read_0013515e-5b4e-4588-843e-b5af4a4b87da/context_tags Group
/read_0013515e-5b4e-4588-843e-b5af4a4b87da/tracking_id Group
/read_002f7800-db08-4ff5-b2b5-c78d9e72ac3a Group
/read_002f7800-db08-4ff5-b2b5-c78d9e72ac3a/Analyses Group
/read_002f7800-db08-4ff5-b2b5-c78d9e72ac3a/Analyses/Basecall_1D_000 Group
/read_002f7800-db08-4ff5-b2b5-c78d9e72ac3a/Analyses/Basecall_1D_000/BaseCalled_template Group
/read_002f7800-db08-4ff5-b2b5-c78d9e72ac3a/Analyses/Basecall_1D_000/BaseCalled_template/Fastq Dataset {SCALAR}
/read_002f7800-db08-4ff5-b2b5-c78d9e72ac3a/Analyses/Basecall_1D_000/Summary Group
/read_002f7800-db08-4ff5-b2b5-c78d9e72ac3a/Analyses/Basecall_1D_000/Summary/basecall_1d_template Group
/read_002f7800-db08-4ff5-b2b5-c78d9e72ac3a/Analyses/Segmentation_000 Group
/read_002f7800-db08-4ff5-b2b5-c78d9e72ac3a/Analyses/Segmentation_000/Summary Group
/read_002f7800-db08-4ff5-b2b5-c78d9e72ac3a/Analyses/Segmentation_000/Summary/segmentation Group
/read_002f7800-db08-4ff5-b2b5-c78d9e72ac3a/Raw Group
/read_002f7800-db08-4ff5-b2b5-c78d9e72ac3a/Raw/Signal Dataset {24867/Inf}
/read_002f7800-db08-4ff5-b2b5-c78d9e72ac3a/channel_id Group
/read_002f7800-db08-4ff5-b2b5-c78d9e72ac3a/context_tags Group, same as /read_0013515e-5b4e-4588-843e-b5af4a4b87da/context_tags
/read_002f7800-db08-4ff5-b2b5-c78d9e72ac3a/tracking_id Group, same as /read_0013515e-5b4e-4588-843e-b5af4a4b87da/tracking_id
/read_00457254-a6e4-429e-b8f3-3dc6337b1554 Group
/read_00457254-a6e4-429e-b8f3-3dc6337b1554/Analyses Group
/read_00457254-a6e4-429e-b8f3-3dc6337b1554/Analyses/Basecall_1D_000 Group
/read_00457254-a6e4-429e-b8f3-3dc6337b1554/Analyses/Basecall_1D_000/BaseCalled_template Group
/read_00457254-a6e4-429e-b8f3-3dc6337b1554/Analyses/Basecall_1D_000/BaseCalled_template/Fastq Dataset {SCALAR}
/read_00457254-a6e4-429e-b8f3-3dc6337b1554/Analyses/Basecall_1D_000/Summary Group
/read_00457254-a6e4-429e-b8f3-3dc6337b1554/Analyses/Basecall_1D_000/Summary/basecall_1d_template Group
/read_00457254-a6e4-429e-b8f3-3dc6337b1554/Analyses/Segmentation_000 Group
/read_00457254-a6e4-429e-b8f3-3dc6337b1554/Analyses/Segmentation_000/Summary Group
/read_00457254-a6e4-429e-b8f3-3dc6337b1554/Analyses/Segmentation_000/Summary/segmentation Group
/read_00457254-a6e4-429e-b8f3-3dc6337b1554/Raw Group
/read_00457254-a6e4-429e-b8f3-3dc6337b1554/Raw/Signal Dataset {57910/Inf}
/read_00457254-a6e4-429e-b8f3-3dc6337b1554/channel_id Group
/read_00457254-a6e4-429e-b8f3-3dc6337b1554/context_tags Group, same as /read_0013515e-5b4e-4588-843e-b5af4a4b87da/context_tags
/read_00457254-a6e4-429e-b8f3-3dc6337b1554/tracking_id Group, same as /read_0013515e-5b4e-4588-843e-b5af4a4b87da/tracking_id
/read_0073ec46-24e3-40c7-980d-5b8c0c6059bd Group
/read_0073ec46-24e3-40c7-980d-5b8c0c6059bd/Analyses Group
/read_0073ec46-24e3-40c7-980d-5b8c0c6059bd/Analyses/Basecall_1D_000 Group
/read_0073ec46-24e3-40c7-980d-5b8c0c6059bd/Analyses/Basecall_1D_000/BaseCalled_template Group

Seems like the move table is not there in this file? Do you know which options I should pass to modern Guppy?

@liuqianhn
Copy link
Collaborator

@hasindu2008 yes, move/event table is not in the file, and you need re-basecall it. You might find the help documents for your basecaller using guppy_basecaller --help or from nanopore community.

@123chenshixin
Copy link

@hasindu2008 I tried to re-basecall my own single-read fast5 files without any move and event data throught guppy_basecaller. And I ultimately successful create move data.My command is as follows and I hope it can be some help for you.
guppy_basecaller -i /home/cxs3_z4/ds/single_fast5_J1-019A -r -s ./J1-019A --config dna_r9.4.1_450bps_hac_prom.cfg --fast5_out

The guppy_basecaller version:
guppy_basecaller --version

: Guppy Basecalling Software, (C) Oxford Nanopore Technologies, Limited. Version 3.1.5+781ed57

It creates "workspace" directory in my working path.And one of the output single-read fast5 file is as follows.
h5ls -r fff36937-dff8-4f8b-a343-d1b680e7f99c.fast5

/ Group
/Analyses Group
/Analyses/Basecall_1D_000 Group
/Analyses/Basecall_1D_000/BaseCalled_template Group
/Analyses/Basecall_1D_000/BaseCalled_template/Fastq Dataset {SCALAR}
/Analyses/Basecall_1D_000/Summary Group
/Analyses/Basecall_1D_000/Summary/basecall_1d_template Group
/Analyses/Basecall_1D_001 Group
/Analyses/Basecall_1D_001/BaseCalled_template Group
/Analyses/Basecall_1D_001/BaseCalled_template/Fastq Dataset {SCALAR}
/Analyses/Basecall_1D_001/BaseCalled_template/Move Dataset {91653}
/Analyses/Basecall_1D_001/BaseCalled_template/Trace Dataset {91653, 8}
/Analyses/Basecall_1D_001/Summary Group
/Analyses/Basecall_1D_001/Summary/basecall_1d_template Group
/Analyses/RawGenomeCorrected_001 Group
/Analyses/Segmentation_000 Group
/Analyses/Segmentation_000/Summary Group
/Analyses/Segmentation_000/Summary/segmentation Group
/Analyses/Segmentation_001 Group
/Analyses/Segmentation_001/Summary Group
/Analyses/Segmentation_001/Summary/segmentation Group
/Raw Group
/Raw/Reads Group
/Raw/Reads/Read_27254 Group
/Raw/Reads/Read_27254/Signal Dataset {184511/Inf}
/UniqueGlobalKey Group
/UniqueGlobalKey/channel_id Group
/UniqueGlobalKey/context_tags Group
/UniqueGlobalKey/tracking_id Group

The move data is in the Basecall_1D_001/BaseCalled_template path.

@shaodongyan
Copy link

Basecall_1D_001/BaseCalled_template
hello,can i know you deepmod order?

@liuqianhn
Copy link
Collaborator

@shaodongyan I have no idea what "deepmod order" means. But based on my understanding, deepmod never test on guppy basecalling and have some issues on it. I would not recommend using it on guppy basecalled until we finish a test; otherwise, the results are not correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants