diff --git a/CHANGELOG.md b/CHANGELOG.md index 07866b837..ef8a78697 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,30 @@ All notable changes to Dorado will be documented in this file. +# [0.4.3] (14 Nov 2023) + +This release of Dorado introduces a new RNA m6A modified base model and initial support for poly(A)/poly(T) tail length estimation. It also introduces duplex performance enhancements and bug fixes to improve the stability of Dorado. + +* 803e3a7ce2590b1c95b4754117185983ac2ad560 - Add RNA m6A DRACH-context model +* 0f282cde507a36bf91863270bd0323564235c15b - Add poly(A)/poly(T) tail length estimation support for RNA and cDNA +* 54e14ca01e7391c8857989da7db086a4591375a1 - Add RNA read splitting +* 2dc1f039cac7f3e6cd082b77a5b020fed5488e2f - Enable RNA adapter trimming +* 80114c08c45bc902843a2e18b5949ebf5cfefdf2 - Correctly update CIGAR and POS entries when trimming barcodes +* 4b2025c57fd3b87b2ce6cd52be07adfd9ae5acf9 - Add documentation for sample sheet support +* 641cb08b457d727c3da682185c6fe491df49dab2 - Reduce host memory footprint for duplex basecalling +* 7c1c0f04d93113d4dd2c632bdcd242304b54d270 - Reduce working reads size, in particular for duplex. +* 831f0a91f0100c2586720f6026450fdbae1a8d21 - Fix pairing check for split reads in duplex basecalling +* b63056743be6e5442f2f5af65a36c592bbf96184 - Account for split reads during progress tracking +* 383fe0226bfa7956705376ac5e4a32096ff80c45 - Update to Koi v0.4.1 +* 873c6b11e0113735b21305afce5057138558388d - Fix warnings about `ONLY_C_LOCAL` mismatches in PCH builds +* 52cbabff83de3c9fb6f1a0db9194828b92418855 - Encapsulate `date` dependency +* 8fb8a4df567ba22df6a298f4e30277a0d47ceaa4 - Disable Cutlass LSTM codepath for 128-wide LSTM layers because this kernel does not work +* 6a9dad907af8dd2b4e556d49a329a8a0fbc5c32c - Enable warnings as errors at build time +* 5aaef312027836ffbd6e2b944e6cd3ba4a259267 - Address auto batchsize issues on unified memory Linux systems +* 92b5a6792fca4d2bb2b76727ec486efe8bdfae97 - Reduce compilation times +* 062e3fd53f58380070efff660303b71c03cd02c0 - Minor speed improvements to CPU beam search + + # [0.4.2] (30 Oct 2023) This release of Dorado fixes a bug with the CpG-context 5mC/5hmC model calling all contexts and adds beta support for using a barcode alias from a sample sheet. diff --git a/README.md b/README.md index 7baa08c24..77c3e9128 100644 --- a/README.md +++ b/README.md @@ -19,10 +19,10 @@ If you encounter any problems building or running Dorado, please [report an issu ## Installation - - [dorado-0.4.2-linux-x64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.2-linux-x64.tar.gz) - - [dorado-0.4.2-linux-arm64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.2-linux-arm64.tar.gz) - - [dorado-0.4.2-osx-arm64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.2-osx-arm64.zip) - - [dorado-0.4.2-win64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.2-win64.zip) + - [dorado-0.4.3-linux-x64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.3-linux-x64.tar.gz) + - [dorado-0.4.3-linux-arm64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.3-linux-arm64.tar.gz) + - [dorado-0.4.3-osx-arm64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.3-osx-arm64.zip) + - [dorado-0.4.3-win64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.3-win64.zip) ## Platforms @@ -201,12 +201,12 @@ SQK-RPB004_barcode03.bam unclassified.bam ``` -#### Using a Sample Sheet -`dorado` is able to use a sample sheet to restrict the barcode classifications to only those present, and to apply aliases to the detected classifications. This is enabled by passing the path to a sample sheet to the `--sample-sheet` argument when using the `basecaller` or `demux` commands. See [here](documentation/SampleSheets.md) for more information. +#### Using a sample sheet +Dorado is able to use a sample sheet to restrict the barcode classifications to only those present, and to apply aliases to the detected classifications. This is enabled by passing the path to a sample sheet to the `--sample-sheet` argument when using the `basecaller` or `demux` commands. See [here](documentation/SampleSheets.md) for more information. ### Poly(A) tail estimation -Dorado has initial support for estimating poly(A) tail lengths for DNA and RNA. Note that Oxford Nanopore cDNA reads sequence in two different orientations and transcript poly(A) length estimation handles both (A and T homopolymers). This feature can be enabled by passing `--estimate-poly-a` to the `basecaller` command. It is disabled by default. The estimated tail length is stored in the `pt:i` tag of the output record. Reads for which the tail length could not be estimated will not have the `pt:i` tag. +Dorado has initial support for estimating poly(A) tail lengths for cDNA and RNA. Note that Oxford Nanopore cDNA reads are sequenced in two different orientations and Dorado poly(A) tail length estimation handles both (A and T homopolymers). This feature can be enabled by passing `--estimate-poly-a` to the `basecaller` command. It is disabled by default. The estimated tail length is stored in the `pt:i` tag of the output record. Reads for which the tail length could not be estimated will not have the `pt:i` tag. ## Available basecalling models @@ -273,7 +273,7 @@ Below is a table of the available basecalling models and the modified basecallin | :-------- | :------- | :--- | :--- | | **rna004_130bps_fast@v3.0.1** | N/A | N/A | 4 kHz | | **rna004_130bps_hac@v3.0.1** | N/A | N/A | 4 kHz | -| **rna004_130bps_sup@v3.0.1** | 6mA_DRACH | v1 | 4 kHz | +| **rna004_130bps_sup@v3.0.1** | m6A_DRACH | v1 | 4 kHz | | rna002_70bps_fast@v3 | N/A | N/A | 3 kHz | | rna002_70bps_hac@v3 | N/A | N/A | 3 kHz | diff --git a/cmake/DoradoVersion.cmake b/cmake/DoradoVersion.cmake index c261fb60d..69da9eb6f 100644 --- a/cmake/DoradoVersion.cmake +++ b/cmake/DoradoVersion.cmake @@ -1,6 +1,6 @@ set(DORADO_VERSION_MAJOR 0) set(DORADO_VERSION_MINOR 4) -set(DORADO_VERSION_REV 2) +set(DORADO_VERSION_REV 3) find_package(Git QUIET) if(GIT_FOUND AND EXISTS "${PROJECT_SOURCE_DIR}/.git") diff --git a/documentation/SampleSheets.md b/documentation/SampleSheets.md index 22178a8cf..36e7c708e 100644 --- a/documentation/SampleSheets.md +++ b/documentation/SampleSheets.md @@ -1,4 +1,4 @@ -# Sample Sheet specification +# Sample sheet specification `dorado` can make use of a MinKNOW-compatible sample sheet containing data used to identify a particular classification of read. To apply a sample sheet, provide the path to the appropriate CSV file using the `--sample-sheet` argument: @@ -20,7 +20,7 @@ Note that `dorado` currently uses the sample sheet only for barcode filtering an In the case of `demux`, the sample sheet must contain a 1-to-1 mapping of `barcode` identifiers to `flow_cell_id`/`position_id` - i.e. all entries in the `barcode` column must be unique. -#### Column Headers +#### Column headers A sample sheet may only contain the column names below: | | | |