This repository has been archived by the owner on Sep 15, 2020. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #4 from tskir/eva-1937-repeat-expansion-tests
EVA-1937 — Tests for the repeat expansion pipeline
- Loading branch information
Showing
16 changed files
with
698 additions
and
13 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
.pytest_cache | ||
**/__pycache__ | ||
*.egg-info |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,22 +1,23 @@ | ||
dist: bionic | ||
language: python | ||
python: | ||
- "3.8" | ||
|
||
# The pipeline requires GNU parallel; bcftools; and certain Python modules | ||
before_install: | ||
- sudo apt-get -y install parallel | ||
install: | ||
- sudo apt update | ||
- sudo apt -y install samtools bcftools parallel libbz2-dev liblzma-dev | ||
- pip -q install -r requirements.txt | ||
- git clone -q git://github.com/samtools/htslib.git | ||
- git clone -q git://github.com/samtools/bcftools.git | ||
- cd bcftools && make --quiet --jobs `nproc` && cd .. | ||
- export PATH=$PATH:bcftools | ||
|
||
# For the actual test, we're running a set of 2,000 ClinVar variants through VEP and comparing the result with the | ||
# expected one (diff will exit with return code 0 if the files are identical, and with 1 otherwise). Of course, this | ||
# means that when VEP updates, the test will break; however, this is exactly the intention, as in this case we will be | ||
# able to compare the results and see if they make sense. | ||
script: | ||
- ls | ||
- echo 'Test 1. VEP mapping pipeline' | ||
- bash run_consequence_mapping.sh vep_mapping_pipeline/test/input.vcf output_mappings.tsv | ||
- diff vep_mapping_pipeline/test/output_mappings.tsv output_mappings.tsv | ||
|
||
- echo 'Test 2. Repeat expansion pipeline' | ||
- pip install --editable . | ||
- pytest |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# Tests for the repeat expansion pipeline | ||
|
||
ClinVar repeat expansion data includes a number of peculiarities. To check them all in separate unit tests would be | ||
expensive to develop and maintain. Hence, the pipeline uses a hybrid integration test with an annotated dataset. | ||
|
||
The dataset includes the input file [`input_variant_summary.tsv`](input_variant_summary.tsv) and two expected output | ||
files: [`output_dataframe.tsv`](output_dataframe.tsv) and [`output_consequences.tsv`](output_consequences.tsv). The | ||
input file is not a sample, but rather a complete selection of “NT expansion” variants from ClinVar data as of | ||
2020-04-08. The expected output files were produced by the pipeline and checked manually for correctness. The idea | ||
behind including the entire dataset is that it will make the tests sensitive to even minor changes. | ||
|
||
The test files are annotated using comments, which are removed by the testing function prior to using those files. The records of special interest are listed on top, and their peculiarities are documented. This allows to trace the fate | ||
of each such record from input to full dataframe to the collapsed final output. | ||
|
||
In addition to the hybrid integration test, the code of the pipeline itself performs sanity checks whenever possible. |
Empty file.
Oops, something went wrong.