Skip to content

Latest commit

 

History

History
191 lines (191 loc) · 20.9 KB

CHANGELOG.md

File metadata and controls

191 lines (191 loc) · 20.9 KB
  • V 1.6.1.0:
    • Added a feature to list bibliography information via trident list --bibliography.
    • Added a new Server API /bibliography to serve bibliography information via HTTP.
  • V 1.6.0.0:
    • Added support to write gzipped EIGENSTRAT and PLINK files with genoconvert and forge. Both commands get a new option -z which creates gzipped output.
  • V 1.5.7.4:
    • Fixed a bug that broke the long-form genotype data input option (with --genoFile + --snpFile + ...).
  • V 1.5.7.3:
    • Allowed 0 in the Nr_SNPs .janno column.
  • V 1.5.7.2:
    • Fixed a bug introduced in Version 1.5.5.0, where input using option "-p" (for example in init) would not behave correctly if input files have multiple file endings, separated by dots.
  • V 1.5.7.1:
    • Fixed a bug in the .janno reading triggered by trailing à characters and caused by premature whitespace trimming.
    • Removed the hacky removeNoBreakSpace function from the .janno reading pipeline. It is not necessary any more.
    • Added a golden test that ensures both changes perform as expected.
  • V 1.5.7.0:
    • Added support for VCF files (Variant Call Format) in Janno-packages.
    • Restructured test package structure, affecting some of the unit- and golden tests.
  • V 1.5.6.0:
    • Introduced individual Janno... types for every .janno column (except Poseidon_ID) in a new module ColumnTypes. This was done to improve .janno validation error messages.
    • Defined a typeclass Makeable with a function make to write smart constructors for the column types.
    • Added a bit TemplateHaskell to automatically create instance definitions for Makeable and cassava ToField/FromField typeclasses in a new module ColumnTypeUtils.
    • Switched to Text for the string types and as an intermediate format, so that we can reliably check for non UTF-8 characters with T.decodeUtf8'.
    • The general .csv field parsing sequence is now as follows: Transform Bytestring to UTF-8 encoded Text and fail upon exceptions. Then transform Text to the desired type with a make constructor function. This function does additional validation and fails if the checks can not be satisfied.
    • Removed the many unused aeson ToJSON/FromJSON instances for .janno column types.
  • V 1.5.5.0:
    • Linked to sequence-formats 1.8.1.0, which adds reading support for gzipped Plink (*.bed and *.bim) and Eigenstrat (.geno and .snp) files.
    • gzipped files are recognised automatically by their file ending.
  • V 1.5.4.0:
    • Better error messages for parsec operations on command line input.
    • Better error messages for YML file parsing.
    • Refactored --errLength to exclusively cover exceptions of type PoseidonGenotypeExceptionForward emerging during genotype data parsing.
  • V 1.5.3.0:
    • Introduced a new output option forge --preservePyml to preserve certain features of the input package, in case there is only one.
    • Refactored the out mode selection in forge to replace an increasingly complex boolean logic with a clear separation of modes based on the new type ForgeOutMode.
    • Thus made the output modes --onlyGeno, --minimal, --preservePyml and the default mutually exclusive to rule out (increasingly complex) interactions.
  • V 1.5.2.0:
    • A new option forge --ordered was added, which outputs the resulting package with individuals ordered according to the entered entities.
  • V 1.5.1.0:
    • A new option list --individuals --fullJanno adds all standard columns from the Janno to the per-individual output.
    • A new API option /individuals?additionalJannoColumns=ALL triggers the same behaviour for the Web API.
  • V 1.5.0.1: Changed the release pipeline: trident-macOS was replaced by trident-macOS-X64 and trident-macOS-ARM64.
  • V 1.5.0.0
    • Removed Josiah Carberry from newPackageTemplate, so that he doesn't get added any more to new packages created by init and forge - the contributor field is missing in the output of these commands now.
    • Adjusted the golden test output accordingly, but also added Josiah to one of the test packages (test/testDat/testPackages/ancient/Schiffels_2016) to keep him around at least in one way and make sure that the ORCID parser runs in the tests.
    • Added a new warning to the package reading process to point out an empty contributor field.
  • V 1.4.1.0:
    • Added new tool trident jannocoalesce, which merges information from a source .janno file to a target .janno file.
  • V 1.4.0.4:
    • Added better error messages for generic cassava parsing (e.g. for broken Int and Double fields) in .janno files.
    • Added better error handling and messages for inconsistent Date_*, Contamination_* and Relation_* columns in .janno files using an Except & Writer monad stack.
    • Cleaned a bit in the SequencingSource module.
  • V 1.4.0.3:
    • Fixed a severe performance leak in code around resolveEntityIndices, which was called in various functions and wastefully recomputed isLatestInCollection way too often. This affected simple commands, like fetching a few packages from the server, forging, and has effects also in xerxes.
    • Bumped to a newer Compiler (GHC 9.4.7) and new Stackage Snapshot (LTS-21.17)
  • V 1.4.0.2:
    • Strictly checking ploidy information across the .janno file and the genotype data in the package reading process has unforeseen consequences. Activating this will require some more changes, so we decided to uncomment this code for now.
  • V 1.4.0.1:
    • This patch makes the error output in case of ploidy-mismatches between the Genotype_Ploidy information in Janno and heterozygote genotypes more user-friendly.
    • We fixed a bug in Fetch.hs in the output for comparing local and remote package versions
    • trident fetch --downloadAll now considers only latest versions.
  • V 1.4.0.0:
    • Major version bump, due to forgescript semantics change.
    • forgeScript now allows for versions of packages to be specified.
    • fetch and forge now work with all package versions.
    • Various other subcommands now load all versions and got an option --onlyLatest to specify listing only latest versions.
    • forge semantics has subtly changed in its behaviour of duplicate resolution. Essentially, there is no automatic duplicate resolution anymore.
    • Genotype Ploidy in Janno Files is now checked with validate: If a sample marked as "haploid" has a heterozygote SNP, this now throws an error.
    • The output of the webAPI and list now include an isLatest field.
    • list --raw now includes column headers.
    • The SecondaryTypes module was dissolved and instead we now have Contributor, ServerClient and Version. EntitiesList (or what is left of it) now lives in EntityTypes.
    • We reworked the tests and added new ones according to all these changes.
  • V 1.3.0.4: Added an option --ignorePoseidonVersion to validate.
  • V 1.3.0.3: Small code layout changes in the golden test setup and slightly better error handling for http requests in fetch and list --remote.
  • V 1.3.0.2: Added a --ignoreChecksums option to validate.
  • V 1.3.0.1: Added a global option --debug, which is short for --logMode VerboseLog.
  • V 1.3.0.0: Replaced update with rectify.
  • V 1.2.3.4: Some cleaning of the trident command line documentation. Added meaningful meta variables to the subcommand arguments. Shortened fetch command logInfo output.
  • V 1.2.3.3: Fixed the behaviour of chronicle when updating a chronicle file (with -u): The lastModified field is now only touched if there is actually a change in the package list.
  • V 1.2.3.2: Some refactoring of summarise to make the code more neat and the result counts more accurate.
  • V 1.2.3.1: Fixed the behaviour of forge when combining .bib files. Duplicates are now properly removed upon merging and the output is alphabetically sorted.
  • V 1.2.3.0: Gave validate the ability to check not just entire packages, but also individual package components (e.g. .janno or .bib files).
  • V 1.2.2.0: Taught the server (serve) how to provide multiple named archives in parallel, with a modified -d interface on the command line and a new option ?archive=... in the Web API. The client commands fetch and list can request information from different archives with a new option --archive.
  • V 1.2.1.0: Introduced the interacting trident subcommands chronicle and timetravel for archive versioning. They are hidden from the command line documentation, because they are not relevant for the end user. As part of the changes a new runtime test mode was defined in the PoseidonIO environment, which can be enabled on the command line with trident --testmode Testing.
  • V 1.2.0.0: Massive rework of the server client infrastructure, including the Web API. The server is now a (hidden) subcommand of trident, is capable of serving multiple versions of one package, and returns proper error messages in case of client version incompatibility. The server-client golden tests are running locally now without querying the production setup. These changes required significant refactoring in the code for the subcommands fetch and list, as well as in the internal modules Package.hs, SecondaryTypes.hs, with effects for the whole project.
  • V 1.1.12.0: Implemented the changes for Poseidon v2.7.1, added stricter validation for the .ssf file, elevated the log level of broken lines from debug to error and switched to a new stylish-haskell version for linting.
  • V 1.1.11.4: Fixed an issue in the .ssf implementation: Multiple columns must be treated as list columns.
  • V 1.1.11.3: Re-implemented the survey subcommand with advanced type level magic to avoid hard to maintain boilerplate code. Again no user-facing changes in trident.
  • V 1.1.11.2: Switch to stackage resolver LTS 20.17 for ghc-9.2.7. No user-facing changes in trident.
  • V 1.1.11.1: Reworked the parts of the test infrastructure to make the golden tests structurally simpler and cleaner. No user-facing changes to trident.
  • V 1.1.11.0: Implemented the changes and additions for the new schema release Poseidon v2.7.0: The sequencingSourceFile (.ssf) file, the new .janno columns (Country_ISO, Library_Names) and the small changes to existing columns (Library_Built).
  • V 1.1.10.2: Added a missing default (asFamily) for the --outPlinkPopName option
  • V 1.1.10.1: Internal refactoring. Introduced a newtype wrapper JannoRows for [JannoRow], which is an instance of Monoid. This should encourage the use of a dedicated implementation of mconcat for JannoRows
  • V 1.1.10.0: Added an option to validate (and therefore readPoseidonPackage) to test parsing the entire .bed/.geno file, not just the first 100 SNPs
  • V 1.1.9.1: Small changes: made trident update write messages to the CHANGELOG file now with a prefix - (to make it proper markdown), turned off verbose debug-level warnings about missing standard columns in the .janno file and made the schema version mismatch error message clearer
  • V 1.1.9.0: Added option to control the read/write of the population name from Plink FAM files more flexibly.
  • V 1.1.8.6: Refactored the -j mechanism by which list --individuals includes additional variables in the output table. It is now possible to query arbitrary addititional columns
  • V 1.1.8.5: Rolled back some of the ToJSON instances changed in 1.1.8.4 because they broke backwards compatibility of the server-client communication. Added some additional tests to prevent such oversights in the future. Slightly reorganized the golden tests
  • V 1.1.8.4: Unified the implementation of ToJSON/FromJSON and ToField/FromField instances for .janno datatypes to perform input validation through smart constructors
  • V 1.1.8.3: The fix in introduced in 1.1.8.1 introduced a bug: It broke valid unicode characters in .janno files and prevented reading them. The solution implemented here solves this issue
  • V 1.1.8.2: Improved the behaviour of list when provided with undefined .janno columns in the -j argument
  • V 1.1.8.1: Fixed an decoding-encoding bug in the janno code by generally trimming all whitespaces on reading and deleting No-Break Space characters
  • V 1.1.8.0: Renamed --no-extract to --packagewise, fixed its behaviour with implicit package selection, and clarified help text
  • V 1.1.7.2: Added a proper warning in readPoseidonPackageCollection if one or all baseDirs do not exist
  • V 1.1.7.1: Added a hint about --logMode VerboseLog to the important "Broken lines" error message in the .janno reading process
  • V 1.1.7.0: Reorganized handling of duplicate individuals: Duplicates are now generally ignored, except in validate (can also be turned of with a new flag) and forge. The forgeString language features a new syntactic entity to select individuals specifically and thus resolve duplication conflicts
  • V 1.1.6.0: Removed outdated --verbose from validate and ignore trailing slashes from --outPath
  • V 1.1.5.0: Enabled reading and forging additional, unspecified variables in .janno files
  • V 1.1.4.2: Added parsing for Accession IDs (.janno file). Wrong entries are ignored, so this is non-breaking
  • V 1.1.4.1: Added a small validation check for calibrated ages in the .janno file
  • V 1.1.4.0: Changes to make poseidon-hs compatible with Poseidon v2.6.0 (backwards compatible with v2.5.0): contributor field optional, added orcid field for contributors, added more capture type options in janno files
  • V 1.1.3.1: Package reading will now fail if bib-entries are not found due to missing bibtex files
  • V 1.1.3.0: Added new features to the server, updated logging, and new API for compatibility checks
  • V 1.1.2.0: Replaced progress indicators with simple, sequential log messages for the download in fetch and the SNP-wise operations in forge and genoconvert
  • V 1.1.1.3: Tiny change to make the documentation of --snpSet more clear
  • V 1.1.1.2: Outsourced optparse-applicative parsers to an own library module. This is helpful for xerxes and other derived tools/libraries
  • V 1.1.1.1: Finally removed the fstats dummy subcommand from trident. It lives in xerxes for a long time already
  • V 1.1.1.0: More complete genotype data error handling to log them properly. Added an option --errLength to truncate overly long error messages
  • V 1.1.0.2: Internal change of the Logging Monad, should not change anything on the user-end
  • V 1.1.0.1: Added Ord instance to PoseidonEntity and SignedEntity
  • V 1.1.0.0: Removed the short options (-r + -g + -s + -i) for the direct genotype data input. Also improved the trident input package and genotype data parsing by making pointless no-input situations impossible
  • V 1.0.1.1: Output directories in fetch and genoconvert are now created if they don't exist
  • V 1.0.1.0: Allowing flexible input of entity lists in fetch and forge
  • V 1.0.0.2: Switched to GHC 8.10.7 and Stackage lts-18.28
  • V 1.0.0.1: Fixed memory leak in genoconvert
  • V 1.0.0.0: Enabled logging with the co-log library
  • V 0.29.1: JSON support for entities
  • V 0.29.0: Added a simpler input interface for unpackaged genotype data and an option to set the output path of genoconvert explicitly
  • V 0.28.0: Added direct interaction with genotype data files (not packaged in a Poseidon package) for genoconvert and forge
  • V 0.27.2: Allow forge lists to be empty, thus triggering forging all packages
  • V 0.27.1: Added a file existence check for the README and CHANGELOG files in the package reading process
  • V 0.27.0: Improved EntitiesList parsing in Forge and Fetch
  • V 0.26.4: Replaced special characters in the validate subcommand report message
  • V 0.26.3: Added an option --ignorePoseidonVersion to the update subcommand to allow updating packages that are outdated by Poseidon version. Poseidon versions are not ignored by default any more, thus reversing a change introduced in 0.26.1.
  • V 0.26.2: Added a check to prevent an empty output package name in the init and forge subcommands
  • V 0.26.1: Added an option to ignore a package's Poseidon version when reading it and activated that by default for the update subcommand
  • V 0.26.0: Updated the library to Poseidon v2.5. This means a number of (breaking) changes in the structure of .janno files
  • V 0.25.0: Moved fstats command into new tool xerxes provided by github package poseidon-analysis-hs in the same organisation
  • V 0.24.4: Switched off geno-check upon server start
  • V 0.24.3: Incongruent SNPs are now skipped, with an optional warning
  • V 0.24.2: Better error messages for broken .janno files
  • V 0.24.1: Added the subcommand summarize as a synonym for summarise to trident
  • V 0.24.0: Made entities parsing in forgeFiles much more powerful, adding exclusion of entities and comments. The changes should be fully backwards compatible
  • V 0.23.1: Cleaned up reduncy genotype checking in validate
  • V 0.23.0: Added a feature to select SNPs during forge
  • V 0.22.0: Added a --minimal option for init and forge to create minimal packages without .janno and .bib (e.g. in automatic pipelines)
  • V 0.21.3: Added a column name suggestion mechanism to the .janno file reading procedure
  • V 0.21.2: Made trident survey more useful
  • V 0.21.1: Simplified package creation in init and forge by enabling creation of deeper paths and by making the output package name argument optional
  • V 0.21.0: Added a poseidonVersion pre-parsing check in the reading pipeline, which strictly excludes packages with missing or wrong version
  • V 0.20.1: Updated poseidon-http-server with new APIs and updated previous APIs
  • V 0.20.0: New forging algorithm, much faster, and new option --no-extract in forge. Also new minimal package template
  • V 0.19.0: Replaced trident checksumupdate with the much more powerful trident update
  • V 0.18.2: Added a golden test feature to make sure code changes do not accidentally modify the output of trident modules
  • V 0.18.1: Removes Pandoc-dependency by implementing a custom bibtex-parser
  • V 0.18.0: checksumupdate now also increments version numbers and updates lastModified fields
  • V 0.17.4: Fixed a critical bug in janno file encoding
  • V 0.17.3: Fixed all issues flagged by the --pedantic compiler setting
  • V 0.17.2: A better internal configuration solution for readPoseidonPackage(Collection)
  • V 0.17.1: Made snpSet non-mandatory to keep backwards-compatibility
  • V 0.17.0: Added and changed fields in the POSEIDON.yml and .janno file data types as defined in Poseidon V 2.3.1
  • V 0.16.0: Added a verbose switch to the package reading functions. It's only available with validate (--verbose) to show all sorts of additional output, at the moment only unspecified and missing janno columns
  • V 0.15.2: Better handling of individuals linked to multiple groups/populations in list
  • V 0.15.1: Completed the task in 0.14.2 with a newtype JannoSex to enable a custom Show action
  • V 0.15.0: Introduced genotype data structure validation on the first 100 SNPs for the package reading process
  • V 0.14.3: Small change in how progress indicators for reading/downloading packages and processing SNPs are printed. This should improve the output in cli-enviroments without the ability to overwrite already printed output
  • V 0.14.2: Modified show instances of multiple janno column types and fixed output of list
  • V 0.14.1: added more helpful error messages to forge
  • V 0.14.0: Multiple minor changes: allowed multiple forgeStrings/fetchStrings and forgeFiles/fetchFiles, added readmeFile and changelogFile to the package data structures, relaxed duplicate check in forge to only stop if there is an overlap within the specific selection relevant for the new package
  • V 0.13.1: added /server_version API and changed server behaviour to only zip files when genotype data is not ignored
  • V 0.13.0: Renamed the update module to checksumupdate, allowed trident to ignore duplicate individuals for some inspection modules, switched to Unix file endings for janno file encoding
  • V 0.12.0: List can now also display janno columns with the -j option for the --individuals case
  • V 0.11.0: Added the module genoconvert to automatically switch the genotype data format in packages
  • V 0.10.0: List now has --remote option to view packages, groups and individuals on a remote server
  • V 0.9.0: Plink output now supported in forge.
  • V 0.8.0: Forge now has new option --intersect to control merging behaviour
  • V 0.7.3: Enabled the validate module to handle duplicated packages
  • V 0.7.2: Turned off checksum validation for every module but validate
  • V 0.7.1: Added a check for ID duplicates when reading package collections
  • V 0.7.0: Made janno reading a lot more flexible
  • V 0.6.0: Added the /individuals_all API to the webserver
  • V 0.5.3: Replaced the .csl data-file setup for bibtex parsing with an in-code solution to solve a related issue with the binary executables
  • V 0.5.2: Fixed a bug that prevented fetch from running after an interrupted run
  • V 0.5.1: Added --downloadAll option to fetch
  • V 0.5.0: Added fetch command to download data from a poseidon server
  • ...
  • V 0.2.2: Various new commands and rename executable to trident.
  • V 0.2.1: Added option to read F-Statistics by file.
  • V 0.2.0: List command and Fstat commands seem to work correctly. Testing needed.
  • V 0.1.0: List command and Fstat commands work in early tests.
  • V 0.0.1: First working command line utility, poseidon summary