Skip to content

Commit

Permalink
Choose max value to identify appropriate NumPy dtype to represent seq…
Browse files Browse the repository at this point in the history
…names (#122)
  • Loading branch information
jkanche authored Sep 23, 2024
1 parent 6c8d32e commit 5f6756e
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 7 deletions.
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
# Changelog

## Version 0.4.27 - 0.4.28
## Version 0.4.27 - 0.4.30

- Implement `subtract` method, add tests.
- Use accessor methods to access properties especially `get_seqnames()`
- Modify search and overlap methods for strand-awareness.
- Choose appropriate NumPy dtype for sequences.
- Update tests and documentation.

## Version 0.4.26
Expand Down
15 changes: 9 additions & 6 deletions src/genomicranges/GenomicRanges.py
Original file line number Diff line number Diff line change
Expand Up @@ -222,13 +222,16 @@ def _sanitize_seqnames(self, seqnames, seqinfo):
if not isinstance(seqnames, np.ndarray):
seqnames = np.asarray([self._reverse_seqindex[x] for x in seqnames])

num_uniq = len(np.unique(seqnames))
if num_uniq < 2**8:
if len(seqnames) == 0:
seqnames = seqnames.astype(np.int8)
elif num_uniq < 2**16:
seqnames = seqnames.astype(np.int16)
elif num_uniq < 2**32:
seqnames = seqnames.astype(np.int32)
else:
num_uniq = np.max(seqnames)
if num_uniq < 2**8:
seqnames = seqnames.astype(np.int8)
elif num_uniq < 2**16:
seqnames = seqnames.astype(np.int16)
elif num_uniq < 2**32:
seqnames = seqnames.astype(np.int32)

return seqnames

Expand Down

0 comments on commit 5f6756e

Please sign in to comment.