-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compatibility issues with R's BiocFileCache #27
Comments
Hi @jwokaty, thank you for reporting this. The first should be an easy fix. Let me test out the 2nd scenario. The only other time we ran into this was when the file added to the cache was large and would take a while to move/copy, hence we added a timeout constraint. |
Hi @jwokaty, Github automatically closes issues now, but otherwise this works for me, Please install the recent version of the package. Starting with Pythonimport pybiocfilecache as bfc
from pathlib import Path
cache_dir = "./cache_with_py"
# removing any previous caches with the same name
import shutil
shutil.rmtree(cache_dir)
# download the human gtf reference and save it as `hsapiens.gtf.gz` in the current working directory
import urllib.request
url = "ftp://ftp.ensembl.org/pub/release-71/gtf/homo_sapiens/Homo_sapiens.GRCh37.71.gtf.gz"
urllib.request.urlretrieve(url, filename = "hsapiens.gtf.gz" )
# add to the cache
cache.add(rname="homosapiens", fpath=Path("./hsapiens.gtf.gz"))
cache.list_resources() Switch to Rlibrary(BiocFileCache)
cache_dir <- "./cache_with_py"
rbfc <- BiocFileCache(cache_dir)
length(rbfc)
bfcinfo(rbfc)
show(rbfc)
rbfc[["BFC1"]]
hsap <- file("./hsapiens.gtf.gz")
add2 <- bfcadd(rbfc, "hsapiens_from_R", "./hsapiens.gtf.gz", download=FALSE)
add2 Roundtrip to Pythoncache.list_resources()
cache.get(rname="hsapiens_from_R") or
cache.get(rid="BFC2")
downurl = "https://bioconductor.org/packages/stats/bioc/BiocFileCache/BiocFileCache_2024_stats.tab"
add_url = cache.add(rname="download_link", fpath=downurl, rtype="web")
cache.list_resources() Let me know if you run into any issues |
Thanks for your quick attention to my issue! I reinstalled the new version; however, after running the second script and opening the cache with a file made with python, I get
When I attempt the round trip, I am still not able to access the resource created in R. I am able to access the resource I previously created in Python. (Note, this is from a different session I tried in the Bioconductor Docker to make sure I'm not experiencing a problem local to my system.)
The resource created in R doesn't have the cache directory in the path, which might be why it's not found. |
I'm experiencing two issues related to compatibility with R's BiocFileCache.
pyBiocFileCache
withBiocFileCache
. When I attempt to set up the cache in an R session that was created withpyBiocFileCache
, I get the following error in my R session:Lori suggested that this is related to missing
schema_version
, which is missing in themetadata
table of myBiocFileCache.sqlite
file although it should be inserted when creating the database:pyBiocFileCache/src/pybiocfilecache/cache.py
Lines 104 to 109 in 0c2d3a2
If I insert it manually into my
BiocFileCache.sqlite
file, I am able to create the cache that I created viapyBiocFileCache
in an R session usingBiocFileCache
.pyBiocFileCache
that was created in a cache initially through R'sBiocFileCache
. I get an RpathTimeoutError:Here's how I created the caches.
Create a resource with pybiocfilecache.
Create a resource with BiocFileCache in an R session.
In a terminal running Python, try accessing the resource created in the R session.
In an R session, try creating a cache with the one made in the Python session.
If it's helpful, here's what's my virtualenv and my
sessionInfo()
:The text was updated successfully, but these errors were encountered: