Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Openssl is not available in netConnectHttps for importing remote BigWig files #83

Open
lcolladotor opened this issue Feb 20, 2023 · 7 comments

Comments

@lcolladotor
Copy link

Hi,

I'm having trouble importing remote BigWig files with derfinder, which internally uses rtracklayer::import(). I noticed this when looking at recount which is failing on BioC 3.16 and 3.17 (details at leekgroup/recount#23).

You can reproduce this issue with:

library("GenomicRanges")
library("rtracklayer")
range <- GRanges(seqnames = "chrY", ranges = IRanges(1, 57227415))
rtracklayer::import("http://sciserver.org/public-data/recount2/data/SRP002001/bw/mean_SRP002001.bw", selection = reduce(range), as = "RleList")
Error in seqinfo(con) : UCSC library operation failed
In addition: Warning message:
In seqinfo(con) :
  No openssl available in netConnectHttps for sciserver.org : 443
> traceback()
7: seqinfo(con)
6: seqinfo(con)
5: .local(con, format, text, ...)
4: import(FileForFormat(con), ...)
3: import(FileForFormat(con), ...)
2: rtracklayer::import("http://sciserver.org/public-data/recount2/data/SRP002001/bw/mean_SRP002001.bw", 
       selection = reduce(range), as = "RleList")
1: rtracklayer::import("http://sciserver.org/public-data/recount2/data/SRP002001/bw/mean_SRP002001.bw", 
       selection = reduce(range), as = "RleList")
> packageVersion("rtracklayer")
[1] ‘1.58.0

I noticed issue #63 and saw that PR #68 fixed that issue with rtracklayer version 1.55.4 https://github.com/sanchit-saini/rtracklayer/blob/fa2a29d01f4f2975d8e2fe0de5ce4073b4e6b187/DESCRIPTION#L3 (which I'm assuming is already part of version 1.58.0 I'm using).

I also saw #73, and can tell that these are different errors since the canonical message here is No openssl available in netConnectHttps for sciserver.org : 443.

You get a similar error with duffel

library("GenomicRanges")
library("rtracklayer")
range <- GRanges(seqnames = "chrY", ranges = IRanges(1, 57227415))
rtracklayer::import("http://duffel.rail.bio/recount/SRP002001/bw/mean_SRP002001.bw", selection = reduce(range), as = "RleList")
Error in seqinfo(con) : UCSC library operation failed
In addition: Warning message:
In seqinfo(con) :
  No openssl available in netConnectHttps for recount-opendata.s3.amazonaws.com : 443
> traceback()
7: seqinfo(con)
6: seqinfo(con)
5: .local(con, format, text, ...)
4: import(FileForFormat(con), ...)
3: import(FileForFormat(con), ...)
2: rtracklayer::import("http://duffel.rail.bio/recount/SRP002001/bw/mean_SRP002001.bw", 
       selection = reduce(range), as = "RleList")
1: rtracklayer::import("http://duffel.rail.bio/recount/SRP002001/bw/mean_SRP002001.bw", 
       selection = reduce(range), as = "RleList")

(duffel currently points to AWS nellore/digitalocean-duffel@c6e53d5 so these two are the same. IDIES is a different mirror for recount2 data).

library("GenomicRanges")
> library("rtracklayer")
> range <- GRanges(seqnames = "chrY", ranges = IRanges(1, 57227415))
> rtracklayer::import("https://recount-opendata.s3.amazonaws.com/recount2/SRP002001/bw/mean_SRP002001.bw", selection = reduce(range), as = "RleList")
Error in seqinfo(con) : UCSC library operation failed
In addition: Warning message:
In seqinfo(con) :
  No openssl available in netConnectHttps for recount-opendata.s3.amazonaws.com : 443
> options(width = 120); sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.2.2 (2022-10-31)
 os       macOS Ventura 13.0.1
 system   aarch64, darwin20
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       America/New_York
 date     2023-02-20
 pandoc   2.17.1.1 @ /opt/homebrew/bin/pandocPackages ───────────────────────────────────────────────────────────────────
 package              * version   date (UTC) lib source
 Biobase                2.58.0    2022-11-01 [1] Bioconductor
 BiocGenerics         * 0.44.0    2022-11-01 [1] Bioconductor
 BiocIO                 1.8.0     2022-11-01 [1] Bioconductor
 BiocParallel           1.32.5    2022-12-25 [1] Bioconductor
 Biostrings             2.66.0    2022-11-01 [1] Bioconductor
 bitops                 1.0-7     2021-04-24 [1] CRAN (R 4.2.0)
 cli                    3.6.0     2023-01-09 [1] CRAN (R 4.2.0)
 codetools              0.2-19    2023-02-01 [1] CRAN (R 4.2.0)
 crayon                 1.5.2     2022-09-29 [1] CRAN (R 4.2.0)
 DelayedArray           0.24.0    2022-11-01 [1] Bioconductor
 GenomeInfoDb         * 1.34.9    2023-02-02 [1] Bioconductor
 GenomeInfoDbData       1.2.9     2022-11-02 [1] Bioconductor
 GenomicAlignments      1.34.0    2022-11-01 [1] Bioconductor
 GenomicRanges        * 1.50.2    2022-12-18 [1] Bioconductor
 IRanges              * 2.32.0    2022-11-01 [1] Bioconductor
 lattice                0.20-45   2021-09-22 [1] CRAN (R 4.2.2)
 Matrix                 1.5-3     2022-11-11 [1] CRAN (R 4.2.0)
 MatrixGenerics         1.10.0    2022-11-01 [1] Bioconductor
 matrixStats            0.63.0    2022-11-18 [1] CRAN (R 4.2.0)
 RCurl                  1.98-1.10 2023-01-27 [1] CRAN (R 4.2.0)
 restfulr               0.0.15    2022-06-16 [1] CRAN (R 4.2.0)
 rjson                  0.2.21    2022-01-09 [1] CRAN (R 4.2.0)
 Rsamtools              2.14.0    2022-11-01 [1] Bioconductor
 rtracklayer          * 1.58.0    2022-11-01 [1] Bioconductor
 S4Vectors            * 0.36.1    2022-12-07 [1] Bioconductor
 sessioninfo            1.2.2     2021-12-06 [1] CRAN (R 4.2.0)
 SummarizedExperiment   1.28.0    2022-11-01 [1] Bioconductor
 XML                    3.99-0.13 2022-12-04 [1] CRAN (R 4.2.0)
 XVector                0.38.0    2022-11-01 [1] Bioconductor
 yaml                   2.3.7     2023-01-23 [1] CRAN (R 4.2.0)
 zlibbioc               1.44.0    2022-11-01 [1] Bioconductor

 [1] /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library

──────────────────────────────────────────────────────────────────────────────

Let me know if there's any piece of info that might be helpful to you.

Best,
Leo

@lcolladotor lcolladotor changed the title Openssl is not available in netConnectHttps for importing some BigWig files Openssl is not available in netConnectHttps for importing remote BigWig files Feb 20, 2023
@lcolladotor
Copy link
Author

Ahh, note that @ChristopherWilks posted at leekgroup/recount#23 (comment) that the same code works with rtracklayer version 1.50.0 from BioC 3.11.

lcolladotor added a commit to leekgroup/recount that referenced this issue Feb 20, 2023
…lks/snaptron#17. Also #23. I tried insulating recount from these tests, so they'll be reported as warnings instead of errors on the BioC build machines for now.
lcolladotor added a commit to leekgroup/recount that referenced this issue Feb 20, 2023
…lks/snaptron#17. Also #23. I tried insulating recount from these tests, so they'll be reported as warnings instead of errors on the BioC build machines for now.
@lawremi
Copy link
Owner

lawremi commented Feb 21, 2023

Support for SSL depends on having the openssl library available at build time. It's conceivable that either the user is building the package from source without openssl, or Bioconductor at some point stopped providing Mac binaries with openssl support.

@lcolladotor
Copy link
Author

Hi Michael,

The user in this case is me but also, the Bioc machines were reporting the same error. I did notice that the openssl package wasn't being loaded from my R session info above. http://bioconductor.org/checkResults/release/bioc-LATEST/recount/ doesn't show the error anymore, but that's because I turned it into a warning with some edits to the tests at leekgroup/recount@5f2696d that rely on tryCatch().

I'll ask on bioc-devel to see if someone else knows about a change in the rtracklayer binaries.

Best,
Leo

@lcolladotor
Copy link
Author

Also, my collaborator @nellore pointed out we had run into a similar issue back in 2016 as noted at https://support.bioconductor.org/p/81267/

@lawremi
Copy link
Owner

lawremi commented Feb 24, 2023

By "openssl" I mean the C library, not the R package. I'm guessing that the Bioconductor build machine needs to be configured to build openssl support into the Mac binary. This should be as simple as installing openssl with brew. Would you happen to know the right person to contact about that?

@lcolladotor
Copy link
Author

lcolladotor commented Mar 3, 2023

Hi Michael,

Jennifer and Hervé replied at https://stat.ethz.ch/pipermail/bioc-devel/2023-March/019503.html. It sounds like BioC is building the packages with openssl C Library support.

Do you have other leads? cc @ChristopherWilks @nellore.

Best,
Leo

@lcolladotor
Copy link
Author

Hi,

I no longer get the No openssl available in netConnectHttps for error message part anymore, however, just like #73, I'm still encountering issues with derfinder and thus also recount with remote BigWigFile imports through rtracklayer.

The above link (first message on this thread) has changed from "http://sciserver.org/public-data/recount2/data/SRP002001/bw/mean_SRP002001.bw" to "http://data.idies.jhu.edu/recount2/data/SRP002001/bw/mean_SRP002001.bw". duffel currently points to AWS, but with all 3 links I get the same type of error.

Duffel link reproducible example

Here's the small reproducible code:

## Remotely access from duffel link
library("GenomicRanges")
library("rtracklayer")
range <- GRanges(seqnames = "chrY", ranges = IRanges(1, 57227415))
rtracklayer::import("http://duffel.rail.bio/recount/SRP002001/bw/mean_SRP002001.bw", selection = reduce(range), as = "RleList")
traceback()
options(width = 120)
sessioninfo::session_info()
curl::curl_version()

Here's the R output

> ## Remotely access from duffel link
> library("GenomicRanges")
> library("rtracklayer")
> range <- GRanges(seqnames = "chrY", ranges = IRanges(1, 57227415))
> rtracklayer::import("http://duffel.rail.bio/recount/SRP002001/bw/mean_SRP002001.bw", selection = reduce(range), as = "RleList")
Error in seqinfo(con) : UCSC library operation failed
In addition: Warning message:
In seqinfo(con) :
  Couldn't open https://recount-opendata.s3.amazonaws.com/recount2/SRP002001/bw/mean_SRP002001.bw
> traceback()
7: seqinfo(con)
6: seqinfo(con)
5: .local(con, format, text, ...)
4: import(FileForFormat(con), ...)
3: import(FileForFormat(con), ...)
2: rtracklayer::import("http://duffel.rail.bio/recount/SRP002001/bw/mean_SRP002001.bw",
       selection = reduce(range), as = "RleList")
1: rtracklayer::import("http://duffel.rail.bio/recount/SRP002001/bw/mean_SRP002001.bw",
       selection = reduce(range), as = "RleList")
> options(width = 120)
> sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.0 (2024-04-24)
 os       macOS Sonoma 14.5
 system   aarch64, darwin20
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       America/New_York
 date     2024-05-20
 pandoc   3.1.12.1 @ /opt/homebrew/bin/pandoc

─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 package              * version     date (UTC) lib source
 abind                  1.4-5       2016-07-21 [1] CRAN (R 4.4.0)
 Biobase                2.64.0      2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 BiocGenerics         * 0.50.0      2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 BiocIO                 1.14.0      2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 BiocParallel           1.38.0      2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 Biostrings             2.72.0      2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 bitops                 1.0-7       2021-04-24 [1] CRAN (R 4.4.0)
 cli                    3.6.2       2023-12-11 [1] CRAN (R 4.4.0)
 codetools              0.2-20      2024-03-31 [1] CRAN (R 4.4.0)
 crayon                 1.5.2       2022-09-29 [1] CRAN (R 4.4.0)
 curl                   5.2.1       2024-03-01 [1] CRAN (R 4.4.0)
 DelayedArray           0.30.1      2024-05-07 [1] Bioconductor 3.19 (R 4.4.0)
 GenomeInfoDb         * 1.40.0      2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 GenomeInfoDbData       1.2.12      2024-05-03 [1] Bioconductor
 GenomicAlignments      1.40.0      2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 GenomicRanges        * 1.56.0      2024-05-01 [1] Bioconductor 3.19 (R 4.4.0)
 httr                   1.4.7       2023-08-15 [1] CRAN (R 4.4.0)
 IRanges              * 2.38.0      2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 jsonlite               1.8.8       2023-12-04 [1] CRAN (R 4.4.0)
 lattice                0.22-6      2024-03-20 [1] CRAN (R 4.4.0)
 Matrix                 1.7-0       2024-03-22 [1] CRAN (R 4.4.0)
 MatrixGenerics         1.16.0      2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 matrixStats            1.3.0       2024-04-11 [1] CRAN (R 4.4.0)
 R6                     2.5.1       2021-08-19 [1] CRAN (R 4.4.0)
 RCurl                  1.98-1.14   2024-01-09 [1] CRAN (R 4.4.0)
 restfulr               0.0.15      2022-06-16 [1] CRAN (R 4.4.0)
 rjson                  0.2.21      2022-01-09 [1] CRAN (R 4.4.0)
 Rsamtools              2.20.0      2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 rtracklayer          * 1.64.0      2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 S4Arrays               1.4.0       2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 S4Vectors            * 0.42.0      2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 sessioninfo            1.2.2       2021-12-06 [1] CRAN (R 4.4.0)
 SparseArray            1.4.3       2024-05-07 [1] Bioconductor 3.19 (R 4.4.0)
 SummarizedExperiment   1.34.0      2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 UCSC.utils             1.0.0       2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 XML                    3.99-0.16.1 2024-01-22 [1] CRAN (R 4.4.0)
 XVector                0.44.0      2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 yaml                   2.3.8       2023-12-11 [1] CRAN (R 4.4.0)
 zlibbioc               1.50.0      2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)

 [1] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> curl::curl_version()
$version
[1] "8.6.0"

$ssl_version
[1] "(SecureTransport) LibreSSL/3.3.6"

$libz_version
[1] "1.2.12"

$libssh_version
[1] NA

$libidn_version
[1] NA

$host
[1] "x86_64-apple-darwin23.0"

$protocols
 [1] "dict"    "file"    "ftp"     "ftps"    "gopher"  "gophers" "http"    "https"   "imap"    "imaps"   "ldap"
[12] "ldaps"   "mqtt"    "pop3"    "pop3s"   "rtsp"    "smb"     "smbs"    "smtp"    "smtps"   "telnet"  "tftp"

$ipv6
[1] TRUE

$http2
[1] TRUE

$idn
[1] FALSE

IDIES/AWS links

Here's more code for testing with the IDIES or AWS links directly, thus bypassing the redirect service provided by duffel. The results are the same.

Note that download the file with download.file(mode = "wb") then using rtracklayer::import() works.

R code:

## Remotely access from IDIES link
library("GenomicRanges")
library("rtracklayer")
range <- GRanges(seqnames = "chrY", ranges = IRanges(1, 57227415))
rtracklayer::import("http://data.idies.jhu.edu/recount2/data/SRP002001/bw/mean_SRP002001.bw", selection = reduce(range), as = "RleList")
traceback()

## Remotely access from AWS link
library("GenomicRanges")
library("rtracklayer")
range <- GRanges(seqnames = "chrY", ranges = IRanges(1, 57227415))
rtracklayer::import("https://recount-opendata.s3.amazonaws.com/recount2/SRP002001/bw/mean_SRP002001.bw", selection = reduce(range), as = "RleList")
traceback()


## Locally download data from duffel link
temp_duffel <- tempfile("mean_SRP002001_duffel.bw")
download.file("http://duffel.rail.bio/recount/SRP002001/bw/mean_SRP002001.bw", temp_duffel, mode = "wb")
chr <- "chrY"
rtracklayer::import(BigWigFile(temp_duffel), selection = reduce(range), as = "RleList")[[chr]]

## Locally download data from IDIES link
temp_idies <- tempfile("mean_SRP002001_idies.bw")
download.file("http://data.idies.jhu.edu/recount2/data/SRP002001/bw/mean_SRP002001.bw", temp_idies, mode = "wb")
rtracklayer::import(BigWigFile(temp_idies), selection = reduce(range), as = "RleList")[[chr]]

## Locally download data from AWS link
temp_aws <- tempfile("mean_SRP002001_aws.bw")
download.file("https://recount-opendata.s3.amazonaws.com/recount2/SRP002001/bw/mean_SRP002001.bw", temp_aws, mode = "wb")
rtracklayer::import(BigWigFile(temp_aws), selection = reduce(range), as = "RleList")[[chr]]

R output:

> ## Remotely access from IDIES link
> library("GenomicRanges")
> library("rtracklayer")
> range <- GRanges(seqnames = "chrY", ranges = IRanges(1, 57227415))
> rtracklayer::import("http://data.idies.jhu.edu/recount2/data/SRP002001/bw/mean_SRP002001.bw", selection = reduce(range), as = "RleList")
Error in seqinfo(con) : UCSC library operation failed
In addition: Warning message:
In seqinfo(con) :
  Couldn't open https://data.idies.jhu.edu/recount2/data/SRP002001/bw/mean_SRP002001.bw
> traceback()
7: seqinfo(con)
6: seqinfo(con)
5: .local(con, format, text, ...)
4: import(FileForFormat(con), ...)
3: import(FileForFormat(con), ...)
2: rtracklayer::import("http://data.idies.jhu.edu/recount2/data/SRP002001/bw/mean_SRP002001.bw",
       selection = reduce(range), as = "RleList")
1: rtracklayer::import("http://data.idies.jhu.edu/recount2/data/SRP002001/bw/mean_SRP002001.bw",
       selection = reduce(range), as = "RleList")


> ## Remotely access from AWS link
> library("GenomicRanges")
> library("rtracklayer")
> range <- GRanges(seqnames = "chrY", ranges = IRanges(1, 57227415))
> rtracklayer::import("https://recount-opendata.s3.amazonaws.com/recount2/SRP002001/bw/mean_SRP002001.bw", selection = reduce(range), as = "RleList")

Error in seqinfo(con) : UCSC library operation failed
In addition: Warning message:
In seqinfo(con) :
  Couldn't open https://recount-opendata.s3.amazonaws.com/recount2/SRP002001/bw/mean_SRP002001.bw
>
> traceback()
7: seqinfo(con)
6: seqinfo(con)
5: .local(con, format, text, ...)
4: import(FileForFormat(con), ...)
3: import(FileForFormat(con), ...)
2: rtracklayer::import("https://recount-opendata.s3.amazonaws.com/recount2/SRP002001/bw/mean_SRP002001.bw",
       selection = reduce(range), as = "RleList")
1: rtracklayer::import("https://recount-opendata.s3.amazonaws.com/recount2/SRP002001/bw/mean_SRP002001.bw",
       selection = reduce(range), as = "RleList")
>
>
> ## Locally download data from duffel link
> temp_duffel <- tempfile("mean_SRP002001_duffel.bw")
> download.file("http://duffel.rail.bio/recount/SRP002001/bw/mean_SRP002001.bw", temp_duffel, mode = "wb")
trying URL 'http://duffel.rail.bio/recount/SRP002001/bw/mean_SRP002001.bw'
Content type 'binary/octet-stream' length 50936703 bytes (48.6 MB)
==================================================
downloaded 48.6 MB

> chr <- "chrY"
> rtracklayer::import(BigWigFile(temp_duffel), selection = reduce(range), as = "RleList")[[chr]]
numeric-Rle of length 57227415 with 2855 runs
  Lengths:  2838976       36    62104       36    21210        1 ...      573       36    30043       36   341328
  Values :   0.0000  25.7608   0.0000  12.8804   0.0000  12.8804 ...   0.0000  12.8804   0.0000  12.8804   0.0000



> ## Locally download data from IDIES link
> temp_idies <- tempfile("mean_SRP002001_idies.bw")
> download.file("http://data.idies.jhu.edu/recount2/data/SRP002001/bw/mean_SRP002001.bw", temp_idies, mode = "wb")
trying URL 'http://data.idies.jhu.edu/recount2/data/SRP002001/bw/mean_SRP002001.bw'
Content type 'text/plain' length 50936703 bytes (48.6 MB)
==================================================
downloaded 48.6 MB

> rtracklayer::import(BigWigFile(temp_idies), selection = reduce(range), as = "RleList")[[chr]]
numeric-Rle of length 57227415 with 2855 runs
  Lengths:  2838976       36    62104       36    21210        1 ...      573       36    30043       36   341328
  Values :   0.0000  25.7608   0.0000  12.8804   0.0000  12.8804 ...   0.0000  12.8804   0.0000  12.8804   0.0000



> ## Locally download data from AWS link
> temp_aws <- tempfile("mean_SRP002001_aws.bw")
> download.file("https://recount-opendata.s3.amazonaws.com/recount2/SRP002001/bw/mean_SRP002001.bw", temp_aws, mode = "wb")
trying URL 'https://recount-opendata.s3.amazonaws.com/recount2/SRP002001/bw/mean_SRP002001.bw'
Content type 'binary/octet-stream' length 50936703 bytes (48.6 MB)
==================================================
downloaded 48.6 MB

> rtracklayer::import(BigWigFile(temp_aws), selection = reduce(range), as = "RleList")[[chr]]
numeric-Rle of length 57227415 with 2855 runs
  Lengths:  2838976       36    62104       36    21210        1 ...      573       36    30043       36   341328
  Values :   0.0000  25.7608   0.0000  12.8804   0.0000  12.8804 ...   0.0000  12.8804   0.0000  12.8804   0.0000

recount reprex

Building recount is failing on BioC 3.19 and 3.20 ultimately due to this issue but with a much larger BigWig file (1327.5 MB vs 48.6 from the earlier reproducible example). This was reported to me at leekgroup/recount#25. https://bioconductor.org/checkResults/release/bioc-LATEST/recount/nebbiolo1-buildsrc.html points to https://github.com/leekgroup/recount/blob/c3fa29a46c64598a51c54df73cfbcf1252389c80/vignettes/recount-quickstart.Rmd#L524-L528, which can be reduced to just this code:

library("GenomicRanges")
library("rtracklayer")
files <- "http://duffel.rail.bio/recount/SRP009615/bw/mean_SRP009615.bw"
chr <- "chrY"
chrlen <- 57227415
bList <- BigWigFileList(files)
which <- GRanges(seqnames = chr, ranges = IRanges(1, chrlen))
x <- import(bList[[1]], selection = reduce(which), as = "RleList")
traceback()

Here's the output:

> files <- "http://duffel.rail.bio/recount/SRP009615/bw/mean_SRP009615.bw"
> chr <- "chrY"
> chrlen <- 57227415
> bList <- BigWigFileList(files)
> which <- GRanges(seqnames = chr, ranges = IRanges(1, chrlen))
> x <- import(bList[[1]], selection = reduce(which), as = "RleList")
Error in seqinfo(con) : UCSC library operation failed
In addition: Warning message:
In seqinfo(con) :
  Couldn't open https://recount-opendata.s3.amazonaws.com/recount2/SRP009615/bw/mean_SRP009615.bw
> traceback()
5: seqinfo(con)
4: seqinfo(con)
3: .local(con, format, text, ...)
2: import(bList[[1]], selection = reduce(which), as = "RleList")
1: import(bList[[1]], selection = reduce(which), as = "RleList")

Note that downloading the file locally does work. But well, connection issues can pop up way more frequently when such a large file is being downloaded.

> temp_SRP009615 <- tempfile("mean_SRP009615_duffel.bw")
> download.file(files, temp_SRP009615, mode = "wb")
trying URL 'http://duffel.rail.bio/recount/SRP009615/bw/mean_SRP009615.bw'
Content type 'binary/octet-stream' length 1392034077 bytes (1327.5 MB)
==================================================
downloaded 1327.5 MB

> rtracklayer::import(BigWigFile(temp_SRP009615), selection = reduce(which), as = "RleList")[[chr]]
numeric-Rle of length 57227415 with 150774 runs
  Lengths:  2781486       36        1       36        8       18 ...       57       36       49       36   339535
  Values : 0.000000 0.286485 0.000000 0.372865 0.000000 0.323784 ... 0.000000 0.323784 0.000000 0.310604 0.000000

I guess that I could expand derfinder to attempt to download the file locally with 3 retries, similar to how it currently tries to import the data remotely with rtracklayer 3 times https://github.com/lcolladotor/derfinder/blob/f9cd986e0c1b9ea6551d0d8d2077d4501216a661/R/loadCoverage.R#L396-L410. But it doesn't seem like the best solution to me, as one of the appeals of the BigWigFile format was the option to remotely access parts of it.

Let me know if I can provide any more useful information.

Best,
Leo

lcolladotor added a commit to leekgroup/recount that referenced this issue May 20, 2024
lcolladotor added a commit to leekgroup/recount that referenced this issue May 20, 2024
lcolladotor added a commit to leekgroup/recount that referenced this issue May 20, 2024
lcolladotor added a commit to leekgroup/recount that referenced this issue May 20, 2024
lcolladotor added a commit to LieberInstitute/recountWorkflow that referenced this issue May 20, 2024
lcolladotor added a commit to LieberInstitute/recountWorkflow that referenced this issue May 20, 2024
lcolladotor added a commit to LieberInstitute/recountWorkflow that referenced this issue May 20, 2024
lcolladotor added a commit to LieberInstitute/recountWorkflow that referenced this issue May 20, 2024
lcolladotor added a commit to leekgroup/recount that referenced this issue May 21, 2024
lcolladotor added a commit to leekgroup/recount that referenced this issue May 21, 2024
lcolladotor added a commit to LieberInstitute/recountWorkflow that referenced this issue May 21, 2024
lcolladotor added a commit to LieberInstitute/recountWorkflow that referenced this issue May 21, 2024
lcolladotor added a commit to LieberInstitute/recountWorkflow that referenced this issue May 21, 2024
lcolladotor added a commit to LieberInstitute/recountWorkflow that referenced this issue May 21, 2024
lcolladotor added a commit to LieberInstitute/recountWorkflow that referenced this issue May 21, 2024
lcolladotor added a commit to LieberInstitute/recountWorkflow that referenced this issue May 21, 2024
lcolladotor added a commit to LieberInstitute/recountWorkflow that referenced this issue May 21, 2024
lcolladotor added a commit to LieberInstitute/recountWorkflow that referenced this issue May 21, 2024
lcolladotor added a commit to leekgroup/recount that referenced this issue May 21, 2024
…s. So we can now run all this code on Windows, unlike before. It's not working for remote BigWig files, but I don't know if that's due to lawremi/rtracklayer#83 or something else Windows-specific.
lcolladotor added a commit to leekgroup/recount that referenced this issue May 21, 2024
…s. So we can now run all this code on Windows, unlike before. It's not working for remote BigWig files, but I don't know if that's due to lawremi/rtracklayer#83 or something else Windows-specific.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants