[R] Loading large tar.gz XenaHub Data into R

William Dunlap wdun|@p @end|ng |rom t|bco@com
Fri Aug 2 00:46:50 CEST 2019


By the way, instead of saying only that there were warnings, it would be
nice to show some of them.  E.g.,
> z <- readLines("
https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz
")
[ Hit control-C or Esc to interrupt, or wait a long time ]
There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
Warning messages:
1: In readLines("
https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz")
:
  line 1 appears to contain an embedded nul
2: In readLines("
https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz")
:
  line 4 appears to contain an embedded nul
3: In readLines("
https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz")
:
  line 7 appears to contain an embedded nul

Burt's guess looks right, as the following gives 10 long lines of
reasonable-looking data.  Remove the 'n=10' to get all of it.

z <- readLines(gzcon(url("
https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz")),
n=10)

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Thu, Aug 1, 2019 at 3:37 PM Bert Gunter <bgunter.4567 using gmail.com> wrote:

> These are gzipped files, I assume. So see ?gzfile and associated info
> for how to open a gzip connection and read from it. You may also
> prefer to search (e.g. at rseek.org) on "read a gzipped file" or
> similar for possible alternatives.
>
> Of course, if they're not gzipped files, then ignore the above. If
> they are, your current approach is hopeless.
>
>
> Cheers,
> Bert
>
> On Thu, Aug 1, 2019 at 3:13 PM Spencer Brackett
> <spbrackett20 using saintjosephhs.com> wrote:
> >
> > Good evening,
> >
> > I am attempting to load the following Xena dataset
> >
> https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz
> >
> > I am trying to unpack the dataset and read it into R as a table, but due
> to
> > the size of the file, I am having some trouble. The following are the
> > commands I have tried thus far.
> >
> > HumanMethylation450 <- fread("
> >
> https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz
> > ")
> >
> > readLines("
> >
> https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz
> > ")
> >
> >                  ###These two above attempts failed with warning messages
> > from R###
> >
> > Methyl <-read.delim("
> >
> https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz
> > ")
> >
> >                ##This attempt is still processing, but has been doing so
> > for quite some time##
> >
> > Any ideas as to what else I could try?
> >
> > Best,
> >
> > Spencer
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list