[R] Fwd: UPDATE

Richard M. Heiberger rmh @ending from temple@edu
Thu Dec 27 05:23:27 CET 2018


this is wrong because the file is a csv file.  read_excel is designed
for xls files.
GBM_protein_expression <- read_excel("C:/Users/Spencer/Desktop/GBM
protein_expression.csv")

How did you get a csv? it downloads as tsv.

the statement you should use is in base, no library() statement is needed.

GBM_protein_expression <- read.delim("C:/Users/Spencer/Desktop/GBM
protein_expression.csv")

read.delim is the same as read.csv except that it sets the sep
argument to "\t".



On Wed, Dec 26, 2018 at 11:11 PM Spencer Brackett
<spbrackett20 using saintjosephhs.com> wrote:
>
> Sorry, my mistake.
>
> So I could still use read.table and should I try using a .txt version of
> the file to avoid the silent changes you described?
>
> Also, when I tried to simply this process by downloading the dataset onto
> RStudio opposed to R (Gui) I received the following...
>  library(readxl)
> > GBM_protein_expression <- read_excel("C:/Users/Spencer/Desktop/GBM
> protein_expression.csv")
> Error: Can't establish that the input is either xls or xlsx.
> > View(GBM_protein_expression)
> Error in View : object 'GBM_protein_expression' not found
> Error in gzfile(file, mode) : cannot open the connection
> In addition: Warning message:
> In gzfile(file, mode) :
>   cannot open compressed file
> 'C:/Users/Spencer/AppData/Local/Temp/RtmpQNQrMh/input147c61fc5b52.rds',
> probable reason 'No such file or directory'
> > library(readxl)
> > GBM_protein_expression <-
> read_excel("C:/Users/Spencer/Desktop/GBM_protein_ expression.xlsx")
> readxl works best with a newer version of the tibble package.
> You currently have tibble v1.4.2.
> Falling back to column name repair from tibble <= v1.4.2.
> Message displays once per session.
> > View(GBM_protein_expression)
>
>
> Is this perhaps the result of lack of preview (which I did not complete at
> the time I hit import as the preview failed to load), or the fact that the
> excel file itself contains no numerical data, but only TRUE or FALSE
> entries?
>
> On Wed, Dec 26, 2018 at 10:59 PM Jeff Newmiller <jdnewmil using dcn.davis.ca.us>
> wrote:
>
> > Please always reply-all to keep the list involved.
> >
> > If you used Save As to change the data format to Excel AND the file
> > extension to xlsx, then yes, you should be able to read with readxl. I
> > don't recommend it, though... Excel often changes data silently and in
> > irregularly located places in your file.
> >
> > On December 26, 2018 7:38:16 PM PST, Spencer Brackett <
> > spbrackett20 using saintjosephhs.com> wrote:
> > >So even if I imported the file form ICGC to my desktop as an excel
> > >file,
> > >and can view and saved the data as such, it is still a TSV?
> > >
> > >On Wed, Dec 26, 2018 at 10:35 PM Jeff Newmiller
> > ><jdnewmil using dcn.davis.ca.us>
> > >wrote:
> > >
> > >> CSV and TSV are not Excel files. Yes, I know Excel will open them,
> > >but
> > >> that does not make them Excel files.
> > >>
> > >> Read a TSV file with read.table or read.csv, setting the sep argument
> > >to
> > >> "\t".
> > >>
> > >> On December 26, 2018 7:26:35 PM PST, Spencer Brackett <
> > >> spbrackett20 using saintjosephhs.com> wrote:
> > >> >I tried importing the file without preview and recieved the
> > >> >following....
> > >> >
> > >> >library(readxl)
> > >> >> GBM_protein_expression <- read_excel("C:/Users/Spencer/Desktop/GBM
> > >> >protein_expression.csv")
> > >> >Error: Can't establish that the input is either xls or xlsx.
> > >> >> View(GBM_protein_expression)
> > >> >Error in View : object 'GBM_protein_expression' not found
> > >> >Error in gzfile(file, mode) : cannot open the connection
> > >> >In addition: Warning message:
> > >> >In gzfile(file, mode) :
> > >> >  cannot open compressed file
> > >>
> > >>'C:/Users/Spencer/AppData/Local/Temp/RtmpQNQrMh/input147c61fc5b52.rds',
> > >> >probable reason 'No such file or directory'
> > >> >> library(readxl)
> > >> >> GBM_protein_expression <-
> > >> >read_excel("C:/Users/Spencer/Desktop/GBM_protein_ expression.xlsx")
> > >> >readxl works best with a newer version of the tibble package.
> > >> >You currently have tibble v1.4.2.
> > >> >Falling back to column name repair from tibble <= v1.4.2.
> > >> >Message displays once per session.
> > >> >> View(GBM_protein_expression)
> > >> >
> > >> >Also, the area above my console says that no data is available in
> > >the
> > >> >table. Is this perhaps the result of lack of preview or the fact
> > >that
> > >> >the
> > >> >excel file itself contains no numerical data, but only TRUE or FALSE
> > >> >entries?
> > >> >
> > >> >On Wed, Dec 26, 2018 at 9:57 PM Spencer Brackett <
> > >> >spbrackett20 using saintjosephhs.com> wrote:
> > >> >
> > >> >> Hello again,
> > >> >>
> > >> >> I worked on directly downloading the file into R as was suggested,
> > >> >but
> > >> >> have thus far been unsuccessful. This is what  I generated on my
> > >> >second
> > >> >> attempt...
> > >> >>
> > >> >>  GBM protein_expression<-(file.choose(), header=TRUE, sep="\t")
> > >> >> Error: unexpected symbol in "GBM protein_expression"
> > >> >> > GBM
> > >> >>
> > >>
> >
> > >>protein_expression<-(file.choose(GBM_protein_expression.xlsx),header=TRUE,
> > >> >> sep="\t")
> > >> >> Error: unexpected symbol in "GBM protein_expression"
> > >> >> >
> > >> >>
> > >> >> What part of the argument is in error?
> > >> >>
> > >> >> Also I tried importing the dataset as an excel file on RStudio to
> > >see
> > >> >if I
> > >> >> could solve my problem that way. However, my imported excel file
> > >has
> > >> >been
> > >> >> stuck in the 'retrieving preview data' and no data is appearing.
> > >Is
> > >> >the
> > >> >> data file prehaps too large or in the wrong format?
> > >> >>
> > >> >>
> > >> >>
> > >> >> On Wed, Dec 26, 2018 at 6:42 PM Spencer Brackett <
> > >> >> spbrackett20 using saintjosephhs.com> wrote:
> > >> >>
> > >> >>> Mr. Heiberger,
> > >> >>>
> > >> >>>  Thank you for the insight! I will try out suggestion.
> > >> >>>
> > >> >>> Best,
> > >> >>>
> > >> >>> Spencer Brackett
> > >> >>>
> > >> >>> On Wed, Dec 26, 2018 at 6:34 PM Richard M. Heiberger
> > >> ><rmh using temple.edu>
> > >> >>> wrote:
> > >> >>>
> > >> >>>> I looked at the first file.  It gives an option to download as
> > >TSV
> > >> >>>> (tab separated values).
> > >> >>>> That is the same as CSV except with tabs instead of commas.
> > >> >>>> You do not need any external software to read it.  Read the
> > >> >downloaded
> > >> >>>> file directly into R.
> > >> >>>>
> > >> >>>> read.delim looks as if it would work directly on the downloaded
> > >> >file.
> > >> >>>> ?read.delim
> > >> >>>> The notation "\t" means the tab character.
> > >> >>>>
> > >> >>>> As an aside, stay away from notepad. it is too naive for almost
> > >> >>>> anything interesting.
> > >> >>>> The specific case I often see is people reading linux-style text
> > >> >files
> > >> >>>> with notepad, which doesn't
> > >> >>>> understand NL terminated lines.  nicely formatted text files
> > >become
> > >> >>>> illegible.
> > >> >>>>
> > >> >>>> On Wed, Dec 26, 2018 at 6:04 PM Spencer Brackett
> > >> >>>> <spbrackett20 using saintjosephhs.com> wrote:
> > >> >>>> >
> > >> >>>> > Good evening,
> > >> >>>> >
> > >> >>>> > I am attempting to anaylze the protein expression data
> > >contained
> > >> >within
> > >> >>>> > these two ICGC, TCGA datasets (one for GBM and the other for
> > >LGG)
> > >> >>>> >
> > >> >>>> > *File for GBM  protein expression*:
> > >> >>>> >
> > >> >>>>
> > >> >
> > >>
> > >
> > https://dcc.icgc.org/search?filters=%7B%22donor%22:%7B%22projectId%22:%7B%22is%22:%5B%22GBM-US%22%5D%7D,%22availableDataTypes%22:%7B%22is%22:%5B%22pexp%22%5D%7D%7D%7D
> > >> >>>> >
> > >> >>>> > *File for LGG protein expression:*
> > >> >>>> >
> > >> >>>> >
> > >> >>>> > *
> > >> >>>>
> > >> >
> > >>
> > >
> > https://dcc.icgc.org/search?filters=%7B%22donor%22:%7B%22projectId%22:%7B%22is%22:%5B%22LGG-US%22%5D%7D,%22availableDataTypes%22:%7B%22is%22:%5B%22pexp%22%5D%7D%7D%7D
> > >> >>>> > <
> > >> >>>>
> > >> >
> > >>
> > >
> > https://dcc.icgc.org/search?filters=%7B%22donor%22:%7B%22projectId%22:%7B%22is%22:%5B%22LGG-US%22%5D%7D,%22availableDataTypes%22:%7B%22is%22:%5B%22pexp%22%5D%7D%7D%7D
> > >> >>>> >*
> > >> >>>> >
> > >> >>>> >   When I tried to transfer the files from .txt (via Notepad)
> > >to
> > >> >.csv
> > >> >>>> (via
> > >> >>>> > Excel), the data appeared in the columns as unorganized and
> > >> >random
> > >> >>>> > script... not like how a typical csv should be arranged at
> > >all. I
> > >> >need
> > >> >>>> the
> > >> >>>> > dataset to be converted into .csv in order to analyze it in R,
> > >> >which
> > >> >>>> is why
> > >> >>>> > I am hoping someone here might help me in doing that. If not,
> > >is
> > >> >there
> > >> >>>> > perhaps some other way that I could analyze the datatsets on
> > >R,
> > >> >which
> > >> >>>> again
> > >> >>>> > is downloaded from the dataportal ICGC?
> > >> >>>> >
> > >> >>>> > Best,
> > >> >>>> >
> > >> >>>> > Spencer Brackett
> > >> >>>> >
> > >> >>>> >         [[alternative HTML version deleted]]
> > >> >>>> >
> > >> >>>> > ______________________________________________
> > >> >>>> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more,
> > >see
> > >> >>>> > https://stat.ethz.ch/mailman/listinfo/r-help
> > >> >>>> > PLEASE do read the posting guide
> > >> >>>> http://www.R-project.org/posting-guide.html
> > >> >>>> > and provide commented, minimal, self-contained, reproducible
> > >> >code.
> > >> >>>>
> > >> >>>
> > >> >
> > >> >       [[alternative HTML version deleted]]
> > >> >
> > >> >______________________________________________
> > >> >R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >> >https://stat.ethz.ch/mailman/listinfo/r-help
> > >> >PLEASE do read the posting guide
> > >> >http://www.R-project.org/posting-guide.html
> > >> >and provide commented, minimal, self-contained, reproducible code.
> > >>
> > >> --
> > >> Sent from my phone. Please excuse my brevity.
> > >>
> >
> > --
> > Sent from my phone. Please excuse my brevity.
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list