[R] Fwd: UPDATE

Caitlin Gibbons bioprogr@mmer @ending from gm@il@com
Thu Dec 27 05:55:34 CET 2018


Does this help Spencer? The read.delim() function assumes a tab character by default, but I specifically included it using the read.csv function. The downloaded file is NOT an Excel file so this should help. 

GBM_protein_expression <- read.csv("C:/Users/Spencer/Desktop/GBM
protein_expression.tsv", sep=“\t”)

Sent from my iPhone

> On Dec 26, 2018, at 9:23 PM, Richard M. Heiberger <rmh using temple.edu> wrote:
> 
> this is wrong because the file is a csv file.  read_excel is designed
> for xls files.
> GBM_protein_expression <- read_excel("C:/Users/Spencer/Desktop/GBM
> protein_expression.csv")
> 
> How did you get a csv? it downloads as tsv.
> 
> the statement you should use is in base, no library() statement is needed.
> 
> GBM_protein_expression <- read.delim("C:/Users/Spencer/Desktop/GBM
> protein_expression.csv")
> 
> read.delim is the same as read.csv except that it sets the sep
> argument to "\t".
> 
> 
> 
> On Wed, Dec 26, 2018 at 11:11 PM Spencer Brackett
> <spbrackett20 using saintjosephhs.com> wrote:
>> 
>> Sorry, my mistake.
>> 
>> So I could still use read.table and should I try using a .txt version of
>> the file to avoid the silent changes you described?
>> 
>> Also, when I tried to simply this process by downloading the dataset onto
>> RStudio opposed to R (Gui) I received the following...
>> library(readxl)
>>> GBM_protein_expression <- read_excel("C:/Users/Spencer/Desktop/GBM
>> protein_expression.csv")
>> Error: Can't establish that the input is either xls or xlsx.
>>> View(GBM_protein_expression)
>> Error in View : object 'GBM_protein_expression' not found
>> Error in gzfile(file, mode) : cannot open the connection
>> In addition: Warning message:
>> In gzfile(file, mode) :
>>  cannot open compressed file
>> 'C:/Users/Spencer/AppData/Local/Temp/RtmpQNQrMh/input147c61fc5b52.rds',
>> probable reason 'No such file or directory'
>>> library(readxl)
>>> GBM_protein_expression <-
>> read_excel("C:/Users/Spencer/Desktop/GBM_protein_ expression.xlsx")
>> readxl works best with a newer version of the tibble package.
>> You currently have tibble v1.4.2.
>> Falling back to column name repair from tibble <= v1.4.2.
>> Message displays once per session.
>>> View(GBM_protein_expression)
>> 
>> 
>> Is this perhaps the result of lack of preview (which I did not complete at
>> the time I hit import as the preview failed to load), or the fact that the
>> excel file itself contains no numerical data, but only TRUE or FALSE
>> entries?
>> 
>> On Wed, Dec 26, 2018 at 10:59 PM Jeff Newmiller <jdnewmil using dcn.davis.ca.us>
>> wrote:
>> 
>>> Please always reply-all to keep the list involved.
>>> 
>>> If you used Save As to change the data format to Excel AND the file
>>> extension to xlsx, then yes, you should be able to read with readxl. I
>>> don't recommend it, though... Excel often changes data silently and in
>>> irregularly located places in your file.
>>> 
>>> On December 26, 2018 7:38:16 PM PST, Spencer Brackett <
>>> spbrackett20 using saintjosephhs.com> wrote:
>>>> So even if I imported the file form ICGC to my desktop as an excel
>>>> file,
>>>> and can view and saved the data as such, it is still a TSV?
>>>> 
>>>> On Wed, Dec 26, 2018 at 10:35 PM Jeff Newmiller
>>>> <jdnewmil using dcn.davis.ca.us>
>>>> wrote:
>>>> 
>>>>> CSV and TSV are not Excel files. Yes, I know Excel will open them,
>>>> but
>>>>> that does not make them Excel files.
>>>>> 
>>>>> Read a TSV file with read.table or read.csv, setting the sep argument
>>>> to
>>>>> "\t".
>>>>> 
>>>>> On December 26, 2018 7:26:35 PM PST, Spencer Brackett <
>>>>> spbrackett20 using saintjosephhs.com> wrote:
>>>>>> I tried importing the file without preview and recieved the
>>>>>> following....
>>>>>> 
>>>>>> library(readxl)
>>>>>>> GBM_protein_expression <- read_excel("C:/Users/Spencer/Desktop/GBM
>>>>>> protein_expression.csv")
>>>>>> Error: Can't establish that the input is either xls or xlsx.
>>>>>>> View(GBM_protein_expression)
>>>>>> Error in View : object 'GBM_protein_expression' not found
>>>>>> Error in gzfile(file, mode) : cannot open the connection
>>>>>> In addition: Warning message:
>>>>>> In gzfile(file, mode) :
>>>>>> cannot open compressed file
>>>>> 
>>>>> 'C:/Users/Spencer/AppData/Local/Temp/RtmpQNQrMh/input147c61fc5b52.rds',
>>>>>> probable reason 'No such file or directory'
>>>>>>> library(readxl)
>>>>>>> GBM_protein_expression <-
>>>>>> read_excel("C:/Users/Spencer/Desktop/GBM_protein_ expression.xlsx")
>>>>>> readxl works best with a newer version of the tibble package.
>>>>>> You currently have tibble v1.4.2.
>>>>>> Falling back to column name repair from tibble <= v1.4.2.
>>>>>> Message displays once per session.
>>>>>>> View(GBM_protein_expression)
>>>>>> 
>>>>>> Also, the area above my console says that no data is available in
>>>> the
>>>>>> table. Is this perhaps the result of lack of preview or the fact
>>>> that
>>>>>> the
>>>>>> excel file itself contains no numerical data, but only TRUE or FALSE
>>>>>> entries?
>>>>>> 
>>>>>> On Wed, Dec 26, 2018 at 9:57 PM Spencer Brackett <
>>>>>> spbrackett20 using saintjosephhs.com> wrote:
>>>>>> 
>>>>>>> Hello again,
>>>>>>> 
>>>>>>> I worked on directly downloading the file into R as was suggested,
>>>>>> but
>>>>>>> have thus far been unsuccessful. This is what  I generated on my
>>>>>> second
>>>>>>> attempt...
>>>>>>> 
>>>>>>> GBM protein_expression<-(file.choose(), header=TRUE, sep="\t")
>>>>>>> Error: unexpected symbol in "GBM protein_expression"
>>>>>>>> GBM
>>>>>>> 
>>>>> 
>>> 
>>>>> protein_expression<-(file.choose(GBM_protein_expression.xlsx),header=TRUE,
>>>>>>> sep="\t")
>>>>>>> Error: unexpected symbol in "GBM protein_expression"
>>>>>>>> 
>>>>>>> 
>>>>>>> What part of the argument is in error?
>>>>>>> 
>>>>>>> Also I tried importing the dataset as an excel file on RStudio to
>>>> see
>>>>>> if I
>>>>>>> could solve my problem that way. However, my imported excel file
>>>> has
>>>>>> been
>>>>>>> stuck in the 'retrieving preview data' and no data is appearing.
>>>> Is
>>>>>> the
>>>>>>> data file prehaps too large or in the wrong format?
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Wed, Dec 26, 2018 at 6:42 PM Spencer Brackett <
>>>>>>> spbrackett20 using saintjosephhs.com> wrote:
>>>>>>> 
>>>>>>>> Mr. Heiberger,
>>>>>>>> 
>>>>>>>> Thank you for the insight! I will try out suggestion.
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> 
>>>>>>>> Spencer Brackett
>>>>>>>> 
>>>>>>>> On Wed, Dec 26, 2018 at 6:34 PM Richard M. Heiberger
>>>>>> <rmh using temple.edu>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> I looked at the first file.  It gives an option to download as
>>>> TSV
>>>>>>>>> (tab separated values).
>>>>>>>>> That is the same as CSV except with tabs instead of commas.
>>>>>>>>> You do not need any external software to read it.  Read the
>>>>>> downloaded
>>>>>>>>> file directly into R.
>>>>>>>>> 
>>>>>>>>> read.delim looks as if it would work directly on the downloaded
>>>>>> file.
>>>>>>>>> ?read.delim
>>>>>>>>> The notation "\t" means the tab character.
>>>>>>>>> 
>>>>>>>>> As an aside, stay away from notepad. it is too naive for almost
>>>>>>>>> anything interesting.
>>>>>>>>> The specific case I often see is people reading linux-style text
>>>>>> files
>>>>>>>>> with notepad, which doesn't
>>>>>>>>> understand NL terminated lines.  nicely formatted text files
>>>> become
>>>>>>>>> illegible.
>>>>>>>>> 
>>>>>>>>> On Wed, Dec 26, 2018 at 6:04 PM Spencer Brackett
>>>>>>>>> <spbrackett20 using saintjosephhs.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Good evening,
>>>>>>>>>> 
>>>>>>>>>> I am attempting to anaylze the protein expression data
>>>> contained
>>>>>> within
>>>>>>>>>> these two ICGC, TCGA datasets (one for GBM and the other for
>>>> LGG)
>>>>>>>>>> 
>>>>>>>>>> *File for GBM  protein expression*:
>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> https://dcc.icgc.org/search?filters=%7B%22donor%22:%7B%22projectId%22:%7B%22is%22:%5B%22GBM-US%22%5D%7D,%22availableDataTypes%22:%7B%22is%22:%5B%22pexp%22%5D%7D%7D%7D
>>>>>>>>>> 
>>>>>>>>>> *File for LGG protein expression:*
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> *
>>>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> https://dcc.icgc.org/search?filters=%7B%22donor%22:%7B%22projectId%22:%7B%22is%22:%5B%22LGG-US%22%5D%7D,%22availableDataTypes%22:%7B%22is%22:%5B%22pexp%22%5D%7D%7D%7D
>>>>>>>>>> <
>>>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> https://dcc.icgc.org/search?filters=%7B%22donor%22:%7B%22projectId%22:%7B%22is%22:%5B%22LGG-US%22%5D%7D,%22availableDataTypes%22:%7B%22is%22:%5B%22pexp%22%5D%7D%7D%7D
>>>>>>>>>> *
>>>>>>>>>> 
>>>>>>>>>>  When I tried to transfer the files from .txt (via Notepad)
>>>> to
>>>>>> .csv
>>>>>>>>> (via
>>>>>>>>>> Excel), the data appeared in the columns as unorganized and
>>>>>> random
>>>>>>>>>> script... not like how a typical csv should be arranged at
>>>> all. I
>>>>>> need
>>>>>>>>> the
>>>>>>>>>> dataset to be converted into .csv in order to analyze it in R,
>>>>>> which
>>>>>>>>> is why
>>>>>>>>>> I am hoping someone here might help me in doing that. If not,
>>>> is
>>>>>> there
>>>>>>>>>> perhaps some other way that I could analyze the datatsets on
>>>> R,
>>>>>> which
>>>>>>>>> again
>>>>>>>>>> is downloaded from the dataportal ICGC?
>>>>>>>>>> 
>>>>>>>>>> Best,
>>>>>>>>>> 
>>>>>>>>>> Spencer Brackett
>>>>>>>>>> 
>>>>>>>>>>        [[alternative HTML version deleted]]
>>>>>>>>>> 
>>>>>>>>>> ______________________________________________
>>>>>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more,
>>>> see
>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>>>> PLEASE do read the posting guide
>>>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>>>>> and provide commented, minimal, self-contained, reproducible
>>>>>> code.
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>>      [[alternative HTML version deleted]]
>>>>>> 
>>>>>> ______________________________________________
>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>> 
>>>>> --
>>>>> Sent from my phone. Please excuse my brevity.
>>>>> 
>>> 
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>> 
>> 
>>        [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list