[R] Using options(max.print = 1000000) to read in data

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Mon Jul 8 17:16:48 CEST 2019


Hello,

Look at the output of these two commands:

1) str(<matrix object>)

str(matrix(0, nrow = 20530, ncol =173))
num [1:20530, 1:173] 0 0 0 0 0 0 0 0 0 0 ...


The important part is

num [1:20530, 1:173]


This says it's a numeric vector with dimensions 20530 and 173, it's a matrix

2) str(<data.frame object>)

str(as.data.frame(matrix(0, nrow = 20530, ncol =173)))
'data.frame':	20530 obs. of  173 variables:
  $ V1  : num  0 0 0 0 0 0 0 0 0 0 ...
  $ V2  : num  0 0 0 0 0 0 0 0 0 0 ...
  $ V3  : num  0 0 0 0 0 0 0 0 0 0 ...
  [...]


Now it's expressly written

'data.frame':	20530 obs. of  173 variables:


There are 173 variables eac h with the same number of observations, 20530.


When you read in data files with any form of read.table (read.csv is the 
same function with different arguments default values set), you will 
always get a data.frame.


Hope this helps,

Rui Barradas


Às 15:43 de 08/07/19, Spencer Brackett escreveu:
> Using str(GBM.txt) produced the same output as last time, which lists 
> the number of objects acting on a particular number of variables for the 
> said dataset and a few rows read from the original file.
> 
> The result of class(GBM.txt) generates the following..
> 
>  > class(GBM.txt)
> [1] "data.frame"
> 
> Is this to say that the object is set as a 'data frame', opposed to a 
> 'matrix' ?
> 
> I will try running ?is.matrix now
> 
> 
> 
> On Mon, Jul 8, 2019 at 10:33 AM Rui Barradas <ruipbarradas using sapo.pt 
> <mailto:ruipbarradas using sapo.pt>> wrote:
> 
>     Hello,
> 
>     Inline.
> 
>     Às 15:26 de 08/07/19, Spencer Brackett escreveu:
>      > Thank you,
>      >
>      > Here is a summary of the resulting output....
>      >
>      >> nrow(GBM.txt)
>      > [1] 20530
>      >> ncol(GBM.txt)
>      > [1] 173
>      >
>      > This corresponds with the info found in my global environment for the
>      > object indicated. Now, how do I go about determining if the
>     dataset is a
>      > matrix?
> 
>     Try any of
> 
>     str(GBM.txt)
>     class(GBM.txt)
> 
>     Also, like Kevin said, max.print only affects how much is printed, not
>     the read functions. Why change max.print at all? The default value
>     (1000) is large enough, I have never needed to see more than this at a
>     time. In fact, to have an idea of the data I would rather further limit
>     the number of matrix lines printed with
> 
>     head(object)
>     tail(object)
> 
> 
> 
>     Hope this helps,
> 
>     Rui Barradas
>      >
>      >
>      > On Mon, Jul 8, 2019 at 10:16 AM Kevin Thorpe
>     <kevin.thorpe using utoronto.ca <mailto:kevin.thorpe using utoronto.ca>>
>      > wrote:
>      >
>      >>
>      >>> On Jul 8, 2019, at 10:06 AM, Spencer Brackett <
>      >> spbrackett20 using saintjosephhs.com
>     <mailto:spbrackett20 using saintjosephhs.com>> wrote:
>      >>>
>      >>> Hello,
>      >>>
>      >>>   I am trying to reload some data into R in order to check to
>     see if it is
>      >>> formatted as a matrix. I used the command options(max.print =
>     10000000)
>      >> to
>      >>> account for the 20,000 some rows omitted previously when just
>     using the
>      >>> basic version of this function. After entering this command,
>     the dataset
>      >>> mostly loaded into R, but 14717 rows were still omitted.
>      >>>
>      >>>   Can I simply increase the number indicted after 'max.print ='
>     to read in
>      >>> the remaining rows, or should I use 'bigfile.sample <-' or
>      >>> 'bigfile.colclass <-' instead? Do I even need to read in all of
>     the rows
>      >> to
>      >>> test for a matrix?
>      >>>
>      >>> Best,
>      >>>
>      >>> Spencer
>      >>>
>      >>
>      >> I don’t think this option affects how much data is read in, just
>     how much
>      >> is printed to the screen. Use the function str() on your
>     imported object to
>      >> see how many rows, among other things, were brought in.
>      >>
>      >>
>      >>>        [[alternative HTML version deleted]]
>      >>>
>      >>> ______________________________________________
>      >>> R-help using r-project.org <mailto:R-help using r-project.org> mailing list
>     -- To UNSUBSCRIBE and more, see
>      >>> https://stat.ethz.ch/mailman/listinfo/r-help
>      >>> PLEASE do read the posting guide
>      >> http://www.R-project.org/posting-guide.html
>      >>> and provide commented, minimal, self-contained, reproducible code.
>      >>
>      >>
>      >> --
>      >> Kevin E. Thorpe
>      >> Head of Biostatistics,  Applied Health Research Centre (AHRC)
>      >> Li Ka Shing Knowledge Institute of St. Michael's
>      >> Assistant Professor, Dalla Lana School of Public Health
>      >> University of Toronto
>      >> email: kevin.thorpe using utoronto.ca
>     <mailto:kevin.thorpe using utoronto.ca>  Tel: 416.864.5776  Fax: 416.864.3016
>      >>
>      >>
>      >
>      >       [[alternative HTML version deleted]]
>      >
>      > ______________________________________________
>      > R-help using r-project.org <mailto:R-help using r-project.org> mailing list
>     -- To UNSUBSCRIBE and more, see
>      > https://stat.ethz.ch/mailman/listinfo/r-help
>      > PLEASE do read the posting guide
>     http://www.R-project.org/posting-guide.html
>      > and provide commented, minimal, self-contained, reproducible code.
>      >
>



More information about the R-help mailing list