[R] reading dataset

Frede Aakmann Tøgersen frtog at vestas.com
Thu Mar 27 14:14:22 CET 2014


No you're not right as far as I can tell from the read_v1100r2.f90 fortran code you can find in the download folder.

For day 1 in 1961 I think that prcp = ccc[1:25200], rstn = ccc[25201:50400], and rsnw = ccc[50401:75600] and the same rule applies to the following days with appropriate indices.

Have a look at this (I think -99.9 is value for missing values so set these to NA):

ccc <-readBin("APHRO_MA_050deg_V1101R2.1961", numeric(), n=1e8, size=4, signed=TRUE, endian='little')

n <- 180
m <- 140
recl <- n*m # = 25200

## calculate some indices for day 1 and 2
for (i in 1:6){
    strt <- (i - 1)*recl + 1
    stp <- strt + recl - 1
    print(c(strt, stp))
}

## for day 1 in 1961
prcp <- ccc[1:25200]
prcp[prcp < -90] <- NA
dim(prcp) <- c(n, m)
image(prcp)

rstn <- ccc[25201:50400]
rstn[rstn < -90] <- NA
dim(rstn) <- c(n, m)
image(rstn)

rsnw <- ccc[50401:75600]
rsnw[rsnw < -90] <- NA
dim(rsnw) <- c(n, m)
image(rsnw)



I will leave it to you to interpret rstn and rsnw in regards to prcp. 




Yours sincerely / Med venlig hilsen


Frede Aakmann Tøgersen
Specialist, M.Sc., Ph.D.
Plant Performance & Modeling

Technology & Service Solutions
T +45 9730 5135
M +45 2547 6050
frtog at vestas.com
http://www.vestas.com

Company reg. name: Vestas Wind Systems A/S
This e-mail is subject to our e-mail disclaimer statement.
Please refer to www.vestas.com/legal/notice
If you have received this e-mail in error please contact the sender. 


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of eliza botto
> Sent: 27. marts 2014 13:26
> To: Pascal Oettli
> Cc: r-help at r-project.org
> Subject: Re: [R] reading dataset
> 
> Dear Pascal,
> Thanks for your reply. From your answer I perceived that if followings are
> first three elements of a file
> > dput(ccc[1:3])
> c(0.15912090241909, 0.167244642972946, 0.192471280694008)
> then 0.15912090241909 is precipitation magnitude , 0.167244642972946 is
> RSTN and 0.192471280694008 is flag value.Did i get it right?
> Eliza
> 
> 
> 
> > From: kridox at ymail.com
> > Date: Thu, 27 Mar 2014 12:39:30 +0900
> > Subject: Re: [R] reading dataset
> > To: eliza_botto at hotmail.com
> > CC: r-help at r-project.org
> >
> > Hello,
> >
> > Some hints:
> >    - for the year 1961, the total number of values is 27594000,
> >    - there are 180 longitudes and 140 latitudes,
> >    - there are 365 days,
> >    - there are 3 variables,
> >
> > Compare the total number of values and the result of (180 x 140 x 365 x 3).
> >
> > The order is "precip", "rstn", "flag", "precip", "rstn", "flag",
> > "precip", "rstn", "flag"...
> >
> > Hope this helps,
> > Pascal
> >
> > On Thu, Mar 27, 2014 at 9:45 AM, eliza botto <eliza_botto at hotmail.com>
> wrote:
> > > Dear useRs,
> > > A similar question has previously been asked by another user
> (https://stat.ethz.ch/pipermail/r-sig-geo/2011-September/012791.html) but
> i'll try to discuss it from another angle. Its about data reading. I am trying to
> read to read a data-set APHRO_MA_050deg_V1101R2.1961.gz from
> http://www.chikyu.ac.jp/precip/cgi-
> bin/aphrodite/script/aphrodite_cgi.cgi/download?file=%2FV1101R2%2FAPH
> RO_MA%2F050deg.
> > > I copied the command from previous post which is
> > > ccc <-readBin("APHRO_MA_050deg_V1101R2.1961", numeric(), n=1e8,
> size=4, signed=TRUE, endian='little')
> > > Followings are what I know about the structure of data set. The file
> contains daily fields for 365 days. These daily fields are arranged according to
> the Julian calendar.  Daily
> > > fields (data arrays) contain information on the precipitation amount and
> > > ratio of 0.05-degree cells containing a rain gauge.  In the case the given
> file which is a
> > > 0.5-degree grid file, each field consists of a data array with longitude
> > > by latitude dimensions of  180 x 140 elements for APHRO_MA.
> > >  The first element is a cell at the southwest corner centered at [60.25E,
> 14.75S], the second
> > > is a cell at [60.75E, 14.75S], ..., the 180th is a cell at [149.75E,
> > > 14.75S], and the 181st is a cell at [60.25E, 14.25S]. The data files are
> written in PLAIN DIRECT ACCESS BINARY.  In each daily field, the array for
> precipitation comes first, followed by
> > > information on the rain gauge. Each element (both precipitation and
> > > rain gauge information) is written as a 4-byte floating-point number
> > > in little endian byte order.  Users should swap the byte order to
> > > big endian if necessary.  There are no 'space', 'end of record', or
> > > 'end of file' marks in between.  As it says that precipitation data is in the
> form of array which comes first, followed by the information on rain gauge,
> how do I know which element is precipitation data and which is the
> information of the rain gauge?Thankyou very  much in advance
> > >
> > > Eliza
> > >         [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> > --
> > Pascal Oettli
> > Project Scientist
> > JAMSTEC
> > Yokohama, Japan
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list