[R] [newbie] aggregating table() results and simplifying code with loop

John Kane jrkrideau at inbox.com
Sat Sep 15 17:51:00 CEST 2012


I have not seen any replies to your questions so I will suggest an approach that may work if I can get a function to work.

If I understand what you want, you have a pattern something like this:
pattern1  <-  c("2Ma", "no2Ma","no2Ma", "no2Ma","no2Ma")
pattern2  <-  c("no2Ma", 'no2Ma', "no2Ma", "no2Ma", "2Ma")

for each five year period where 2Ma stands to Maize, one of 11 different grains
  1AU   2BC   2Co   2Ma   2MG   2ML   2oc   2PG   2SA   2We   3sN 

and what you want to know is if each year gives a pattern like

check1 <-  c(TRUE, FALSE, FALSE, FALSE, FALSE)
check2  <-  c(FALSE, FALSE, FALSE, FALSE, TRUE)

If I understand the patterns you only care for the two above, is that correct?

I am running out of time today but I think that this approach will get you started
===========================================================

T80<-read.table(file="C:/sample.txt", header=T, sep=";")

# Reminder of just what we want to get as a final result.
check1 <-  c(TRUE, FALSE, FALSE, FALSE, FALSE)
check2  <-  c(FALSE, FALSE, FALSE, FALSE, TRUE)

pattern1  <-  c("2Ma", "2Ma","2Ma", "2Ma","2Ma")

# one row examples to see that is happening
T80[1,3:7]
T80[1, 3:7] == pattern1

T80[405, 3:7]
T80[405, 3:7] == pattern1

# now we apply the patterns to the entire data set.
pp1  <-  T80[, 3:7] == pattern1
pp2  <-  T80[, 3:7] == pattern2

# reassign the WS values so we know where the data is from
WSnames  <-  rep(T80$WS, 2)

# Assmble new data frame. 
maizedata  <-  data.frame(WSnames, rbind(pp1,pp2))
========================================================

Now, assuming this runs for you and I have not made a serious mistake in logic, kyou should be able to do some subsetting  (?subset)  to extract only the
check1 and check2 patterns above.  

This is where I ran into trouble as I don't have the time this morning to work out the subsetting conditions. It looks tricking and you probably need a couple of subsetting moves. 

It's not a pretty  solutlion and, particularly, I expect someone could clean it up to make the subsetting easier or even unnecessary but I hope it helps.

Once you have extracted what you want   use apply() or perhaps the plyr package to aggregate the results.  

Repeat for all grains.  Actually look into setting the whole thing up as a function. You should be able to write the program once as a function and do a loop or an apply() to do all 11 grains in one go.

Best of luck.

John Kane
Kingston ON Canada


> -----Original Message-----
> From: ridavide at gmail.com
> Sent: Thu, 13 Sep 2012 15:36:28 +0200
> To: r-help at r-project.org
> Subject: [R] [newbie] aggregating table() results and simplifying code
> with loop
> 
> Dear all,
> I'm looking for primary help at aggregating table() results and at
> writing a loop (if useful)
> 
> My dataset ( http://goo.gl/gEPKW ) is composed of 23k rows, each one
> representing a point in the space of which we know the land cover over
> 10 years (column y01 to y10).
> 
> I need to analyse it with a temporal sliding window of 5 years (y01 to
> y05, y02 to y06 and so forth)
> For each period I'm looking for specific sequences (e.g., Maize,
> -noMaize, -noMaize, -noMaize, -noMaize) to calculate the "return time"
> of principal land covers: barley (2BC), colza (2Co), maize (2Ma), etc.
> I define the "return time" as the presence of a given land cover
> according to a given sequence. Hence, each return time could require
> the sum of different sequences (e.g., a return time of 5 years derives
> from the sum of [2Ma,no2Ma,no2Ma,no2Ma,no2Ma] +
> [no2Ma,no2Ma,no2Ma,no2Ma,2Ma]).
> I need to repeat the calculation for each land cover for each time
> window. In addition, I need to repeat the process over three datasets
> (the one I give is the first one, the second one is from year 12 to
> year 24, the third one from year 27 to year 31. So I have breaks in
> the monitoring of land cover that avoid me to create a continuous
> dataset). At the end I expect to aggregate the sum for each spatial
> entity (column WS)
> 
> I've started writing the code for the first crop in the first 5yrs
> period (http://goo.gl/FhZNx) then copying and pasting it for each crop
> then for each time window...
> Moreover I do not know how to aggregate the results of table(). (NB
> sometimes I have a different number of WS per table because a given
> sequence could be absent in a given spatial entity... so I have the
> following warning msg: number of columns of result is not a multiple
> of vector length (arg 1)). Therefore, I'm "obliged" to copy&paste the
> table corresponding to each sequence....
> 
> FIRST QUEST. How to aggregate the results of table() when the number
> of columns is different?
> Or the other way around: Is there a way to have a table where each row
> reports the number of points per time return per WS? something like
> 
> WS1    WS2    WS3    WS4    ...    WS16    crop    period
> 23    15    18    43    ...    52       Ma5    01
> 18    11    25    84    ...    105       Ma2    01
> ...    ...    ...    ...    ...    ...    ...    ...
> ...    ...    ...    ...    ...    ...    Co5    01
> ...    ...    ...    ...    ...    ...    ...    ...
> ...    ...    ...    ...    ...    ...    Ma5    02
> ...    ...    ...    ...    ...    ...    ...    ...
> In this table each row should represent a return time for a given land
> cover a given period (one of the 6 time window of 5 years)?
> 
> SECOND QUEST. Could a loop (instead of a modular copy/paste code)
> improve the time/reliability of the calculation? If yes, could you
> please indicate me some entry-level references to write it?
> 
> I am aware this are newbie's questions, but I have not be able to
> solve them using manuals and available sources.
> Thank you in advance for your help.
> 
> Greetings,
> Dd
> 
> PS
> R: version 2.14.2 (2012-02-29)
> OS: MS Windows XP Home 32-bit SP3
>
> *****************************
> Davide Rizzo
> post-doc researcher
> INRA UR055 SAD-ASTER
> website :: http://sites.google.com/site/ridavide/

____________________________________________________________
GET FREE 5GB EMAIL - Check out spam free email with many cool features!
Visit http://www.inbox.com/email to find out more!




More information about the R-help mailing list