[R] Help... Organizing multiple spreadsheets data into a huge R data structure!

Duncan Murdoch murdoch at stats.uwo.ca
Mon Sep 15 22:29:14 CEST 2008


On 15/09/2008 12:27 PM, John Wong wrote:
> Hello R users,
> 
> I am relatively new to the R program, and I hope some of you can offer
> me some suggestions on how to organize my data in R using some of the
> more advanced data structuring technique. Here's my scenario:
> 
> I have date set of 50 participants (each with conditions and
> demographic data), each participant performed 2x16 trials, for each
> trial, there was specific information about the trial (i.e. errors,
> and timing), and a spreadsheet-like large data set with headers. I
> have to extract data from each spreadsheet-like data according to the
> information about the specific trial. And then group then according to
> trial nature in the 2x16 structure. Then I can further analyse then
> according to the demographic data grouping the 50 participants.
> 
> 1. I have no idea about what is the best way to organized this data
> set in R, so that it can be the most efficient to analyse it.
> 50 (demographic data set) X  2 (phase) X 16 (trials of varied nature)
> X Trial Data set + Trial Online Recording Physiological Data Set
> Spreadsheet (in text format)

Generally the easiest format to use in R is a dataframe, with one row 
per observation. In your case this would be something like:

participant phase trial trialdata spreadsheetrow spreadsheetcolumn 
observation

This is repetitive (you repeat the trialdata for every observation in 
the spreadsheet); if that's a problem, I'd split it into two dataframes, 
one for the trial data, one for the spreadsheet data:

participant phase trial trialdata

participant phase trial spreadsheetrow spreadsheetcolumn observation

This makes more sense from a database point of view, but it can be 
harder to work with in R, if you want to use the trialdata when 
analyzing the spreadsheet data.

Duncan Murdoch

and the second would have as many of those as are necess
> 2. I don't have a clear idea on how to manage this data structure in
> R. Can somebody point me to the corresponding R resource / examples so
> that I can read and try it out on my data set?
> 
> I tried to hurry for my project but there's no cohort here that is
> particularly polished in R...
> Thanks a LOT...!
> 
> - John
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list