[R] Simple table generation question

Bartjoosen bartjoosen at hotmail.com
Tue Jul 10 16:55:56 CEST 2007


Maybe this is what you want:

you are right about the re-allocating the tables, but you can subset your
table into a new one:

selection <- which(device_Prob_Vector > 0.5)
# or via sample: selection <- sample(num_Devices)
training_Set <- measurements[selection]
validation_Set <- measurements[-selection]

good luck

Bart



natekupp wrote:
> 
> Hey all,
> 
> I'm doing some work with machine learning on R (I'm a fairly new user of
> R), and I have a question about generating new tables from existing
> tables.  I'm currently using a table of measurements I read in from a CSV
> file to generate training and validation data set tables for future use in
> a machine learning algorithm using the code:
> 
> #generate probabilities to divide up training / validation data sets
> randomly
> device_Prob_Vector <- runif(num_Devices)
> 
> #NULL-initialize training and validation sets.  This seems like a bit of a
> hack...
> training_Set <- measurements[0]
> validation_Set <- measurements[0]
> 
> #divide up the training and validation data sets from measurements.
> for ( i in 1:num_Devices)
> {
>      	if ( device_Prob_Vector[i] > 0.5 )
>      	{
>      		training_Set <- rbind(training_Set, measurements[i,])
>      	}
>      	else
>      	{
>      		validation_Set <- rbind(validation_Set, measurements[i,])
>      	}
> }
> 
> This code works correctly, but takes quite a long time to execute.  I
> suspect this is because rbind() is dynamically resizing the tables as it
> adds new rows to each table of data.  Is there a way to pre-allocate
> memory for each of the two tables, and then shrink them after the loop has
> completed?  Thanks for the help.
> 
> ~Nate
> 

-- 
View this message in context: http://www.nabble.com/Simple-table-generation-question-tf4056042.html#a11522530
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list