[R] Having some Trouble Data Structures
Uwe Ligges
ligges at statistik.tu-dortmund.de
Sun Oct 28 15:49:07 CET 2012
On 28.10.2012 10:32, Benjamin Ward (ENV) wrote:
> Hi All,
>
> I'm trying to run a simulation of host-pathogen evolution based around individuals.
> What I need to have is a dataframe or table of some description - describing all the individuals of a pathogen population (so far I've implemented this as a matrix):
>
> ID No_of_Effectors Effectors (Sequences)
> [1,] 0001 3 ## 3 Random Numbers ##
>
> There will be many such rows for many individuals. They have something called effectors, the number of which is randomly generated, so say you get 3 in the No_of_Effectors column. Then I make R generate 3 numbers from between 1 and 10,000, this gives me three numerical representations of genes. These numbers will be compared to a similar data structure of the host individuals who have their immune genes with similar numbers.
>
> My problem is that obviously I can't stick 3 numbers in one "cell" of the matrix (I've tried) :
>
> Pathogen_Individuals[1,3] <- c(2,3,4)
Consider to use a data.frame with the third column (Effectors) of list
type. Then you can do:
Pathogen_Individuals$Effectors[1] <- list(c(2,3,4))
And what you get is:
> Pathogen_Individuals
ID No_of_Effectors Effectors
1 0001 3 2, 3, 4
Uwe Ligges
> Error in Pathogen_Individuals[1, 3] <- c(345, 567, 678) :
> number of items to replace is not a multiple of replacement length
>
> In future I'm also going to have more variables such as whether a gene is expressed. Such information may require a matrix in itself - something like:
>
>
> Effector ID Sequence Expressed?
> [1,] 0001 345,567,678 1 (or 0).
>
> Is there a way then I can put more than one value in the cell like a list of values, or a way to put objects in a cell of a data frame, matrix or table etc. Almost an inception deal - data structures nested in a data structure? If I search for things like "insert list into matrix" I get results like how to turn one into another, which is not what I think I need to be doing.
>
> I have been considering having several data structures not nested in each other, something like for every individual create a new matrix object with the name Effectors_[Individual_ID] and some how get my simulation loops operating on those objects but I find it hard to see how to tell R all of those matrices are to be included in an operation, as you can all lines of a data frame for example with for loops.
> This is strange for me because this model was written in a macro-code for another program which handles data in a different format and layout to R.
>
> My problem is I think, each individual in the model has many variables - in this case representations of genes. So I'm having trouble getting my head about this.
>
> Hopefully someone more experienced will be able to offer advice or a solution, it will be very appreciated.
>
> Many Thanks,
> Ben Ward (ENV, UEA & The Sainsbury Lab, JIC).
>
> P.S. I have searched previous queries to the list, and I'm not sure but this may be useful for relevant:
>
>
> Have you thought of using a list?
>
>> a <- matrix(1:10, nrow=2)
>> b <- 1:5
>> x <- list(a=a, b=b)
>> x
> $a
> [,1] [,2] [,3] [,4] [,5]
> [1,] 1 3 5 7 9
> [2,] 2 4 6 8 10
>
> $b
> [1] 1 2 3 4 5
>
>> x$a
> [,1] [,2] [,3] [,4] [,5]
> [1,] 1 3 5 7 9
> [2,] 2 4 6 8 10
>> x$b
> [1] 1 2 3 4 5
>
> oliveoil and yarn datasets have been mentioned.
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list