[R] how to convert a data.frame to a list of dist objects for individual differences MDS?

Phil Spector spector at stat.berkeley.edu
Tue Mar 22 17:47:54 CET 2011


Michael -
    I think this does what you want:

helm.raw <- read.table("http://euclid.psych.yorku.ca/datavis/Private/mdshelm.dat",header=TRUE, row.names=1)
trans = c('A'='RPur','C'='Red','E'='Yel','G'='Gy1','I'='Gy2','K'='Green','M'='Blue','O'='BlP','Q'='Pur1','S'='Pur2')
cnames = do.call(rbind,strsplit(rownames(helm.raw), ""))
cnames = apply(cnames,2,function(x)trans[x])
uu = unique(as.vector(cnames))

onecol = function(col){
    themat = matrix(NA,10,10)
    dimnames(themat) = list(uu,uu)
    themat[cnames] = col
    as.dist(t(themat))
}

result = lapply(as.data.frame(helm.raw),onecol)

> result$CD1
       RPur  Red  Yel  Gy1  Gy2 Green Blue  BlP Pur1
Red   11.5 
Yel   13.1  6.0 
Gy1   12.6  7.9  6.2 
Gy2   10.6  8.4  8.4  5.2 
Green 10.6  9.4  9.9  6.5  4.1 
Blue  10.8 10.2 10.3  8.8  7.0   6.4 
BlP    7.3 11.3 12.7 11.2 10.4   9.9  4.2 
Pur1   5.4 11.5 12.9 11.7 10.8   9.4  8.4  4.5 
Pur2   5.0 11.5 10.7 10.2 10.6  10.1  8.1  6.4  3.0

 					- Phil Spector
 					 Statistical Computing Facility
 					 Department of Statistics
 					 UC Berkeley
 					 spector at stat.berkeley.edu



On Tue, 22 Mar 2011, Michael Friendly wrote:

> I have a 45 x 16 data frame consisting of dissimilarities among 10 colors, 
> giving in each
> column the 45 = 10*9/2 pairwise judgments for one of 16 subjects.  The 
> rownames
> identify each pair of colors, e.g, "AC" = ("A","C"), and the pairs are 
> ordered by columns
> in the lower triangle of each distance matrix.
>
>> helm.raw <- 
> read.table("http://euclid.psych.yorku.ca/datavis/Private/mdshelm.dat", 
> header=TRUE, row.names=1)
>> head(helm.raw)
>     N1   N2   N3   N4   N5  N6a  N6b   N7   N8  N9  N10  CD1 CD2a CD2b  CD3 
> CD4
> AC  6.8  5.9  7.1  7.5  6.6  5.2  5.8  6.2  7.5 6.0  9.2 11.5  9.3  9.0 10.4 
> 9.9
> AE 12.5 11.1 10.2 10.3 10.5  9.4 10.5 10.8  9.1 9.4 10.8 13.1 10.7 10.0 12.4 
> 13.2
> AG 13.8 18.8 11.1 10.7 10.2 11.4 13.4  9.9 10.2 9.5  9.7 12.6 10.7 10.4 12.8 
> 12.3
> AI 14.2 17.3 12.5 11.6  9.6 13.3 14.0 11.1 12.1 9.5 10.1 10.6 11.9 10.0 13.7 
> 11.1
> AK 12.5 16.6 11.8 10.6 10.8 12.0 13.2 10.3 12.5 9.8 10.3 10.6 11.0  9.3 11.8 
> 8.7
> AM 11.0 16.5  9.9  9.7  9.7 12.3 11.7  8.8  9.7 8.7  9.7 10.8  9.8  8.6  4.3 
> 5.6
>> row.names(helm.raw)
> [1] "AC" "AE" "AG" "AI" "AK" "AM" "AO" "AQ" "AS" "CE" "CG" "CI" "CK" "CM" 
> "CO" "CQ" "CS" "EG" "EI" "EK"
> [21] "EM" "EO" "EQ" "ES" "GI" "GK" "GM" "GO" "GQ" "GS" "IK" "IM" "IO" "IQ" 
> "IS" "KM" "KO" "KQ" "KS" "MO"
> [41] "MQ" "MS" "OQ" "OS" "QS"
>>
>
> To analyse this (with individual differences MDS, e.g., smacofDiff()), I need 
> to:
>
> (a) convert this to a list of objects of class "dist", one for each column of 
> helm.raw
> (b) rename the 1-letter codes to color name abbreviations as row/col labels 
> for each distance matrix,
> according to:
>  'A'='RPur'
>  'C'='Red'
>  'E'='Yel'
>  'G'='Gy1'
>  'I'='Gy2'
>  'K'='Green'
>  'M'='Blue'
>  'O'='BlP'
>  'Q'='Pur1'
>  'S'='Pur2'
>
> I've done this in SAS, but I don't know how to do it in R because neither 
> dist() nor
> as.dist() seem to be able to work with data in this format.  I could try 
> brute-force,
> but maybe there is an easier way.  Can someone help?
>
> As a distance matrix, the column helm.raw$CD1 for subject CD1 should appear 
> something like
> shown below (without the Obs column, where stim is the rowname)
>
> --------------------------------- Subject=CD1 
> ----------------------------------
>
>     Obs  stim   RPur   Red   Yel   Gy1   Gy2  Green  Blue  BlP  Pur1  Pur2
>
>       1  RPur     .     .     .     .     .      .     .    .     .     .
>       2  Red    11.5    .     .     .     .      .     .    .     .     .
>       3  Yel    13.1   6.0    .     .     .      .     .    .     .     .
>       4  Gy1    12.6   7.9   6.2    .     .      .     .    .     .     .
>       5  Gy2    10.6   8.4   8.4   5.2    .      .     .    .     .     .
>       6  Green  10.6   9.4   9.9   6.5   4.1     .     .    .     .     .
>       7  Blue   10.8  10.2  10.3   8.8   7.0    6.4    .    .     .     .
>       8  BlP     7.3  11.3  12.7  11.2  10.4    9.9   4.2   .     .     .
>       9  Pur1    5.4  11.5  12.9  11.7  10.8    9.4   8.4  4.5    .     .
>      10  Pur2    5.0  11.5  10.7  10.2  10.6   10.1   8.1  6.4    3     .
>
>
> -- 
> Michael Friendly     Email: friendly AT yorku DOT ca
> Professor, Psychology Dept.
> York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
> 4700 Keele Street    Web:   http://www.datavis.ca
> Toronto, ONT  M3J 1P3 CANADA
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list