[R] Data arrangement for PLSDA using the ropls package

Mon Sep 18 09:51:57 CEST 2017

Hello,
I would like to do a partial least square discriminant analysis (PLSDA) in R using the package "ropls"
Which is in R available via the R command :

source("https://bioconductor.org/biocLite.R")

When I try to do a PLSDA using my own data.
The impact of two genders (AP,C) on 5 compounds measured in persons (samples) should be illustrated.  When I try to do a PLSDA I get the warning message:

"Single component model: only 'overview' and 'permutation' (in case of single response (O)PLS(-DA)) plots available"

I assume it has something to do with the way I arrange my data into R. I tried to do it in a similar way as it has been done in the example of the package using the sacurine data set (bioconductor.org/packages/release/bioc/vignettes/ropls/inst/doc/ropls-vignette.pdf)

Can somebody maybe tell me how I correctly have to arrange my data in order to perfom a PLSDA using the "ropls" package?

Thank you very much,

Mike

Please find my code and an example data set below:

CODE:

#Input data and convert to data frame and define "Sample" as row

dta<-read.csv("Demo.csv",sep=";",header=T)

rownames(dta)<-dta$Sample

dta

#Remove non-numeric "Sample" and "Gender" rows and convert to matrix

dta.exp<-dta[,c(-1,-7)]

matrix<-as.matrix(dta.exp)

str(matrix)

matrix

#create vector with "gender" as y-component

dta.treatments<-dta[,7]

dta.treatments

dta.factor<-as.factor(dta.treatments)

dta.plsda <- opls(matrix, dta.factor)

DATA:

> dput(dta)

structure(list(Sample = structure(c(1L, 12L, 23L, 34L, 36L, 37L,

38L, 39L, 40L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 13L,

14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 24L, 25L, 26L, 27L,

28L, 29L, 30L, 31L, 32L, 33L, 35L), .Label = c("sa1", "sa10",

"sa11", "sa12", "sa13", "sa14", "sa15", "sa16", "sa17", "sa18",

"sa19", "sa2", "sa20", "sa21", "sa22", "sa23", "sa24", "sa25",

"sa26", "sa27", "sa28", "sa29", "sa3", "sa30", "sa31", "sa32",

"sa33", "sa34", "sa35", "sa36", "sa37", "sa38", "sa39", "sa4",

"sa40", "sa5", "sa6", "sa7", "sa8", "sa9"), class = "factor"),

    Comp1 = c(1.7686, 0.6873, 1.2322, 1.4874, 1.8986, 1.3484,

    1.0959, 0.583, 1.039, 1.6133, 0.9595, 1.6377, 1.4538, 0.8737,

    1.3363, 1.7881, 2.3604, 1.1239, 2.1281, 2.037, 0.5314, 0.7147,

    0.5917, 0.6671, 0.6645, 0.9865, 1.019, 0.9664, 0.6966, 0.679,

    0.7976, 0.8503, 1.2566, 0.5881, 0.8838, 0.6657, 0.7399, 0.5778,

    0.7121, 1.1909), Comp2 = c(0.0284, 0.9064, 0, 0.7053, 0.7695,

    0.337, 1.0418, 0.8346, 0.3884, 1.9946, 1.3296, 0.119, 0.0106,

    0.7872, 1.0174, 0.0704, 0.0854, 0.4259, 0.0395, 0.0549, 2.4471,

    1.8418, 2.9805, 1.1181, 0.5403, 2.7181, 1.4835, 0.875, 2.2205,

    2.4106, 1.1967, 0.303, 0.1129, 2.5432, 2.328, 0.9839, 2.3583,

    1.9589, 1.9918, 1.2232), Comp3 = c(2.9976, 1.6201, 0.7497,

    1.371, 2.7035, 0.4533, 0.9927, 1.0973, 1.6702, 1.3696, 0.3392,

    1.1489, 2.1086, 1.1586, 1.3645, 1.6008, 2.9567, 1.5721, 2.9633,

    2.4623, 0.1103, 0.3137, 0.313, 0.2969, 0.5148, 0.7419, 0.5641,

    0.7871, 0.7362, 0.8754, 0.4883, 0.8504, 1.4582, 0.1934, 0.764,

    0.7515, 0.7143, 0.2139, 0.5743, 1.7305), Comp4 = c(0, 0,

    0.603, 0, 1.6524, 0, 0, 0, 0, 1.1056, 0, 0, 0, 0, 0, 0, 5.7848,

    0, 0, 0, 0, 0, 0, 0, 0, 0.7895, 3.4641, 0, 0, 1.7446, 0,

    0, 1.5165, 0, 5.9645, 4.1878, 0.7313, 5.7994, 3.0168, 0),

    Comp5 = c(18.6058, 5.6489, 12.0842, 4.2708, 3.8489, 10.2139,

    6.1149, 11.3373, 8.9013, 5.8342, 18.532, 17.9267, 8.7386,

    6.9455, 7.3044, 19.0811, 10.8809, 10.7149, 4.7057, 0, 10.3088,

    5.1514, 19.1218, 21.1768, 8.3797, 2.7146, 8.7405, 14.4817,

    8.6571, 17.4254, 17.5725, 5.1233, 13.7539, 6.7396, 2.1342,

    14.4216, 9.2952, 19.9525, 2.2317, 16.501), Gender = structure(c(1L,

    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,

    1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,

    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("AP", "C"

    ), class = "factor")), .Names = c("Sample", "Comp1", "Comp2",

"Comp3", "Comp4", "Comp5", "Gender"), class = "data.frame", row.names = c("sa1",

"sa2", "sa3", "sa4", "sa5", "sa6", "sa7", "sa8", "sa9", "sa10",

"sa11", "sa12", "sa13", "sa14", "sa15", "sa16", "sa17", "sa18",

"sa19", "sa20", "sa21", "sa22", "sa23", "sa24", "sa25", "sa26",

"sa27", "sa28", "sa29", "sa30", "sa31", "sa32", "sa33", "sa34",

"sa35", "sa36", "sa37", "sa38", "sa39", "sa40"))

Eisenring Michael, Dr.

Federal Department of Economic Affairs, Education and Research
EAER
Agroecology and Environment
Biosafety

Reckenholzstrasse 191, CH-8046 Zürich
Tel. +41 58 468 7181
Fax +41 58 468 7201
michael.eisenring at agroscope.admin.ch<mailto:michael.eisenring at agroscope.admin.ch>
www.agroscope.ch<http://www.agroscope.ch/>

	[[alternative HTML version deleted]]