[R] r code to generate interaction columns

kMan kchamberln at gmail.com
Wed Mar 10 03:52:40 CET 2010


Dear Dhruv,

Your clarification helps, and I'm stumped. Sorry I cannot be of more help.

Sincerely,
KeithC.

-----Original Message-----
From: Sharma, Dhruv [mailto:Dhruv.Sharma at PenFed.org] 
Sent: Monday, March 08, 2010 7:51 AM
To: kMan; r-help at r-project.org
Subject: RE: [R] r code to generate interaction columns


 thanks Kieth.  I wanted something generic code to check column data type
and loop through and create the interaction columns automatically as I want
to test this out as a new algorithm for data mining.

Traditional regression may give misleading results with multi-collinearity
and thus I wanted to take interaction terms and run them through random
forests and rpart as they would need interaction terms to be manually
created.

Hope that clarifies.

Dhruv

-----Original Message-----
From: kMan [mailto:kchamberln at gmail.com]
Sent: Sunday, March 07, 2010 8:08 PM
To: Sharma, Dhruv; r-help at r-project.org
Subject: RE: [R] r code to generate interaction columns

Dear Dhruv,

You could create interaction variables manually (assuming A is your
dependent variable). Just multiply the variables together.
cd.int<-C*D
ce.int<-C*E
cde.int<-C*D*E # what about D*E, or interactions with B?
Include those in your model, such as
A~B+C+D+E+cd.int+cd.int+ce.int+cde.int.
Then you can compare those models to the results you get when you specify
the interaction in the model formula directly using the documented syntax.
In your R-console, type ?formula, or help("formula") for details. 

Sincerely,
KeithC.


-----Original Message-----
From: Sharma, Dhruv [mailto:Dhruv.Sharma at PenFed.org]
Sent: Saturday, March 06, 2010 10:30 AM
To: r-help at r-project.org
Subject: [R] r code to generate interaction columns

Hi,
   is there a way to take a dataset and extract numeric columns and create
interaction columns from it automatically?

   For e.g.  there are 5 columns of data: A,B,C,D,E.

   CDE are numeric.

   Can someone provide code to automatically create more columns such
as:

   1) C*D, C*E, C*D*E, (C+E)/(D+.01 (to avoid divide by zero),
(D+E)/(C+.01 (to avoid divide by zero), (C+D)/(E+.01 (to avoid divide by
zero))

?

I know in glm multiplying can create terms but i want the columns to be part
of the data set so that i can feed this into Random forest to pick out
predictive interaction terms as regression cannot reliably handle correlated
interaction terms.

if anyone has some simple code that can do this that would be helpful.

thanks
Dhruv
    

	[[alternative HTML version deleted]]



More information about the R-help mailing list