[R] Cleaning database: grep()? apply()?

Jonas Malmros jonas.malmros at gmail.com
Tue Nov 13 20:34:04 CET 2007


Dear R users,

I have a huge database and I need to adjust it somewhat.

Here is a very little cut out from database:

CODE	NAME	                                               DATE         DATA1
4813	ADVANCED TELECOM	                1987	0.013
3845	ADVANCED THERAPEUTIC SYS LTD	1987	10.1
3845	ADVANCED THERAPEUTIC SYS LTD	1989	2.463
3845	ADVANCED THERAPEUTIC SYS LTD	1988	1.563
2836	ADVANCED TISSUE SCI  -CL A	                1987	0.847
2836	ADVANCED TISSUE SCI  -CL A	                 1989	0.872
2836	ADVANCED TISSUE SCI  -CL A	                 1988	0.529

What I need is:
1) Delete all cases containing -CL A (and also -OLD, -ADS, etc) at the end
2) Delete all cases that have less than 3 years of data
3) For each remaining case compute ratio DATA1(1989) / DATA1(1987)
[and then ratios involving other data variables] and output this into
new database consisting of CODE, NAME, RATIOs.

Maybe someone can suggest an effective way to do these things? I
imagine the first one would involve grep(), and 2 and 3 would involve
apply family of functions, but I cannot get my mind around the actual
code to perform this adjustments. I am new to R, I do write code but
usually it consists of for-functions and plotting. I would much
appreciate your help.
Thank you in advance!
-- 
Jonas Malmros
Stockholm University
Stockholm, Sweden



More information about the R-help mailing list