[R] how to define a function in R

Joshua Wiley jwiley.psych at gmail.com
Thu Jul 8 08:29:25 CEST 2010


Hi Jason,

I did not have time to actually test this code so there may be typos
and some of it may not work as I thought.  I would create a copy of
your data in a test directory and experiment with that until you are
confident you have everything working how you want .  As a side note,
since you are new to R, I would highly recommend reading an intro book
(as has been suggested) and using a good text editor (if you are not
already).  Jumping straight into searching, reading, subsetting, and
writing 400 some data files is a tough way to learn R.

Good luck with your project,

Joshua

##code/comments


setwd(dir = "the directory path with your files here")
getwd() # just to make sure it worked

#An alternative to changing the working directory
#would be to include the full path with the file names
#but to me, that is more cumbersome

#The idea is to get a vector of character strings where each element
#is a different file name

filenames <- dir(path = ".", pattern = "ggg|fff", full.names = FALSE,
ignore.case = TRUE)

#print the results to screen
#you can peruse these to make sure they look about right
filenames

#Here I initialize a list with as many elements as filenames
#This will hold all the data to be read in from the text files
my.data <- vector(mode = "list", length = length(filenames))

#Name the list
names(my.data) <- filenames

#The next step is to read in each of the files
#and assign each object created to the relevant element of the list
#I am using a for loop
#it could also be done with the apply family of functions, I believe

#If you do not set sringsAsFactors to FALSE, I believe the 'names'
#columns of your data will be converted to a factor
#which might be problematic
for(i in filenames) {
  my.data[[i]] <- read.table(file = i, header = TRUE, sep = "",
row.names = TRUE, stringsAsFactors = FALSE)
}
#Obviously you need to make sure the settings of read.table() are appropriate
#for your text files

my.data.subset <- lapply(my.data, function(x) {subset(x, name == "aaa"
| name == "bbb")})

#Assuming that there is at least some data kept for each file
#Something like this should work

#Name the new list with your subset data
names(my.data.subset) <- filenames

#Use a loop to write each element of your data to the appropriate file
#Also note that if you did not change the working directory
#this will overwrite your files, so be sure you have them saved elsewhere
#unless you are very certain that nothing has/will go wrong
for(i in filenames) {
  write.table(x = my.data.subset[[i]], file = i, sep = "", row.names =
TRUE, col.names=TRUE)
}

#Some functions you might read about

?read.table
?write.table
?dir
?regexp # for info about how to select only the relevant filenames using dir()
?subset
?'|' #for info on logical operators
?lapply


On Wed, Jul 7, 2010 at 12:11 PM, jd6688 <jdsignature at gmail.com> wrote:
>
> Hi Joshua:
>
> Here are what i am going to accomplish:
>
> I have 400 files named as xxx.txt. the content of the file looks like the
> following:
>
>    name    count
>
> 1. aaa     100
> 2. bbb    2000
> 3. ccc    300
> 4. ddd   3000
>
> ........
> more that 1000 rows in each files.
>
> these are the areas i need help:
> 1. how can i only read in the files with the string patterns ggg or fff as
> part of the file names?
>  for instance, I only need the file names with the ggg or fff in it
>     xxxxx_ggg_yyyyy_1.txt
>     yyyy_fff_yyyy_xxx.txt
>
>    i don't need to read in the files, such as xxxx_aaa_yyyy.txt
>
> 2.how cam rename the files:
>
>  for instance: xxxxx_ggg_yyyyy_1.txt======>changed to ggg1a.txt
>
>
> 3.after the files read in, how can i only keep the rows with the aaa and
> bbb, everything elses show be removed from the files, but the files still
> remain the same file name?
>
>   for instance, in the xxxxx_ggg_yyyyy_1.txt file, it shouls looks like:
>  name    count
>
> 1. aaa    100
> 2. bbb    2000
> 3. aaa    300
> 4. bbb    400
>
>
> Thanks so lot, I am very new to R, I am looking forward to any helps from
> you.
>
>
> On Tue, Jul 6, 2010 at 9:23 PM, Joshua Wiley-2 [via R] <
> ml-node+2280373-448579502-312346 at n4.nabble.com<ml-node%2B2280373-448579502-312346 at n4.nabble.com>
> > wrote:
>
> > Hello,
> >
> > As others have said, its hard to give specific advice without specific
> > needs, but that's okay; I made up some examples needs and some (rather
> > silly) code that might handle it.  Depending what you need to do, it
> > may help you get started.  I tried to explicitly name all the
> > arguments in any functions I used.
> >
> > When I make gmail use basic text format instead of html, code is sent
> > poorly, so you can trundle off here to see the example, if you like.
> >
> > http://gist.github.com/466164
> >
> > Cheers,
> >
> > Josh
> >
> > On Tue, Jul 6, 2010 at 3:48 PM, jd6688 <[hidden email]<http://user/SendEmail.jtp?type=node&node=2280373&i=0>>
> > wrote:
> >
> > >
> > > 1. how to write a R script?
> > > 2.How to write a SAS like macro/generic process to process multiple files
> > by
> > > using the same funstion in R?
> > >
> > > Thanks in advance
> > > --
> > > View this message in context:
> > http://r.789695.n4.nabble.com/how-to-define-a-function-in-R-tp2280290p2280290.html<http://r.789695.n4.nabble.com/how-to-define-a-function-in-R-tp2280290p2280290.html?by-user=t>
> > > Sent from the R help mailing list archive at Nabble.com.
> > >
> > > ______________________________________________
> > > [hidden email] <http://user/SendEmail.jtp?type=node&node=2280373&i=1>mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >
> >
> > --
> > Joshua Wiley
> > Ph.D. Student, Health Psychology
> > University of California, Los Angeles
> > http://www.joshuawiley.com/
> >
> > ______________________________________________
> > [hidden email] <http://user/SendEmail.jtp?type=node&node=2280373&i=2>mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> > ------------------------------
> > View message @
> > http://r.789695.n4.nabble.com/how-to-define-a-function-in-R-tp2280290p2280373.html
> > To unsubscribe from how to define a function in R, click here< (link removed) =>.
> >
> >
> >
>
> --
> View this message in context: http://r.789695.n4.nabble.com/how-to-define-a-function-in-R-tp2280290p2281376.html
> Sent from the R help mailing list archive at Nabble.com.
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/



More information about the R-help mailing list