[R] If find x, then y, else do nothing

Joshua Wiley jwiley.psych at gmail.com
Mon Aug 8 20:00:26 CEST 2011


Possibly....if any(grepl(4, x)) somehow returned true for a subject
that only had 1 & 2, the effect would be to return all 1s.

dummy <- factor(c("2", "1", "NA", "4"))
foo(dummy)

> foo(dummy)
[1] 1  1  NA 2
Levels: 1 2 NA

note the internal representation of the dummy factor:

> as.numeric(dummy)
[1] 2 1 4 3

the NA is stored as a factor level, the 4th, so the numeric
representation is a 4.  It is not how it appears when you print to the
screen.

For your *real* data set, please send str(yourdata) and also a copy of
exactly how you are running the code.

On Mon, Aug 8, 2011 at 10:35 AM, Edward Patzelt <patze003 at umn.edu> wrote:
> Here's a thought, when I execute it on a data frame with only the subject
> and resp variables I have the same problem.  But not on the subset of data
> we've been working with.  The difference between the full vector and the
> subset vector is that the full vector contains NA's.  Only 35 out of 17,000
> trials, but could this screw up the function?
>
> unique(blah$resp)
>
> [1] 2 1 NA 4
>
> On Mon, Aug 8, 2011 at 11:46 AM, Joshua Wiley <jwiley.psych at gmail.com>
> wrote:
>>
>> These are shots in the dark, but you could try: running it in a clean
>> R session with only the necessary packages and data loaded; testing on
>> subsets of your data; examing the classes of all your variables and
>> make sure they are what is expexted; upgrading your version of R.
>>
>> I just retested this on:
>>
>> R version 2.12.1 (2010-12-16)
>> Platform: x86_64-pc-mingw32/x64 (64-bit)
>>
>> locale:
>> [1] LC_COLLATE=English_United States.1252
>> [2] LC_CTYPE=English_United States.1252
>> [3] LC_MONETARY=English_United States.1252
>> [4] LC_NUMERIC=C
>> [5] LC_TIME=English_United States.1252
>>
>> attached base packages:
>> [1] grid      splines   stats     graphics  grDevices utils     datasets
>> [8] methods   base
>>
>> other attached packages:
>> [1] car_2.0-9       survival_2.36-5 nnet_7.3-1      MASS_7.3-11
>>
>> again without issue, so I am thinking there must be something
>> particular perhaps to your session going on, but it is difficult to
>> say.
>>
>> On Mon, Aug 8, 2011 at 9:35 AM, Edward Patzelt <patze003 at umn.edu> wrote:
>> > Hmmmm, I pulled out a portion of the data set to create the code for
>> > posting.  When I execute this on the data frame I get the following for
>> > subject 8 which is clearly incorrect.  I get this for all subjects who
>> > originally had "1 & 2".
>> >
>> >  dat$respalt <- with(dat, ave(Slide1_RESP, factor(grid), FUN = foo))
>> >> head(dat$respalt,20
>> > +     )
>> >  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
>> >
>> > On Mon, Aug 8, 2011 at 11:14 AM, Joshua Wiley <jwiley.psych at gmail.com>
>> > wrote:
>> >>
>> >> Hmm, well I suppose it technically does "touch" in some sense still if
>> >> there are 1 & 2s, but it should just return it as is, not changed.
>> >> Thanks for the data, very easy!  Here is what I get:
>> >>
>> >> dat <- structure(list(subject = c(8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
>> >> 8L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 6L, 6L,
>> >> 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L), resp = c(2, 1, 1, 2, 1, 2, 1,
>> >> 1, 1, 1, 4, 4, 2, 2, 4, 4, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 2, 2,
>> >> 1, 2)), .Names = c("subject", "resp"), row.names = c(1L, 2L,
>> >> 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 4161L, 4162L, 4163L, 4164L,
>> >> 4165L, 4166L, 4167L, 4168L, 4169L, 4170L, 166L, 167L, 168L, 169L,
>> >> 170L, 171L, 172L, 173L, 174L, 175L), class = "data.frame")
>> >>
>> >> require(car)
>> >> foo <- function(x) {
>> >>  if (any(grepl(4, x))) {
>> >>    x <- recode(x, "2 = 1; 4 = 2")
>> >>  }
>> >>  return(x)
>> >> }
>> >>
>> >> dat$respalt <- with(dat, ave(resp, factor(subject), FUN = foo))
>> >>
>> >> ## which at least for me gives:
>> >> structure(list(subject = c(8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
>> >> 8L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 6L, 6L,
>> >> 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L), resp = c(2, 1, 1, 2, 1, 2, 1,
>> >> 1, 1, 1, 4, 4, 2, 2, 4, 4, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 2, 2,
>> >> 1, 2), respalt = c(2, 1, 1, 2, 1, 2, 1, 1, 1, 1, 2, 2, 1, 1,
>> >> 2, 2, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 2, 2, 1, 2)), .Names = c("subject",
>> >> "resp", "respalt"), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L,
>> >> 8L, 9L, 10L, 4161L, 4162L, 4163L, 4164L, 4165L, 4166L, 4167L,
>> >> 4168L, 4169L, 4170L, 166L, 167L, 168L, 169L, 170L, 171L, 172L,
>> >> 173L, 174L, 175L), class = "data.frame")
>> >>
>> >> Does that work for you?  I am running:
>> >>
>> >> R Under development (unstable) (2011-07-30 r56564)
>> >> Platform: x86_64-pc-mingw32/x64 (64-bit)
>> >> with car_2.0-10
>> >>
>> >> HTH,
>> >>
>> >> Josh
>> >>
>> >> On Mon, Aug 8, 2011 at 9:02 AM, Edward Patzelt <patze003 at umn.edu>
>> >> wrote:
>> >> > Here's the code.  I don't want it to even touch the vector if there
>> >> > are
>> >> > already "1's & 2's"
>> >> >
>> >> > structure(list(subject = c(8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
>> >> > 8L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 6L, 6L,
>> >> > 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L), resp = c(2, 1, 1, 2, 1, 2, 1,
>> >> > 1, 1, 1, 4, 4, 2, 2, 4, 4, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 2, 2,
>> >> > 1, 2)), .Names = c("subject", "resp"), row.names = c(1L, 2L,
>> >> > 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 4161L, 4162L, 4163L, 4164L,
>> >> > 4165L, 4166L, 4167L, 4168L, 4169L, 4170L, 166L, 167L, 168L, 169L,
>> >> > 170L, 171L, 172L, 173L, 174L, 175L), class = "data.frame")
>> >> >
>> >> > R version 2.12.2 (2011-02-25)
>> >> > Platform: x86_64-pc-mingw32/x64 (64-bit)
>> >> >
>> >> > locale:
>> >> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>> >> > States.1252    LC_MONETARY=English_United States.1252
>> >> > [4] LC_NUMERIC=C                           LC_TIME=English_United
>> >> > States.1252
>> >> >
>> >> > attached base packages:
>> >> > [1] splines   stats     graphics  grDevices utils     datasets
>> >> >  methods
>> >> > base
>> >> >
>> >> > other attached packages:
>> >> > [1] car_2.0-10      nnet_7.3-1      MASS_7.3-11     Hmisc_3.8-3
>> >> > survival_2.36-5 RODBC_1.3-2
>> >> >
>> >> > loaded via a namespace (and not attached):
>> >> > [1] cluster_1.13.3  grid_2.12.2     lattice_0.19-17 tools_2.12.2
>> >> >
>> >> > On Mon, Aug 8, 2011 at 10:43 AM, Joshua Wiley
>> >> > <jwiley.psych at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> On Mon, Aug 8, 2011 at 8:23 AM, Edward Patzelt <patze003 at umn.edu>
>> >> >> wrote:
>> >> >> > Thanks Josh for the code to post I have been trying to figure out
>> >> >> > how
>> >> >> > to
>> >> >> > do
>> >> >> > that.  Your code works except that it changes subjects that
>> >> >> > responded
>> >> >> > with
>> >> >> > "1 & 2" to all 1's.  What does the "ave" argument mean in the
>> >> >> > execution
>> >> >> > of
>> >> >>
>> >> >> Not in the example data you provided on my system.  If you can
>> >> >> provide
>> >> >> data (preferablly using dput() or uploading a txt file on a file
>> >> >> hosting service) that reproduces this issue, I will be happy to look
>> >> >> at it for you.  You might also try reporting your sessionInfo() ---
>> >> >> this may be related to the version of R or the packages you are
>> >> >> using,
>> >> >> but at present I have no information.
>> >> >>
>> >> >> > the function?
>> >> >>
>> >> >> 'ave' is not an argument in the function.  ave() *is* a function.  I
>> >> >> call with() to have the ave() function evaluated in an environment
>> >> >> created from dat (the data).  See ?ave and ?with  I could have
>> >> >> equivalently (though more cumbersomely) written:
>> >> >>
>> >> >> ave(dat$Slide1_RESP, dat$Subject, FUN = foo))
>> >> >>
>> >> >> because ave will now be evaluated in the global environment, it will
>> >> >> not have access to the variables stored in 'dat' unless explicitly
>> >> >> told that they are in dat (as above).
>> >> >>
>> >> >> Cheers,
>> >> >>
>> >> >> Josh
>> >> >>
>> >> >> > library(car)
>> >> >> > foo <- function(x) {
>> >> >> >  if (any(grepl(4, x))) {
>> >> >> >    x <- recode(x, "2 = 1; 4 = 2")
>> >> >> >  }
>> >> >> >  return(x)
>> >> >> > }
>> >> >> > ## do it
>> >> >> > dat$test <- with(dat, ave(Slide1_RESP, Subject, FUN = foo))
>> >> >> > On Fri, Aug 5, 2011 at 4:56 PM, Joshua Wiley
>> >> >> > <jwiley.psych at gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> On Fri, Aug 5, 2011 at 2:34 PM, Edward Patzelt <patze003 at umn.edu>
>> >> >> >> wrote:
>> >> >> >> >
>> >> >> >> > The problem is that we were using a task where some subjects
>> >> >> >> > responded
>> >> >> >> > with "1 & 2" and some responded with "2 & 4".  So there is
>> >> >> >> > overlap
>> >> >> >> > for 2
>> >> >> >> > because it means stimulus 1 for subject 1 and it means stimulus
>> >> >> >> > 2
>> >> >> >> > for
>> >> >> >> > subject 2.
>> >> >> >> >
>> >> >> >> > subject
>> >> >> >> >
>> >> >> >> > subject_1
>> >> >> >> >
>> >> >> >> > 1
>> >> >> >> >
>> >> >> >> > subject_1
>> >> >> >> >
>> >> >> >> > 1
>> >> >> >> >
>> >> >> >> > subject_1
>> >> >> >> >
>> >> >> >> > 1
>> >> >> >> >
>> >> >> >> > subject_1
>> >> >> >> >
>> >> >> >> > 2
>> >> >> >> >
>> >> >> >> > subject_1
>> >> >> >> >
>> >> >> >> > 2
>> >> >> >> >
>> >> >> >> > subject_2
>> >> >> >> >
>> >> >> >> > 4
>> >> >> >> >
>> >> >> >> > subject_2
>> >> >> >> >
>> >> >> >> > 2
>> >> >> >> >
>> >> >> >> > subject_2
>> >> >> >> >
>> >> >> >> > 2
>> >> >> >> >
>> >> >> >> > subject_2
>> >> >> >> >
>> >> >> >> > 4
>> >> >> >> >
>> >> >> >> > subject_2
>> >> >> >> >
>> >> >> >> > 2
>> >> >> >> >
>> >> >> >> > subject_2
>> >> >> >> >
>> >> >> >> > 2
>> >> >> >> >
>> >> >> >> > subject_2
>> >> >> >> >
>> >> >> >> > 2
>> >> >> >> >
>> >> >> >> > subject_2
>> >> >> >> >
>> >> >> >> > 4
>> >> >> >>
>> >> >> >> FYI providing data in the format above (this list is plain text)
>> >> >> >> makes
>> >> >> >> the job of those trying to help substantially harder.  You can
>> >> >> >> use
>> >> >> >> dput().  For example, if I wanted to share the first 10 rows of
>> >> >> >> the
>> >> >> >> built in mtcars data set, I would just copy and paste the output
>> >> >> >> from
>> >> >> >> running:
>> >> >> >>
>> >> >> >> dput(mtcars[1:10, ])
>> >> >> >>
>> >> >> >> Anyway, here you go, this should be directly executable as long
>> >> >> >> as
>> >> >> >> you
>> >> >> >> have installed the 'car' package.
>> >> >> >>
>> >> >> >> ## your data in a form easily copied and pasted into the console
>> >> >> >> ## created using dput() (highly recommended for future posts)
>> >> >> >> dat <- structure(list(subject = c("subject_1", "subject_1",
>> >> >> >> "subject_1",
>> >> >> >> "subject_1", "subject_1", "subject_2", "subject_2", "subject_2",
>> >> >> >> "subject_2", "subject_2", "subject_2", "subject_2", "subject_2"
>> >> >> >> ), val = c(1, 1, 1, 2, 2, 4, 2, 2, 4, 2, 2, 2, 4)), .Names =
>> >> >> >> c("subject",
>> >> >> >> "val"), row.names = c(NA, -13L), class = "data.frame")
>> >> >> >>
>> >> >> >> ## load the required package for recoding
>> >> >> >> ## though it is overkill for only two levels
>> >> >> >> require(car)
>> >> >> >>
>> >> >> >> ## define a function to do the recoding
>> >> >> >> foo <- function(x) {
>> >> >> >>  if (any(grepl(4, x))) {
>> >> >> >>    x <- recode(x, "2 = 1; 4 = 2")
>> >> >> >>  }
>> >> >> >>  return(x)
>> >> >> >> }
>> >> >> >>
>> >> >> >> ## do it
>> >> >> >> dat$altval <- with(dat, ave(val, subject, FUN = foo))
>> >> >> >>
>> >> >> >> Cheers,
>> >> >> >>
>> >> >> >> Josh
>> >> >> >>
>> >> >> >>
>> >> >> >> >
>> >> >> >> > On Fri, Aug 5, 2011 at 4:25 PM, Joshua Wiley
>> >> >> >> > <jwiley.psych at gmail.com>
>> >> >> >> > wrote:
>> >> >> >> >>
>> >> >> >> >> Hi Edward,
>> >> >> >> >>
>> >> >> >> >> You can try something like:
>> >> >> >> >>
>> >> >> >> >> u.ppl <- unique(init.dat1$grid)
>> >> >> >> >> l.ppl <- ifelse(grepl(4, init.dat1$Slide1_RESP), 2,
>> >> >> >> >>                init.dat1$Slide1_RESP)
>> >> >> >> >>
>> >> >> >> >> Note that this is not exact as you have not provided a
>> >> >> >> >> reproducible
>> >> >> >> >> example.  I am not exactly sure how you are  putting 1 for 2
>> >> >> >> >> and
>> >> >> >> >> 2
>> >> >> >> >> for
>> >> >> >> >> 4, if the value is equal to 4, but presumably it is clearer
>> >> >> >> >> with
>> >> >> >> >> data.
>> >> >> >> >>  In any event, look at ?ifelse it is something like a
>> >> >> >> >> vectorized
>> >> >> >> >> if
>> >> >> >> >> statement and is, I believe, preferable to your use of a for
>> >> >> >> >> loop.
>> >> >> >> >>  I
>> >> >> >> >> can probably give you a runnable solution if you can give the
>> >> >> >> >> first
>> >> >> >> >> few rows of the relevant data.
>> >> >> >> >>
>> >> >> >> >> Cheers,
>> >> >> >> >>
>> >> >> >> >> Josh
>> >> >> >> >>
>> >> >> >> >> On Fri, Aug 5, 2011 at 2:15 PM, Edward Patzelt
>> >> >> >> >> <patze003 at umn.edu>
>> >> >> >> >> wrote:
>> >> >> >> >> > I want to write code that says "If you find an element equal
>> >> >> >> >> > to
>> >> >> >> >> > 4
>> >> >> >> >> > in
>> >> >> >> >> > this
>> >> >> >> >> > vector for each person in the data set tested separately,
>> >> >> >> >> > then
>> >> >> >> >> > put
>> >> >> >> >> > in
>> >> >> >> >> > 1 for
>> >> >> >> >> > 2 and 2 for 4, else leave the variable as is"
>> >> >> >> >> >
>> >> >> >> >> >  u.ppl <- (unique(init.dat1$grid))
>> >> >> >> >> >      l.ppl <- length(u.ppl)
>> >> >> >> >> >        for (i in 1:l.ppl)
>> >> >> >> >> >        {
>> >> >> >> >> >          if (grep("4",init.dat1$Slide1_RESP)) {2 == 1, 4 ==
>> >> >> >> >> > 2};
>> >> >> >> >> > else
>> >> >> >> >> > init.dat1$Slide1_RESP
>> >> >> >> >> >
>> >> >> >> >> >        }
>> >> >> >> >> >
>> >> >> >> >> > --
>> >> >> >> >> > Edward H. Patzelt
>> >> >> >> >> > Research Assistant – TRiCAM Lab
>> >> >> >> >> > University of Minnesota – Psychology/Psychiatry
>> >> >> >> >> > VA Medical Center
>> >> >> >> >> > Office: S355 Elliot Hall - Twin Cities Campus
>> >> >> >> >> > Phone: 612-626-0072  Email: patze003 at umn.edu
>> >> >> >> >> >
>> >> >> >> >> > Please consider the environment before printing this email
>> >> >> >> >> > www.psych.umn.edu/research/tricam
>> >> >> >> >> >
>> >> >> >> >> >        [[alternative HTML version deleted]]
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > ______________________________________________
>> >> >> >> >> > R-help at r-project.org mailing list
>> >> >> >> >> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >> >> >> >> > PLEASE do read the posting guide
>> >> >> >> >> > http://www.R-project.org/posting-guide.html
>> >> >> >> >> > and provide commented, minimal, self-contained, reproducible
>> >> >> >> >> > code.
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> --
>> >> >> >> >> Joshua Wiley
>> >> >> >> >> Ph.D. Student, Health Psychology
>> >> >> >> >> Programmer Analyst II, ATS Statistical Consulting Group
>> >> >> >> >> University of California, Los Angeles
>> >> >> >> >> https://joshuawiley.com/
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > --
>> >> >> >> > Edward H. Patzelt
>> >> >> >> > Research Assistant – TRiCAM Lab
>> >> >> >> > University of Minnesota – Psychology/Psychiatry
>> >> >> >> > VA Medical Center
>> >> >> >> > Office: S355 Elliot Hall - Twin Cities Campus
>> >> >> >> > Phone: 612-626-0072  Email: patze003 at umn.edu
>> >> >> >> >
>> >> >> >> > Please consider the environment before printing this email
>> >> >> >> > www.psych.umn.edu/research/tricam
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> --
>> >> >> >> Joshua Wiley
>> >> >> >> Ph.D. Student, Health Psychology
>> >> >> >> Programmer Analyst II, ATS Statistical Consulting Group
>> >> >> >> University of California, Los Angeles
>> >> >> >> https://joshuawiley.com/
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > Edward H. Patzelt
>> >> >> > Research Assistant – TRiCAM Lab
>> >> >> > University of Minnesota – Psychology/Psychiatry
>> >> >> > VA Medical Center
>> >> >> > Office: S355 Elliot Hall - Twin Cities Campus
>> >> >> > Phone: 612-626-0072  Email: patze003 at umn.edu
>> >> >> >
>> >> >> > Please consider the environment before printing this email
>> >> >> > www.psych.umn.edu/research/tricam
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Joshua Wiley
>> >> >> Ph.D. Student, Health Psychology
>> >> >> Programmer Analyst II, ATS Statistical Consulting Group
>> >> >> University of California, Los Angeles
>> >> >> https://joshuawiley.com/
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Edward H. Patzelt
>> >> > Research Assistant – TRiCAM Lab
>> >> > University of Minnesota – Psychology/Psychiatry
>> >> > VA Medical Center
>> >> > Office: S355 Elliot Hall - Twin Cities Campus
>> >> > Phone: 612-626-0072  Email: patze003 at umn.edu
>> >> >
>> >> > Please consider the environment before printing this email
>> >> > www.psych.umn.edu/research/tricam
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Joshua Wiley
>> >> Ph.D. Student, Health Psychology
>> >> Programmer Analyst II, ATS Statistical Consulting Group
>> >> University of California, Los Angeles
>> >> https://joshuawiley.com/
>> >
>> >
>> >
>> > --
>> > Edward H. Patzelt
>> > Research Assistant – TRiCAM Lab
>> > University of Minnesota – Psychology/Psychiatry
>> > VA Medical Center
>> > Office: S355 Elliot Hall - Twin Cities Campus
>> > Phone: 612-626-0072  Email: patze003 at umn.edu
>> >
>> > Please consider the environment before printing this email
>> > www.psych.umn.edu/research/tricam
>> >
>>
>>
>>
>> --
>> Joshua Wiley
>> Ph.D. Student, Health Psychology
>> Programmer Analyst II, ATS Statistical Consulting Group
>> University of California, Los Angeles
>> https://joshuawiley.com/
>
>
>
> --
> Edward H. Patzelt
> Research Assistant – TRiCAM Lab
> University of Minnesota – Psychology/Psychiatry
> VA Medical Center
> Office: S355 Elliot Hall - Twin Cities Campus
> Phone: 612-626-0072  Email: patze003 at umn.edu
>
> Please consider the environment before printing this email
> www.psych.umn.edu/research/tricam
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/



More information about the R-help mailing list