[R] Mailinglist

Rachel Thompson r@chel@thomp@on @ending from @tudent@uv@@nl
Sun Jan 6 17:46:56 CET 2019


Hi Michael

Thanks, I'll check it out.

Best,

Rachel

On Sun, Jan 6, 2019 at 11:45 AM Michael Dewey <lists using dewey.myzen.co.uk>
wrote:

> Dear Rachel
>
> Not sure if this is going to help but if it is a csv file then
> read.csv() is your friend. Read the help first in case you need to
> specify what is being used for the decimal point and the separator as if
> it is from the Netherlands they may not be the default settings.
>
> michael
>
> On 06/01/2019 16:37, Rachel Thompson wrote:
> > Hi Jeff,
> >
> > Thanks for your email.
> > I am an intern from Amsterdam and I have to do an analysis in R. I spoke
> to
> > my professor in Amsterdam and my supervisor's here in Boston. But they
> are
> > to busy to help. I informed them from the start that I am not familiar
> with
> > R(Rstudio) and they told me that I would receive guidance. So since they
> > can not help me, I decided to share my problem online.
> > (It is a CVS file imported into R)
> >
> > Please understand that I am new to this. I will unsubscribe to the
> mailing
> > list if my question does not belong here.
> >
> > Thanks,
> >
> > Rachel
> >
> > On Sun, Jan 6, 2019 at 11:01 AM Jeff Newmiller <jdnewmil using dcn.davis.ca.us
> >
> > wrote:
> >
> >> I would not want to leave the impression that I think the task at hand
> is
> >> merely tedious... my point is that there are numerous steps involved and
> >> each step depends on information that has not been communicated to the
> >> list, and there is a learning curve even in knowing what to include in
> an
> >> email question. What I do think is that knowing enough basic R syntax to
> >> express small bits of the problem in R will be a vast improvement over
> >> attempting to use only English descriptions, and Rachel has to bridge
> that
> >> initial gap.
> >>
> >> For example, some images of data were apparently sent to Jim only, yet
> he
> >> still does not know in what format the data file is stored, so that
> >> technique was not very effective. One way for the question to become
> more
> >> focused is for Rachel to study up on her own how to import data and
> provide
> >> us with a "dput" (see the StackOverflow discussion I referenced before)
> of
> >> a small sample of data. Another is for Rachel to use basic R syntax to
> >> create an anonymous data set from scratch (also outlined in the SO
> >> discussion). These approaches allow us to keep the focus of our mailing
> >> list discussion on manipulating the data into summaries. Another
> approach
> >> is to re-focus the question on importing data by supplying a download
> link
> >> to the data so we can make suggestions as to what R commands will handle
> >> this data in its raw form. In any case, we cannot leapfrog over the
> data to
> >> the analysis as the question stands.
> >>
> >> Given the above, I have to wonder why Rachel hasn't simply used the tool
> >> she is familiar with... SPSS... to do this? If it is because this is an
> >> academic assignment to learn R then she should be talking to her
> >> institutional support (instructor/teaching assistant/tutoring staff)
> anyway
> >> since there is a no-homework policy on this list (and that avenue would
> >> have the benefit of being conducted orally and most likely in her native
> >> language).
> >>
> >>
> >> On January 6, 2019 1:12:46 AM PST, Jim Lemon <drjimlemon using gmail.com>
> wrote:
> >>> Hi Rachel,
> >>> It looks to me as though the first thing you want to do is to get your
> >>> data, which you attach as images, into a data frame. If these are flat
> >>> files like CSV or TAB, you should be able to read them in with some
> >>> variant of the read.table function. If Excel, look at the various
> >>> Excel import packages. Then you can operate on the data frame by doing
> >>> things like tabulating Participant ID against the code for SMS or call
> >>> (which I assume are those 3000+ numbers). You can take the differences
> >>> in what look like POSIX time values between successive TRUE and FALSE
> >>> screen values to get the duration of screen activity and it looks like
> >>> participant activity is recorded at regular intervals. As Jeff
> >>> suggested, this is really just boring work figuring out how to extract
> >>> the events:
> >>>
> >>> call_indices<-which(Probetype == xxxxxxCallLogProbe & ValueSpecified
> >>> == _id  & Valuedetailed ==3271)
> >>>
> >>> using suitable logical statements and then tabulating them by
> >>> ParticipantID. If you know how to do that in SPSS, it won't be too
> >>> hard to translate the logical statements into R syntax as above. I may
> >>> have misunderstood the variable names, but I think the logic is clear.
> >>>
> >>> Jim
> >>>
> >>> On Sun, Jan 6, 2019 at 4:07 PM Rachel Thompson
> >>> <rachel.thompson using student.uva.nl> wrote:
> >>>>
> >>>> Hi Jim,
> >>>>
> >>>> Thank you for the clarification. Since I only work in SPSS and I am
> >> >from Amsterdam I have had problems with specifying what I am trying to
> >>> do in this specific program and also in clear English language.
> >>>>
> >>>> I think I want to indeed aggregate these events for each subject over
> >>> the observation. But in this case several observations.
> >>>> 1. I want to have a summary of how many times a specific subject got
> >>> called (CallLogProbe)
> >>>> 2. I want to have a summary of how many times a specific subject got
> >>> a text message (SMS probe)
> >>>> 3. I want to have a summary of how many times a specific subject
> >>>> - Turned their screen on - True  (ScreenProbe)
> >>>> - Or did not turn their screen on - False (ScreenProbe)
> >>>> 4.  I want to have a summary of the activity level of a specific
> >>> subject
> >>>> - Activity level - none (ActivityProbe)
> >>>> - Activity level- low     (ActivityProbe)
> >>>> - Activity level - High  (ActivityProbe)
> >>>>
> >>>> I want to do this for all the 36 subjects(Participants).
> >>>>
> >>>> In the end, I have to define percentages, so I am able to
> >>> say...Subject 36 has low social interactions ( because they only got
> >>> called and texted 500 times in total, while the average of all the
> >>> participants is 10000 or something). I have to come up with the
> >>> percentages myself and define cutoff points of what is considered
> >>> low-medium-high, based on what the results of all the subjects are.
> >>>>
> >>>> I hope that I am as clear as possible .
> >>>>
> >>>>
> >>>> I feel as if I am on my way of understanding it, but since I do not
> >>> clearly know, I am trying out a lot of different codes etc. and I do
> >>> not know if I am doing the right thing. I indeed made a new data frame
> >>> etc, but I still feel a bit lost. Do I need to make one per subject or
> >>> per Probe etc..
> >>>>
> >>>>
> >>>> Thanks for your help. I hope that you can help me resolve this issue.
> >>>>
> >>>>
> >>>> Best,
> >>>>
> >>>>
> >>>> Rachel
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Sat, Jan 5, 2019 at 9:03 PM Jim Lemon <drjimlemon using gmail.com>
> >>> wrote:
> >>>>>
> >>>>> Hi Rachel,
> >>>>> I'll take a guess and assume that you are monitoring the mobile
> >>> phones
> >>>>> of 36 people, adding an observation every time some specified change
> >>>>> of state is sensed on each device. I'll also assume that you are
> >>> only
> >>>>> recording four types of measurement. It seems that you want to
> >>>>> aggregate these events for each subject over the interval or
> >>>>> observation (or over each day or something). I think you are going
> >>> to
> >>>>> create a new data frame of these summaries from the one you have of
> >>>>> individual observations. Creating each summary doesn't look too
> >>> hard,
> >>>>> but you will have to define more precisely what you want those
> >>>>> summaries to be. For instance, "I want the mean activity level for
> >>>>> each subject during the overall time that their mobile phone is
> >>>>> switched on", One you have clearly defined your goals, it probably
> >>>>> won't be too hard to get to them.
> >>>>>
> >>>>> Jim
> >>>>>
> >>>>> On Sun, Jan 6, 2019 at 5:39 AM Rachel Thompson
> >>>>> <rachel.thompson using student.uva.nl> wrote:
> >>>>>>
> >>>>>> Dear Mr/Mrs,
> >>>>>>
> >>>>>> This is my first time working in R studio.
> >>>>>> I have a database of 36 participants but it has 150600 entries.
> >>>>>> Column -         Column - Column            - Column
> >>>>>>
> >>>>>> Participant       Activityprobe - Activity Level  - High/low/none
> >>>>>>
> >>>>>> Participant       Screenprobe - screenon/off     -
> >>>>>>
> >>>>>> Participant       SMSprobe etc
> >>>>>>
> >>>>>> Participant       CallLogProbe etc.
> >>>>>>
> >>>>>> I need a code that helps me count the activity level of all the
> >>> participants
> >>>>>> High activity level. No activity level and Low activity level.
> >>>>>> And to help me find out for every participant what the percentages
> >>> are of
> >>>>>> all their high/no/low activity.
> >>>>>>
> >>>>>> For screenprobe I need to count how many times the participant
> >>> turned their
> >>>>>> screen on and how many times they turned it off and the percentage
> >>> of
> >>>>>> screen on/off.
> >>>>>>
> >>>>>> For callLog I need to count how many times each participant got
> >>> called and
> >>>>>> the percentage.
> >>>>>>
> >>>>>> For SMS I need to count the number of SMS for each participant and
> >>> their
> >>>>>> percentage.
> >>>>>>
> >>>>>> I also need to categorize the probes. So that my database shows
> >>> all the
> >>>>>> activity levels first, organized by none/high/low and then all the
> >>>>>> screenprobes, organized by on and off etc...
> >>>>>>
> >>>>>> I hope that my description is clear and that you can maybe help
> >>> me.
> >>>>>>
> >>>>>> Best,
> >>>>>>
> >>>>>> Rachel
> >>>>>>
> >>>>>>          [[alternative HTML version deleted]]
> >>>>>>
> >>>>>> ______________________________________________
> >>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>>>>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>> ______________________________________________
> >>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >> --
> >> Sent from my phone. Please excuse my brevity.
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> --
> Michael
> http://www.dewey.myzen.co.uk/home.html
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list