[R] Please help(urgent) - How to simulate transactional data for reliability/survival analysis

Sunny Singha sunnysingha.analytics at gmail.com
Wed Jul 5 08:03:03 CEST 2017


Mark,
Below is the sampled simulated granular data format for pumps for
trial period of 3 months that I need to transform for survival
analysis:
3 months = (60*24*90) minutes i.e 129600 minutes

pump_id timings                events   vibration temprature flow
pump1 01-07-2017 00:00       0        3.443    69.6           139.806
pump1 01-07-2017 00:10       1        0.501    45.27         140.028
pump1 01-07-2017 00:20       0        2.031    52.9           137.698
pump1 01-07-2017 00:30       0        2.267    60.12         139.054
pump1 01-07-2017 00:40       1        2.267    60.12         139.054
pump1 01-07-2017 00:50       0        2.267    60.12         139.054
pump2 01-07-2017 00:00       0        3.443    69.6           139.806
pump2 01-07-2017 00:10       0        0.501    45.27         140.028
pump2 01-07-2017 00:20       0        2.031    52.9           137.698
pump2 01-07-2017 00:30       0        2.267    60.12         139.054
pump2 01-07-2017 00:40       1        2.267    60.12         139.054
pump2 01-07-2017 00:50       0        2.267    60.12         139.054

The above data set records observations and timings where 'pumps'
experienced failure, tagged as '1' in column 'events'.
In the above granular dataset the pump1 experiences 2 "event episodes."

Below is the desired transformed format. the covariates in this data
set will have the mean value:
pump_id      event_episodes      event_status      start(minutes)
stop(minutes)
pump1                          1                             1
          0                               10
pump1                          2                             1
         10                              40
pump1                          3                             0
         40                              129600
pump2                          1                             1
          0                               40
pump2                          2                             0
          0                               129600
.........
.........

The 'start' and 'stop' columns are evaluated from the 'timings'
columns. I need help in performing such transformation in 'R'.
Please guide and help.

Regards,
Sandeep

On Wed, Jul 5, 2017 at 7:26 AM, Mark Sharp <msharp at txbiomed.org> wrote:
> A small example data set that illustrates your question will be of great value to those trying to help. This appears to be a transformation that you are wanting to do (timestamp to units of time) so a data representing what you have (dput() is handy for this) and one representing what you want to have with any guidance regarding how to use the other columns in you data set (e.g., the event(0/1)).
>
> Mark
> R. Mark Sharp, Ph.D.
> msharp at TxBiomed.org
>
>
>
>
>
>> On Jul 4, 2017, at 7:02 AM, Sunny Singha <sunnysingha.analytics at gmail.com> wrote:
>>
>> Thanks Boris and Bret,
>> I was successful in simulating granular/transactional data.
>> Now I need some guidance to transform the same data in format acceptable
>> for survival analysis i.e below format:
>>
>> pump_id | event_episode_no. | event(0/1) | start | stop | time_to_dropout
>>
>> The challenge I'm experience is to generate the 'start' and 'stop' in units
>> of minutes/days from single column of 'Timestamp' which is
>> the column from transactional/granular data based on condition tagged in
>> separate column, 'event 0/1, (i.e event ).
>>
>> Please guide how to do such transformation in 'R'.
>>
>> Regards,
>> Sandeep
>>
>>
>>
>> On Wed, Jun 28, 2017 at 2:51 PM, Boris Steipe <boris.steipe at utoronto.ca>
>> wrote:
>>
>>> In principle what you need to do is the following:
>>>
>>> - break down the time you wish to simulate into intervals.
>>> - for each interval, and each failure mode, determine the probability of
>>> an event.
>>>   Determining the probability is the fun part, where you make your domain
>>>   knowledge explicit and include all the factors into your model:
>>> cumulative load,
>>>   failure history, pressure, temperature, phase of the moon ...
>>> - once you have a probability of failure, use the runif() function to
>>> give you
>>>   a uniformly distributed random number in [0, 1]. If the number is
>>> smaller than
>>>   your failure probability, accept the failure event, and record it.
>>> - Repeat many times.
>>>
>>> Hope this helps.
>>> B.
>>>
>>>
>>>
>>>
>>>> On Jun 27, 2017, at 10:58 AM, sandeep Rana <sandykido at gmail.com> wrote:
>>>>
>>>> Hi friends,
>>>> I haven't done such a simulation before and any help would be greatly
>>> appreciated. I need your guidance.
>>>>
>>>> I need to simulate end to end data for Reliability/survival analysis of
>>> a Pump ,with correlation in place, that is at 'Transactional level' or at
>>> the granularity of time-minutes, where each observation is a reading
>>> captured via Pump's sensors each minute.
>>>> Once transactional data is prepared I Then need to summarise above data
>>> for reliability/ survival analysis.
>>>>
>>>> To begin with below is the transactional data format that i want prepare:
>>>> Pump-id| Timestamp | temp | vibration | suction pressure| discharge
>>> pressure | Flow
>>>>
>>>> Above transactional data has to be prepared with below failure modes
>>>> Defects :
>>>> (1)    Cavitation – very high in frequency but low impact
>>>> (2)    Bearing Damage – very low in frequency but high impact
>>>> (3)    Worn Shaft – medium frequency but medium impact
>>>>
>>>> I have used survsim package but that's not what I need here.
>>>> Please help and guide.
>>>>
>>>> Regards,
>>>> Sandeep
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/
>>> posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/
>>> posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> CONFIDENTIALITY NOTICE: This e-mail and any files and/or attachments transmitted, may contain privileged and confidential information and is intended solely for the exclusive use of the individual or entity to whom it is addressed. If you are not the intended recipient, you are hereby notified that any review, dissemination, distribution or copying of this e-mail and/or attachments is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender stating that this transmission was misdirected; return the e-mail to sender; destroy all paper copies and delete all electronic copies from your system without disclosing its contents.



More information about the R-help mailing list