[R] Please help(urgent) - How to simulate transactional data for reliability/survival analysis

Bert Gunter bgunter.4567 at gmail.com
Wed Jul 5 16:42:45 CEST 2017


Strictly speaking, this is reliability, not survival, data, since
failed pumps are apparently repaired and put back in service as new.
Also, it is not clear from your data whether there is interval
censoring: is the recorded "event" time (failure) the actual failure
time -- so no censoring -- or the time at which the pump has been
discovered to have failed, so that it is known to have failed in the
interval since the last time it was recorded, but exactly when is
unknown. Presumably there is also standard right censoring -- the pump
is still running when the testing period concludes.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Jul 4, 2017 at 11:03 PM, Sunny Singha
<sunnysingha.analytics at gmail.com> wrote:
> Mark,
> Below is the sampled simulated granular data format for pumps for
> trial period of 3 months that I need to transform for survival
> analysis:
> 3 months = (60*24*90) minutes i.e 129600 minutes
>
> pump_id timings                events   vibration temprature flow
> pump1 01-07-2017 00:00       0        3.443    69.6           139.806
> pump1 01-07-2017 00:10       1        0.501    45.27         140.028
> pump1 01-07-2017 00:20       0        2.031    52.9           137.698
> pump1 01-07-2017 00:30       0        2.267    60.12         139.054
> pump1 01-07-2017 00:40       1        2.267    60.12         139.054
> pump1 01-07-2017 00:50       0        2.267    60.12         139.054
> pump2 01-07-2017 00:00       0        3.443    69.6           139.806
> pump2 01-07-2017 00:10       0        0.501    45.27         140.028
> pump2 01-07-2017 00:20       0        2.031    52.9           137.698
> pump2 01-07-2017 00:30       0        2.267    60.12         139.054
> pump2 01-07-2017 00:40       1        2.267    60.12         139.054
> pump2 01-07-2017 00:50       0        2.267    60.12         139.054
>
> The above data set records observations and timings where 'pumps'
> experienced failure, tagged as '1' in column 'events'.
> In the above granular dataset the pump1 experiences 2 "event episodes."
>
> Below is the desired transformed format. the covariates in this data
> set will have the mean value:
> pump_id      event_episodes      event_status      start(minutes)
> stop(minutes)
> pump1                          1                             1
>           0                               10
> pump1                          2                             1
>          10                              40
> pump1                          3                             0
>          40                              129600
> pump2                          1                             1
>           0                               40
> pump2                          2                             0
>           0                               129600
> .........
> .........
>
> The 'start' and 'stop' columns are evaluated from the 'timings'
> columns. I need help in performing such transformation in 'R'.
> Please guide and help.
>
> Regards,
> Sandeep
>
> On Wed, Jul 5, 2017 at 7:26 AM, Mark Sharp <msharp at txbiomed.org> wrote:
>> A small example data set that illustrates your question will be of great value to those trying to help. This appears to be a transformation that you are wanting to do (timestamp to units of time) so a data representing what you have (dput() is handy for this) and one representing what you want to have with any guidance regarding how to use the other columns in you data set (e.g., the event(0/1)).
>>
>> Mark
>> R. Mark Sharp, Ph.D.
>> msharp at TxBiomed.org
>>
>>
>>
>>
>>
>>> On Jul 4, 2017, at 7:02 AM, Sunny Singha <sunnysingha.analytics at gmail.com> wrote:
>>>
>>> Thanks Boris and Bret,
>>> I was successful in simulating granular/transactional data.
>>> Now I need some guidance to transform the same data in format acceptable
>>> for survival analysis i.e below format:
>>>
>>> pump_id | event_episode_no. | event(0/1) | start | stop | time_to_dropout
>>>
>>> The challenge I'm experience is to generate the 'start' and 'stop' in units
>>> of minutes/days from single column of 'Timestamp' which is
>>> the column from transactional/granular data based on condition tagged in
>>> separate column, 'event 0/1, (i.e event ).
>>>
>>> Please guide how to do such transformation in 'R'.
>>>
>>> Regards,
>>> Sandeep
>>>
>>>
>>>
>>> On Wed, Jun 28, 2017 at 2:51 PM, Boris Steipe <boris.steipe at utoronto.ca>
>>> wrote:
>>>
>>>> In principle what you need to do is the following:
>>>>
>>>> - break down the time you wish to simulate into intervals.
>>>> - for each interval, and each failure mode, determine the probability of
>>>> an event.
>>>>   Determining the probability is the fun part, where you make your domain
>>>>   knowledge explicit and include all the factors into your model:
>>>> cumulative load,
>>>>   failure history, pressure, temperature, phase of the moon ...
>>>> - once you have a probability of failure, use the runif() function to
>>>> give you
>>>>   a uniformly distributed random number in [0, 1]. If the number is
>>>> smaller than
>>>>   your failure probability, accept the failure event, and record it.
>>>> - Repeat many times.
>>>>
>>>> Hope this helps.
>>>> B.
>>>>
>>>>
>>>>
>>>>
>>>>> On Jun 27, 2017, at 10:58 AM, sandeep Rana <sandykido at gmail.com> wrote:
>>>>>
>>>>> Hi friends,
>>>>> I haven't done such a simulation before and any help would be greatly
>>>> appreciated. I need your guidance.
>>>>>
>>>>> I need to simulate end to end data for Reliability/survival analysis of
>>>> a Pump ,with correlation in place, that is at 'Transactional level' or at
>>>> the granularity of time-minutes, where each observation is a reading
>>>> captured via Pump's sensors each minute.
>>>>> Once transactional data is prepared I Then need to summarise above data
>>>> for reliability/ survival analysis.
>>>>>
>>>>> To begin with below is the transactional data format that i want prepare:
>>>>> Pump-id| Timestamp | temp | vibration | suction pressure| discharge
>>>> pressure | Flow
>>>>>
>>>>> Above transactional data has to be prepared with below failure modes
>>>>> Defects :
>>>>> (1)    Cavitation – very high in frequency but low impact
>>>>> (2)    Bearing Damage – very low in frequency but high impact
>>>>> (3)    Worn Shaft – medium frequency but medium impact
>>>>>
>>>>> I have used survsim package but that's not what I need here.
>>>>> Please help and guide.
>>>>>
>>>>> Regards,
>>>>> Sandeep
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide http://www.R-project.org/
>>>> posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/
>>>> posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> CONFIDENTIALITY NOTICE: This e-mail and any files and/or attachments transmitted, may contain privileged and confidential information and is intended solely for the exclusive use of the individual or entity to whom it is addressed. If you are not the intended recipient, you are hereby notified that any review, dissemination, distribution or copying of this e-mail and/or attachments is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender stating that this transmission was misdirected; return the e-mail to sender; destroy all paper copies and delete all electronic copies from your system without disclosing its contents.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list