[R] Fwd: Joining two datasets - recursive procedure?

Sun Mar 22 18:32:13 CET 2015

Nonsense. You are not telling us something or I have failed to
understand something.

Consider:

v1 = c("a","b")
v2 = "c("a","a")

It is not possible to change the value of a sum of values
corresponding to v2="a" without also changing that for v1, which is
not supposed to change according to my understanding of your
specification.

So I'm done.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll

On Sun, Mar 22, 2015 at 8:28 AM, Luca Meyer <lucam1968 at gmail.com> wrote:
> Sorry forgot to keep the rest of the group in the loop - Luca
> ---------- Forwarded message ----------
> From: Luca Meyer <lucam1968 at gmail.com>
> Date: 2015-03-22 16:27 GMT+01:00
> Subject: Re: [R] Joining two datasets - recursive procedure?
> To: Bert Gunter <gunter.berton at gene.com>
>
>
> Hi Bert,
>
> That is exactly what I am trying to achieve. Please notice that negative v4
> values are allowed. I have done a similar task in the past manually by
> recursively alterating v4 distribution across v3 categories within fix each
> v1&v2 combination so I am quite positive it can be achieved but honestly I
> took me forever to do it manually and since this is likely to be an
> exercise I need to repeat from time to time I wish I could learn how to do
> it programmatically....
>
> Thanks again for any further suggestion you might have,
>
> Luca
>
>
> 2015-03-22 16:05 GMT+01:00 Bert Gunter <gunter.berton at gene.com>:
>
>> Oh, wait a minute ...
>>
>> You still want the marginals for the other columns to be as originally?
>>
>> If so, then this is impossible in general as the sum of all the values
>> must be what they were originally and you cannot therefore choose your
>> values for V3 arbitrarily.
>>
>> Or at least, that seems to be what you are trying to do.
>>
>> -- Bert
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>> (650) 467-7374
>>
>> "Data is not information. Information is not knowledge. And knowledge
>> is certainly not wisdom."
>> Clifford Stoll
>>
>>
>>
>>
>> On Sun, Mar 22, 2015 at 7:55 AM, Bert Gunter <bgunter at gene.com> wrote:
>> > I would have thought that this is straightforward given my previous
>> email...
>> >
>> > Just set z to what you want -- e,g, all B values to 29/number of B's,
>> > and all C values to 2.567/number of C's (etc. for more categories).
>> >
>> > A slick but sort of cheat way to do this programmatically -- in the
>> > sense that it relies on the implementation of factor() rather than its
>> > API -- is:
>> >
>> > y <- f1$v3  ## to simplify the notation; could be done using with()
>> > z <- (c(29,2.567)/table(y))[c(y)]
>> >
>> > Then proceed to z1 as I previously described
>> >
>> > -- Bert
>> >
>> >
>> > Bert Gunter
>> > Genentech Nonclinical Biostatistics
>> > (650) 467-7374
>> >
>> > "Data is not information. Information is not knowledge. And knowledge
>> > is certainly not wisdom."
>> > Clifford Stoll
>> >
>> >
>> >
>> >
>> > On Sun, Mar 22, 2015 at 2:00 AM, Luca Meyer <lucam1968 at gmail.com> wrote:
>> >> Hi Bert, hello R-experts,
>> >>
>> >> I am close to a solution but I still need one hint w.r.t. the following
>> >> procedure (available also from
>> >> https://www.dropbox.com/s/qhmpkkrejjkpbkx/sample_code.txt?dl=0)
>> >>
>> >> rm(list=ls())
>> >>
>> >> # this is (an extract of) the INPUT file I have:
>> >> f1 <- structure(list(v1 = c("A", "A", "A", "A", "A", "A", "B", "B", "B",
>> >> "B", "B", "B"), v2 = c("A", "B", "C", "A", "B", "C", "A", "B", "C", "A",
>> >> "B", "C"), v3 = c("B", "B", "B", "C", "C", "C", "B", "B", "B", "C", "C",
>> >> "C"), v4 = c(18.18530, 3.43806,0.00273, 1.42917, 1.05786, 0.00042,
>> 2.37232,
>> >> 3.01835, 0, 1.13430, 0.92872, 0)), .Names = c("v1", "v2", "v3", "v4"),
>> class
>> >> = "data.frame", row.names = c(2L, 9L, 11L, 41L, 48L, 50L, 158L, 165L,
>> 167L,
>> >> 197L, 204L, 206L))
>> >>
>> >> # this is the procedure that Bert suggested (slightly adjusted):
>> >> z <- rnorm(nrow(f1)) ## or anything you want
>> >> z1 <- round(with(f1,v4 + z -ave(z,v1,v2,FUN=mean)), digits=5)
>> >> aggregate(v4~v1*v2,f1,sum)
>> >> aggregate(z1~v1*v2,f1,sum)
>> >> aggregate(v4~v3,f1,sum)
>> >> aggregate(z1~v3,f1,sum)
>> >>
>> >> My question to you is: how can I set z so that I can obtain specific
>> values
>> >> for z1-v4 in the v3 aggregation?
>> >> In other words, how can I configure the procedure so that e.g. B=29 and
>> >> C=2.56723 after running the procedure:
>> >> aggregate(z1~v3,f1,sum)
>> >>
>> >> Thank you,
>> >>
>> >> Luca
>> >>
>> >> PS: to avoid any doubts you might have about who I am the following is
>> my
>> >> web page: http://lucameyer.wordpress.com/
>> >>
>> >>
>> >> 2015-03-21 18:13 GMT+01:00 Bert Gunter <gunter.berton at gene.com>:
>> >>>
>> >>> ... or cleaner:
>> >>>
>> >>> z1 <- with(f1,v4 + z -ave(z,v1,v2,FUN=mean))
>> >>>
>> >>>
>> >>> Just for curiosity, was this homework? (in which case I should
>> >>> probably have not provided you an answer -- that is, assuming that I
>> >>> HAVE provided an answer).
>> >>>
>> >>> Cheers,
>> >>> Bert
>> >>>
>> >>> Bert Gunter
>> >>> Genentech Nonclinical Biostatistics
>> >>> (650) 467-7374
>> >>>
>> >>> "Data is not information. Information is not knowledge. And knowledge
>> >>> is certainly not wisdom."
>> >>> Clifford Stoll
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Sat, Mar 21, 2015 at 7:53 AM, Bert Gunter <bgunter at gene.com> wrote:
>> >>> > z <- rnorm(nrow(f1)) ## or anything you want
>> >>> > z1 <- f1$v4 + z - with(f1,ave(z,v1,v2,FUN=mean))
>> >>> >
>> >>> >
>> >>> > aggregate(v4~v1,f1,sum)
>> >>> > aggregate(z1~v1,f1,sum)
>> >>> > aggregate(v4~v2,f1,sum)
>> >>> > aggregate(z1~v2,f1,sum)
>> >>> > aggregate(v4~v3,f1,sum)
>> >>> > aggregate(z1~v3,f1,sum)
>> >>> >
>> >>> >
>> >>> > Cheers,
>> >>> > Bert
>> >>> >
>> >>> > Bert Gunter
>> >>> > Genentech Nonclinical Biostatistics
>> >>> > (650) 467-7374
>> >>> >
>> >>> > "Data is not information. Information is not knowledge. And knowledge
>> >>> > is certainly not wisdom."
>> >>> > Clifford Stoll
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> > On Sat, Mar 21, 2015 at 6:49 AM, Luca Meyer <lucam1968 at gmail.com>
>> wrote:
>> >>> >> Hi Bert,
>> >>> >>
>> >>> >> Thank you for your message. I am looking into ave() and tapply() as
>> you
>> >>> >> suggested but at the same time I have prepared a example of input
>> and
>> >>> >> output
>> >>> >> files, just in case you or someone else would like to make an
>> attempt
>> >>> >> to
>> >>> >> generate a code that goes from input to output.
>> >>> >>
>> >>> >> Please see below or download it from
>> >>> >> https://www.dropbox.com/s/qhmpkkrejjkpbkx/sample_code.txt?dl=0
>> >>> >>
>> >>> >> # this is (an extract of) the INPUT file I have:
>> >>> >> f1 <- structure(list(v1 = c("A", "A", "A", "A", "A", "A", "B", "B",
>> >>> >> "B", "B", "B", "B"), v2 = c("A", "B", "C", "A", "B", "C", "A",
>> >>> >> "B", "C", "A", "B", "C"), v3 = c("B", "B", "B", "C", "C", "C",
>> >>> >> "B", "B", "B", "C", "C", "C"), v4 = c(18.18530, 3.43806,0.00273,
>> >>> >> 1.42917,
>> >>> >> 1.05786, 0.00042, 2.37232, 3.01835, 0, 1.13430, 0.92872,
>> >>> >> 0)), .Names = c("v1", "v2", "v3", "v4"), class = "data.frame",
>> >>> >> row.names =
>> >>> >> c(2L,
>> >>> >> 9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L, 206L))
>> >>> >>
>> >>> >> # this is (an extract of) the OUTPUT file I would like to obtain:
>> >>> >> f2 <- structure(list(v1 = c("A", "A", "A", "A", "A", "A", "B", "B",
>> >>> >> "B", "B", "B", "B"), v2 = c("A", "B", "C", "A", "B", "C", "A",
>> >>> >> "B", "C", "A", "B", "C"), v3 = c("B", "B", "B", "C", "C", "C",
>> >>> >> "B", "B", "B", "C", "C", "C"), v4 = c(17.83529, 3.43806,0.00295,
>> >>> >> 1.77918,
>> >>> >> 1.05786, 0.0002, 2.37232, 3.01835, 0, 1.13430, 0.92872,
>> >>> >> 0)), .Names = c("v1", "v2", "v3", "v4"), class = "data.frame",
>> >>> >> row.names =
>> >>> >> c(2L,
>> >>> >> 9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L, 206L))
>> >>> >>
>> >>> >> # please notice that while the aggregated v4 on v3 has changed …
>> >>> >> aggregate(f1[,c("v4")],list(f1$v3),sum)
>> >>> >> aggregate(f2[,c("v4")],list(f2$v3),sum)
>> >>> >>
>> >>> >> # … the aggregated v4 over v1xv2 has remained unchanged:
>> >>> >> aggregate(f1[,c("v4")],list(f1$v1,f1$v2),sum)
>> >>> >> aggregate(f2[,c("v4")],list(f2$v1,f2$v2),sum)
>> >>> >>
>> >>> >> Thank you very much in advance for your assitance.
>> >>> >>
>> >>> >> Luca
>> >>> >>
>> >>> >> 2015-03-21 13:18 GMT+01:00 Bert Gunter <gunter.berton at gene.com>:
>> >>> >>>
>> >>> >>> 1. Still not sure what you mean, but maybe look at ?ave and
>> ?tapply,
>> >>> >>> for which ave() is a wrapper.
>> >>> >>>
>> >>> >>> 2. You still need to heed the rest of Jeff's advice.
>> >>> >>>
>> >>> >>> Cheers,
>> >>> >>> Bert
>> >>> >>>
>> >>> >>> Bert Gunter
>> >>> >>> Genentech Nonclinical Biostatistics
>> >>> >>> (650) 467-7374
>> >>> >>>
>> >>> >>> "Data is not information. Information is not knowledge. And
>> knowledge
>> >>> >>> is certainly not wisdom."
>> >>> >>> Clifford Stoll
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>> On Sat, Mar 21, 2015 at 4:53 AM, Luca Meyer <lucam1968 at gmail.com>
>> >>> >>> wrote:
>> >>> >>> > Hi Jeff & other R-experts,
>> >>> >>> >
>> >>> >>> > Thank you for your note. I have tried myself to solve the issue
>> >>> >>> > without
>> >>> >>> > success.
>> >>> >>> >
>> >>> >>> > Following your suggestion, I am providing a sample of the
>> dataset I
>> >>> >>> > am
>> >>> >>> > using below (also downloadble in plain text from
>> >>> >>> > https://www.dropbox.com/s/qhmpkkrejjkpbkx/sample_code.txt?dl=0):
>> >>> >>> >
>> >>> >>> > #this is an extract of the overall dataset (n=1200 cases)
>> >>> >>> > f1 <- structure(list(v1 = c("A", "A", "A", "A", "A", "A", "B",
>> "B",
>> >>> >>> > "B", "B", "B", "B"), v2 = c("A", "B", "C", "A", "B", "C", "A",
>> >>> >>> > "B", "C", "A", "B", "C"), v3 = c("B", "B", "B", "C", "C", "C",
>> >>> >>> > "B", "B", "B", "C", "C", "C"), v4 = c(18.1853007621835,
>> >>> >>> > 3.43806581506388,
>> >>> >>> > 0.002733567617055, 1.42917483425029, 1.05786640463504,
>> >>> >>> > 0.000420548864162308,
>> >>> >>> > 2.37232740842861, 3.01835841813241, 0, 1.13430282139936,
>> >>> >>> > 0.928725667117666,
>> >>> >>> > 0)), .Names = c("v1", "v2", "v3", "v4"), class = "data.frame",
>> >>> >>> > row.names
>> >>> >>> > =
>> >>> >>> > c(2L,
>> >>> >>> > 9L, 11L, 41L, 48L, 50L, 158L, 165L, 167L, 197L, 204L, 206L))
>> >>> >>> >
>> >>> >>> > I need to find a automated procedure that allows me to adjust v3
>> >>> >>> > marginals
>> >>> >>> > while maintaining v1xv2 marginals unchanged.
>> >>> >>> >
>> >>> >>> > That is: modify the v4 values you can find by running:
>> >>> >>> >
>> >>> >>> > aggregate(f1[,c("v4")],list(f1$v3),sum)
>> >>> >>> >
>> >>> >>> > while maintaining costant the values you can find by running:
>> >>> >>> >
>> >>> >>> > aggregate(f1[,c("v4")],list(f1$v1,f1$v2),sum)
>> >>> >>> >
>> >>> >>> > Now does it make sense?
>> >>> >>> >
>> >>> >>> > Please notice I have tried to build some syntax that tries to
>> modify
>> >>> >>> > values
>> >>> >>> > within each v1xv2 combination by computing sum of v4, row
>> percentage
>> >>> >>> > in
>> >>> >>> > terms of v4, and there is where my effort is blocked. Not really
>> >>> >>> > sure
>> >>> >>> > how I
>> >>> >>> > should proceed. Any suggestion?
>> >>> >>> >
>> >>> >>> > Thanks,
>> >>> >>> >
>> >>> >>> > Luca
>> >>> >>> >
>> >>> >>> >
>> >>> >>> > 2015-03-19 2:38 GMT+01:00 Jeff Newmiller <
>> jdnewmil at dcn.davis.ca.us>:
>> >>> >>> >
>> >>> >>> >> I don't understand your description. The standard practice on
>> this
>> >>> >>> >> list
>> >>> >>> >> is
>> >>> >>> >> to provide a reproducible R example [1] of the kind of data you
>> are
>> >>> >>> >> working
>> >>> >>> >> with (and any code you have tried) to go along with your
>> >>> >>> >> description.
>> >>> >>> >> In
>> >>> >>> >> this case, that would be two dputs of your input data frames
>> and a
>> >>> >>> >> dput
>> >>> >>> >> of
>> >>> >>> >> an output data frame (generated by hand from your input data
>> >>> >>> >> frame).
>> >>> >>> >> (Probably best to not use the full number of input values just
>> to
>> >>> >>> >> keep
>> >>> >>> >> the
>> >>> >>> >> size down.) We could then make an attempt to generate code that
>> >>> >>> >> goes
>> >>> >>> >> from
>> >>> >>> >> input to output.
>> >>> >>> >>
>> >>> >>> >> Of course, if you post that hard work using HTML then it will
>> get
>> >>> >>> >> corrupted (much like the text below from your earlier emails)
>> and
>> >>> >>> >> we
>> >>> >>> >> won't
>> >>> >>> >> be able to use it. Please learn to post from your email software
>> >>> >>> >> using
>> >>> >>> >> plain text when corresponding with this mailing list.
>> >>> >>> >>
>> >>> >>> >> [1]
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >>
>> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >>
>> ---------------------------------------------------------------------------
>> >>> >>> >> Jeff Newmiller                        The     .....
>>  .....  Go
>> >>> >>> >> Live...
>> >>> >>> >> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.
>> >>> >>> >> Live
>> >>> >>> >> Go...
>> >>> >>> >>                                       Live:   OO#.. Dead: OO#..
>> >>> >>> >> Playing
>> >>> >>> >> Research Engineer (Solar/Batteries            O.O#.       #.O#.
>> >>> >>> >> with
>> >>> >>> >> /Software/Embedded Controllers)               .OO#.       .OO#.
>> >>> >>> >> rocks...1k
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >>
>> ---------------------------------------------------------------------------
>> >>> >>> >> Sent from my phone. Please excuse my brevity.
>> >>> >>> >>
>> >>> >>> >> On March 18, 2015 9:05:37 AM PDT, Luca Meyer <
>> lucam1968 at gmail.com>
>> >>> >>> >> wrote:
>> >>> >>> >> >Thanks for you input Michael,
>> >>> >>> >> >
>> >>> >>> >> >The continuous variable I have measures quantities (down to the
>> >>> >>> >> > 3rd
>> >>> >>> >> >decimal level) so unfortunately are not frequencies.
>> >>> >>> >> >
>> >>> >>> >> >Any more specific suggestions on how that could be tackled?
>> >>> >>> >> >
>> >>> >>> >> >Thanks & kind regards,
>> >>> >>> >> >
>> >>> >>> >> >Luca
>> >>> >>> >> >
>> >>> >>> >> >
>> >>> >>> >> >===
>> >>> >>> >> >
>> >>> >>> >> >Michael Friendly wrote:
>> >>> >>> >> >I'm not sure I understand completely what you want to do, but
>> >>> >>> >> >if the data were frequencies, it sounds like task for fitting a
>> >>> >>> >> >loglinear model with the model formula
>> >>> >>> >> >
>> >>> >>> >> >~ V1*V2 + V3
>> >>> >>> >> >
>> >>> >>> >> >On 3/18/2015 2:17 AM, Luca Meyer wrote:
>> >>> >>> >> >>* Hello,
>> >>> >>> >> >*>>* I am facing a quite challenging task (at least to me) and
>> I
>> >>> >>> >> > was
>> >>> >>> >> >wondering
>> >>> >>> >> >*>* if someone could advise how R could assist me to speed the
>> >>> >>> >> > task
>> >>> >>> >> > up.
>> >>> >>> >> >*>>* I am dealing with a dataset with 3 discrete variables and
>> one
>> >>> >>> >> >continuous
>> >>> >>> >> >*>* variable. The discrete variables are:
>> >>> >>> >> >*>>* V1: 8 modalities
>> >>> >>> >> >*>* V2: 13 modalities
>> >>> >>> >> >*>* V3: 13 modalities
>> >>> >>> >> >*>>* The continuous variable V4 is a decimal number always
>> greater
>> >>> >>> >> > than
>> >>> >>> >> >zero in
>> >>> >>> >> >*>* the marginals of each of the 3 variables but it is
>> sometimes
>> >>> >>> >> > equal
>> >>> >>> >> >to zero
>> >>> >>> >> >*>* (and sometimes negative) in the joint tables.
>> >>> >>> >> >*>>* I have got 2 files:
>> >>> >>> >> >*>>* => one with distribution of all possible combinations of
>> >>> >>> >> > V1xV2
>> >>> >>> >> >(some of
>> >>> >>> >> >*>* which are zero or neagtive) and
>> >>> >>> >> >*>* => one with the marginal distribution of V3.
>> >>> >>> >> >*>>* I am trying to build the long and narrow dataset V1xV2xV3
>> in
>> >>> >>> >> > such
>> >>> >>> >> >a way
>> >>> >>> >> >*>* that each V1xV2 cell does not get modified and V3 fits as
>> >>> >>> >> > closely
>> >>> >>> >> >as
>> >>> >>> >> >*>* possible to its marginal distribution. Does it make sense?
>> >>> >>> >> >*>>* To be even more specific, my 2 input files look like the
>> >>> >>> >> >following.
>> >>> >>> >> >*>>* FILE 1
>> >>> >>> >> >*>* V1,V2,V4
>> >>> >>> >> >*>* A, A, 24.251
>> >>> >>> >> >*>* A, B, 1.065
>> >>> >>> >> >*>* (...)
>> >>> >>> >> >*>* B, C, 0.294
>> >>> >>> >> >*>* B, D, 2.731
>> >>> >>> >> >*>* (...)
>> >>> >>> >> >*>* H, L, 0.345
>> >>> >>> >> >*>* H, M, 0.000
>> >>> >>> >> >*>>* FILE 2
>> >>> >>> >> >*>* V3, V4
>> >>> >>> >> >*>* A, 1.575
>> >>> >>> >> >*>* B, 4.294
>> >>> >>> >> >*>* C, 10.044
>> >>> >>> >> >*>* (...)
>> >>> >>> >> >*>* L, 5.123
>> >>> >>> >> >*>* M, 3.334
>> >>> >>> >> >*>>* What I need to achieve is a file such as the following
>> >>> >>> >> >*>>* FILE 3
>> >>> >>> >> >*>* V1, V2, V3, V4
>> >>> >>> >> >*>* A, A, A, ???
>> >>> >>> >> >*>* A, A, B, ???
>> >>> >>> >> >*>* (...)
>> >>> >>> >> >*>* D, D, E, ???
>> >>> >>> >> >*>* D, D, F, ???
>> >>> >>> >> >*>* (...)
>> >>> >>> >> >*>* H, M, L, ???
>> >>> >>> >> >*>* H, M, M, ???
>> >>> >>> >> >*>>* Please notice that FILE 3 need to be such that if I
>> aggregate
>> >>> >>> >> > on
>> >>> >>> >> >V1+V2 I
>> >>> >>> >> >*>* recover exactly FILE 1 and that if I aggregate on V3 I can
>> >>> >>> >> > recover
>> >>> >>> >> >a file
>> >>> >>> >> >*>* as close as possible to FILE 3 (ideally the same file).
>> >>> >>> >> >*>>* Can anyone suggest how I could do that with R?
>> >>> >>> >> >*>>* Thank you very much indeed for any assistance you are
>> able to
>> >>> >>> >> >provide.
>> >>> >>> >> >*>>* Kind regards,
>> >>> >>> >> >*>>* Luca*
>> >>> >>> >> >
>> >>> >>> >> >       [[alternative HTML version deleted]]
>> >>> >>> >> >
>> >>> >>> >> >______________________________________________
>> >>> >>> >> >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
>> see
>> >>> >>> >> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> >>> >> >PLEASE do read the posting guide
>> >>> >>> >> >http://www.R-project.org/posting-guide.html
>> >>> >>> >> >and provide commented, minimal, self-contained, reproducible
>> code.
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >
>> >>> >>> >         [[alternative HTML version deleted]]
>> >>> >>> >
>> >>> >>> > ______________________________________________
>> >>> >>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
>> see
>> >>> >>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> >>> > PLEASE do read the posting guide
>> >>> >>> > http://www.R-project.org/posting-guide.html
>> >>> >>> > and provide commented, minimal, self-contained, reproducible
>> code.
>> >>> >>
>> >>> >>
>> >>
>> >>
>>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.