[R] SDLC methodology for R and Data science......

Richard O'Keefe r@oknz @end|ng |rom gm@||@com
Sun Feb 20 09:38:14 CET 2022


Come to think of it, there's a CMU report
you can get free:
https://resources.sei.cmu.edu/asset_files/TechnicalReport/2000_005_001_13751.pdf
The books are "Introduction to the Personal
Software Process" by Watts S. Humphrey,
https://www.amazon.com/Introduction-Personal-Software-Process-Humphrey/dp/0201548097
and "PSP: A Self-Improvement Process for
Software Engineers",
https://www.amazon.com/gp/product/B001EWOG8A/ref=dbs_a_def_rwt_bibl_vppi_i0

You might like to look into using R's "testthat"
package (or some other testing framework), see
http://rstudio-pubs-static.s3.amazonaws.com/278724_4d8935a2955c49d9934e2113c737e70e.html
for an introduction.

Way back in 1982(?) when I was getting stuck on
my PhD thesis, my co-supervisor (Lawrence Byrd)
said "write an example of using your system, and
then explain how it does that."  That unstuck me,
and it was pretty much test-driven development
in a nutshell (as we put it in those days, program
by debugging the empty program).

If you're looking for whizzy tools,
- knitr (documentation, there's a good little book)
- testthat (testing)
- lintr (static checking).



On Wed, 16 Feb 2022 at 05:27, akshay kulkarni <akshay_e4 using hotmail.com> wrote:

> Dear richard,
>                       I am very grateful for your informative reply.
>
> THe fact is, I am doing a project, which is not less complex,(if not more)
> than those of Microsoft or Accenture or Google , but I am doing it all by
> myself. Can you please let me the full title of the book by Watts Humphrey?
> Or any online resources for "personal software process"? Perhaps I can get
> some tips on how to go about my project ( I've mostly taken into account
> standard methods of the state of the art, I am looking for something
> "whizzy" than aids development by one person).
>
> Thanks again,
> Yours sinecerly,
> AKSHAY M KULKARNI
> ------------------------------
> *From:* Richard O'Keefe <raoknz using gmail.com>
> *Sent:* Monday, February 14, 2022 5:23 AM
> *To:* akshay kulkarni <akshay_e4 using hotmail.com>
> *Cc:* R help Mailing list <r-help using r-project.org>
> *Subject:* Re: [R] SDLC methodology for R and Data science......
>
> There are at least two ways to use R.
> If you have devised a statistical/data science technique
> and are writing a package to be used by other people,
> that is normal software development that happens to be
> using R and the R tool.  Lots of attention to documentation
> and tests.  Test-Driven Development is one approach.
>
> Many R users aren't developing code for other people.
> They are trying to make sense of some kind of data.
> This is what used to be called "exploratory programming".
> And heavyweight development processes aren't really
> appropriate for this kind of work.  In traditional terms,
> when you are doing exploratory programming, you spend
> most of your time in the requirements phase.
>
> Perhaps the most important thing here is to keep a log
> of what you are doing and record things that didn't work,
> why they didn't work, and what you learned from it.
> When something DOES give you some insight, you want to
> be able to do it again.
>
> The tricky thing is scaling from exploration to development.
> After playing around with one data set, you might want to
> provide a script that other people can use to process
> similar data sets the same way.
> Use a light weight process, but make sure you have plenty
> of tests, and adequate documentation.
>
> Watts Humphrey developed something he called the "Personal
> Software Process" and wrote a book about it.  I don't like
> his examples for several reasons, but the point about
> watching what you do and measuring it so you can improve is
> well made.
>
>
>
> On Mon, 14 Feb 2022 at 05:33, akshay kulkarni <akshay_e4 using hotmail.com>
> wrote:
>
> dear members,
>                          I am Stock trader and using R for research.
>
> Until now I was coding very haphazardly, but recently I stumbled upon the
> Software Development Life Cycle (SDLC), which introduced me to principled
> software design. I am college dropout and don't have in depth knowledge in
> Software Engineering principles. However, now, I want to go in a structured
> manner.
>
> I googled for a SDLC method (like XP, AGILE and WATERFALL) that suits the
> R programming language and specifically for data science, but was bootless.
> Do you people have any idea on which software engineering methodology to
> use in R and data science, so that I can code efficiently and in a
> structured manner? The point to note, with regards to R, is that
> statistical ANALYSIS sometimes takes very little code as compared to other
> programming languages. Any SDLC method for these types of analysis,
> besides, rigorous scripting with R?
>
> Thanking you,
> Yours sincerely,
> AKSHAY M KULKARNI
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list