[R] selecting a subset of files to be processed

Erin Hodgess erinm.hodgess at gmail.com
Sun Jul 29 06:03:23 CEST 2012


Thanks so much!


On Sat, Jul 28, 2012 at 1:32 PM, Ted Harding <Ted.Harding at wlandres.net> wrote:
> And, in addition to the tip from Rui (and similar from Joshua) below,
> I would advise that there is one good reason not to try doing it
> in "pure Linux".
>
> The only source (that I know of) in Linux itself for random numbers
> can be tapped by something like
>
>   cat /dev/random > filename
>
> /dev/random stores noise generated by the timings of system events
> (keyboard presses, mouse-clicks, disk accesses, interrupts, etc.)
> after subjecting them to a high-entropy stirring process. See:
>
>   man random
>
> It yields them in the form of random bytes (each of 8 random 0/1 bits)
> and you would have to devise some means of coverting those onto a
> form suitable for accessing a directory listing at random. Not a
> pretty task!
>
> There is also the command 'rand' available in the openSSL toolkit,
> but that still outputs the results in the same format as /dev/random.
>
> If you really want to do this outside R, the I would suggest writing
> a little C program (to be run from the Linux command line). C can
> do its own random number generation, with results returned as
> real (double), and then apply these to select at random from the
> contents of a file generated by something like
>
>   ls filesdir > filelist.txt
>
> and output the random selection.
>
> Ted.
>
> On 28-Jul-2012 18:00:38 Rui Barradas wrote:
>> Hello,
>>
>> If the files are to be processed in R select a random sample in R.
>> Using list.files() you can assign a character vector with the filenames
>> of interest and then sample from that vector.
>>
>> ?list.files
>> filenames <- list.files(path, pattern)
>>
>> rand.sampl <- sample(filenames, 45)
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> Em 28-07-2012 18:49, Erin Hodgess escreveu:
>>> Dear R People:
>>>
>>> I am using a Linux system in which I have about 3000 files.
>>>
>>> I would like to randomly select about 45 of those files to be processed in
>>> R.
>>>
>>> Could I make the selection in R or should I do it in Linux, please?
>>>
>>> This is with R-2.15.1.
>>>
>>> Thanks,
>>> erin
>>>
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> -------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
> Date: 28-Jul-2012  Time: 19:32:26
> This message was sent by XFMail
> -------------------------------------------------



-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodgess at gmail.com



More information about the R-help mailing list