[R] How to parallelize a process called by a socket connection

Hervé Pagès hp@ge@ @end|ng |rom |redhutch@org
Sun Feb 2 01:07:35 CET 2020


Seems like you've replied to an existing thread to ask a new question 
(your post gets buried deep inside the "How to extract or sort values 
from one column" thread in my Thunderbird). Unfortunately this means 
that a lot of people who might be able to help you will miss it.

H.


On 2/1/20 11:24, James Spottiswoode wrote:
> Hi R Experts,
> 
> I’m using R version 3.4.3 running under Linux on an AWS EC2 instance.  I have an R code listening on a port for a socket connection which passes incoming data to a function the results of which are then passed back to the calling machine.  Here’s the function that listens for a socket connection:
> 
> # define server function
> server <- function() {	
>    while(TRUE){
>   	con <- socketConnection(host="localhost", port = server_port, blocking=TRUE,
>                              server=TRUE, open="r+", timeout = 100000000)
>      	data <- readLines(con, 1L, skipNul = T, ok = T)
>      	response <- check(data)
>      	if (!is.null(response)) writeLines(response, con)
>    }
> }
> 
> The server function expects to receive a character string which is then passed to the function check().  check() is a large, complex routine which does text analysis and many other things and returns a JSON string to be passed back to the calling machine.
> 
> This all works perfectly except that while check() spends ~50ms doing its stuff no more requests can be received and processed. Therefore if a new request comes in sooner than ~50ms after the last one, it is not processed. I would therefore like to parallelize this so that the box can be running more than one check() process simulatanously.  I’m familar with several of the paralyzing R packages but I cannot see how to integrate them with the socket connection side of things.
> 
> Currently I have a kludge which is a round-robin approach to solving the problem.  I have 4 versions of the whole R code listening on 4 different ports, say P1, P2, P3, P4, and the calling machine issues calls in sequence to ports P1,P2,P3,P4,P1… etc. This mitigates, but doesn’t solve, the problem.
> 
> Any advice would be greatly appreciated!  Thanks.
> 
> James
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=2N70kU171QMzQHhg6A9N3op5jqv8uCm9-njqZfPW3Ok&s=h4ZzqcZ-uTxQeMUcI1l7nHEQHY-Vn-EQsKH83fU7B3s&e=
> PLEASE do read the posting guide https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=2N70kU171QMzQHhg6A9N3op5jqv8uCm9-njqZfPW3Ok&s=GgmKzz9H7MAj3iy7Pu4U0q5v02Fumnl3hjxug2SY1zk&e=
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages using fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-help mailing list