[R] detect and replace outliers by the average

AbouEl-Makarim Aboueissa @boue|m@k@r|m1962 @end|ng |rom gm@||@com
Thu Apr 20 20:46:55 CEST 2023


Hi Rui:


here is the dataset

factor x1 x2
0 700 700
0 700 500
0 470 470
0 710 560
0 5555 520
0 610 720
0 710 670
0 610 9999
1 690 620
1 580 540
1 690 690
1 NA 401
1 450 580
1 700 700
1 400 8888
1 6666 600
1 500 400
1 680 650
2 117 63
2 120 68
2 130 73
2 120 69
2 125 54
2 999 70
2 165 62
2 130 987
2 123 70
2 78
2 98
2 5
2 321 NA

with many thanks
abou
______________________


*AbouEl-Makarim Aboueissa, PhD*

*Professor, Mathematics and Statistics*
*Graduate Coordinator*

*Department of Mathematics and Statistics*
*University of Southern Maine*



On Thu, Apr 20, 2023 at 2:44 PM Rui Barradas <ruipbarradas using sapo.pt> wrote:

> Às 19:36 de 20/04/2023, AbouEl-Makarim Aboueissa escreveu:
> > Dear All:
> >
> >
> >
> > *Re:* detect and replace outliers by the average
> >
> >
> >
> > The dataset, please see attached, contains a group factoring column “
> > *factor*” and two columns of data “x1” and “x2” with some NA values. I
> need
> > some help to detect the outliers and replace it and the NAs with the
> > average within each level (0,1,2) for each variable “x1” and “x2”.
> >
> >
> >
> > I tried the below code, but it did not accomplish what I want to do.
> >
> >
> >
> >
> >
> > data<-read.csv("G:/20-Spring_2023/Outliers/data.csv", header=TRUE)
> >
> > data
> >
> > replace_outlier_with_mean <- function(x) {
> >
> >    replace(x, x %in% boxplot.stats(x)$out, mean(x, na.rm=TRUE))  #### ,
> > na.rm=TRUE NOT working
> >
> > }
> >
> > data[] <- lapply(data, replace_outlier_with_mean)
> >
> >
> >
> >
> >
> > Thank you all very much for your help in advance.
> >
> >
> >
> >
> >
> > with many thanks
> >
> > abou
> >
> >
> > ______________________
> >
> >
> > *AbouEl-Makarim Aboueissa, PhD*
> >
> > *Professor, Mathematics and Statistics*
> > *Graduate Coordinator*
> >
> > *Department of Mathematics and Statistics*
> > *University of Southern Maine*
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> Hello,
>
> There is no data set attached, see the posting guide on what file
> extensions are allowed as attachments.
>
> As for the question, try to compute mean(x, na.rm = TRUE)  first, then
> use this value in the replace instruction. Without data I'm just guessing.
>
> Hope this helps,
>
> Rui Barradas
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list