[R] How to show a specific value of a ggplot2

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Fri May 27 12:17:17 CEST 2022


Hello,

I think the function find_x_from_profile below does what you want.
I have used the data set in the first example of ?readARFF, the built-in 
and all-present data set iris.

The function returns a one line data.frame whose column names are "x" 
and "y". Pass the y-axis value in argument ynew and the value you want 
is output column "x".
The function only takes one y value at a time, this can be changed if 
needed.


suppressPackageStartupMessages({
   library(farff)
   library(mlr3)
   library(mlr3learners)
   library(mlr3filters)
   library(mlr3extralearners)
   library(DALEX)
   library(DALEXtra)
   library(readr)
   library(ggplot2)
})

# make the results reproducible
set.seed(2022)

# this is the data for the reprex
path <- tempfile()
writeARFF(iris, path = path)
data <- readARFF(path)
#  Parse with reader=readr : 
C:\Users\ruipb\AppData\Local\Temp\RtmpUxSDP3\file578778a1417
#  header: 0.000000; preproc: 0.000000; data: 0.110000; postproc: 
0.000000; total: 0.110000

# data = readARFF("ant.arff")
index <- sample(1:nrow(data), 0.7*nrow(data))
train <- data[index,]
test <- data[-index,]
task <- TaskRegr$new("data", backend = train, target = "Sepal.Length")

learner <- lrn("regr.randomForest")
model <- learner$train(task )

explainer <- explain_mlr3(model,
                          data = test[,-16],
                          y = as.numeric(test$Sepal.Length)-1,
                          label="RF")
#  Preparation of a new explainer is initiated
#    -> model label       :  RF
#    -> data              :  45  rows  5  cols
#    -> target variable   :  45  values
#    -> predict function  :  yhat.LearnerRegr  will be used (  default  )
#    -> predicted values  :  No value for predict function target 
column. (  default  )
#    -> model_info        :  package mlr3 , ver. 0.13.3 , task 
regression (  default  )
#    -> predicted values  :  numerical, min =  4.775823 , mean = 
5.892271 , max =  7.226967
#    -> residual function :  difference between y and yhat (  default  )
#    -> residuals         :  numerical, min =  -1.642701 , mean = 
-0.9922714 , max =  -0.2101927
#    A new explainer has been created!

m <- model_profile(explainer = explainer, variables = "Sepal.Width")

find_x_from_profile <- function(model, xvar, ynew) {
   if(length(ynew) > 1) {
     warn <- "'ynew' length is greater than 1, only the first is 
considered."
     warning(warn)
     ynew <- ynew[1]
   }
   ap <- m$agr_profiles[c("_yhat_", "_x_")]
   names(ap) <- c("yhat", "x")
   i <- order(ap$yhat)
   ap <- ap[i, ]
   j <- findInterval(ynew, ap$yhat)
   olddata <- data.frame(
     x = ap$yhat[order(i)][j:(j + 1)],
     y = ap$x[order(i)][j:(j + 1)]
   )
   newdata <- approx(olddata, xout = ynew)
   newdata <- as.data.frame(newdata)
   names(newdata) <- rev(names(newdata))
   newdata[2:1]
}

find_x_from_profile(m, xvar = "Sepal.Width", 5.85)
#           x    y
#  1 2.941472 5.85

newdata <- find_x_from_profile(m, xvar = "Sepal.Width", 5.85)

p <- plot(m)
p +
   geom_point(
     data = newdata,
     mapping = aes(x, y),
     color = "red",
     size = 2,
     inherit.aes = FALSE
   )


Hope this helps,

Rui Barradas



Às 08:54 de 27/05/2022, Neha gupta escreveu:
> I am sorry for that.
> 
> I used
> 
> library(farff)
> library(mlr3learners)
> library(mlr3filters)
> library(mlr3extralearners)
> library(mlr3)
> library(DALEX)
> library(DALEXtra)
> 
> data = readARFF("ant.arff")
> index= sample(1:nrow(data), 0.7*nrow(data))
> train= data[index,]
> test= data[-index,]
> task = TaskRegr$new("data", backend = train, target = "bug")
> 
> learner= lrn("regr.randomForest")
> model= learner$train(task )
> 
> explainer = explain_mlr3(model,
>                            data = test[,-16],
>                            y = as.numeric(test$bug)-1,
>                            label="RF")
> 
> m=model_profile(explainer = explainer, variables = "rfc")
> 
> plot(m)
> 
> Ant it shows a plot, with values of x axis (bug) and y axis (rfc)
> 
> I can manually see what is the value of bug at rfc=75, but I need the 
> exact value and by seeing the plot and guessing the rfc=75 value for bug 
> might not be the exact value I need.
> 
> Thank you
> 
> On Fri, May 27, 2022 at 9:39 AM Rui Barradas <ruipbarradas using sapo.pt 
> <mailto:ruipbarradas using sapo.pt>> wrote:
> 
>     Hello,
> 
>     Neha, it's not the first time you post questions to R-Help, please,
>     please!, start your scripts by loading the packages needed.
> 
>     I have never used package DALEX but for what I understand from its
>     documentation it  helps to explore and explain models behavior. If your
>     profile plot was output by method plot.model_profile(), the workflow is
>     or seems to be
> 
>     1. fit a model;
>     2. create an object of S3 class "model_profile" with functions
>     explain()
>     and model_profile();
>     3. plot that object.
> 
> 
>     So to know what is the value of y for a given x, predict from the
>     fitted
>     model, package DALEX and its plots have nothing to do with it.
>     If there's a predict method for the fitting function, then it should be
>     as simple as
> 
> 
>     newdata75 <- data.frame(x = 75)
>     y75 <- predict(fit, newdata = newdata75)
> 
> 
>     or something similar.
> 
>     I have never used this package so I might be completely wrong.
> 
>     Hope this helps,
> 
>     Rui Barradas
> 
>     Às 08:09 de 27/05/2022, Neha gupta escreveu:
>      > Thank you Rui, Avi
>      >
>      > I am using the plot(), in the Dalex package and it implements the
>     ggplot.
>      >
>      > So I only used plot(mydata) and it displays the ggplot . If we
>     need to
>      > adjust or make further changes in the plot, I think people use
>      >
>      > plot + .....
>      > I don't know if this group support the image pasting but my plot is
>      > showing like below. (bugs is a variable in my data whose values are
>      > displayed on y-axis and RFC is another variable in my dataset whose
>      > value is shown on the x-axis. I want to know exactly (not
>     necessarily
>      > using the plot, a simple print function should also work for me)
>     what is
>      > the value of 'bug' when the value of 'rfc' is 75.
>      >
>      > image.png
>      >
>      >
>      > On Fri, May 27, 2022 at 7:49 AM Rui Barradas
>     <ruipbarradas using sapo.pt <mailto:ruipbarradas using sapo.pt>
>      > <mailto:ruipbarradas using sapo.pt <mailto:ruipbarradas using sapo.pt>>> wrote:
>      >
>      >     Hello,
>      >
>      >     If you cannot determine the exact value of y for given x,
>     then isn't
>      >     your problem how to determine an approximate value of y? Once
>     you have
>      >     it, it's easy to plot it.
>      >
>      >     With newdata = data.frame(x = 75, y = ???),
>      >
>      >
>      >     ggplot(mydata, mapping = aes(x, y)) +
>      >         geom_point(color = "black") +
>      >         geom_point(newdata, mapping = aes(x, y), color = "red") +
>      >         xlim(0, 200)
>      >
>      >
>      >     The question is how to find newdata$y, interpolation, other
>     method?
>      >
>      >     Hope this helps,
>      >
>      >     Rui Barradas
>      >
>      >     Às 00:40 de 27/05/2022, Neha gupta escreveu:
>      >      > I have a ggplot2 which has x-values 0-200 and y values 0-10
>      >      >
>      >      > p=plot(mydata)
>      >      > p+xlim(0, 200)
>      >      >
>      >      > I want to show what is the y value when we have 75 as x value.
>      >     The graph
>      >      > which is displayed has a broad range (like 0-50, 50-100
>     etc on x
>      >     axis) and
>      >      > cannot determine the exact value of y at the value of 75
>     on x-axis.
>      >      >
>      >      > Thank you
>      >      >
>      >      >       [[alternative HTML version deleted]]
>      >      >
>      >      > ______________________________________________
>      >      > R-help using r-project.org <mailto:R-help using r-project.org>
>     <mailto:R-help using r-project.org <mailto:R-help using r-project.org>> mailing list
>      >     -- To UNSUBSCRIBE and more, see
>      >      > https://stat.ethz.ch/mailman/listinfo/r-help
>     <https://stat.ethz.ch/mailman/listinfo/r-help>
>      >     <https://stat.ethz.ch/mailman/listinfo/r-help
>     <https://stat.ethz.ch/mailman/listinfo/r-help>>
>      >      > PLEASE do read the posting guide
>      > http://www.R-project.org/posting-guide.html
>     <http://www.R-project.org/posting-guide.html>
>      >     <http://www.R-project.org/posting-guide.html
>     <http://www.R-project.org/posting-guide.html>>
>      >      > and provide commented, minimal, self-contained,
>     reproducible code.
>      >
>



More information about the R-help mailing list