[R] RegExp question

Andrej andrej.kastrin at gmail.com
Wed Jun 16 19:05:06 CEST 2010


Sorry, I apologize. Below is the minimal example.

library(RWeka)
model <- J48(as.factor(Species)~., data = iris)
> model
J48 pruned tree
------------------

Petal.Width <= 0.6: setosa (50.0)
Petal.Width > 0.6
|   Petal.Width <= 1.7
|   |   Petal.Length <= 4.9: versicolor (48.0/1.0)
|   |   Petal.Length > 4.9
|   |   |   Petal.Width <= 1.5: virginica (3.0)
|   |   |   Petal.Width > 1.5: versicolor (3.0/1.0)
|   Petal.Width > 1.7: virginica (46.0/1.0)

Number of Leaves  : 	5

Size of the tree : 	9

So, the task is to extract the number of leases.

Andrej

On Jun 16, 6:58 pm, David Winsemius <dwinsem... at comcast.net> wrote:
> Publicly produce something we can work with. I have no idea how to  
> create an example that will match such an object.
>
> ?dput
> ?dump
>
> Read Posting Guide.
> --
> David.
>
> On Jun 16, 2010, at 12:54 PM, Andrej wrote:
>
>
>
> > Thanks David for your fast reply, but now I realized tat "string" is
> > of type:
>
> >> class(string)
> > [1] "jobjRef"
> > attr(,"package")
> > [1] "rJava"
>
> > so I get an error when i try with gsub or sub:
>
> >> sub("^.+\\t(\\d+)\\n.+$", "\\1", string)
> > Error in as.character.default(x) :
> >  no method for coercing this S4 class to a vector
>
> > I think that there should be trivial solution, but... Any further
> > idea?
>
> > Regards, Andrej
>
> > On Jun 16, 6:47 pm, David Winsemius <dwinsem... at comcast.net> wrote:
> >> On Jun 16, 2010, at 12:04 PM, Andrej wrote:
>
> >>> Dear all,
>
> >>> I'm trying to filter out the "number of leaves" (it should be 1 in  
> >>> the
> >>> example below) from the following string:
>
> >>>> string
> >>> [1] "Java-Object{J48 pruned tree\n------------------\n: 0  
> >>> (15.0/3.0)\n
> >>> \nNumber of Leaves  : \t1\n\nSize of the tree : \t1\n}"
>
> >>> Any idea how to do that as simple as possible? Thanks in advance for
> >>> any advice.
>
> >> ?sub   # or ?gsub if you need more than one pattern matched (they are
> >> on the same page).
>
> >> This should find the first occurrence of digits following a tab
> >> terminated by a line feed and then return only the digits:
>
> >> string <- "Java-Object{J48 pruned tree\n------------------\n: 0
> >> (15.0/3.0)\n \nNumber of Leaves  : \t1\n\nSize of the tree : \t1\n}"
> >> sub("^.+\\t(\\d+)\\n.+$", "\\1", string)
> >> [1] "1"
>
> >> The parens within the search pattern are matched to "\\1". Need to
> >> double backslashed within patterns.
>
> >>> Regards, Andrej
>
> >> --
>
> >> David Winsemius, MD
> >> West Hartford, CT
>
> >> ______________________________________________
> >> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/
> >> listinfo/r-help
> >> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> > ______________________________________________
> > R-h... at r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list