[Rd] Re: ichar() function in R : 1st implementation, RFC

Kurt Hornik Kurt.Hornik at wu-wien.ac.at
Thu Oct 23 17:14:02 MEST 2003


>>>>> Martin Maechler writes:

> (RFC := Request For Comments)

My preference would be that we start by adding a version of sscanf().

-k

>>>>> "Tim" == Tim Keighley <Tim.Keighley at csiro.au>
>>>>>     on Thu, 23 Oct 2003 11:45:22 +1000 writes:

Tim> Hi Martin,
Tim> In October 2000 you wrote to r-help:

>>>> which reminds me that I've had a desire for something like
>>>> the old S function [from the blue book, and library(examples) I think]
>>>> ichar(ch)
>>>> which would return a vector of integers, each the (decimal) equivalent of
>>>> the (ISO-latin1) representation of the corresponding characters in ch.
>>>> 
>>>> This should be easy enough (and be done in C).
>>>> Any volunteers?

Tim> Did you get any volunteers?
> no.
> Thank you for reminding me!

Tim> Is this function or an
Tim> equivalent now available in R? I have been unsuccessful
Tim> in my investigation but I do not have all the available
Tim> CRAN packages, so it might be in one of them.

> I've searched myself (we have almost all of them installed), and
> didn't find anything.

Tim> Cheers,
Tim> Tim Keighley

> I now did a first cut, using R only code,
> (and realizing that most of  chars8bit() should really happen in C).

> I'm proposing to add something like this to R-devel in the near
> future.
> Note that  AsciiToInt() and ichar() ar for S-plus and "old S"
> compatibility, whereas I think we'd really want (equivalents) of
> the three functions
>     digitsBase()
>     chars8bit()
>     strcodes()
> in R eventually.  

> I'm very interested in feedback, 
> particularly
> - function and arguments' naming
> - proposals for improvements
> - neat examples of usage.

> Martin

> Martin Maechler <maechler at stat.math.ethz.ch>	http://stat.ethz.ch/~maechler/
> Seminar fuer Statistik, ETH-Zentrum  LEO C16	Leonhardstr. 27
> ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
> phone: x-41-1-632-3408		fax: ...-1228			<><


> ### This was  digits.v()  in library(sfsmisc):
> ### --> get it's help() file /u/maechler/R/Pkgs/sfsmisc/man/digits.Rd
> digitsBase <- function(x, base = 2, ndigits = 1 + floor(log(max(x),base)))
> {
>     ## Purpose: Give the vector A of the base-_base_ representation of _n_:
>     ## -------  n = sum_{k=0}^M  A_{M-k} base ^ k ,   where  M = length(a) - 1
>     ## Value: MATRIX  M where  M[,i]  corresponds to  x[i]
>     ##	c( result ) then contains the blocks in proper order ..
>     ## Author: Martin Maechler, Date:  Wed Dec  4 14:10:27 1991
>     ## ----------------------------------------------------------------
>     ## Arguments: x: vector of non-negative integers
>     ##	base: Base for representation
>     ##	ndigits: Number of digits/bits to use
>     ## EXAMPLE: digitsBase(1:24, 8) #-- octal representation
>     ## ----------------------------------------------------------------
>     if(any((x <- as.integer(x)) < 0))
>         stop("`x' must be non-negative integers")
>     r <- matrix(0, nrow = ndigits, ncol = length(x))
>     if(ndigits >= 1) for (i in ndigits:1) {
>         r[i,] <- x %% base
>         if (i > 1) x <- x %/% base
>     }
>     r
> }


> ### This is an improved version of  make.ASCII() in 1991's ~/S/Good-string.S !
> chars8bit <- function(i = 0:255)
> {
>     ## Purpose: Compute a character vector from its "ASCII" codes.
>     ## We seem to have to use this complicated way thru text and parse.

>     ## Author: Martin Maechler, Original date: Wed Dec 4, 1991
>     ## ----------------------------------------------------------------
>     i <- as.integer(i)
>     if(any(i < 0 | i > 255)) stop("`i' must be in 0:255")
>     i8 <- apply(digitsBase(i, base = 8), 2, paste, collapse="")
>     c8 <- paste('"\\', i8, '"', sep="")
>     eval(parse(text = paste("c(",paste(c8, collapse=","),")", sep="")))
> }

> strcodes <- function(x, table = chars8bit(0:255))
> {
>     ## Purpose: R (code) implementation of old S's ichar()
>     ## ----------------------------------------------------------------------
>     ## Arguments: x: character vector
>     ## ----------------------------------------------------------------------
>     ## Author: Martin Maechler, Date: 23 Oct 2003, 12:42

>     lapply(strsplit(x, ""), match, table = table)
> }

> ## S-PLUS has  AsciiToInt() officially, and   ichar() in  library(examples):
> AsciiToInt <- ichar <- function(strings) unname(unlist(strcodes(strings)))

> ## Examples:

> all8bit <- chars8bit(0:255)
> matrix(all8bit, 32, 8, byrow = TRUE)

> x <- c(a = "abc", bb = "BLA & blu", Person = "Mächler, Zürich")
> strcodes(x)
> AsciiToInt(x)
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list