# [R] Zero Index Origin?

Richard A. O'Keefe ok at cs.otago.ac.nz
Fri Apr 2 02:13:04 CEST 2004

```I asked where index origin 0 would help.
(I am familiar with Dijkstra's article on why counting should start
at 0 and agree whole-heartedly.  I am also convinced that "kicking

Peter Wolf <s-plus at wiwi.uni-bielefeld.de>
has provided two examples.

(1) Converting pseudo-code which assumes index origin 0.

I've done this myself, although not yet in R.  I've also done it
the other way, converting index origin 1 code to C.

A good method, I've found, is to to start by converting to
index-free form.  This applies whatever the source and target
languages are.

(2) A sorting algorithm.

sort.6 <- function (a) {
n <- length(a)
adapt <- function (i) {i+1}
a <- c(0,a)
for (i in 2:n) {
j <- i-1
j <- j-1
}
}
a[-1]
}

The really interesting thing here is that this basically is an
index origin 1 algorithm.  The original array and the final result
start at 1, not 0, and position 0 is used for a "sentinel".

Let's convert it to 1-origin.  I'll skip the details of how I
did it because that's not the point I want to make.

sort.VI <- function (a) {
a <- c(0, a)
for (i in 3:length(a)) {
j <- i-1
a <- a[i]
while (a[j] > a) {
a[j+1] <- a[j]
j <- j-1
}
a[j+1] <- a
}
a[-1]
}

What do you get if you move up to index-free form?

sort.six <- function (a) {
s <- c()
for (x in a) {
f <- s <= x
s <- c(s[f], x, s[!f])    # insert x stably into s
}
s
}

It's clear that sort.six is shorter, clearer, and easier to get
right than sort.VI.  But how much do we have to pay for this?
How much efficiency do we lose?

> a <- runif(400)
> system.time(sort.VI(a))
  3.64  0.02 12.56  0.00  0.00
> system.time(sort.six(a))
 0.15 0.01 0.16 0.00 0.00

We don't lose any efficiency at all.  We gain, considerably.
(Not as much as we'd gain by using the built-in sort(), of course.)

I expect that this will happen fairly often:  rewriting the code to be
index-free will *usually* make it shorter and clearer, and will *always*
make it easier to adapt to a language with a different index origin.
When the target language is R or S, the result is likely to be faster
than a direct conversion.

```