[R] Various newbie questions

tobias.verbeke@bivv.be tobias.verbeke at bivv.be
Wed Feb 4 14:40:22 CET 2004

r-help-bounces at stat.math.ethz.ch wrote on 04/02/2004 12:33:15:

> Hello,
> 1) What is the difference between a "data frame" (J H Maindonald, Using
> R, p. 12) and a "vector"?

a vector is a sequence of data of a certain kind ("of a certain mode").
You can have a vector of numbers

> myvector <- c(1,2,3,4,5,6)
> mode(a)
[1] "numeric"

or a vector of character strings

> myvector <- c("have", "a", "look", "at", "Manuals", "section", "on",
> mode(myvector)
[1] "character"

or vectors of other kinds (e.g. logical).

a data frame is what you would call 'une matrice de données' in French.
In R a matrix can contain only one type of data (e.g. numerical data or
character strings) whilst a data.frame can contain different data types
in different columns (one per column, though).

These things are explained more clearly in "An Introduction to R",
that you can find on CRAN in the Manuals section.

> In Using R, the author asks the reader to enter the following data in a
> data frame, which I will call "mydata":
> year snow.cover
> 1970 6.5
> 1971 12.0
> 1972 14.9
> 1973 10.0
> 1974 10.7
> 1975 7.9
> ...
> mydata=data.frame(year=c(1970,...),snow.cover=c(6.5,...))
> 2) How to you retrieve say, snow.cover's second data item? mydata[1][2]
> does not work, neither does mydata[1,2].

mydata[i,j] will give you the element of the ith row and jth column
mydata[2,2] will give what you want.

[question on histogram]

> In a French statistics book, the author provides the following data:
> Group A   Number: 35   Mean:27
> Group B  Number: 42  Mean:24
> and asks: "what is the mean of the group constituted by the reunion of
> the two groups?"
> The answer is of course (27 x 35) + (24 x 42) / 77
> 3) Is there a way to compute this mean in R (apart from doing the above
> operation, of course) if you have two sets of data?

R is a very flexible programming language, so you can do
a lot of what you can imagine and more.

If you have the two sets, there is no need to do this,
just concatenate these two sets and calculate the mean.
If you want to make your own function for doing this,
it could be done as follows:

myfun <- function(set1, set2){
  set1and2 <- c(set1, set2)
  overallmean <- mean(set1and2)

Then use this new user-defined function with
two vectors of your own, say a and b

> a <- c(1,2,3)
> b <- c(4,5,6)
> myfun(a, b)
[1] 3.5

If you only have the data of the exercise in the
statistics textbook, you can use
the weighted.mean function of R:

> weighted.mean(c(27,24), w=c(35,42))
[1] 25.36364

which is correct

> (27*35+24*42)/77
[1] 25.36364

> 4) How do you set class limits in R, for instance
> 10-20
> 21-31
> etc.

For this you could use the cut() function
Type ?cut at the R prompt and the help page
on this function will show up.

> 5) How do you determine quartiles in R? Is there a way to determine the
> "semi-inter-quartile deviation" ("écart semi-inter-quartile" in
> French)?

I know of IQR, but am not sure it is what you want.
Read its help page by typing ?IQR

I hope that this helps,


More information about the R-help mailing list