[R] how to address last and all but last column in dataframe

David Winsemius dwinsemius at comcast.net
Sat Sep 6 21:52:17 CEST 2008


Not sure where your "input" came from. It's not in a format I would  
have expected of an R object and the first line is not in a form that  
would be particularly easy to read into a valid R object. Numbers are  
no legitimate object names. It's also not clear what you want to do  
with the duplicated line numbers at the beginning. Your question  
implies that you do not consider them part of the data.

In the future a worked example along the lines of that constructed by  
Jorge Ivan Velez in a recent answer to another question might increase  
chances of a prompt reply with tested code:

# Data set
DF=read.table(textConnection("V1 V2 V3
a    b    0:1:12
d    f    1:2:1
c    d    1:0:9
b    e    2:2:6
f    c    5:5:0"),header=TRUE)
closeAllConnections()

The "length" of a dataframe is the number of columns.

?length

Dataframes can be referenced using the extract operation e.g.   
df[<row>, <col>]

?Extract       # for additional information on indexing using  column  
vectors.

So:

video[ ,length(video)]  #should return the last column vector although  
it will be no longer be named.

The rest of the dataframe with intact column names could be obtained  
with:

video[  ,-length(video)]

-- 
David Winsemius


On Sep 6, 2008, at 3:00 PM, drflxms wrote:

> Dear R-colleagues,
>
> another question from a newbie: I am creating a lot of simple
> pivot-charts from my raw data using the reshape-package. In these  
> charts
> we have medical doctors judging videos in the columns and the videos
> they judge in the rows. Simple example of chart/data.frame "input"  
> with
> two categories 1/0:
>
> video 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
>
> 1      1 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
> 2      2 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  1
> 3      3 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
> 4      4 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
> 5      5 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  1  0
> 6      6 0 0 0 0 0 0 0 0 0  0  0  0  0  1  0  0  0  0  0  0  0
> 7      7 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
> 8      8 0 0 0 0 0 0 0 0 0  0  0  0  0  0  1  0  0  0  0  0  0
> 9      9 0 0 0 0 0 0 0 0 0  1  0  1  1  0  1  1  0  0  0  1  0
> 10    10 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
>
> I recently learned, that I can easily create a confusion matrix out of
> this data using the following commands:
>
> pairs<-data.frame(pred=factor(unlist(input[2:21])),ref=factor(input[, 
> 22]))
> pred<-pairs$pred
> ref <- pairs$ref
> library (caret)
> confusionMatrix(pred, ref, positive=1)
>
> - where column 21 is the reference/goldstandard.
>
> My problem is now, that I analyse data.frames with an unknown count of
> columns. So to get rid of the first and last column for the "pred"
> variable and to select the last column for the "ref" variable, I  
> have to
> look at the data.frame before doing the above commands to set the  
> proper
> column numbers.
>
> It would be very comfortable, if I could address the last column not  
> by
> number (where I have to count beforehand) but by a variable "last  
> column".
>
> Probably there is a more easy solution for this problem using the  
> names
> of the columns as well: the reference is always number "21" the first
> column is always called "video". So I tried:
>
> attach(input)
> pairs<-data.frame(pred=factor(unlist(input[[,-c(video, 
> 21)]])),ref=factor(input[[21]]))
>
> which does not work unfortunately :-(.
>
> I'd be very happy in case someone could help me out, cause I am really
> tired of counting - there are a lot of tables to analyse...
>
> Cheers and greetings from Munich,
> Felix
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list