[R] Elementary Help

arun smartpink111 at yahoo.com
Wed Jun 19 18:42:08 CEST 2013


Hi,
Based on the information you provided, the solution should be the one I provided earlier.  Otherwise, I must have misunderstood your question.  In your first post, you mentioned the IDs range from 5:200.   So, the question is not clear.


dat1<- read.table(text="
timeSec pupilId pupilName
137237 57 LaurenColes
137250 57 LaurenColes
137254 59 JackGough
137262 57 LaurenColes
137275 92 GraceChapman
137281 59 JackGough
137285 111 DavidHenderson
137291 57 LaurenColes
137297 92 GraceChapman
137305 68 AmeliaNorth
137306 82 AlexBruce
137309 92 GraceChapman
137311 111 DavidHenderson
137325 57 LaurenColes
137328 82 AlexBruce
137329 68 AmeliaNorth
137330 111 DavidHenderson
137330 104 SofiaMorrison
137335 15 KieraNoble
137340 34 LouisTalbot
137342 20 EllaOConnor
137345 68 AmeliaNorth
137346 57 LaurenColes
137349 65 AmeliaMiller
137351 40 KatieWinter
137353 34 LouisTalbot
137357 115 NoahStorey
137357 92 GraceChapman
",sep="",header=TRUE,stringsAsFactors=FALSE) 

 IDs<-1:161
 setdiff(IDs,dat1$pupilId)
#  [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  16  17  18  19
# [19]  21  22  23  24  25  26  27  28  29  30  31  32  33  35  36  37  38  39
# [37]  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  58  60
# [55]  61  62  63  64  66  67  69  70  71  72  73  74  75  76  77  78  79  80
# [73]  81  83  84  85  86  87  88  89  90  91  93  94  95  96  97  98  99 100
# [91] 101 102 103 105 106 107 108 109 110 112 113 114 116 117 118 119 120 121
#[109] 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139
#[127] 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157
#[145] 158 159 160 161
 sum(setdiff(IDs,dat1$pupilId))
#[1] 12179
 length(setdiff(IDs,dat1$pupilId))  #changes when you use the actual dataset
#[1] 148


A.K.


Sorry, this is my first time I am using the forum. The data I have is the following: 

timeSec	pupilId	pupilName 
137237	57	LaurenColes 
137250	57	LaurenColes 
137254	59	JackGough 
137262	57	LaurenColes 
137275	92	GraceChapman 
137281	59	JackGough 
137285	111	DavidHenderson 
137291	57	LaurenColes 
137297	92	GraceChapman 
137305	68	AmeliaNorth 
137306	82	AlexBruce 
137309	92	GraceChapman 
137311	111	DavidHenderson 
137325	57	LaurenColes 
137328	82	AlexBruce 
137329	68	AmeliaNorth 
137330	111	DavidHenderson 
137330	104	SofiaMorrison 
137335	15	KieraNoble 
137340	34	LouisTalbot 
137342	20	EllaOConnor 
137345	68	AmeliaNorth 
137346	57	LaurenColes 
137349	65	AmeliaMiller 
137351	40	KatieWinter 
137353	34	LouisTalbot 
137357	115	NoahStorey 
137357	92	GraceChapman 
etc... 

The exact quesiton is the following: 

Some ids in the range 1 to 161 are unused (e.g. 4,7,9). In fact,
 there are 30 unused pupil Ids  between 1 and 161. What is the sum of 
these 30 integers? 
----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: R help <r-help at r-project.org>
Cc: 
Sent: Wednesday, June 19, 2013 9:26 AM
Subject: Re: Elementary Help

HI,
Probably, this is the case.  It is better to provide a reproducible example data as mentioned in the posting guide.
set.seed(24)
dat1<- data.frame(ID=c(1:3,5:8,10:14),value=sample(1:40,12,replace=TRUE))
 IDs<- 1:14  #the possible ID list
setdiff(IDs,dat1$ID)
#[1] 4 9
length(setdiff(IDs,dat1$ID))
#[1] 2
A.K.

Hi, Unfortunately somehow it won't help. The unused values are not NA, 
the unused values are simply not there. Since these are student Ids, for
instance there is no 4,8,9 etc... I need to find out which of these are
not there. 



----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: R help <r-help at r-project.org>
Cc: 
Sent: Tuesday, June 18, 2013 10:32 AM
Subject: Re: Elementary Help

Hi,
May be this helps:
set.seed(24)
dat1<- data.frame(ID=1:200,value=sample(c(5:200,NA),200,replace=TRUE))
 which(is.na(dat1$value))
#[1]  56 146 184
sum(which(is.na(dat1$value)))  #Not clear about the 2nd part of the question
#[1] 386

 sum(is.na(dat1$value))
#[1] 3
table(is.na(dat1$value))
#FALSE  TRUE 
#  197     3 
A.K.


>I am totally new to R, therefore probably this question will be very 
easy for most of you. I have a range of values in a column ranging from 5 to 200. >Some of the values are missing, that is, not all student 
numbers are there. How do I find which are these missing numbers and 
obtain the sum of >these integers?



More information about the R-help mailing list