[R] [FORGED] data$variable=factor(....) <NA> <NA> <NA>

Rolf Turner r.turner at auckland.ac.nz
Sun Jul 12 23:12:03 CEST 2015


(1) Please keep the discourse on list.

(2) Moral of your story:  Don't use Excel --- for *anything*!!!

(3) Why didn't you follow my suggestion?

(4) Naturally you get NAs!  There are no levels of "1" or "2" in your 
data.  The levels are "F" and "M", for crying out loud!!!  Why *on 
earth* did you say "levels=c(1:2)"?  This could never possibly make any 
sense at all.

(5) And by the way, why on earth do you write "c(1:2)" rather than just 
"1:2"?  What do you think the "c()" is doing for you?  Understand what 
things *mean*; don't just slap code down and hope.

cheers,

Rolf Turner

-- 
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 13/07/15 00:25, Dagmar Juranková wrote:
> Hello.
>
> There was a gap in front of F and M in my Excel table, thats why there
> were more levels of F and M. I corrected it (and saved as csv) and it
> still shows NA NA NA.
>
>
> selbst <- read.csv("C:/Users/Dadka/Desktop/Rcsv/doc.ex.csv/selbst.csv")
>  > View(selbst) > df=
> read.csv("C:/Users/Dadka/Desktop/Rcsv/doc.ex.csv/selbst.csv") > selbst$q_2   [1] F F M F F M F M F M M F F M M F M F F F F F F M M F M F F F M F F F F F F F F F F M F F M F M F F F F
> [52] F F F M M M M F M F F F F F F M F
> Levels: F M
>>attributes(selbst$q_2) $levels
> [1] "F" "M"
>
> $class
> [1] "factor"
>
>>selbst$q_2= factor(selbst$q_2, levels=c(1:2),
> labels=c("F","M"),exclude=NA, nmax=NA) > selbst$q_2 [1] <NA> <NA> <NA>
> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
> <NA> <NA> <NA> [21] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> [41] <NA> <NA> <NA>
> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
> <NA> <NA> <NA> [61] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> Levels: F M
>
>
> I think the problem might be in my csv document.
> Screenshot in png format is attached below.
> Could you maybe have a look at it please?
> Thank you.
>
> 2015-07-12 1:00 GMT+02:00 Rolf Turner <r.turner at auckland.ac.nz
> <mailto:r.turner at auckland.ac.nz>>:
>
>
>     Try:
>
>     ggg <- c("F","M","F",M")
>     data$gender <- factor(ggg[data$gender])
>
>     This in effect converts the (spurious) " F" and " M" levels into "F"
>     and "M" respectively, giving you a factor with the two levels that
>     you really want.
>
>     cheers,
>
>     Rolf Turner
>
>     P. S.  *Not* a good idea to use "data" as the name of your data frame.
>     See fortune("dog").
>
>     R. T.
>
>     --
>     Technical Editor ANZJS
>     Department of Statistics
>     University of Auckland
>     Phone: +64-9-373-7599 ext. 88276 <tel:%2B64-9-373-7599%20ext.%2088276>
>
>
>     On 12/07/15 07:21, Dagmar Juranková wrote:
>
>         Hello everybody, I have a problem with R.
>
>
>         I uploaded a questionnaire saved as csv into R and I tried to test
>         independence between two variables.
>
>
>
>         data <- read.csv("C:/Users/Me/Desktop/data.csv")>   View(data)> df =
>         read.csv("C:/Users/Me/Desktop/data.csv")> ls()
>         [1] "df"     "data"> attributes(data$gender)
>         $levels
>         [1] " F" " M" "F"  "M"
>
>         $class
>         [1] "factor"
>
>
>         I changed my variable "gender" into a factor using:
>
>
>         data$gender=factor(data$gender, levels=c(1:2), labels= c( "F", "M"),
>         exclude= NA, nmax= NA).
>
>
>         Then I wrote data$gender and the only thing i got was:
>
>
>         [1] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
>         <NA> <NA>
>         <NA> <NA> <NA> <NA> <NA> <NA>
>
>         [21] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
>         <NA> <NA>
>         <NA> <NA> <NA> <NA> <NA> <NA>
>
>         [41] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
>         <NA> <NA>
>         <NA> <NA> <NA> <NA> <NA> <NA>
>
>         [61] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
>
>         Levels: F M
>
>
>         Does anybody know why?
>
>
>         -My csv doc in the column gender is filled out properly.
>         (M=Male, F= Female)
>
>         -My imported dataset in R is complete (all values)
>
>
>         ! I have done this with a different excel document and it worked out
>         without any problems. I am really clueless. I cant go further
>         and compare
>         the variables and do t-tests without this working.
>
>
>         Could someone please help me out?
>
>         Thank you.



More information about the R-help mailing list