[R] R combining vectors into a data frame but without a continuous common variable

arun smartpink111 at yahoo.com
Thu Oct 4 21:03:04 CEST 2012


Hi Lucy,

No problem.

Just a correction to my earlier email.

dat1<-read.table("Landeck_vec.txt",sep="",header=TRUE,stringsAsFactors=FALSE)
dat2<-read.table("Kaurnetal_vec.txt",sep="",header=TRUE,stringsAsFactors=FALSE)
colnames(dat1)[1]<-"Date"

(Rui:
#dat2 Date format is inconsistent.)
dat2$Date<-gsub("\\.","\\/",dat2$Date)
dat1$Date<-as.POSIXct(dat1$Date,format="%d.%m.%Y")
dat2$Date<-as.POSIXct(dat2$Date,format="%d/%m/%Y")

 str(dat1)
#'data.frame':    22623 obs. of  2 variables:
# $ Date : POSIXct, format: "1900-04-01" "1900-04-02" ...
# $ Event: int  0 0 0 0 0 0 0 0 0 0 ...
 str(dat2)
#'data.frame':    36598 obs. of  2 variables:
# $ Date  : POSIXct, format: "1900-01-01" "1900-01-02" ...
# $ Precip: chr  "0" "0" "0" "0" ...

Precip is "character", which I convert it to numeric
 #dat2<-within(dat2,{Precip<-as.numeric(Precip)})
#Warning message:
#In eval(expr, envir, enclos) : NAs introduced by coercion

The reason is that there are datapoints which has some unusual characters.

which(is.na(dat2$Precip))
# [1]  7060  8584  8798 11235 12848 13701 14006 14038 14098 14311 16016 16748
#[13] 18575 19307 19489 19702 19764 21196
dat2[8584,]
#           Date Precip
#8584 1923-09-01     NA

When I looked into the data, I found this:

01/09/1923	L�cke

  count(is.na(dat2$Precip))
#      x  freq
#1 FALSE 36580
#2  TRUE    18
#Removed those rows.
dat3<-subset(dat2,!is.na(Precip))
 nrow(dat3)
#[1] 36580

dat4<-merge(dat1,dat3,by="Date")
 dat5<-subset(dat4,Event!=0)
 nrow(dat5)
#[1] 132
 rownames(dat5)<-1:nrow(dat5)
 head(dat5)
#        Date Event Precip
#1 1901-06-02     1    0.0
#2 1905-06-02     1    0.0
#3 1906-08-03     1   15.6
#4 1908-05-08     1    0.0
#5 1911-06-02     1    3.0
#6 1911-09-15     1   23.2

A.K.





----- Original Message -----
From: lucy88 <lucy.foggin at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Thursday, October 4, 2012 7:18 AM
Subject: [R] R combining vectors into a data frame but without a continuous common variable

Hello,

I have two different files which I'd like to combine to make one data frame
but I've no idea how to do it! The first file has two columns; one is the
date, the following is a binary code for debris flow events. Then my other
file has also two columns; the date and then precipitation data.

The thing is, is that the two date columns don't all contain the same dates.
The binary one is every day from April - October from 1900 - 2005, yet the
precipitation file has dates from from say, 1911 to 2004, with some missing
data on certain months and during certain years.

So my question is how to make a data frame which would have the date, the
binary 0 or 1, and then the corresponding precip value from that particular
date. I only want the precip information for the days where I have
information in the binary file; the others can be disregarded.

I have tried using codes which I found in answer to other questions asked
but none of them work with my issue. If I'm honest I don't really know if
this is what I need. I'm hoping to end up doing a logistic regression. I've
uploaded the two files in case I've not been very clear...

I'd be really grateful if anyone could help me and suggest a way to do it!
I'm also really not very technical and am not at all comfortable with R so
if you could be really basic in your advice I'd appreciate it!

Many thanks in advance,
Lucy

Landeck_vec.txt
<http://r.789695.n4.nabble.com/file/n4644986/Landeck_vec.txt>  

Kaurnetal_vec.txt
<http://r.789695.n4.nabble.com/file/n4644986/Kaurnetal_vec.txt>  




--
View this message in context: http://r.789695.n4.nabble.com/R-combining-vectors-into-a-data-frame-but-without-a-continuous-common-variable-tp4644986.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





More information about the R-help mailing list