[R] Problem with filling dataframe's column

Eric Berger er|cjberger @end|ng |rom gm@||@com
Tue Jun 13 10:05:49 CEST 2023


Hi Javed,
grep returns the positions of the matches. See an example below.

> v <- c("abc", "bcd", "def")
> v
[1] "abc" "bcd" "def"
> grep("cd",v)
[1] 2
> w <- v[-grep("cd",v)]
> w
[1] "abc" "def"
>


On Tue, Jun 13, 2023 at 8:50 AM javad bayat <j.bayat194 using gmail.com> wrote:
>
> Dear Rui;
> Hi. I used your codes, but it seems it didn't work for me.
>
> > pat <- c("_esmdes|_Des Section|0")
> > dim(data2)
>     [1]  281549      9
> > grep(pat, data2$Layer)
> > dim(data2)
>     [1]  281549      9
>
> What does grep function do? I expected the function to remove 3 rows of the
> dataframe.
> I do not know the reason.
>
>
>
>
>
>
> On Mon, Jun 12, 2023 at 5:16 PM Rui Barradas <ruipbarradas using sapo.pt> wrote:
>
> > Às 23:13 de 12/06/2023, javad bayat escreveu:
> > > Dear Rui;
> > > Many thanks for the email. I tried your codes and found that the length
> > of
> > > the "Values" and "Names" vectors must be equal, otherwise the results
> > will
> > > not be useful.
> > > For some of the characters in the Layer column that I do not need to be
> > > filled in the LU column, I used "NA".
> > > But I need to delete some of the rows from the table as they are useless
> > > for me. I tried this code to delete entire rows of the dataframe which
> > > contained these three value in the Layer column: It gave me the following
> > > error.
> > >
> > >> data3 = data2[-grep(c("_esmdes","_Des Section","0"), data2$Layer),]
> > >       Warning message:
> > >        In grep(c("_esmdes", "_Des Section", "0"), data2$Layer) :
> > >        argument 'pattern' has length > 1 and only the first element will
> > be
> > > used
> > >
> > >> data3 = data2[!grepl(c("_esmdes","_Des Section","0"), data2$Layer),]
> > >      Warning message:
> > >      In grepl(c("_esmdes", "_Des Section", "0"), data2$Layer) :
> > >      argument 'pattern' has length > 1 and only the first element will be
> > > used
> > >
> > > How can I do this?
> > > Sincerely
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Sun, Jun 11, 2023 at 5:03 PM Rui Barradas <ruipbarradas using sapo.pt>
> > wrote:
> > >
> > >> Às 13:18 de 11/06/2023, Rui Barradas escreveu:
> > >>> Às 22:54 de 11/06/2023, javad bayat escreveu:
> > >>>> Dear Rui;
> > >>>> Many thanks for your email. I used one of your codes,
> > >>>> "data2$LU[which(data2$Layer == "Level 12")] <- "Park"", and it works
> > >>>> correctly for me.
> > >>>> Actually I need to expand the codes so as to consider all "Levels" in
> > >> the
> > >>>> "Layer" column. There are more than hundred levels in the Layer
> > column.
> > >>>> If I use your provided code, I have to write it hundred of time as
> > >> below:
> > >>>> data2$LU[which(data2$Layer == "Level 1")] <- "Park";
> > >>>> data2$LU[which(data2$Layer == "Level 2")] <- "Agri";
> > >>>> ...
> > >>>> ...
> > >>>> ...
> > >>>> .
> > >>>> Is there any other way to expand the code in order to consider all of
> > >> the
> > >>>> levels simultaneously? Like the below code:
> > >>>> data2$LU[which(data2$Layer == c("Level 1","Level 2", "Level 3", ...))]
> > >> <-
> > >>>> c("Park", "Agri", "GS", ...)
> > >>>>
> > >>>>
> > >>>> Sincerely
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Sun, Jun 11, 2023 at 1:43 PM Rui Barradas <ruipbarradas using sapo.pt>
> > >>>> wrote:
> > >>>>
> > >>>>> Às 21:05 de 11/06/2023, javad bayat escreveu:
> > >>>>>> Dear R users;
> > >>>>>> I am trying to fill a column based on a specific value in another
> > >>>>>> column
> > >>>>> of
> > >>>>>> a dataframe, but it seems there is a problem with the codes!
> > >>>>>> The "Layer" and the "LU" are two different columns of the dataframe.
> > >>>>>> How can I fix this?
> > >>>>>> Sincerely
> > >>>>>>
> > >>>>>>
> > >>>>>> for (i in 1:nrow(data2$Layer)){
> > >>>>>>              if (data2$Layer == "Level 12") {
> > >>>>>>                  data2$LU == "Park"
> > >>>>>>                  }
> > >>>>>>              }
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>> Hello,
> > >>>>>
> > >>>>> There are two bugs in your code,
> > >>>>>
> > >>>>> 1) the index i is not used in the loop
> > >>>>> 2) the assignment operator is `<-`, not `==`
> > >>>>>
> > >>>>>
> > >>>>> Here is the loop corrected.
> > >>>>>
> > >>>>> for (i in 1:nrow(data2$Layer)){
> > >>>>>      if (data2$Layer[i] == "Level 12") {
> > >>>>>        data2$LU[i] <- "Park"
> > >>>>>      }
> > >>>>> }
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> But R is a vectorized language, the following two ways are the
> > idiomac
> > >>>>> ways of doing what you want to do.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> i <- data2$Layer == "Level 12"
> > >>>>> data2$LU[i] <- "Park"
> > >>>>>
> > >>>>> # equivalent one-liner
> > >>>>> data2$LU[data2$Layer == "Level 12"] <- "Park"
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> If there are NA's in data2$Layer it's probably safer to use ?which()
> > in
> > >>>>> the logical index, to have a numeric one.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> i <- which(data2$Layer == "Level 12")
> > >>>>> data2$LU[i] <- "Park"
> > >>>>>
> > >>>>> # equivalent one-liner
> > >>>>> data2$LU[which(data2$Layer == "Level 12")] <- "Park"
> > >>>>>
> > >>>>>
> > >>>>> Hope this helps,
> > >>>>>
> > >>>>> Rui Barradas
> > >>>>>
> > >>>>
> > >>>>
> > >>> Hello,
> > >>>
> > >>> You don't need to repeat the same instruction 100+ times, there is a
> > way
> > >>> of assigning all new LU values at the same time with match().
> > >>> This assumes that you have the new values in a vector.
> > >>
> > >> Sorry, this is not clear. I mean
> > >>
> > >>
> > >> This assumes that you have the new values in a vector, the vector Names
> > >> below. The vector of values to be matched is created from the data.
> > >>
> > >>
> > >> Rui Barradas
> > >>
> > >>>
> > >>>
> > >>> Values <- sort(unique(data2$Layer))
> > >>> Names <- c("Park", "Agri", "GS")
> > >>>
> > >>> i <- match(data2$Layer, Values)
> > >>> data2$LU <- Names[i]
> > >>>
> > >>>
> > >>> Hope this helps,
> > >>>
> > >>> Rui Barradas
> > >>>
> > >>> ______________________________________________
> > >>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide
> > >>> http://www.R-project.org/posting-guide.html
> > >>> and provide commented, minimal, self-contained, reproducible code.
> > >>
> > >>
> > >
> > Hello,
> >
> > Please cc the r-help list, R-Help is threaded and this can in the future
> > be helpful to others.
> >
> > You can combine several patters like this:
> >
> >
> > pat <- c("_esmdes|_Des Section|0")
> > grep(pat, data2$Layer)
> >
> > or, programatically,
> >
> >
> > pat <- paste(c("_esmdes","_Des Section","0"), collapse = "|")
> >
> >
> > Hope this helps,
> >
> > Rui Barradas
> >
> >
>
> --
> Best Regards
> Javad Bayat
> M.Sc. Environment Engineering
> Alternative Mail: bayat194 using yahoo.com
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list