[R] Reorder file names read by list.files function

William Dunlap wdun|@p @end|ng |rom t|bco@com
Thu Oct 11 02:05:21 CEST 2018


You can paste the directory names, dir.names(files), back on, with
file.path(), after you do the sorting.  A better idiom is to use order()
instead of sort() and usng order's output to subscript file.names.  E.g.,
the following sorts by year and month number.

> file.names <- c("C:/tmp/June_2018.PDF", "C:/tmp/May_2018.PDF",
"C:/tmp/October_2016.PDF")
> bfile.names <- sub("\\..*$", "", basename(file.names))
> bfile.names
[1] "June_2018"    "May_2018"     "October_2016"
> month <- sub("^([[:alpha:]]+)_.*$", "\\1", bfile.names)
> month
[1] "June"    "May"     "October"
> month.names
Error: object 'month.names' not found
> month.names <-
c("January","February","March","April","May","June","July","August","September","October","November","December")
> month.number <- match(month, month.names)
> month.number
[1]  6  5 10
> file.names[ order(year, month.number) ]
[1] "C:/tmp/October_2016.PDF" "C:/tmp/May_2018.PDF"
 "C:/tmp/June_2018.PDF"




Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Oct 10, 2018 at 4:23 PM, Ek Esawi <esawiek using gmail.com> wrote:

> Thank you Bill and RUI. I use month.name with sort and basename, as
> suggested by Bill. i got the sorted numerical values, then i use
> month.name to get proper ordered month names. The problem is that i
> have to paste to the names the extension PDF giving me the correct
> ordered file names, but then i get the same error message which
> suggest that the code is not reading the files properly
>
> I have not tried RUI's yet, but i will if nothing else works out.
>
> Thanks again--EK
>
> had to strip off file.names from the extension PDF, but when i paste
> the month.name with .PDF to get the correct file names, i am getting
> the same error.
> On Tue, Oct 9, 2018 at 4:47 PM William Dunlap <wdunlap using tibco.com> wrote:
> >
> > Use basename(filename) to remove the lead parts of the full path to the
> file.  E.g., replace
> >    FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
> > with (the untested)
> >     FNs <- sort(match(sub("\\.PDF", "", basename(file.names)),
> month.name))
> >
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com
> >
> > On Tue, Oct 9, 2018 at 1:38 PM, Ek Esawi <esawiek using gmail.com> wrote:
> >>
> >> Hi again,
> >>
> >> I worked with RUi's idea of using the match function with month.name.
> >> I got numerical values for months then i sorted and pasted the PDF
> >> file extension. It gave me the file order i wanted, but now statements
> >> 8,9,&10 don't work and i kept getting an error which is listed below.
> >> The dilemma is if i add full.names=TRUE in statement 6 then statements
> >> 9 and 10 don't produce what they did earlier. If i put
> >> full.names=FALSE, then i am back to square 1.
> >> Any idea is greatly appreciated.:
> >>
> >> The code
> >>
> >> 1. nstall.packages("tabulizer")
> >> 2. installed.packages("stringr")
> >> 3. library(stringr)
> >> 4. library(tabulizer)
> >> 5. path = "C:/Users/namei/Documents/TextMining/S2017"
> >> 6. file.names <- dir(path, pattern =".PDF",full.names = TRUE)
> >> 7. file.names <- str_remove(file.names,"\\s[0-9][0-9]")
> >> 8. FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
> >> 9. FNs1 <- paste0(month.name[FNs],".","PDF")
> >> 10 A <- lapply(FNs1, function(i) extract_tables(i))
> >>
> >> Output and the error message.
> >>
> >> path = "C:/Users/eesawi/Documents/TextMining/S2017"
> >> > file.names <- dir(path, pattern =".PDF",full.names = TRUE)
> >> > file.names <- str_remove(file.names,"\\s[0-9][0-9]")
> >> > FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
> >> > FNs1 <- paste0(month.name[FNs],".","PDF")
> >> > A <- lapply(FNs1, function(i) extract_tables(i))
> >>  Show Traceback
> >>
> >>  Error in normalizePath(path.expand(path), winslash, mustWork) :
> >>   path[1]=".PDF": The system cannot find the file specified
> >> On Tue, Oct 9, 2018 at 9:44 AM Ek Esawi <esawiek using gmail.com> wrote:
> >> >
> >> > Hi All--
> >> >
> >> > I used base R list.file function to read files from a directory. The
> >> > file names are months (April, August, etc). That's the system reads
> >> > them in alphabetical order., but i want to reordered them in calendar
> >> > order (January, February, ...December).. I thought i might be able to
> >> > do it via RegEx or possibly gtools package, I am wondering if there is
> >> > an easier way.
> >> >
> >> > Thanks--EK
> >> >
> >> > Example
> >> > path = "C:/Users/name/Downloads/MyFiles"
> >> > file.names <- dir(path, pattern =".PDF")
> >> >
> >> > Example output
> >> > Output:
> >> > "February.PDF"  "January.PDF" "March.PDF"
> >> > Desired output
> >> > "January.PDF"  "February.PDF" "March.PDF"
> >>
> >> ______________________________________________
> >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
>

	[[alternative HTML version deleted]]




More information about the R-help mailing list