[Rd] Limitation of dirname() and basename(): file.name() and file.dir() ?

cstrato cstrato at aon.at
Wed Mar 28 20:00:31 CEST 2007


I am glad to hear that there seems to be some commitment for improvement,
although I must admit, that I did not realize that both functions do not 
check
if a name is a directory or a filename, even though the  definition in "The
Open Group Base Specifications" says:
   dirname  - return the directory portion of a pathname
   basename - return non-directory portion of a pathname

Using basename() and dirname() I tried to define functions, which show
what I wanted to do:

file.name <- function(fullname, name.exists=T) {
   isdir <- file.info(fullname)[,"isdir"]
   if (is.na(isdir)) return(ifelse (name.exists, NA, basename(fullname)))
   ifelse (isdir, "", basename(fullname))  
}

file.dir <-function(fullname, name.exists=T) {
   isdir <- file.info(fullname)[,"isdir"]
   if (is.na(isdir)) return(ifelse (name.exists, NA, 
file.dir(dirname(fullname))))
   path.expand(ifelse (isdir, fullname, dirname(fullname))) 
}

Here are some examples:

 > file.name("/net/home/stratowa/Diverses/tmp.txt")
[1] "tmp.txt"
 > file.name("/net/home/stratowa/Diverses/")
[1] ""
 > file.name("/net/home/stratowa/Diverses")
[1] ""
 > file.dir("/net/home/stratowa/Diverses/tmp.txt")
[1] "/net/home/stratowa/Diverses"
 > file.dir("/net/home/stratowa/Diverses/")
[1] "/net/home/stratowa/Diverses/"
 > file.dir("/net/home/stratowa/Diverses")
[1] "/net/home/stratowa/Diverses"

To get the filename part for a novel filename, I can set "name.exists=F".

I think, that it would be really helpful, if R could add these (or similar)
functions to the base package, but this is my personal opinion.

Best regards
Christian


Simon Urbanek wrote:
>
> On Mar 27, 2007, at 5:42 PM, Herve Pages wrote:
>
>> Simon Urbanek wrote:
>>> Your proposed behavior is inconsistent, anyway. The purpose of 
>>> dirname is to return parent directory of the entity represented by 
>>> the pathname.
>>
>> Mmmm, I don't think this is true:
>>
>>> dirname("aaa/..")
>>   [1] "aaa"
>>
>> "aaa" is not the parent directory of "aaa/.."
>>
>> Same here:
>>
>>> dirname("/usr/./.")
>>   [1] "/usr/."
>>
>
> Yes, the problem is that most dirname implementations don't 
> canonicalize the path - they are working on the string representation 
> and don't use the underlying FS. I wasn't saying that dirname is 
> perfect, in fact I refrained from commenting on this earlier exactly 
> because of the behavior you describe, but the decision to remove 
> trailing slashes was a deliberate as can be seen from the specs. 
> Semantically correct behavior (taking the definition of the dirname 
> function literally) would be dirname "/usr/." = "/", basename "/usr/." 
> = "usr". However, I suspect that not many people would expect this ;). 
> As was proposed earlier, one could think of "true" dirname" or perhaps 
> better "parentdir" function, although admittedly I don't see an issue 
> here ...
>
> Cheers,
> Simon
>
>
>
>>
>>> "/my/path" and "/my/path/" are equivalent as they both
>>> represent the directory "path" whose parent is "/my", therefore
>>> returning "/my/path" in either case is inconsistent with the purpose
>>> of this function. As of trailing slashes (independently of dirname),
>>> sadly, some programs exploit the equivalence of both representations
>>> by encoding meta-information in the representation, but this behavior
>>> is quite confusing and error-prone. You're free to add such special
>>> cases to your application, but there is no reason to add such
>>> confusion to R.
>>
>> Note that Python's designers were not afraid to emancipate from Unix for
>> this particular case:
>>
>>>>> import os.path
>>>>> os.path.dirname("aaa/..")
>>   'aaa'
>>>>> os.path.dirname("aaa/../")
>>   'aaa/..'
>>
>>
>> Also note that, if the goal was to mimic Unix behaviour, then why not
>> fully go for it, even for edge-cases:
>>
>>   R
>>   ----
>>> dirname("/")
>>   [1] "/"
>>> basename("/")
>>   [1] ""
>>
>>   Unix
>>   ----
>>   hpages at lamb1:~> dirname "/"
>>   /
>>   hpages at lamb1:~> basename "/"
>>   /
>>
>> Just my 2 cents...
>>
>> Cheers,
>> H.
>>
>>



More information about the R-devel mailing list