[Rd] suggestion how to use memcpy in duplicate.c

Matthew Dowle mdowle at mdowle.plus.com
Thu Apr 22 13:12:53 CEST 2010


Is this a thumbs up for memcpy for DUPLICATE_ATOMIC_VECTOR at least ?

If there is further specific testing then let me know, happy to help, but 
you seem to have beaten me to it.

Matthew


"Simon Urbanek" <simon.urbanek at r-project.org> wrote in message 
news:65D21B93-A737-4A94-BDF4-AD7E90518AC0 at r-project.org...
>
> On Apr 21, 2010, at 2:15 PM, Seth Falcon wrote:
>
>> On 4/21/10 10:45 AM, Simon Urbanek wrote:
>>> Won't that miss the last incomplete chunk? (and please don't use
>>> DATAPTR on INTSXP even though the effect is currently the same)
>>>
>>> In general it seems that the it depends on nt whether this is
>>> efficient or not since calls to short memcpy are expensive (very
>>> small nt that is).
>>>
>>> I ran some empirical tests to compare memcpy vs for() (x86_64, OS X)
>>> and the results were encouraging - depending on the size of the
>>> copied block the difference could be quite big: tiny block (ca. n =
>>> 32 or less) - for() is faster small block (n ~ 1k) - memcpy is ca. 8x
>>> faster as the size increases the gap closes (presumably due to RAM
>>> bandwidth limitations) so for n = 512M it is ~30%.
>>>
>>
>>> Of course this is contingent on the implementation of memcpy,
>>> compiler, architecture etc. And will only matter if copying is what
>>> you do most of the time ...
>>
>> Copying of vectors is something that I would expect to happen fairly 
>> often in many applications of R.
>>
>> Is for() faster on small blocks by enough that one would want to branch 
>> based on size?
>>
>
> Good question. Given that the branching itself adds overhead possibly not. 
> In the best case for() can be ~40% faster (for single-digit n) but that 
> means billions of copies to make a difference (since the operation itself 
> is so fast). The break-even point on my test machine is n=32 and when I 
> added the branching it took 20% hit so I guess it's simply not worth it. 
> The only case that may be worth branching is n:1 since that is likely a 
> fairly common use (the branching penalty in copy routines is lower than 
> comparing memcpy/for implementations since the branching can be done 
> before the outer for loop so this may vary case-by-case).
>
> Cheers,
> Simon
>



More information about the R-devel mailing list