我有一个类似于以下的功能:
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public void SetVariable<T>(T newValue) where T : struct {
// I know by this point that T is blittable (i.e. only unmanaged value types)
// varPtr is a void*, and is where I want to copy newValue to
*varPtr = newValue; // This won't work, but is basically what I want to do
}
我看到了 Marshal.StructureToIntPtr(),但它看起来相当慢,而且这是性能敏感的代码。如果我知道类型T
我可以直接声明varPtr
as a T*
,但是……嗯,我不这么认为。
不管怎样,我都在寻找最快的方法来做到这一点。 “安全”不是一个问题:在代码中,我知道结构的大小T
将完全适合指向的内存varPtr
.
一种答案是使用本机 memcpy 尝试执行的相同优化技巧,在 C# 中重新实现本机 memcpy。您可以在 Microsoft 自己的源代码中看到 Microsoft 这样做。请参阅缓冲区.cs http://referencesource.microsoft.com/mscorlib/system/buffer.cs.html#0d691e54a9fdddc3Microsoft 参考源中的文件:
// This is tricky to get right AND fast, so lets make it useful for the whole Fx.
// E.g. System.Runtime.WindowsRuntime!WindowsRuntimeBufferExtensions.MemCopy uses it.
internal unsafe static void Memcpy(byte* dest, byte* src, int len) {
// This is portable version of memcpy. It mirrors what the hand optimized assembly versions of memcpy typically do.
// Ideally, we would just use the cpblk IL instruction here. Unfortunately, cpblk IL instruction is not as efficient as
// possible yet and so we have this implementation here for now.
switch (len)
{
case 0:
return;
case 1:
*dest = *src;
return;
case 2:
*(short *)dest = *(short *)src;
return;
case 3:
*(short *)dest = *(short *)src;
*(dest + 2) = *(src + 2);
return;
case 4:
*(int *)dest = *(int *)src;
return;
...
有趣的是,他们本机实现了最大 512 的所有大小的 memcpy;大多数大小都使用指针别名技巧来让虚拟机发出对不同大小进行操作的指令。仅在 512 处,他们才最终调用本机 memcpy:
// P/Invoke into the native version for large lengths
if (len >= 512)
{
_Memcpy(dest, src, len);
return;
}
据推测,本机 memcpy 甚至更快,因为它可以手动优化以使用 SSE/MMX 指令来执行复制。
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)