You are right, just a pointer. However a normal system will, in the call, copy the data into it's own buffer. Here's the typical sequence:
Code:
.----------------.
| Program Buffer |
'----------------'
|
| .----------------.
'--->| Library Buffer |
'----------------'
|
Kernel Mode ===========|=======================
|
v
.---------------.
| Kernel Buffer |
'---------------'
|
| .---------------.
'----->| Device Buffer |
'---------------'
If your driver uses zero-copy with scatter/gather, the kernel -> device buffer copy can be skipped.
As you can see, that's a lot of RAM copies. And they are CPU RAM copies, not DMA, so it's very wasteful.
The library would receive a pointer to the data, and a length. It would then copy that data over into it's buffer(s) before calling the kernel, again passing a pointer to the data, and the length, and the kernel will again copy it into it's buffer(s) before returning from the call.
The whole reason this is done is so that the caller can then consider it's buffer 'handled' and can overwrite it with new data for the next packet. If it didn't do this, it would need a complicated callback mechanism where it would be told when the buffer was really free. That's just not feasible. Hence the idea of passing ownership of the buffer back and forth, where the only data being copied is the pointer, not the block of data itself.
Bookmarks