Buffers¶
Buffers are objects that contain data that can be read or written by shaders on the GPU. Buffers must contain homogeneous data (all elements in the buffer have the same type) and can be numeric or composite. Composite data can be vectors, matrices, or draw commands for indirect drawing.
Info
The various objects described on this page are declared in
the package Orka.Rendering.Buffers
and its child packages.
Creating a buffer¶
To create a buffer, call Create_Buffer
:
Buffer_1 : Buffer := Create_Buffer
(Flags => (Dynamic_Storage => True, others => False),
Kind => Orka.Types.UInt_Type,
Length => 64);
Length
specifies the number of elements in the buffer, not the number
of bytes. The size of a buffer can be queried with the function Length
.
Alternatively, Create_Buffer
can be called with the parameters Flags
and Data
to initialize the buffer with the given data:
Indices : Unsigned_32_Array := (1, 2, 0, 0, 2, 3);
Buffer_2 : Buffer := Create_Buffer ((others => False), Indices);
Types¶
Buffers can contain data of one of the following types from package Orka
:
Unsigned_8_Array
Unsigned_16_Array
Unsigned_32_Array
Integer_8_Array
Integer_16_Array
Integer_32_Array
Float_16_Array
Float_32_Array
Float_64_Array
and the following types from package GL.Types.Indirect
:
Arrays_Indirect_Command_Array
Elements_Indirect_Command_Array
Dispatch_Indirect_Command_Array
and of the following types from Orka.Types.Singles
and Orka.Types.Doubles
:
Vector4_Array
Matrix4_Array
Additionally, for mapped buffers, the non-array versions of the last five types can also be written.
Uploading data¶
If data needs to be uploaded to the buffer from the CPU after
the buffer has been created, then Dynamic_Storage
must be true and
Set_Data
with the parameter Data
can be called to upload the data:
Buffer_1.Set_Data (Indices, Offset => 42);
Parameter Offset
is optional (default value is 0) and specifies the
position in the buffer of the first element of the given data.
Downloading data¶
To synchronously download data, first set a barrier:
GL.Barriers.Memory_Barrier ((Buffer_Update => True, others => False));
and then call procedure Get_Data
:
declare
Data : Float_32_Array (1 .. 16) := (others => 0.0);
begin
Buffer_0.Get_Data (Data);
end;
This procedure may stall the CPU.
Asynchronously downloading data is not yet supported
See issue #32 on GitHub.
Clearing data¶
Regardless of the value of Dynamic_Storage
, data from a buffer can be
cleared with the procedure Clear_Data
:
declare
Data : Unsigned_32_Array := (1, 2, 0);
begin
Buffer_2.Clear_Data (Data);
end;
This will write (repeatedly) 1, 2, and 0 to the buffer. To efficiently clear the buffer with zeros, use an array with one zero:
Buffer_2.Clear_Data (Single_Array'(1 => 0.0));
Copying data to another buffer¶
Regardless of the value of Dynamic_Storage
, data from a buffer can be
copied to another buffer by calling the procedure Copy_Data
on the
source buffer.
Tip
Disable Dynamic_Storage
if the buffer is only written by shaders
on the GPU. This makes the buffer immutable and may give the video
driver the freedom to allocate it in faster memory and/or perform
faster validation.
To upload data to an immutable buffer, an extra buffer with Dynamic_Storage
can be created as a staging buffer. After having called Set_Data
on
this staging buffer, the data can be copied to the immutable buffer by
calling Copy_Data
.
Binding buffers¶
Buffer
objects implement the interface Bindable_Buffer
, which provides
the procedure Bind
. This procedure can be used to bind the buffer to a
target so that it can be used by certain operations like indirect drawing.
Valid targets are:
Index
Dispatch_Indirect
Draw_Indirect
Parameter
Pixel_Pack
Pixel_Unpack
Query
Tip
The targets can be made directly visible with
use all type Orka.Rendering.Buffers.Buffer_Target
.
Accessing buffers in shaders¶
A second procedure Bind
exists to bind the buffer object to the index
of a target so that the buffer can be accessed in a shader. Valid targets
are:
Shader_Storage
(SSBO)Uniform
(UBO)
A UBO should only be used for small amount of data (no more than 64 KiB) that is accessed uniformly by all threads of a shader. Otherwise it is recommended to use an SSBO, which does not have these limitations.
Tip
The targets can be made directly visible with
use all type Orka.Rendering.Buffers.Indexed_Buffer_Target
.
SSBO¶
SSBOs are large writable buffers:
-
Guaranteed to be at least 128 MiB. The storage size can be variable.
-
Can be read and written. Writes can be atomic via special functions. A barrier is required after a shader has written to the buffer.
To use a buffer as an SSBO, create a buffer
with a binding index in a
shader:
layout(std430, binding = 0) buffer matrixBuffer {
mat4 matrices[];
};
and then bind the buffer to the used index:
Buffer_3.Bind (Shader_Storage, 0);
If data has been uploaded to the buffer or data was written to it by a shader, a memory barrier must be inserted before the buffer can be read by another shader:
GL.Barriers.Memory_Barrier ((Shader_Storage => True, others => False));
The buffer can then be accessed in the shader via the variable matrices
.
Barriers
If a shader progam has written data to the buffer, you must add a barrier before another program or OpenGL command reads from that buffer again. The kind of barrier that is needed depends on how the buffer is subsequently read.
Memory qualifiers
See Memory qualifiers on the OpenGL Wiki for a list of memory qualifiers that can be added to the buffer variable.
UBO¶
A uniform buffer is a buffer that provides uniform data and can be used as an alternative to a set of separate uniforms. If several different shader programs require the same set of uniforms, a UBO is a good fit because it avoids having to set the same uniforms for different programs; the buffer needs to be binded only once.
Compared to SSBOs, UBOs are severely restricted:
-
UBOs are at least 16 KiB, but often 64 KiB or even 2 GiB for some vendors. Storage size is fixed.
-
Can only be read, not written.
-
Should be access uniformly by the shader invocations. May be faster than SSBOs.
To use a buffer as an UBO, create a buffer
with a binding index in a shader:
layout(std140, binding = 0) uniform cameraBuffer {
mat4 viewTM;
mat4 projTM;
};
and then bind the buffer to the used index:
Buffer_3.Bind (Uniform, 0);
Padding
Note that the std140
layout pads vectors to 16 bytes (vec4). You should
avoid using vec3. See Memory layout on the OpenGL Wiki.
TBO¶
A third way to access a buffer in a shader is as a TBO. Data is fetched
in the shader via a texture unit, which can do format conversion in hardware.
In order to use a buffer as a TBO, the buffer must be attached to a
Buffer_Texture
object (a special kind of texture):
Buffer_Texture_1.Attach_Buffer (GL.Pixels.RGBA32F, Buffer_3.GL_Buffer);
Furthermore, a uniform with an explicit binding index must be declared in the shader:
layout(binding = 0) uniform samplerBuffer matrixBuffer;
and the buffer texture must be binded to this binding point:
declare
use all type Orka.Rendering.Textures.Indexed_Texture_Target;
begin
Orka.Rendering.Textures.Bind (Buffer_Texture_1, Texture, 0);
end;
Data can then be fetched via the texelFetch
function in the shader.
Mapped buffers¶
Mapped buffers are buffers that can be read from or written to from any task, not just the task for which the OpenGL context is current, which is what is normally required for any OpenGL subprogram. Creating and deleting these buffers must still happen in the rendering task, just like any other OpenGL object.
The package Orka.Rendering.Buffers.Mapped
contains the type Mapped_Buffer
,
which is used and extended by all implementations of mapped buffers.
The type Mapped_Buffer
has two discriminants: Kind
and Mode
.
If a record type containing a mapped buffer is needed,
the type and the component containing the mapped buffer can be declared as follows:
type Record_1 is record
Component_1 : Some_Mapped_Buffer
(Kind => Orka.Types.Single_Matrix_Type,
Mode => Orka.Rendering.Buffers.Mapped.Write);
end record;
Just like a regular Buffer
, a Mapped_Buffer
can be binded to a target or
to a binding point with the procedure Bind
.
Writing and reading data¶
Data can be written to the buffer with procedure Write_Data
if discriminant
Mode
has the value Write
and read with procedure Read_Data
if the
discriminant equals Read
.
To write data, call Write_Data
:
Buffer_3.Write_Data (Matrix, Offset => Instance_Index);
To read elements from a mapped buffer, create an array on the stack
and then call Read_Data
:
declare
Data : Integer_32_Array (1 .. 16) := (others => 0);
begin
Buffer_4.Read_Data (Data);
end;
Persistent mapped buffers¶
Persistent mapped buffers are buffers that are and remain mapped indefinitely (until the buffer is deleted). This kind of mapped buffer is useful for data that is updated every frame.
Because the mapping is persistent, the GPU may read from or write to the buffer while it is mapped. To guarantee mutually exclusive access between the GPU and the CPU, the buffer must be split into multiple regions and fences must be used to make sure the GPU and CPU never operate on the same region.
To create a persistent mapped buffer, use the function Create_Buffer
in package Orka.Rendering.Buffers.Mapped.Persistent
:
Buffer_4 : Persistent_Mapped_Buffer := Create_Buffer
(Kind => Orka.Types.Single_Matrix_Type,
Length => 1024,
Mode => Orka.Rendering.Buffers.Mapped.Write,
Regions => 3);
At the end of a frame, after reading or writing to the buffer, the buffer can be set to the next region:
Buffer_4.Advance_Index;
Note that this procedure does not set or wait for any fence. See Fences on how to set and wait for a fence.
Writing data¶
If it is intended to write to the buffer, as indicated by setting Mode
to Write
, then you must wait for the fence
of the current region to complete before writing and then set a new fence
after the drawing or dispatch commands which use the buffer.
Reading data¶
In the case of reading from the buffer, set a new fence after the drawing or dispatch commands and then later wait for it to complete before reading the data.
Note
Persistent mapped buffers in Orka are coherent. This means that writes by the GPU or CPU will be automatically visible to the other. There is no need to add a barrier before a fence is set in the case of reading from a mapped buffer.
Unsynchronized mapped buffers¶
For use cases where a buffer does not need to be mapped indefinitely, an unsynchronized mapped buffer can be created instead. An unsynchronized mapped buffer can be mapped and then later unmapped. This is useful so that data can be written to it by a non-rendering task.
To create an unsynchronized mapped buffer, use the function Create_Buffer
in package Orka.Rendering.Buffers.Mapped.Unsynchronized
:
Buffer_5 : Unsynchronized_Mapped_Buffer := Create_Buffer
(Kind => Orka.Types.Int_Type,
Length => 16,
Mode => Orka.Rendering.Buffers.Mapped.Write);
After the buffer has been created, it can be mapped:
Buffer_5.Map;
When the buffer is mapped, the procedures Read_Data
or Write_Data
can be used to read or write data from or to the buffer, depending on
the used Mode
in the Create_Buffer
call.
These procedures can be called from any task.
After data has been read or written, the mapped buffer must be unmapped again so that it can be used by the GPU:
Buffer_5.Unmap;
If the buffer is not mapped, the function Buffer
can be called to
retrieve the actual Buffer
object.
This object can then be used for other purposes, such as binding it
to some target so that it can be used by shaders.
See Binding Buffers on how to bind a Buffer
object.
Fences¶
A fence is needed when using persistent mapped buffers or asynchronously downloading data. The fence is retired after the previous rendering commands have been completed.
To create a fence, call the function Create_Buffer_Fence
in the package
Orka.Rendering.Fences
and make sure the parameter Regions
is equal
to the value used by Create_Buffer
when creating the persistent mapped buffer:
Fence_1 : Buffer_Fence := Create_Buffer_Fence (Regions => 3);
It will actually create multiple fences, one for each region of the buffer.
One Buffer_Fence
(with multiple regions) is sufficient for multiple
persistent mapped buffers as long as the buffers all have the same number of
regions and are all moved to the next region before the fence is moved to the
next region.
To wait for the fence of current region to retire, call procedure Prepare_Index
:
declare
Status : Fence_Status;
begin
Fence_1.Prepare_Index (Status);
end;
This call may be done once at the start of a frame.
The Status
will be Signaled
if the fence was retired during or before the call.
If waiting failed or timed out, then the value will be Not_Signaled
.
At the end of a frame, the procedure Advance_Index
must be called to move the
fence to the next region:
Fence_1.Advance_Index;
Barriers¶
Barrier must be inserted between compute and rendering commands to make sure that the data becomes visible to a shader program or the video driver.
For example, if data was uploaded to a buffer or written by some compute
shader, and is then subsequently read as an SSBO by another program running
on the GPU, a Shader_Storage
memory barrier must be inserted between the
two commands:
GL.Barriers.Memory_Barrier ((Shader_Storage => True, others => False));
-- Issue a rendering command in which a shader accesses the buffer as an SSBO
In another example, if a compute shader has written data to a buffer and
the buffer must then be copied to another buffer, insert a Buffer_Update
barrier:
GL.Barriers.Memory_Barrier ((Buffer_Update => True, others => False));
Buffer_1.Copy_Data (Buffer_2);
For use cases such as deferred shading where ordering of reads and writes
matters only to fragment shaders, the procedure Memory_Barrier_By_Region
can be used instead. In this case both By_Region
and the requested barrier
must be set to True
:
GL.Barriers.Memory_Barrier_By_Region
((By_Region => True, Texture_Fetch => True, others => False));
It is not harmful to use Memory_Barrier
other than that
Memory_Barrier_By_Region
may provide better performance.
The following barriers can be inserted:
Name | Usable with By_Region |
Usage |
---|---|---|
Uniform |
Yes | UBOs |
Texture_Fetch |
Yes | texture*() |
Shader_Image_Access |
Yes | image*() |
Framebuffer |
Yes | Textures attached to framebuffers |
Shader_Storage |
Yes | SSBOs |
Element_Array |
No | Buffers binded to Index |
Command |
No | Buffers binded to Query_Buffer |
Pixel_Buffer |
No | Buffers binded to Pixel_Pack or Pixel_Unpack |
Texture_Update |
No | Textures |
Buffer_Update |
No | Buffers and mapped buffers |
Query_Buffer |
No | Buffers binded to Query |
Summary
The kind of barrier that is needed depends on how the buffer is subsequently read.