Parallel and asynchronous programming¶
Modern CPUs have multiple cores and to maximize hardware utilization various parallel programming techniques should be employed. Besides Single Instruction Multiple Data (SIMD), which provides very fine-grained parallelism at the hardware level, a few techniques can be used to keep the cores of a CPU busy:
-
Fork and join. Divide some of the work of a frame up into n batches.
-
Multiple tasks. One task for each subsystem.
-
Jobs. Divide all the work up into small jobs, which are processed by a pool of n tasks.
In Orka, a job graph processing system provides flexible multitasking by allowing work to be split into multiple small jobs which are then processed by any available task from a task pool. Jobs can be processed in parallel (fork and join) as well as sequentially.
Jobs can be executed sequentially by setting some jobs as dependencies of others, forming a job graph. The execution status of a job graph can be tracked with a future.
Creating a system¶
A job graph processing system can be created by instantiating the
Orka.Jobs.System
package:
package Job_System is new Orka.Jobs.System
(Maximum_Queued_Jobs => 16,
Maximum_Job_Graphs => 4);
When this generic package gets instantiated, a number of worker tasks
will be created. These tasks will try to dequeue and process jobs from
a shared Queue
. The number of worker tasks is stored in the constant
Number_Of_Workers
.
Instantiating the package requires some parameters to be provided:
-
Maximum_Queued_Jobs
: the maximum number of jobs that can be enqueued (the capacity of the queue). -
Maximum_Job_Graphs
: the maximum number of separate job graphs that can be enqueued. For each job graph, aFuture
object is acquired. The number of job graphs that can be processed concurrently is bounded by this number.
Jobs can be enqueued by calling Queue.Enqueue
. The worker tasks can be
shut down by calling the procedure Shutdown
.
Jobs¶
A job is some code that performs some work based on some data. It is an
instance of the limited interface Job
. When a worker picks up a job,
it calls the procedure Execute
of the job.
A job can have zero or one dependent job (thus the job is a dependency of the other job). After a job has been executed by a worker, the worker will decrement the number of dependencies in its dependent job (if there is one). If the number of dependencies of that other job gets reduced to zero, then that job will be enqueued to the shared queue so that it can be processed later by a worker.
The dependent job of a job can be retrieved with the function Dependent
.
If a job has no dependent job, then a pointer to the Null_Job
is returned.
The function Has_Dependencies
returns True
or False
depending on
whether it is the dependent of another job.
Creating a job¶
A job should inherit from Abstract_Job
, which implements the interface Job
:
type Example_Job is new Orka.Jobs.Abstract_Job with record
...
end record;
overriding
procedure Execute
(Object : Example_Job;
Context : Orka.Jobs.Execution_Context'Class);
Create an instance of the job and assign it to a variable of the type
Job_Ptr
:
Job_1 : Orka.Jobs.Job_Ptr := new Example_Job;
Job_1
can then be enqueued via the entry Queue.Enqueue
.
When the job is executed, it can optionally enqueue new jobs via
Context.Enqueue
. These jobs will be enqueued and executed before any
dependent job of the job.
Graphs¶
Jobs can be connected and form a graph by setting some of its jobs
as dependencies of other jobs. To create a chain of jobs such that each job
is a dependency of the next job, call procedure Chain
:
Orka.Jobs.Chain ((Job_1, Job_2, Job_3));
This will create the graph Job_1
→ Job_2
→ Job_3
.
Afterwards, only Job_1
needs to be enqueued manually; the workers will
follow the edges of the graph and enqueue the remaining jobs.
There are some alternative ways to set dependencies. To set one job as the
dependency of another, call Set_Dependency
:
Job_2.Set_Dependency (Job_1);
This will create the graph Job_1
→ Job_2
.
To set multiple jobs as dependencies, use procedure Set_Dependencies
:
Job_3.Set_Dependencies ((Job_1, Job_2));
This will create a graph with the edges Job_1
→ Job_3
and
Job_2
→ Job_3
.
Dependencies and dependents
If a job graph has an edge Job_1
→ Job_2
, then Job_1
is a
dependency of Job_2
and Job_2
is the dependent job of Job_1
.
A job can have 0 or 1 dependents and 0 to n dependencies.
Enqueing jobs¶
In order to execute a job graph by the system, the jobs in the graph
which have no dependencies need to be enqueued via Queue.Enqueue
. Jobs
are considered to be in the same graph if they share a smart pointer to
an instance of the synchronized interface Future
. If a smart point is
null, it is set by Queue.Enqueue
:
declare
Future_Pointer : Orka.Futures.Pointers.Mutable_Pointer;
-- Future_Pointer.Is_Null is True
begin
Job_System.Queue.Enqueue (Job_1, Future_Pointer);
-- Future_Pointer.Is_Null is False
Job_System.Queue.Enqueue (Job_2, Future_Pointer);
-- Future_Pointer.Is_Null is False
end;
Note that Job_1
and Job_2
must have the same root dependent job (the
last job of the graph that gets executed); they must be part of one
connected graph. Otherwise the graph will be considered complete as soon as
one of its job has no dependents, which will cause the remaining jobs to
skip execution.
To get the current status of a future, call function Current_Status
:
if Future_Pointer.Get.Current_Status = Orka.Futures.Done then
-- All jobs in the job graph have been executed
end if;
Awaiting completion¶
Sometimes you need to wait until a job graph has finished before
continuing. The entry Wait_Until_Done
can be used to block until the
status becomes Done
or Failed
:
declare
Status : Orka.Futures.Status;
Future : constant Orka.Futures.Future_Access := Future_Pointer.Get.Value;
begin
select
Future.Wait_Until_Done (Status);
-- Status is Done
or
delay until Clock + Milliseconds (10);
-- Execution of the graph did not complete within 10 ms
-- Current status is Future.Current_Status
end select;
end;
If one of the jobs in the graph raises an exception, Status
becomes
Failed
and Wait_Until_Done
will reraise this exception.
Parallel jobs¶
So far we have seen how jobs can be created and be made part of a job graph. Each of these jobs will operate on some data. Because each job is executed by a single worker, the data is processed sequentially in the job. However, the amount of data might be big enough that it should be split up so that it can be processed by multiple workers, each processing a slice of the data.
A job implementing the interface Parallel_Job
can be made parallelizable
with the function Parallelize
.
This function will return a pointer to a job which, when it is executed
by a worker, will enqueue multiple instances of the given job that it was
meant to parallelize, each with a different range. These jobs will run in
parallel if there are multiple workers.
Jobs wishing to implement the interface Parallel_Job
should inherit from
Abstract_Parallel_Job
and override a different procedure:
type Example_Parallel_Job is new Jobs.Abstract_Parallel_Job with record
...
end record;
overriding
procedure Execute
(Object : Example_Parallel_Job;
Context : Orka.Jobs.Execution_Context'Class;
From, To : Positive);
A function to clone the job must be defined as well:
function Clone_Job
(Job : Orka.Jobs.Parallel_Job_Ptr;
Length : Positive) return Orka.Jobs.Dependency_Array
is
Object : constant Example_Parallel_Job := Example_Parallel_Job (Job.all);
begin
return Result : constant Orka.Jobs.Dependency_Array (1 .. Length)
:= (others => new Example_Parallel_Job'(Object));
end Clone_Job;
An instance of this job can then be created and parallelized with the
function Parallelize
:
Job_1 : Jobs.Parallel_Job_Ptr := new Example_Parallel_Job;
Job_2 : Jobs.Job_Ptr := Jobs.Parallelize
(Job_1, Clone_Job'Access, Length => 24, Slice => 6);
When Job_2
is executed, it will enqueue four instances of Job_1
with
the ranges 1..6, 7..12, 13..18, and 19..24. These four jobs will be executed
before the dependent job of Job_2
gets executed (if it has any).
GPU jobs¶
Some jobs may use subprograms from packages Orka.Rendering
or GL
,
and thus need to run in a task that holds the OpenGL context. A job can
specify this is the case by implementing the empty interface GPU_Job
:
type Render_Scene_Job is new Jobs.Abstract_Job and Jobs.GPU_Job with record
...
end record;
This interface does not require implementing any subprograms and is simply used as a marker so that the system knows that the job cannot be processed by any of the workers (which do not hold the OpenGL context).
In order to process these GPU jobs, the package Orka.Jobs.System
actually provides two queues:
-
CPU: Jobs from this queue are dequeued by the workers.
-
GPU: Jobs from this queue are not dequeued by the workers, but can be dequeued by a special user-defined task.
Jobs are automatically enqueued in the correct queue by Queue.Enqueue
depending on whether the job implements the interface GPU_Job
.
Jobs from the GPU queue must be dequeued and executed in a task that
holds the OpenGL context. In this task, the following code will dequeue
and execute jobs from this queue:
Job_System.Execute_GPU_Jobs;
Note that this procedure blocks until the system is shut down.