# Alternatives to C++ Function Pointers in SYCL using Function Objects

## 24 September 2019

Function Pointers are a feature of the C language and so form part of the C++ standard. As such, a function pointer allows the following behavior:

- "A
`pointer`

to a`function`

can be passed as a`parameter`

to another function"

In C++, especially in modern C++, function pointers are a legacy feature from the C language but they still exist in some code bases.

At Codeplay we develop ComputeCpp, an implementation of the SYCL standard. SYCL (pronounced ‘sickle’) is a royalty-free, cross-platform abstraction layer that enables code for heterogeneous processors to be written in a “single-source” style using completely standard C++. SYCL enables single source development where template functions can contain both host and device code to construct complex algorithms that use acceleration. However, SYCL does not provide support for function pointers since this is a limitation posed by the design of OpenCL v1.2 which is the basis of the current SYCL v1.2.1 definition.

But there is good news, we can use modern C++ to implement a solution that can be used with SYCL. SYCL is built with C++11 (and onward depending on the implementation), meaning features like anonymous functions known as "lambdas" can be used with little to zero overhead. Even going back to C++98/03 it is possible to use function objects defined as either structs or classes, and additionally, you can template your operation (the computation logic) to provide a generic way to consume the function objects or lambdas.

Although function pointers are a legacy C/C++ approach the main benefit is fairly obvious; they provide a straightforward mechanism for choosing a function to execute at run-time.

Here is a short example using a function pointer to perform binary operations such as `addition`

,
`subtraction`

, `multiplication`

and `division`

.

First we define the operations:

```
template<typename T>
auto add(T left, U right) -> T { return left + right; }
template<typename T>
auto subtract(T left, U right) -> T { return left - right; }
template<typename T>
auto multiply(T left, U right) -> T { return left * right; }
template<typename T>
auto divide(T left, U right) -> T { return left / right; }
```

Now we can define the function that will compute the result based on the provided operation via the function pointer parameter:

```
template<typename T>
auto calculate(T left, T right, int (*binary_op)(T, T)) -> T {
return (*binary_op)(left, right);
}
// usage: calculate(6, 3.5, add);
```

On the `return (*binary_op)(left, right);`

call, we are dereferencing the function
pointer and this allows us to execute the code in the memory block it points to.

If we try to call our `calculate`

function or even call any of the binary operators
`(add, subtract, multiply, divide)`

inside a SYCL kernel, we will get an error from
the SYCL device compiler that indicates that the C++ construction being used cannot be converted
into a valid OpenCL code.

Therefore, we need to use an alternative approach that suggests changing the binary operations (our algebraic functions) to be function objects instead of global functions. Additionally, function objects have the ability to encapsulate logic and state within a unique type.

In order to produce a valid compile-time resolvable C++ code, we have to define a template
object type such as a `struct`

or `class`

and overload the function
call `operator()`

to carry out the computation. C++ allows the overloading of the
function call `operator()`

, such that an object instantiated from a class can
be invoked like a function.

Let's modify the calculate function to take a function object as an argument instead of a function pointer.

There is a slight difference in the definition; we need to pass the function object a template, but it is in fact easier to understand as no pointer dereferencing is involved.

I actually think this makes things clearer and more readable:

```
template <typename T, class Operation>
auto calculate(T left, T right, Operation binary_op) -> T {
return binary_op(left, right);
}
// usage: calculate(6, 3.5, add<double>{});
```

On the `return binary_op(left, right);`

call, we are just calling the
overloaded `operator()`

of the function object.

Next let's define the function objects:

```
template<typename T>
struct add {
auto operator()(T left, T right) -> T {
return left + right;
}
};
...
```

The rest of the function objects are defined in the same way with their specific logic
inside the overloaded `operator()`

.

Although this may add a small amount of complexity, it provides a good alternative to using
function pointers in SYCL kernels and offers some additional benefits. The main benefit is that
they are *objects* and hence can contain state, either statically across all instances
of the function objects or individually on a particular instance.

Now that this is clear let's walk through some examples.

## Example Implementation for SYCL Code Using Function Objects

The following code is intended to offer a *generic* program that generates sequences of any type.
It is very simplified for the purpose of showcasing an alternative to function pointers by
using function objects.

We can define our abstract SYCL kernel functor, where the real
logic is not implemented as a part of this class.

```
template <typename T, class Operation>
class generator_kernel {
static constexpr auto read_mode = access::mode::read;
static constexpr auto write_mode = access::mode::write;
static constexpr auto target = access::target::global_buffer;
public:
generator_kernel(accessor<T, 1, read_mode, target> input_acc,
accessor<T, 1, write_mode, target> output_acc, Operation op)
: m_input_acc(input_acc), m_output_acc(output_acc), m_op(op) {}
void operator()(item<1> item) {
auto id = item.get_linear_id();
m_output_acc[id] = m_op(m_input_acc[id]);
}
private:
accessor<T, 1, read_mode, target> m_input_acc;
accessor<T, 1, write_mode, target> m_output_acc;
Operation m_op;
};
```

On `line 1`

, we declare our function object template. I like to use `class`

instead of `typename`

in this case because it communicates better that Operation is going
to be an actual `class`

(or `struct`

).

The kernel becomes rather simple as
all of the computational logic is now defined by the function object - `op`

that will be
passed later when invoked in the application.

Having the operation declared as a template which allows us to pass a functor with an overloaded
`operator()`

which will behave similarly to a function pointer.

Here is the function we will call to generate a sequence:

```
template <typename T, class Operation>
void generate_seq(const std::vector<T>& input, std::vector<T>& output,
size_t num_elements, Operation op) {
<< setup buffers >>
queue.submit([&](handler& cgh) {
<< setup accessors >>
auto kernel = generator_kernel<T, Operation>(input_acc, output_acc, op);
cgh.parallel_for(range<1>(num_elements), kernel);
}
```

The `input`

array in the function parameters is used to store the `index`

values of the sequence that is to be generated.

We instantiate the kernel function object - `generator_kernel`

with a template
operation 'auto kernel = generator_kernel<T, Operation>(input_acc, output_acc, op);'.

Now we can create the function objects that we pass as operations.

**1.** A functor that generates an incremental sequence (it increments each element's value by 1).

```
template <typename T>
struct increment {
auto operator()(int idx) -> T {
T res = 0;
for (auto i = 0; i < idx; i++) {
res++;
}
return res;
}
};
```

`int idx`

is the work-item id or our index derived from the input array of index values.

We can also invoke functors from the body of other functors, thus we can re-use existing code to extend our functionality.

**2.** The following functor will increment each element in the array and multiply
it by a scalar, demonstrating a more specialized sequence based on our `input indexes`

.

```
template <typename T, int scalar>
struct increment_and_multiply {
auto operator()(int idx) -> T {
T res = 0;
res = increment<T>{}(idx);
res *= scalar;
return res;
}
};
```

First, we declare another template parameter to tune the function object to accept a scalar value for the multiplication part.

In this functor we also make use of other function objects to avoid copy-pasting code by doing an
inline instantiation of the `increment<T>`

function object and invoking it through the
overloaded function call operator.

**3.** Finally, as a more interesting and practical example, we can have a
functor that will compute and return the `fibonacci sequence`

.

```
template <typename T>
struct fibonacci {
auto operator()(int idx) -> T {
T res = 0, a = 1, b = 1;
if (idx <= 0) {
return 0;
}
if (idx > 0 && idx < 3) {
return 1;
}
for (auto i = 2; i < idx; i++) {
res = a + b;
a = b;
b = res;
}
return res;
}
};
```

Now it is time to use these functors. Below is how you pass them to the
`generate_seq`

function where they will be
called like normal functions via the `operator()`

.

```
// 1 - use 'increment'
generate_seq(input, output, num_elements, increment<int>{});
// 2 - use 'increment multiply by scalar'
generate_seq(input, output, num_elements, increment_and_multiply<int>{});
// 3 - use 'fibonacci'
generate_seq(input, output, num_elements, fibonacci<int>{});
```

Since we cannot use function pointers in SYCL kernels, function objects can be used as a
standard alternative to them and we also gain other benefits that come alongside function
objects. They are flexible, work well with templates and integrate seamlessly within OOP
application designs, allowing you to take advantage of `object state`

,
`composition`

and even `compile-time polymorphism`

(if you are interested
you can also read my blog
post on using polymorphism in SYCL).

## Another alternative to function pointers - Generic Lambdas

We mentioned lambdas as an alternative to function pointers in the first section and they were introduced in
C++11 with more improvements for C++14 including the addition of generic lambdas.

If you are interested
in using generic lambdas you can read my
blog post that explains how
to use them in SYCL device code with practical examples. The SYCL application example in that blog matches
the one in this post.

While SYCL v1.2.1 was defined with C++ 11 in mind,
ComputeCpp, our implementation of
SYCL, supports C++14 features and make it possible to use generic lambdas, but other SYCL implementations
may not if they are using C++11.

### Another Possible Solution

There may be a possibility that you you really need to call function pointers inside a SYCL kernel. This might mean adding SYCL support to an existing C++ code base that already makes use of function pointers that would be difficult to replace with the proposed solution.

ComputeCpp may be able to support this behavior, however, it is required that the body of the function you are pointing to is resolved in compile time. If you'd like to find out more about this ask on our forum to discuss your problem. The solution may vary based on the specifics of your use-case.