Array programming

The easiest way to use the GPU's massive parallelism, is by expressing operations in terms of arrays: Metal.jl provides an array type, MtlArray, and many specialized array operations that execute efficiently on the GPU hardware. In this section, we will briefly demonstrate use of the MtlArray type. Since we expose Metal's functionality by implementing existing Julia interfaces on the MtlArray type, you should refer to the upstream Julia documentation for more information on these operations.

If you encounter missing functionality, or are running into operations that trigger so-called "scalar iteration", have a look at the issue tracker and file a new issue if there's none. Do note that you can always access the underlying Metal APIs by calling into the relevant submodule.

Construction and Initialization

The MtlArray type aims to implement the AbstractArray interface, and provide implementations of methods that are commonly used when working with arrays. That means you can construct MtlArrays in the same way as regular Array objects:

julia> MtlArray{Int}(undef, 2)
2-element MtlVector{Int64, Metal.PrivateStorage}:
 0
 0

julia> MtlArray{Int}(undef, (1,2))
1×2 MtlMatrix{Int64, Metal.PrivateStorage}:
 0  0

julia> similar(ans)
1×2 MtlMatrix{Int64, Metal.PrivateStorage}:
 0  0

Copying memory to or from the GPU can be expressed using constructors as well, or by calling copyto!:

julia> a = MtlArray([1,2])
2-element MtlVector{Int64, Metal.PrivateStorage}:
 1
 2

julia> b = Array(a)
2-element Vector{Int64}:
 1
 2

julia> copyto!(b, a)
2-element Vector{Int64}:
 1
 2

Higher-order abstractions

The real power of programming GPUs with arrays comes from Julia's higher-order array abstractions: Operations that take user code as an argument, and specialize execution on it. With these functions, you can often avoid having to write custom kernels. For example, to perform simple element-wise operations you can use map or broadcast:

julia> a = MtlArray{Float32}(undef, (1,2));

julia> a .= 5
1×2 MtlMatrix{Float32, Metal.PrivateStorage}:
 5.0  5.0

julia> map(sin, a)
1×2 MtlMatrix{Float32, Metal.PrivateStorage}:
 -0.958924  -0.958924

To reduce the dimensionality of arrays, Metal.jl implements the various flavours of (map)reduce(dim):

julia> a = Metal.ones(2,3)
2×3 MtlMatrix{Float32, Metal.PrivateStorage}:
 1.0  1.0  1.0
 1.0  1.0  1.0

julia> reduce(+, a)
6.0f0

julia> mapreduce(sin, *, a; dims=2)
2×1 MtlMatrix{Float32, Metal.PrivateStorage}:
 0.59582335
 0.59582335

julia> b = Metal.zeros(1)
1-element MtlVector{Float32, Metal.PrivateStorage}:
 0.0

julia> Base.mapreducedim!(identity, +, b, a)
1×1 MtlMatrix{Float32, Metal.PrivateStorage}:
 6.0

Random numbers

Base's convenience functions for generating random numbers are available in Metal as well:

julia> Metal.rand(2)
2-element MtlVector{Float32, Metal.PrivateStorage}:
 0.89025915
 0.8946847

julia> Metal.randn(Float32, 2, 1)
2×1 MtlMatrix{Float32, Metal.PrivateStorage}:
 1.2279074
 1.2518331

Behind the scenes, these random numbers come from two different generators: one backed by Metal Performance Shaders, another by using the GPUArrays.jl random methods. Operations on these generators are implemented using methods from the Random standard library:

julia> using Random, GPUArrays

julia> a = Random.rand(MPS.default_rng(), Float32, 1)
1-element MtlVector{Float32, Metal.PrivateStorage}:
 0.89025915

julia> a = Random.rand!(GPUArrays.default_rng(MtlArray), a)
1-element MtlVector{Float32, Metal.PrivateStorage}:
 0.0705002
Note

MPSMatrixRandom functionality requires Metal.jl >= v1.4

Warning

Random.rand!(::MPS.RNG, args...) and Random.randn!(::MPS.RNG, args...) have a framework limitation that requires the byte offset and byte size of the destination array to be a multiple of 4.