Julia

Introduction#

Julia is a high-level general-purpose dynamic programming language that was originally designed to address the needs of high-performance numerical analysis and computational science, without the typical need of separate compilation to be fast, also usable for client and server web use, low-level systems programming or as a specification language. Julia aims to create an unprecedented combination of ease-of-use, power, and efficiency in a single language.

Julia on Sherlock#

Julia is available on Sherlock and the corresponding module can be loaded with:

$ ml julia

For a list of available versions, you can execute ml spider julia at the Sherlock prompt, or refer to the Software list page.

Using Julia#

Once your environment is configured (ie. when the julia module is loaded), julia can be started by simply typing julia at the shell prompt:

$ julia

_
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.0.0 (2018-08-08)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia>

For a listing of command line options:

$ julia --help

julia [switches] -- [programfile] [args...]
 -v, --version             Display version information
 -h, --help                Print this message

 -J, --sysimage <file>     Start up with the given system image file
 -H, --home <dir>          Set location of `julia` executable
 --startup-file={yes|no}   Load `~/.julia/config/startup.jl`
 --handle-signals={yes|no} Enable or disable Julia's default signal handlers
 --sysimage-native-code={yes|no}
                           Use native code from system image if available
 --compiled-modules={yes|no}
                           Enable or disable incremental precompilation of modules

 -e, --eval <expr>         Evaluate <expr>
 -E, --print <expr>        Evaluate <expr> and display the result
 -L, --load <file>         Load <file> immediately on all processors

 -p, --procs {N|auto}      Integer value N launches N additional local worker processes
                           "auto" launches as many workers as the number
                           of local CPU threads (logical cores)
 --machine-file <file>     Run processes on hosts listed in <file>

 -i                        Interactive mode; REPL runs and isinteractive() is true
 -q, --quiet               Quiet startup: no banner, suppress REPL warnings

Running a Julia script#

A Julia program is easy to run on the command line outside of its interactive mode.

Here is an example where we create a simple Hello World program and launch it with Julia

$ echo 'println("hello world")' > helloworld.jl

That script can now simply be executed by calling julia <script_name>:

$ julia helloworld.jl
hello world

Submitting a Julia job#

Here's an example Julia sbatch script that can be submitted via sbatch:

julia_test.sbatch

#!/bin/bash

#SBATCH --time=00:10:00
#SBATCH --mem=4G
#SBATCH --output=julia_test.log

# load the module
ml julia

# run the Julia application
julia helloworld.jl

You can save this script as julia_test.sbatch and submit it to the scheduler with:

$ sbatch julia_test.sbatch

Once the job is done, you should get a julia_test.log file in the current directory, with the following contents:

$ cat julia_test.log
hello world

Julia packages#

Julia provides an ever-growing list of packages that can be used to install add-on functionality to your Julia code.

Installing packages with Julia is very simple. Julia includes a package module in its base installation that handles installing, updating, and removing packages.

First import the Pkg module:

julia> import Pkg
julia> Pkg.status()
    Status `~/.julia/environments/v1.0/Project.toml`

Julia packages only need to be installed once

You only need to install Julia packages once on Sherlock. Since fielsystems are shared, packages installed on one node will immediately be available on all nodes on the cluster.

Installing packages#

You can first check the status of packages installed on Julia using the status function of the Pkg module:

julia> Pkg.status()
No packages installed.

You can then add packages using the add function of the Pkg module:

julia> Pkg.add("Distributions")
INFO: Cloning cache of Distributions from git://github.com/JuliaStats/Distributions.jl.git
INFO: Cloning cache of NumericExtensions from git://github.com/lindahua/NumericExtensions.jl.git
INFO: Cloning cache of Stats from git://github.com/JuliaStats/Stats.jl.git
INFO: Installing Distributions v0.2.7
INFO: Installing NumericExtensions v0.2.17
INFO: Installing Stats v0.2.6
INFO: REQUIRE updated.

Using the status function again, you can see that the package and its dependencies have been installed:

julia> Pkg.status()
Required packages:
 - Distributions                 0.2.7
Additional packages:
 - NumericExtensions             0.2.17
 - Stats                         0.2.6

Updating Packages#

The update function of the Pkg module can update all packages installed:

julia> Pkg.update()
INFO: Updating METADATA...
INFO: Computing changes...
INFO: Upgrading Distributions: v0.2.8 => v0.2.10
INFO: Upgrading Stats: v0.2.7 => v0.2.8

Removing packages#

The remove function of the Pkg module can remove any packages installed as well:

julia> Pkg.rm("Distributions")
INFO: Removing Distributions v0.2.7
INFO: Removing Stats v0.2.6
INFO: Removing NumericExtensions v0.2.17
INFO: REQUIRE updated.

julia> Pkg.status()
Required packages:
 - SHA                           0.3.2

julia> Pkg.rm("SHA")
INFO: Removing SHA v0.3.2
INFO: REQUIRE updated.

julia> Pkg.status()
No packages installed.

Examples#

Parallel job#

Julia can natively spawn parallel workers across multiple compute nodes, without using MPI. There are two main modes of operation:

ClusterManager: in this mode, you can spawn workers from within the Julia interpreter, and each worker will actually submit jobs to the scheduler, executing instructions within those jobs.
using the --machine-file option: here, you submit a SLURM job and run the Julia interpreter in parallel mode within the job's resources.

The second mode is easier to use, and more convenient, since you have all your resources available and ready to use when the job starts. In mode 1, you'll need to wait for jobs to be dispatched and executed inside Julia.

Here is a quick example on how to use the --machine-file option on Sherlock.

Given following Julia script (julia_parallel_test.jl) that will print a line with the process id and the node it's executing on, in parallel:

julia_parallel_test.jl

using Distributed
@everywhere println("process: $(myid()) on host $(gethostname())")

You can submit the following job: