Julia has quickly become one of the main programming languages for computational sciences, mainly due to its speed and flexibility. The speed and efficiency of Julia are the main reasons why researchers in the field of High Performance Computing have started porting their applications to Julia.
Since Julia has a very small binding-overhead to C, many efficient computational kernels can be integrated into Julia without any noticeable performance drop. For that reason, highly tuned libraries, such as the Intel MKL or OpenBLAS, will allow Julia applications to achieve similar computational performance as their C counterparts. Yet, two questions remain: 1) How fast is Julia for memory-bound applications? 2) How efficient can MPI functions be called from a Julia application?
In this paper, we will assess the performance of Julia with respect to HPC. To that end, we examine the raw throughput achievable with Julia using a new Julia port of the well-known STREAM benchmark. We also compare the running times of the most commonly used MPI collective operations (e.g., MPI_Allreduce) with their C counterparts. Our analysis shows that HPC performance of Julia is on-par with C in the majority of cases.