Heterogeneous Development at the DHPCC++ 2018 Workshop

Posted on May 10, 2018 by Michael Wong.

DHPCC++ is a workshop designed to cover all forms of Heterogeneous and Distributed computing for C and C++ languages. It had its pedigree in two successful SYCL Workshops. While SYCL continues to be a key component of the workshop, it has been broadened due to the demand of the industry. There are still SYCL papers presented at this workshop, but now there are also papers from HPX, and HiHat, for example. To find out more about SYCL activities at IWOCL, please see our IWOCL 2018 blog.
There are several major workshops on heterogeneous computing.

  • P3MA
  • Repara
  • HeteroPar
  • DHPCC++

While P3MA is focused on Performance, Productivity and Portability (hence P cubed) at  ISC for any language, WACCPD is aimed at Directive-based Heterogeneous computing through OpenMP and OpenACC at SC. Repara is a workshop within Euro-Par that also has elements of heterogeneous computing but with some focus towards refactoring code from legacy applications to support the new paradigm. HeteroPar is also a workshop in conjunction with Euro-Par and is focused on the general paradigm of heterogeneous programing algorithm, tools, and models with the distinction of probably the longest running, dating from 2001.  DHPCC++ differs in that it is focused on both Distributed and Heterogeneous computing (hence the DH), and is squarely focused on adding it for the languages of C and C++ only. We started the workshop as a way to continue the research based on our experience developing such languages. DHPCC++ stands for Distributed and Heterogeneous Computing in C and C++ where the embedded acronym for HPC is deliberate, as it is inclusive of High Performance computing. The list is long including:

  • Boost.Compute
  • Khronos SYCL
  • HPX
  • Kokkos
  • Raja
  • UPC++
  • HCC
  • CUDA
  • HSA

After a very successful workshop paired with IWOCL 2017 in Toronto, with 30 attendees and over 12 papers, as well as a keynote by Paul Chow on FPGA simulators, DHPCC++ is moving into its second year at IWOCL in Oxford. The workshop was expanded in 2017 to be inclusive of many of the other C/C++ frameworks that are also striving towards the same goal of establishing a viable Heterogeneous and Distributed computing within the C and C++ programming language. We chose these languages because of the wide spread  expertise in adapting these languages towards off-node computing and the desire to focus on pursuing the standardization within the ISO C and C++ language.

As we continue, we plan to link many of these workshops together with a common journal bringing together the best papers of all the conferences. Current plans include linking P3MA, WaCCPD, and DHPCC++ this year for a special edition of Springer publishing named "Performance Portable and Productive Computing."

DHPCC++ tends to cover research and development in the area of C and C++ languages for heterogeneous and distributed computing that extends beyond current ISO C and C++. It also covers up-to-date ISO C and C++ development in this area. The ISO languages have only recently added parallelism in C++11 while C11 has only published a recent Parallel Technical Specification. While both standards are open to adding heterogeneous programming and distributed computing support, that work is likely to take several years, landing in 2020 or 2023. In the meantime, the demand to support off-node devices has driven the increased proliferation of frameworks for C and C++. Some even dream of replacing MPI and OpenMP but that would be unlikely as these traditional frameworks still maintain a long standing use case in non-C/C++ languages.

At DHPCC++, there is a combination of short and long papers as well as pure abstract-only presentations. At the inaugural DHPCC++ workshop at IWOCL 2017, there were 10 talks/papers and a keynote by Paul Chow on FPGA simulator at University of Toronto. The workshop was well attended, including many FPGA participants from Altera, based in Toronto. At the end of the day was a stimulating panel discussion on the future direction for heterogeneous programming with C and C++.

At this year's DHPCC++, the keynote will be headlined by Dr. John Wickerson of Imperial College discussing GPU Memory Models. As we move forward beyond C++ memory model for parallel CPU systems, the need for heterogeneous memory model will be paramount to these languages moving forward towards standardization. OpenCL and SYCL are examples of such heterogeneous reference.  Dr. Wickerson works with Professor Donaldson on formal verification models and has a distinguished list of publications in this area.

To further entice the mind, we will also have an Invited talk by Dr. Hal Finkel of Argonne National Lab. Dr. Finkel is the Vice Chair of Pl. 16 of the US INCITS C++ Committee and is the Conference Chair for the SC LLVM HPC Workshop. His talk will connect how US National Labs are responding to the Exascale challenge with heterogeneous computing and compiler optimization. He will look at programming models in HPC today and the future at US National Labs and how we can continue to achieve high performance while maintaining high productivity. He will talk about how those changing trends may be influenced by trends in the C++ ecosystem and by parallelism-aware compile-optimization technology.

There will also be six Referred accepted talks and papers with plans for a similar panel to stimulate discussion. The papers range from discussing how HPX, a popular C++ framework for distributed computing, handles computations in a large Stellar Astrophysical application to the report from High Performance Computing BoF disucssion at SC 17, and an update on current status on the progress of Heterogeneous and Distributed computing in ISO C++.

SYCL itself has grown by releasing v1.2.1 in late 2017 after nearly 2 and a half years. The SYCL 1.2.1 specification has improved on the existing 1.2 standard by introducing new features allowing for better integration with existing machine learning and OpenCL-based frameworks such as TensorFlow, with various other improvements based on user feedback. SYCL 1.2.1 includes new asynchronous data movement routines; allowing users more control over when data is moved between devices, placeholder accessors; making data access more flexible, a new extensible property-based system; for customizing the behavior of SYCL objects, and much more. The Khronos group has also been working closely with ISO C++ to ensure closer integration between SYCL and upcoming C++ standards.

In the latest development, SYCL has updated its Statement of Work to solidly coincide with parallel ISO C++ development. We are using the experience gained from our implementation of SYCL to support heterogeneous computing in future ISO C++ by following the C++ Future Directions paper, co-authored by the chair of SYCL and other ISO C++ Senior Directors. The Khronos SYCL group has also gone through a period of open discussion between members to solidify the future direction of SYCL and will begin working on new features for the next version of SYCL. This will be accompanied by more regular releases of SYCL based on the bus-train model such that we will release on a regular cadence of approximately 1.5 years to enable our many users (now into thousands of downloads) to be able to use the latest features of Modern ISO C++ as it also evolves along with SYCL.

For people who are firmly interested in combining modern C++ with heterogeneous/distributed computing, this workshop's laser focus on these combined areas makes it stand apart from the IWOCL OpenCL content. Attendees will learn the latest research direction in this domain by several popular frameworks, as well as the latest directions in ISO C and C++ towards this direction.

There is almost no doubt that some form of heterogeneous and distributed support will be added to C++. The latest C++ "Future Direction" paper identifies this as one of the aspects that is desired for ISO C++. C is a bit more ambiguous as it will only be adding parallelism in the next C2x release, with Heterogeneous Distributed support likely to be added further in the future. C tends to takes some of its future directions from C++ after the design has been firmly established.

Even after this framework appears in ISO standards, the conference and its group of languages/frameworks will likely continue as a reference for future continued research to adapt as new hardware is delivered.

Register for DHPCC++ on the IWOCL website