To recap, unfortunately, ROOT PCH is not extendable; ROOTMAP requires a lot of
maintenance and goes on a very untested codepath, while RDICT has a very limited
scope. The three features require a lot of mechanisms to work together and the
corner cases are very many. The interaction between some of the features often
break design and introduce layering violations.
## From C++ Modules to Dictionaries
C++ Modules have native capabilities to avoid reparsing. It combines all
home-grown solutions to avoid the costly operation at industry quality.
Currently, when ROOT is built with `-Druntime_cxxmodules=On` it gives priority to
C++ Module files (real *pcm* files). If such a file is present it reads all
necessary information from it. If no such file is present ROOT falls back to the
standard information flow.
### Observable differences from 'standard' ROOT
As always, ROOT is (mostly) API and ABI compatible. C++ Modules-aware ROOT is no
different. There are several differences which can be noticed:
*\*modulemap files in $ROOTSYS/include -- those files are used by rootcling to
put a set of header files in a single pcm file. For example, all related
headers of *libGeom* are persisted in *Geom.pcm*. There are a few notable
examples, which are specific to the way we build ROOT. In certain cases we
want some header files to be compiled within C context or with RTTI on/off.
That's mostly for bootstrapping ROOT (aka rootcling stage1).
* modulemap.overlay.yaml -- automatically created virtual filesystem overlay
file. This file introduces C++ Modules for external dependencies.
For example, to 'modularize' glibc for ROOT we would need to place a modulemap
file in (usually) `/usr/include`. This folder is not writable on many
platforms. The vfs file tells the compiler to pretend there is a file at a
specific location. This way we 'mount' `/usr/include/module.modulemap`
non-invasively. The reasons why we need to extend the C++ modules support
beyond ROOT is described bellow.
* rootcling creates a new binary artifact *Name.pcm* after the library name --
this is a temporary solution for the current technology preview. Once we
advance further the implementation we will only create Name.pcm without the
other 2 artifacts. At a final stage, ROOT might be able to integrate the
Name.pcm with the shared library itself.
* Preloads all \*pcm files at start up time -- this currently is the only
remaining bottleneck which introduces a relatively small performance overhead
at startup time and is described bellow. It will be negligible for third-
party code (dominated by header parsing).
* Improved correctness in number of cases -- in a few cases ROOT is more
correct. In particular, when resolving global variables and function
declarations which are not part of the ROOT PCH.
* Enhanced symbol resolution mechanisms, bloom filters -- standard ROOT relies
on information in ROOTMAP files to react when the llvm JIT issues an
unresolved symbol callback. C++ Modules-aware ROOT relies on a behavior much
closer to the standard linker behavior. In particular, we start searching on
the LD_LIBRARY_PATH descending to the system libraries. The algorithm is very
efficient because it uses bloom filters[[5]]. This in turn allows ROOT symbol
to be extended to system libraries.
### Supported Platforms
We support all platforms with glibc++ versions: 5.2, 6.2 and 7.2 and 8.
## Changes required by the users
* Self-contained header files -- every header file should be able to compile
on its own. For instance, `gcc -fsyntax-only -xc++ header.h`
* Enable it in `rootcling` -- rootcling can produce a C++ Modules-aware
dictionary when it is invoked with `-cxxmodule` flag.
* Modularization of external dependencies -- if a header file is not explicitly
nominated as part of a module and it is transitively included in two modules,
both modules contain that header file content. In other words, the header is
duplicated. In turn, this leads to performance regressions. If a dictionary
depends on a header (directly or indirectly) from a external library (e.g.
libxml) it needs to be modularized. As part of our ongoing efforts to move
CMSSW to use C++ Modules [[6]] we have implemented a helper tool [[7]]. The
tool detects (based on the include paths of the compiler) dependencies and
tries to generate the relevant vfs file.
## State of the union
C++ Modules-aware ROOT preloads all modules at start up time. Our motivating
example:
```cpp
// ROOT prompt
root[]S*s;// #1: does not require a definition.
root[]foo::bar*baz1;// #2: does not require a definition.
root[]foo::barbaz2;// #3: requires a definition.
```
becomes equivalent to
```cpp
// ROOT prompt
root[]importROOT.*;
root[]importFoo.*;
root[]S*s;// #1: does not require a definition.
root[]foo::bar*baz1;// #2: does not require a definition.
root[]foo::barbaz2;// #3: requires a definition.
```
The implementation avoids recursive actions and relies on a well-defined (by
the C++ standard) behavior. Currently, this comes with a constant performance
overhead which we go in details bellow.
### Current limitations
* Incremental builds -- building ROOT, modifying the source code and rebuilding
might not work. To work around it remove all pcm files in the $ROOTSYS/lib
folder.
* Relocatability issues -- we have fixed a few of the relocatability issues we
found. We are aware of an obscure relocatability issue when ROOT is copied in
another folder and we are rebuild. ROOT picks up both modulemap files in
seemingly distinct locations.
* Building pcms with rootcling -- in rare cases there might be issues when
building pcm files with rootcling. The easiest will be to open a bug report
to clang, however, reproducing a failure outside of rootcling is very difficult
at the moment.
### Performance
This section compares ROOT PCH technology with C++ Modules which is important but
unfair comparison. As we noted earlier, PCH is very efficient, it cannot be
extended to the experiments’ software stacks because of its design constraints.
On the contrary, the C++ Modules can be used in third-party code where the PCH
is not available.
The comparisons are to give a good metric when we are ready to switch ROOT to use
C++ Modules by default. However, since it is essentially the same technology,
optimizations of C++ Modules also affect the PCH. We have a few tricks up in
the slaves to but they come with given trade-offs. For example, we can avoid
preloading of all modules at the cost of introducing recursive behavior in
loading. This requires to build a global module index which is an on-disk
hash table. It will contain information about the mapping between an
identifier and a module name. Upon failed identifier lookup we will use the
map to decide which set of modules should be loaded. Another optimization
includes building some of the modules without `-fmodules-local-submodule-visibility`.
In turn, this would flatten the C++ modules structure and give us performance
comparable to the ROOT PCH. The trade-off is that we will decrease the
encapsulation and leak information about implementation-specific header files.
The main focus for this technology preview was not in performance due to
time considerations. We have invested some resources in optimizations and
we would like to show you (probably outdated) preliminary performance
results:
* Memory footprint -- mostly due to importing all C++ Modules at startup
we see overhead which depends on the number of preloaded modules. For
ROOT it is between 40-60 MB depending on the concrete configuration.
When the workload increases we notice that the overall memory performance
decreases in number of cases.
* Execution times -- likewise we have an execution overhead. For
workflows which take ms the slowdown can be 2x. Increasing of the work
to seconds shows 50-60% slowdowns.
The performance of the technology preview is dependent on many factors such
as configuration of ROOT and workflow. You can read more at our Intel
IPCC-ROOT Showcase presentation here (pp 25-33)[[8]].
You can visit our continuous performance monitoring tool where we compare
the performance of the technology preview with respect to 'standard' ROOT[[9]].
*Note: if you get error 400, clean your cache or open a private browser session.*
## How to use
Compile ROOT with `-Druntime_cxxmodules=On`. Enjoy.
# Acknowledgement
We would like to thank the ROOT team.
We would like to thank Liz Sexton-Kennedy (FNAL) in particular for supporting
this project.
We would like to thank Axel Naumann for early feedback on this document.
This work has been supported by an Intel Parallel Computing Center grant, by U.S.
National Science Foundation grants PHY-1450377 and PHY-1624356, and by the U.S.
Department of Energy, Office of Science.
# References
(1): [Vassilev, V., 2017, October. Optimizing ROOT's Performance Using C++ Modules. In Journal of Physics: Conference Series (Vol. 898, No. 7, p. 072023). IOP Publishing.][1]
(2): [Clang Modules, Official Documentation][2]
(3): [Manuel Klimek, Deploying C++ Modules to 100s of Millions of Lines of Code, 2016, CppCon][3]
(4): [Precompiled Header and Modules Internals, Official Documentation][4]
(5): [Bloom Filter][5]
(6): [C++ Modules support (based on Clang), GitHub Repo][5]
[1]:https://www.researchgate.net/profile/Vassil_Vassilev3/publication/319717664_Optimizing_ROOT%27s_Performance_Using_C_Modules/links/59bad690aca272aff2d01c1c/Optimizing-ROOTs-Performance-Using-C-Modules.pdf"Vassilev, V., 2017, October. Optimizing ROOT’s Performance Using C++ Modules. In Journal of Physics: Conference Series (Vol. 898, No. 7, p. 072023). IOP Publishing."
[3]:https://cppcon2016.sched.com/event/7nM2/deploying-c-modules-to-100s-of-millions-of-lines-of-code"Deploying C++ Modules to 100s of Millions of Lines of Code"
[4]:https://clang.llvm.org/docs/PCHInternals.html"Precompiled Header and Modules Internals"