CallFlow¶
CallFlow is an interactive visual analysis tool that provides a high-level overview of CCTs together with semantic refinement operations to progressively explore the CCTs.
You can get CallFlow from its GitHub repository:
$ git clone https://github.com/LLNL/CallFlow.git
If you are new to CallFlow and want to start using it, see Getting Started, or refer to the full User Guide below.
Getting Started¶
Prerequisites¶
The callflow (python package) requires python (>= 3.6) and pip (>= 20.1.1). Other dependencies are checked/installed during the installation of callflow using pip.
Other dependencies are:
- hatchet
- pandas
- networkx (2.2)
- numpy
- flask
- statsmodels, and
- sklearn
CallFlow is available on GitHub
The callflow app (visualization component) requires node.js (>= 13.7.0) and npm (>= 6.13.7). If there is an older version of node installed, install nvm and use the following command to change version.
$ nvm use 13.7.0.
Installation¶
You can get CallFlow from its GitHub using this command:
$ git clone https://github.com/LLNL/CallFlow.git
Install callflow python package¶
To install callflow python package, run the following command using pip.
$ pip install .
To install in the dev mode,
$ pip install -e . --prefix=/path/to/install
Check Installation of callflow python package¶
After installing callflow, make sure you update your PYTHONPATH environment variable to point to directory where callflow was installed.
$ python
Python 3.7.7 (default, Jul 11 2019, 01:08:00)
[Clang 11.0.0 (clang-1100.0.33.17)]] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
Typing import callflow
at the prompt should succeed without any error messages:
>>> import callflow
>>>
Install the Visualization client¶
$ cd app
$ npm install
Supported data formats¶
Currently, hatchet supports the following data formats as input:
- HPCToolkit database: This is generated
by using
hpcprof-mpi
to post-process the raw measurements directory output by HPCToolkit. - Caliper Cali file: This is the format in which caliper outputs raw performance data by default.
- Caliper Json-split file: This is generated by either running cali-query on the raw caliper data or by enabling the mpireport service when using caliper.
For more details on the different input file formats, refer to the User Guide.
User Guide¶
CallFlow is structured as three components:
- A Python package callflow that provides functionality to load and manipulate callgraphs.
- A D3 based app for visualization.
- A python server to support the visualization client
Arguments¶
--verbose - Display debug points.
(optional, default: false)
--config - Config file to be processed.
(Either config file or data directory must be provided)
--data_dir - Input directory to be processed.
(Either config file or data directory must be provided)
--process - Enable process mode.
(default: false)
--profile_format - Profile format.
(required, either hpctoolkit | caliper | caliper_json)
--save_path - Save path for the processed files.
(optional, default: data_dir/.callflow)
--filter_by - Set filter by column
(optional, e.g., "time" or "time (inc)")
--filter_perc - Set filter percentage.
(optional, e.g., 10, 20, 30)
--group_by - Set the semantic level for supergraph
(optional, e.g., module to get super graph, name to get call graph, default: 'module')
--read_parameter - Enable parameter analysis.
(optional. This is an experimental feature)
Process datasets¶
First step is to process the raw datasets to use with CallFlow. The processing can be done either by passing data directory (using –data_dir), or using config.callflow.json file (using –config).
- Using the directory (i.e., –data_dir).
The user can input a directory of profiles of a “same” format for processing.
$ python3 server/main.py --data_dir /path/to/dataset --profile_format hpctoolkit --process
Note: The processing step typically entails some filtering and aggregation of data to produce the reduced graphs at desired granularity. To do this, the user can currently provide 3 arguments, namely –filter_by, –filter_perc, and –group_by.
- Using the config file (i.e., –config).
The user can process profiles from different formats using the config file. The parameters of the preprocessing are provided through a config file (see examples of config files in the sample data directories). For example, the user can pass other arguments (e.g., save_path, filter_perc, etc.).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | {
"experiment": "experiment_name",
"save_path": "/path/to/dir",
"read_parameter": false,
"runs": [{
"name": "run-1",
"path": "/path/to/run-1",
"profile_format": "hpctoolkit | caliper | caliper_json"
},
{
"name": "run-2",
"path": "/path/to/run-2",
"profile_format": "hpctoolkit | caliper | caliper_json"
},
{
"name": "run-3",
"path": "/path/to/run-3",
"profile_format": "hpctoolkit | caliper | caliper_json"
}
],
"schema": {
"filter_by": "time (inc)",
"filter_perc": 0,
"group_by": "name",
"module_map": {
"Lulesh": ["main", "lulesh.cycle"],
"LeapFrog": ["LagrangeNodal", "LagrangeLeapFrog"],
"CalcForce": ["CalcForceForNodes", "CalcVolumeForceForElems", "CalcHourglassControlForElems", "CalcFBHourglassForceForElems"],
"CalcLagrange": ["LagrangeElements", "UpdateVolumesForElems", "CalcLagrangeElements", "CalcKinematicsForElems", "CalcQForElems", "CalcMonotonicQGradientsForElems", "CalcMonotonicQRegionForElems", "ApplyMaterialPropertiesForElems", "EvalEOSForElems", "CalcEnergyForElems", "CalcPressureForElems", "CalcSoundSpeedForElems", "IntegrateStressForElems"],
"Timer": ["TimeIncrement"],
"CalcConstraint": ["CalcTimeConstraintsForElems", "CalcCourantConstraintForElems", "CalcHydroConstraintForElems"],
"NA": ["Unknown"],
"MPI": ["MPI_Barrier", "MPI_Reduce", "MPI_Allreduce", "MPI_Irecv", "MPI_Isend", "MPI_Wait", "MPI_Waitall", "MPI_Finalize"]
}
}
}
|
$ python3 server/main.py --config /path/to/config.callflow.json --process
Using CallFlow as a web app¶
To run CallFlow’s web app, a python-based WSGI server (handles socket communication and processing of data) and a Vue client server need to be run simultaneously.
- Run the WSGI server.
Note: Similar to the processing step, the web server can be run either using –config or –data_dir.
$ python3 server/main.py --data_dir /path/to/dataset
or
$ python3 server/main.py --config /path/to/config.callflow.json
- Run the client server.
$ cd app
$ npm run dev
Using CallFlow inside Jupyter notebook environment¶
Use %callflow magic to run CallFlow in Jupyter notebook environment,
- To load the callflow’s magic extension, use the command %load_ext.
%load_ext callflow
- Now, %callflow can be used to trigger the user interface like the command line.
%callflow --data_dir /path/to/directory --profile_format format
or
%callflow --config /path/to/config/file
This feature will spawn the server and client in the background as child processes to Jupyter. It will also detect if any existing processes are in execution and attach seamlessly.
For reference, see an example notebook
If you encounter bugs while using CallFlow, you can report them by opening an issue on GitHub.
If you are referencing CallFlow in a publication, please cite the following paper:
- Huu Tan Nguyen, Abhinav Bhatele, Nikhil Jain, Suraj Kesavan, Harsh Bhatia, Todd Gamblin, Kwan-Liu Ma, and Peer-Timo Bremer. Visualizing Hierarchical Performance Profiles of Parallel Codes using CallFlow. In IEEE Transactions on Visualization and Computer Graphics, November 2019. DOI