Multi-programmed workloads

From Sniper
Jump to navigation Jump to search

In addition to simulating parallel applications in a Pin-based execution-driven mode, Sniper supports playing back instruction traces. One single-threaded trace can be fed into each simulated core. This allows for the simulation of multi-programmed workloads. Collecting and playing back traces were added in Sniper version 2.0.

Collecting traces

Make sure you have Sniper 2.0 or higher installed and compiled. Then use the record-trace script to record a dynamic instruction trace of your single-threaded application:

~/sniper$ ./record-trace -o fft -- test/fft/fft -p1

This will create the file fft.sift. Using the tool sift/siftdump you can look at its contents:

./sift/siftdump fft.sift | less

If you want to skip a certain number of instructions before starting to record the trace, use the -f (instructions to fast-forward) switch. Similarly, the -d (instructions to simulate in detail) switch allows you to stop recording the trace once it contains a given number of instructions.

SimPoints

Using the -b (block size) parameter of record-trace, you can split up the instruction trace into multiple files, each a fixed number of instructions long. For example, this command will run the FFT benchmark and create the files fft.0.sift, fft.1.sift, etc; each containing 1 million instructions:

~/sniper$ ./record-trace -o fft -b 1000000 -- test/fft/fft -p1 -m20

Additionally, basic block vectors will be generated for each block in the fft.*.bbv files. These are text files, first line of each is the number of instructions in the block (except for the last block, this number should match the block size parameter), the next lines are the basic block vector components in a scale that runs from 0 to 65535.

Playing back traces

Starting a multi-programmed simulation uses the regular run-sniper script, which allows you to specify all configuration parameters just like in the Pin-based multi-threaded mode. But rather than specifying the benchmark's command line, you now use the --traces= option to pass in the trace files to use for each core. Make sure to provision enough simulated cores (at least as many as there are trace files):

./run-sniper -c gainestown -n 2 --traces=swim.sift,gcc.sift,swim.sift,equake.sift

The simulation will end as soon as one of the trace files completes.

Collecting and playing back traces simultaneously

If trace files are taking up too much space, it might be a good alternative to run the applications at the same time you are simulating them. In this scenario, there are no guarantees that the simulator will see the exact same instruction trace for each run.

# Make a pipe for each application
cd ~/sniper
mknod fft1_pipe.sift p
mknod fft2_pipe.sift p
# Start recording traces for each application
./record-trace -o fft1_pipe -- ./test/fft/fft -p1 &
./record-trace -o fft2_pipe -- ./test/fft/fft -p1 &
# Run Sniper, pointing to the instruction trace pipe files
./run-sniper -c gainestown -n 2 --traces=fft1_pipe.sift,fft2_pipe.sift

Multiple Multi-threaded workloads

It is also possible to run multiple, multi-threaded benchmarks simultaneously. To do that, first you will need to install the integrated benchmarks suite. After that, one can use the --benchmarks parameter to indicate which benchmarks and configurations you would like to run. For example, you could run the above example with the following command:

$BENCHMARKS_ROOT/run-sniper -c gainestown --benchmarks=splash2-fft-small-1,splash2-fft-small-1

There is no limit to the number of threads that can be run when running with the --benchmarks parameter, but note that ROI handling is currently not supported in this mode, even for multi-threaded workloads.