Add compare_bench.py documentation. Fixes #309 (#318)
diff --git a/README.md b/README.md
index 9109430..2cfb70b 100644
--- a/README.md
+++ b/README.md
@@ -11,6 +11,8 @@
[Known issues and common problems](#known-issues)
+[Additional Tooling Documentation](docs/tools.md)
+
## Example usage
### Basic usage
Define a function that executes the code to be measured.
diff --git a/docs/tools.md b/docs/tools.md
new file mode 100644
index 0000000..f176f74
--- /dev/null
+++ b/docs/tools.md
@@ -0,0 +1,59 @@
+# Benchmark Tools
+
+## compare_bench.py
+
+The `compare_bench.py` utility which can be used to compare the result of benchmarks.
+The program is invoked like:
+
+``` bash
+$ compare_bench.py <old-benchmark> <new-benchmark> [benchmark options]...
+```
+
+Where `<old-benchmark>` and `<new-benchmark>` either specify a benchmark executable file, or a JSON output file. The type of the input file is automatically detected. If a benchmark executable is specified then the benchmark is run to obtain the results. Otherwise the results are simply loaded from the output file.
+
+The sample output using the JSON test files under `Inputs/` gives:
+
+``` bash
+$ ./compare_bench.py ./gbench/Inputs/test1_run1.json ./gbench/Inputs/test1_run2.json
+Comparing ./gbench/Inputs/test1_run1.json to ./gbench/Inputs/test1_run2.json
+Benchmark Time CPU
+----------------------------------------------
+BM_SameTimes +0.00 +0.00
+BM_2xFaster -0.50 -0.50
+BM_2xSlower +1.00 +1.00
+BM_10PercentFaster -0.10 -0.10
+BM_10PercentSlower +0.10 +0.10
+```
+
+When a benchmark executable is run, the raw output from the benchmark is printed in real time to stdout. The sample output using `benchmark/basic_test` for both arguments looks like:
+
+```
+./compare_bench.py test/basic_test test/basic_test --benchmark_filter=BM_empty.*
+RUNNING: test/basic_test --benchmark_filter=BM_empty.*
+Run on (4 X 4228.32 MHz CPU s)
+2016-08-02 19:21:33
+Benchmark Time CPU Iterations
+--------------------------------------------------------------------
+BM_empty 9 ns 9 ns 79545455
+BM_empty/threads:4 4 ns 9 ns 75268816
+BM_empty_stop_start 8 ns 8 ns 83333333
+BM_empty_stop_start/threads:4 3 ns 8 ns 83333332
+RUNNING: test/basic_test --benchmark_filter=BM_empty.*
+Run on (4 X 4228.32 MHz CPU s)
+2016-08-02 19:21:35
+Benchmark Time CPU Iterations
+--------------------------------------------------------------------
+BM_empty 9 ns 9 ns 76086957
+BM_empty/threads:4 4 ns 9 ns 76086956
+BM_empty_stop_start 8 ns 8 ns 87500000
+BM_empty_stop_start/threads:4 3 ns 8 ns 88607596
+Comparing test/basic_test to test/basic_test
+Benchmark Time CPU
+---------------------------------------------------------
+BM_empty +0.00 +0.00
+BM_empty/threads:4 +0.00 +0.00
+BM_empty_stop_start +0.00 +0.00
+BM_empty_stop_start/threads:4 +0.00 +0.00
+```
+
+Obviously this example doesn't give any useful output, but it's intended to show the output format when 'compare_bench.py' needs to run benchmarks.