From c5492e920aba6541a660c1077924f4b6eb294303 Mon Sep 17 00:00:00 2001 From: Avril Date: Mon, 23 Nov 2020 13:16:55 +0000 Subject: [PATCH] update Makefile --- Makefile | 2 +- README.md | 15 +++++++++++++-- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/Makefile b/Makefile index 686fa56..f08a3db 100644 --- a/Makefile +++ b/Makefile @@ -3,7 +3,7 @@ INCLUDE=include PROJECT=fcmp -OPT_FLAGS+= -fgraphite -fopenmp -floop-parallelize-all -ftree-parallelize-loops=4 +OPT_FLAGS?= -fgraphite -fopenmp -floop-parallelize-all -ftree-parallelize-loops=4 FEAT_CFLAGS?= -D_RUN_THREADED=0 FEAT_LDFLAGS?= -lpthread diff --git a/README.md b/README.md index 4909244..8311672 100644 --- a/README.md +++ b/README.md @@ -40,11 +40,22 @@ Build with default optimisations using `make release`, it will output a stripped * The Makefile uses variables `RELEASE_CFLAGS` and `RELEASE_LDFLAGS` to apply optimisations (and `DEBUG_CFLAGS` + `DEBUG_LDFLAGS` for extra compiler flags with the debug target). If needed you can set these yourself to prevent the defaults. * The default `RELEASE_CFLAGS` specify `-march=native` which may be undesireable for you. Set the variable or modify the Makefile if you need to remove this. +## PGO +Building with Profile Guided Optimisation is supported with the `pgo` Makefile target. It uses the same rules as the `release` target and outputs a binary to `fcmp-pgo`. + +There may be small performance improvements from using this target instead of `release`, but the difference is mostly negligable. + ## Debug target Build with debugging information and no optimisations using `make debug`, it will output a binary at `fcmp-debug`. -## Note -Before switching between `release` and `debug` targets, make sure to run `make clean`. +## Notes +- Before switching between targets, make sure to run `make clean`. +- GCC + Graphite compiler specific optimisation flags are added by default with the `OPT_FLAGS` variable. Override this variable if using another compiler that doesn't support these optimisations. + +### Multithreading +- By default, parallel processing is enabled when building through `libpthread`, to build a single-threaded version override the variables `FEAT_CFLAGS` and `FEAT_LDFLAGS` to empty. +- By default the program will decide at runtime whether or not to use parallelised processing. You can set `FEAT_CFLAGS="-D_RUN_THREADED=1"` to _force_ the use of a parallelised run every time in the binary, although this is not recommended. +- Performance gains from parallelised runs mostly appear with a large number of files being compared at once, as the task delegation overhead is surpassed. # License GPL'd with <3