Skip to content

astro-group-bristol/profiling-with-flamegraphs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Profiling with flamegraphs

Plan

  • What's profiling?
  • Using flamegraphs
  • Hands-on (python)

What's profiling/why profile?

  • If you want code to run faster, you need to know which parts are slow
  • Guesswork doesn't always get the right answer
  • Profiling is analysing where the resources are being used in running code
    • Especially CPU time, but maybe memory and other things

Wise sayings:

Amdahl's Law:

"the overall performance improvement gained by optimizing a single part of a system is limited by the fraction of time that the improved part is actually used"

Donald Knuth:

"premature optimization is the root of all evil"

Profilers

Strategy

  • Deterministic
    • Record when certain things happen (e.g. enter/exit subroutines)
    • Using instrumentation or events
    • Pro: accurately records everything that happens
    • Con: can slow things down (sometimes a lot)
  • Statistical
    • Repeatedly ask "what's the status now?" and record the answers
    • Less intrusive, less likely to affect the timings it's trying to measure

Types

  • System level

    • Linux perf
    • MacOS Instruments(?)
    • May need privileged access
      • (sudo sysctl kernel.perf_event_paranoid=0)
    • May not understand/report stack frames in a language-friendly way
  • Language-specific

Output

For CPU profiling the output from the profiler is usually a list of function names or stack traces with timings.

These can be hard to read.

Which is why you need...

Flamegraphs!

Flamegraphs provide a visualisation of hierarchical data like stacktraces. Invented by Brendan Gregg around 2013(?) they exist in various forms:

Output is interactive SVG (Scalable Vector Graphics) - click on an element to expand/contract it to the full page width.

Example outputs:

  • STILTS matching: before and after a 2-line change to matching code - 15% speedup (a561d815, replace TreeMap with HashMap)

Try it out!

Python example using py-spy:

  • Install py-spy, one of the following might work:
    • MacOS: brew install py-spy
    • other: pip install py-spy
  • Run py-spy record on a python program you want to profile
    • from the start:
      py-spy record --native -o pyspy.svg -- python program.py
      
    • attach to a running process:
      py-spy record --native -o pyspy.svg --pid 12345
      
  • (py-spy has some other nice tricks too like py-spy top)

Python example using cProfile:

  • cProfile is included with python and it can produce text summaries or binary cprof files.
  • You need flameprof to turn the cprof files into flamegraphs:
    pip install flameprof
    
  • Run a python program with cProfile enabled:
    python -m cProfile -o program.cprof program.py 
    
  • Convert the output to a flamegraph:
    flameprof program.cprof > cprof.svg
    
  • (The flamegraphs don't seem to be interactive for me, but the docs suggest they should be)

Example using system logging tools:

  • On Linux you can use perf, then pass the output to FlameGraph scripts
    • Clone the FlameGraph repo
      git clone https://github.com/brendangregg/FlameGraph
      
  • Run your program with perf record
    perf record -F 99 -a -g --call-graph dwarf,32768 -- my-program
    
    
  • Generate a flamegraph from the result:
    perf script \
       | FlameGraph//stackcollapse-perf.pl \
       | FlameGraph/flamegraph.pl > results/perf.svg
    
    
  • I think you can do something similar on MacOS using Instruments (maybe here or here??)

Other uses

  • You can use flamegraphs for other things too, e.g. memory usage, off-CPU time, special event categories (perf has lots of options).
  • And here's a neat trick to see what's taking up space on your disk:
    git clone https://github.com/brendangregg/FlameGraph
    FlameGraph/files.pl ~ | FlameGraph/flamegraph.pl > files.svg