Published on October 14, 2016
1. Varnish Forensics Turning Varnish debugging into a crime scene
2. AGENDA Hello, today we will have a look at a fake crime scene and a detective will show you how we conduct police work when someone reports a bug to our support team.
3. D ri d i B o S up po rt D W H O A M I ?
4. Why learn debugging?
5. Because problems are bound to occur Learning debugging – Detecting a problem in the first place? – Panicking doesn’t help, except a potential attacker – It’s actually the second thing we learn in training – Not as hard as it may seem, but neither easy Outsourcing debugging – Get support included in Varnish Plus – Works best if you can also take part
6. program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?”
7. The embarrassing truth about being clever Debugging clever software – Careful choice of wording: clever, not smart nor bright – Can you outsmart the Varnish core developers? (probably) – Can you outsmart yourself? (probably not) Down the rabbit hole – How far can you debug down the stack? – Is Varnish showing a bug somewhere up?
8. Inspecting a problem
9. When somebody reports a crime crash Send a forensics team to gather information – Ask lots of routine questions to the Varnish instance – Neighbors are also primary suspects – Wait for the report Our team consists in – https://github.com/varnish/varnishgather/ – And all the underlying tools used by varnishgather
10. My scientific method Once I get a report – Make observations – Formulate a hypothesis ● Or maybe this is a known issue – Try reproducing, experiment ● Ask for help when clueless – Accept or reject the hypothesis ● Or even better, dump it on someone else desk – Rinse and repeat
11. Thinking outside of the box office If the scientific method doesn’t work – Pretend it’s not a bug ● You never know, the customer may believe you – Wait until your shift ends ● And make sure never to assign a bug to yourself – Be creative ● I can’t share all my trade secrets
12. Debugging Varnish
13. Thinking outside of the black box Varnish is very observable from the outside in – Shared memory log ● Structured logs ● Counters – Command-line interface – VCL ● Comprehensive state machine ● Can contribute custom logs ● Extensible with VMODs (shout out to rtstatus) – HTTP, obviously
14. Thinking outside of the black box People tend to overlook documentation – We have manuals for more than just programs ● vcl(7) ● varnish-cli(7) ● vsl(7) and vsl-query(7) ● varnish-counters(7) ● vmod_std(3) and vmod_directors(3) All manuals are also part of the online docs
15. Thinking outside of the black box The error may come from outside Varnish – Is there a resource hog in another process? (eg. backup) – Is your kernel doing something dumb? (eg. swapping the shmlog) – Can you correlate periodic failures to periodic tasks? (eg. crontab) This is hard for us to tell – This is why varnishgather collects so many things – We also collect information about other Varnish Plus products
16. Preventing bugs before they happen Monitoring is key – Gathering more than just-in-time data – Ability to correlate events with trends – Anticipating failures from trends – Faster readings with good visualization We cannot do that (we do to some extent) – It helps when citizens customers share graphs – Searching for clues can be easier for detectives
17. Wrapping up Learn Varnish and its surroundings – Learn HTTP, know your backends – Get familiar with Varnish tools – Know the neighborhood When a problem occurs – Take a scientific approach – Or try thinking like a criminal ● By that I mean like a web developer
18. Contact info: Email: firstname.lastname@example.org IRC: dridi on irc.linpro.no Or just come and bug me :-)