Optimizing Erlang Code for Speed

100 %
0 %
Information about Optimizing Erlang Code for Speed
Technology

Published on February 17, 2014

Author: vsovietov

Source: slideshare.net

Description

Considers optimizations allow to reach microseconds latencies and GBs throughput in intelligent network management solution written in Erlang

Optimizing Erlang code for speed Revelations from a real-world project based on Erlang on Xen Maxim Kharchenko CTO, Cloudozer LLP mk@cloudozer.com ErlangDripro2014

The road map ● Erlang on Xen intro ● Speed-related notes – – ETS tables are (mostly) ok – Do not overuse records – GC is key to speed – gen_server vs. barebone process – NIFS: more pain than gain – ● Arguments are registers Fast counters Q&A

Erlang on Xen 101 ● A new Erlang runtime that runs without OS ● Conceived in 2009 ● Highly-compatible with Erlang/OTP ● Built from scratch, not a “port” ● Optimised for low startup latency ● Not an open source (yet) ● The public build service is free Go to erlangonxen.org 3

Zerg demo: zerg.erlangonxen.org 4

The road map ● Erlang on Xen intro ● Speed-related notes – – ETS tables are (mostly) ok – Do not overuse records – GC is key to speed – gen_server vs. barebone process – NIFS: more pain than gain – ● Arguments are registers Fast counters Q&A

Arguments are registers animal(batman = Cat, Dog, Horse, Pig, Cow, State) -> feed(Cat, Dog, Horse, Pig, Cow, State); animal(Cat, deli = Dog, Horse, Pig, Cow, State) -> pet(Cat, Dog, Horse, Pig, Cow, State); ... ● Many arguments do not make a function any slower ● Do not reshuffle arguments: %% SLOW animal(Cat, Dog, Horse, Pig, Cow, State) -> feed(Goat, Cat, Dog, Horse, Pig, Cow, State); ... 6

ETS tables are (mostly) ok ● A small ETS table lookup = 10x function activations ● Do not use ets:tab2list() inside tight loops ● Treat ETS as a database; not a pool of global variables ● 1-2 ETS lookups on the fast path are ok ● Beware that ets:lookup(), etc create a copy of the data on the heap of the caller, similarly to message passing 7

Do not overuse records ● ● ● selelement() creates a copy of the tuple State#state{foo=Foo1,bar=Bar1,baz=Baz1} creates 3(?) copies of the tuple Use tuples explicitly in the performance-critical sections to see the heap footprint of the code %% from 9p.erl mixer({rauth,_,_}, {tauth,_,AFid,_,_}, _) -> {write_auth,AFid}; mixer({rauth,_,_}, {tauth,_,AFid,_,_,_}, _) -> {write_auth,AFid}; mixer({rwrite,_,_}, _, initial) -> start_attaching; mixer({rerror,_,_}, _, initial) -> auth_failed; mixer({rlerror,_,_}, _, initial) -> auth_failed; mixer({rattach,_,Qid}, {tattach,_,Fid,_,_,AName,_}, initial) -> {attach_more,Fid,AName,qid_type(Qid)}; mixer({rclunk,_}, {tclunk,_,Fid}, initial) -> {forget,Fid}; 8

Garbage collection is key to speed ● Heap is a list of chunks ● 'new heap' is close to its head, 'old heap' - to its tail ● A GC run takes 10μs on average ● GC may run 1000s times per second ● How to tackle GC-related issues: – (Priority 1) Call erlang:garbage_collect() at strategic points – (Priority 2) For the fastest code avoid GC completely – restart the fast process regularly – (Priority 3) Use fullsweep_after option 9

gen_server vs barebone process ● Message passing using gen_server:call() is 2x slower than Pid ! Msg ● For speedy code prefer barebone processes to gen_servers ● Design Principles are about high availability, not high performance 10

NIFs: more pain than gain ● ● ● ● ● A new principle of Erlang development: do not use NIFs For a small performance boost, NIFs undermine key properties of Erlang: reliability and soft-realtime guarantees Most of the time Erlang code can be made as fast as C Most of performance problems of Erlang are traceable to NIFs, or external C libraries, which are similar Erlang on Xen does not have NIFs and we do not plan to add them 11

Fast counters ● ● 32-bit or 64-bit unsigned integer counters with overflow - trivial in C, not easy in Erlang FIXNUMs are signed 29-bit integers, BIGNUMs consume heap and 10-100x slower ● Use two variables for a counter? ● Erlang on Xen has a new experimental feature – fast counters: foo(C1, 16#ffffff, ...) → foo(C1+1, 0, ...); foo(C1, C2, ...) -> foo(C1, C2+1, ...); ... erlang:new_counter(Bits) -> Ref erlang:increment_counter(Ref, Incr) erlang:read_counter(Ref) erlang:release_counter(Ref) 12

Questions? ? ?? ? ? 13

Add a comment

Related presentations

Related pages

5. Optimizing Erlang code for speed - YouTube

Robby Raschke - Erlang-Lua: How to write an Erlang C Node - Berlin Erlang Factory Lite - Duration: 29:28. Erlang Solutions 483 views
Read more

Optimizing Native Code for Erlang - YouTube

Optimizing Native Code for Erlang ... Erlang was built to be "fast enough" for most problems. But what if you need to go super fast?
Read more

Optimizing Native Code for Erlang - Erlang Factory

Optimizing Native Code for Erlang Steve Vinoski Basho Technologies vinoski@ieee.org ... • NIFs replace Erlang functions of the same name/arity at
Read more

Mnesia Speed Optimization - Erlang

Mnesia Speed Optimization ... , I have a great optimization problem with Mnesia. ... More information about the erlang-questions mailing list ...
Read more

Speed comparison with Project Euler: C vs Python vs Erlang ...

Speed comparison with Project Euler: ... your Erlang code is correct with ... They're usually really good at optimizing numerical code ...
Read more

Optimizing Code for Speed - Wikibooks, open books for an ...

A printable version of Optimizing Code for Speed is available. This book ... Wikipedia has related information at Program optimization.
Read more

Optimizing Code - msdn.microsoft.com

Optimizing Code. Unless you're doing ... Even if you’re not optimizing your code for speed, it helps to be aware of these techniques and their underlying ...
Read more

Optimizing Your Code - msdn.microsoft.com

A list of /O compiler options that specifically affect execution speed or code size. ... optimizing a program for speed could cause code to run slower.
Read more