advertisement

Django Con High Performance Django

44 %
56 %
advertisement
Information about Django Con High Performance Django
Technology

Published on September 6, 2008

Author: zeeg

Source: slideshare.net

advertisement

David Cramer http://www.davidcramer.net/ http://www.ibegin.com/ High Performance Django

Curse Peak daily traffic of approx. 15m pages, 150m hits. Average monthly traffic 120m pages, 6m uniques. Python, MySQL, Squid, memcached, mod_python, lighty. Most developers came strictly from PHP (myself included). 12 web servers, 4 database servers, 2 squid caches.

Peak daily traffic of approx. 15m pages, 150m hits.

Average monthly traffic 120m pages, 6m uniques.

Python, MySQL, Squid, memcached, mod_python, lighty.

Most developers came strictly from PHP (myself included).

12 web servers, 4 database servers, 2 squid caches.

iBegin Massive amounts of data, 100m+ rows. Python, PHP, MySQL, mod_wsgi. Small team of developers. Complex database partitioning/synchronization tasks. Attempting to not branch off of Django. 

Massive amounts of data, 100m+ rows.

Python, PHP, MySQL, mod_wsgi.

Small team of developers.

Complex database partitioning/synchronization tasks.

Attempting to not branch off of Django. 

Areas of Concern Database (ORM) Webserver (Resources, Handling Millions of Reqs) Caching (Invalidation, Cache Dump) Template Rendering (Logic Separation) Profiling

Database (ORM)

Webserver (Resources, Handling Millions of Reqs)

Caching (Invalidation, Cache Dump)

Template Rendering (Logic Separation)

Profiling

Tools of the Trade Webserver (Apache, Nginx, Lighttpd) Object Cache (memcached) Database (MySQL, PostgreSQL, …) Page Cache (Squid, Nginx, Varnish) Load Balancing (Nginx, Perlbal)

Webserver (Apache, Nginx, Lighttpd)

Object Cache (memcached)

Database (MySQL, PostgreSQL, …)

Page Cache (Squid, Nginx, Varnish)

Load Balancing (Nginx, Perlbal)

How We Did It “ Primary” web servers serving Django using mod_python. Media servers using Django on lighttpd. Static served using additional instances of lighttpd. Load balancers passing requests to multiple Squids. Squids passing requests to multiple web servers.

“ Primary” web servers serving Django using mod_python.

Media servers using Django on lighttpd.

Static served using additional instances of lighttpd.

Load balancers passing requests to multiple Squids.

Squids passing requests to multiple web servers.

Lessons Learned Don’t be afraid to experiment. You’re not limited to a one. mod_wsgi is a huge step forward from mod_python. Serving static files using different software can help. Send proper HTTP headers where they are needed. Use services like S3, Akamai, Limelight, etc..

Don’t be afraid to experiment. You’re not limited to a one.

mod_wsgi is a huge step forward from mod_python.

Serving static files using different software can help.

Send proper HTTP headers where they are needed.

Use services like S3, Akamai, Limelight, etc..

Webserver Software Python Scripts Apache (wsgi, mod_py, fastcgi) Lighttpd (fastcgi) Nginx (fastcgi) Reverse Proxies Nginx Squid Varnish Static Content Apache Lighttpd Tinyhttpd Nginx Software Load Balancers Nginx Perlbal

Python Scripts

Apache (wsgi, mod_py, fastcgi)

Lighttpd (fastcgi)

Nginx (fastcgi)

Reverse Proxies

Nginx

Squid

Varnish

Static Content

Apache

Lighttpd

Tinyhttpd

Nginx

Software Load Balancers

Nginx

Perlbal

Database (ORM) Won’t make your queries efficient. Make your own indexes. select_related() can be good, as well as bad. Inherited ordering (Meta: ordering) will get you. Hundreds of queries on a page is never a good thing. Know when to not use the ORM.

Won’t make your queries efficient. Make your own indexes.

select_related() can be good, as well as bad.

Inherited ordering (Meta: ordering) will get you.

Hundreds of queries on a page is never a good thing.

Know when to not use the ORM.

Handling JOINs class Category(models.Model): name = models.CharField() created_by = models.ForeignKey(User) class Poll(models.Model): name = models.CharField() category = models.ForeignKey(Category) created_by = models.ForeignKey(User) # We need to output a page listing all Poll's with # their name and category's name. def a_bad_example(request): # We have just caused Poll to JOIN with User and Category, # which will also JOIN with User a second time. my_polls = Poll.objects.all().select_related() return render_to_response('polls.html', locals(), request) def a_good_example(request): # Use select_related explicitly in each case. poll = Poll.objects.all().select_related('category') return render_to_response('polls.html', locals(), request)

class Category(models.Model):

name = models.CharField()

created_by = models.ForeignKey(User)

class Poll(models.Model):

name = models.CharField()

category = models.ForeignKey(Category)

created_by = models.ForeignKey(User)

# We need to output a page listing all Poll's with

# their name and category's name.

def a_bad_example(request):

# We have just caused Poll to JOIN with User and Category,

# which will also JOIN with User a second time.

my_polls = Poll.objects.all().select_related()

return render_to_response('polls.html', locals(), request)

def a_good_example(request):

# Use select_related explicitly in each case.

poll = Poll.objects.all().select_related('category')

return render_to_response('polls.html', locals(), request)

Template Rendering Sandboxed engines are typically slower by nature. Keep logic in views and template tags. Be aware of performance in loops, and groupby (regroup). Loaded templates can be cached to avoid disk reads. Switching template engines is easy, but may not give you any worthwhile performance gain.

Sandboxed engines are typically slower by nature.

Keep logic in views and template tags.

Be aware of performance in loops, and groupby (regroup).

Loaded templates can be cached to avoid disk reads.

Switching template engines is easy, but may not give you any worthwhile performance gain.

Template Engines

Using Django with Jinja from jinja.contrib.djangosupport import render_to_response from models import MyModel def myview(request): my_object_list = MyModel.objects.all() # Both the context, and request parameters are optional. # If you pass request it will execute your context processors. return render_to_response(‘template/name.html’, locals(), request) from jinja.contrib.djangosupport import register, convert_django_filter def truncatechars(length=30): def wrapped(env, context, value): if len(value) > length: value = value[0:length-3] + '...' return value return wrapped register.filter(truncatechars) from django.contrib.humanize.templatetags.humanize import intcomma register.filter(convert_django_filter(intcomma), 'intcomma')

from jinja.contrib.djangosupport import render_to_response

from models import MyModel

def myview(request):

my_object_list = MyModel.objects.all()

# Both the context, and request parameters are optional.

# If you pass request it will execute your context processors.

return render_to_response(‘template/name.html’, locals(), request)

from jinja.contrib.djangosupport import register, convert_django_filter

def truncatechars(length=30):

def wrapped(env, context, value):

if len(value) > length:

value = value[0:length-3] + '...'

return value

return wrapped

register.filter(truncatechars)

from django.contrib.humanize.templatetags.humanize import intcomma

register.filter(convert_django_filter(intcomma), 'intcomma')

Caching Two flavors of caching: object cache and browser cache. Django provides built-in support for both. Invalidation is a headache without a well thought out plan. Caching isn’t a solution for slow loading pages or improper indexes. Use a reverse proxy in between the browser and your web servers: Squid, Varnish, Nginx, etc..

Two flavors of caching: object cache and browser cache.

Django provides built-in support for both.

Invalidation is a headache without a well thought out plan.

Caching isn’t a solution for slow loading pages or improper indexes.

Use a reverse proxy in between the browser and your web servers: Squid, Varnish, Nginx, etc..

Cache With a Plan Build your pages to use proper cache headers. Create a plan for object cache expiration, and invalidation. For typical web apps you can serve the same cached page for both anonymous and authenticated users. Contain commonly used querysets in managers for transparent caching and invalidation.

Build your pages to use proper cache headers.

Create a plan for object cache expiration, and invalidation.

For typical web apps you can serve the same cached page for both anonymous and authenticated users.

Contain commonly used querysets in managers for transparent caching and invalidation.

Cache Commonly Used Items def my_context_processor(request): # We access object_list every time we use our context processors so # it makes sense to cache this, no? cache_key = ‘mymodel:all’ object_list = cache.get(cache_key) if object_list is None: object_list = MyModel.objects.all() cache.set(cache_key, object_list) return {‘object_list’: object_list} # Now that we are caching the object list we are going to want to invalidate it class MyModel(models.Model): name = models.CharField() def save(self, *args, **kwargs): super(MyModel, self).save(*args, **kwargs) # save it before you update the cache cache.set(‘mymodel:all’, MyModel.objects.all())

def my_context_processor(request):

# We access object_list every time we use our context processors so

# it makes sense to cache this, no?

cache_key = ‘mymodel:all’

object_list = cache.get(cache_key)

if object_list is None:

object_list = MyModel.objects.all()

cache.set(cache_key, object_list)

return {‘object_list’: object_list}

# Now that we are caching the object list we are going to want to invalidate it

class MyModel(models.Model):

name = models.CharField()

def save(self, *args, **kwargs):

super(MyModel, self).save(*args, **kwargs)

# save it before you update the cache

cache.set(‘mymodel:all’, MyModel.objects.all())

Profiling Code Finding the bottleneck can be time consuming. Tools exist to help identify common problematic areas. cProfile/Profile Python modules. PDB (Python Debugger)

Finding the bottleneck can be time consuming.

Tools exist to help identify common problematic areas.

cProfile/Profile Python modules.

PDB (Python Debugger)

Profiling Code With cProfile import sys try: import cProfile as profile except ImportError: import profile try: from cStringIO import StringIO except ImportError: import StringIO from django.conf import settings class ProfilerMiddleware(object): def can(self, request): return settings.DEBUG and 'prof' in request.GET and (not settings.INTERNAL_IPS or request.META['REMOTE_ADDR'] in settings.INTERNAL_IPS) def process_view(self, request, callback, callback_args, callback_kwargs): if self.can(request): self.profiler = profile.Profile() args = (request,) + callback_args return self.profiler.runcall(callback, *args, **callback_kwargs) def process_response(self, request, response): if self.can(request): self.profiler.create_stats() out = StringIO() old_stdout, sys.stdout = sys.stdout, out self.profiler.print_stats(1) sys.stdout = old_stdout response.content = '<pre>%s</pre>' % out.getvalue() return response

import sys

try: import cProfile as profile

except ImportError: import profile

try: from cStringIO import StringIO

except ImportError: import StringIO

from django.conf import settings

class ProfilerMiddleware(object):

def can(self, request):

return settings.DEBUG and 'prof' in request.GET and (not settings.INTERNAL_IPS or request.META['REMOTE_ADDR'] in settings.INTERNAL_IPS)

def process_view(self, request, callback, callback_args, callback_kwargs):

if self.can(request):

self.profiler = profile.Profile()

args = (request,) + callback_args

return self.profiler.runcall(callback, *args, **callback_kwargs)

def process_response(self, request, response):

if self.can(request):

self.profiler.create_stats()

out = StringIO()

old_stdout, sys.stdout = sys.stdout, out

self.profiler.print_stats(1)

sys.stdout = old_stdout

response.content = '<pre>%s</pre>' % out.getvalue()

return response

http://localhost:8000/?prof

Profiling Database Queries from django.db import connection class DatabaseProfilerMiddleware(object): def can(self, request): return settings.DEBUG and 'dbprof' in request.GET and (not settings.INTERNAL_IPS or request.META['REMOTE_ADDR'] in settings.INTERNAL_IPS) def process_response(self, request, response): if self.can(request): out = StringIO() out.write('time sql ') total_time = 0 for query in reversed(sorted(connection.queries, key=lambda x: x['time'])): total_time += float(query['time'])*1000 out.write('%s %s ' % (query['time'], query['sql'])) response.content = '<pre style=&quot;white-space:pre-wrap&quot;>%d queries executed in %.3f seconds %s</pre>' % (len(connection.queries), total_time/1000, out.getvalue()) return response

from django.db import connection

class DatabaseProfilerMiddleware(object):

def can(self, request):

return settings.DEBUG and 'dbprof' in request.GET

and (not settings.INTERNAL_IPS or

request.META['REMOTE_ADDR'] in settings.INTERNAL_IPS)

def process_response(self, request, response):

if self.can(request):

out = StringIO()

out.write('time sql ')

total_time = 0

for query in reversed(sorted(connection.queries, key=lambda x: x['time'])):

total_time += float(query['time'])*1000

out.write('%s %s ' % (query['time'], query['sql']))

response.content = '<pre style=&quot;white-space:pre-wrap&quot;>%d queries executed in %.3f seconds %s</pre>' % (len(connection.queries), total_time/1000, out.getvalue())

return response

http://localhost:8000/?dbprof

Summary Database efficiency is the typical problem in web apps. Develop and deploy a caching plan early on. Use profiling tools to find your problematic areas. Don’t pre-optimize unless there is good reason. Find someone who knows more than me to configure your server software. 

Database efficiency is the typical problem in web apps.

Develop and deploy a caching plan early on.

Use profiling tools to find your problematic areas. Don’t pre-optimize unless there is good reason.

Find someone who knows more than me to configure your server software. 

Slides and code available online at: http://www.davidcramer.net/djangocon Thanks!

Add a comment

Related presentations

Presentación que realice en el Evento Nacional de Gobierno Abierto, realizado los ...

In this presentation we will describe our experience developing with a highly dyna...

Presentation to the LITA Forum 7th November 2014 Albuquerque, NM

Un recorrido por los cambios que nos generará el wearabletech en el futuro

Um paralelo entre as novidades & mercado em Wearable Computing e Tecnologias Assis...

Microsoft finally joins the smartwatch and fitness tracker game by introducing the...

Related pages

High Performance Django The Book

High Performance Django is split up into six sections. Each section will walk you through a different phase in the lifecycle of a Django project.
Read more

High Performance Django by Peter Baumgartner + Lincoln ...

Hi backers! We officially shipped High Performance Django and launched the https://highperformancedjango.com website today.
Read more

DjangoCon 2014- High Performance Django: From Runserver to ...

DjangoCon 2014- High Performance Django: ... Django makes it easy to build a site and get it running on your ... High Performance Django ...
Read more

DjangoCon 2008: High Performance Django - YouTube

DjangoCon 2008: High Performance Django Google Developers. Subscribe Subscribed Unsubscribe 844,375 844K. Loading ... Try Django Tutorial 1 of ...
Read more

Web API performance: profiling Django REST framework ...

Web API performance: profiling Django REST framework Tagged: django djangorestframework python technical
Read more

High Performance Django: Peter Baumgartner, Yann Malet ...

Buy High Performance Django on Amazon.com FREE SHIPPING on qualified orders Amazon Try Prime Books ...
Read more

High Performance Django : django - reddit: the front page ...

High Performance Django (kickstarter.com) submitted 8 months ago by chhantyal. ... it's because we like to set the bar very high for ourselves.
Read more

Performance and optimization | Django documentation | Django

Performance and optimization ... At higher levels the system has to deal ... most performance problems in well-written Django sites aren’t at the ...
Read more

The Web framework for perfectionists with deadlines | Django

Meet Django. Django is a high-level Python Web framework that encourages rapid ... Django takes security seriously and helps developers avoid many ...
Read more