advertisement

How Many Slaves (Ukoug)

60 %
40 %
advertisement
Information about How Many Slaves (Ukoug)

Published on July 24, 2007

Author: dougburns

Source: slideshare.net

Description

UKOUG version of a presentation trying to establish the sensible limits of parallelism on a couple of hardware configurations. Detailed white paper is at http://oracledoug.com/px_slaves.pdf
advertisement

How Many Slaves? Parallel Execution and the Magic of 2 Doug Burns [email_address] http://oracledoug.com

Introduction Introduction What is the Magic of ‘2’? What Tests? Test Scripts and Tools Test Results When is a Conclusion …

Introduction

What is the Magic of ‘2’?

What Tests?

Test Scripts and Tools

Test Results

When is a Conclusion …

Introduction Who (or what) am I ? Scottish Predominantly a DBA Training and Consultancy Current Assignment BSkyB Very Cool Projects and Hardware Less Cool Release Management http://oracledoug.com Blog

Who (or what) am I ?

Scottish

Predominantly a DBA

Training and Consultancy

Current Assignment

BSkyB

Very Cool Projects and Hardware

Less Cool Release Management

http://oracledoug.com

Blog

Why Parallel Execution? Increasing Volumes of Data Increasing User Expectations More Powerful Hardware Parallel Execution (PX) splits a single large task into multiple smaller tasks which are handled by separate processes running concurrently. Full Table Scans Sorts Index Creation, Direct Path inserts etc … Introduction

Why Parallel Execution?

Increasing Volumes of Data

Increasing User Expectations

More Powerful Hardware

Parallel Execution (PX) splits a single large task into multiple smaller tasks which are handled by separate processes running concurrently.

Full Table Scans

Sorts

Index Creation, Direct Path inserts etc …

Previous Paper Suck It Dry – Tuning Parallel Execution http://oracledoug.com/px.html (.doc & .pdf) Reviewer comments on parallel_max_servers Debate about parallel_adaptive_multi_user Something about the Magic of ‘2’ Talked about Hardware, but nothing specific ‘ Sometimes when faced with a slow i/o subsystem you might find that higher degrees of parallelism are useful because the CPUs are spending more time waiting for i/o to complete’ Introduction

Previous Paper

Suck It Dry – Tuning Parallel Execution

http://oracledoug.com/px.html (.doc & .pdf)

Reviewer comments on parallel_max_servers

Debate about parallel_adaptive_multi_user

Something about the Magic of ‘2’

Talked about Hardware, but nothing specific

‘ Sometimes when faced with a slow i/o subsystem you might find that higher degrees of parallelism are useful because the CPUs are spending more time waiting for i/o to complete’

Always set customer expectation levels I hope you didn’t come here looking for answers! Or lots of detail http://oracledoug.com/px_slaves.pdf (or.doc) An interesting story, nonetheless A framework for your own tests A glance at some results Introduction

Always set customer expectation levels

I hope you didn’t come here looking for answers!

Or lots of detail

http://oracledoug.com/px_slaves.pdf (or.doc)

An interesting story, nonetheless

A framework for your own tests

A glance at some results

Introduction Introduction What is the Magic of ‘2’? What Tests? Test Scripts and Tools Test Results When is a Conclusion …

Introduction

What is the Magic of ‘2’?

What Tests?

Test Scripts and Tools

Test Results

When is a Conclusion …

Batch Queue Management and the Magic of ‘2’ Cary Millsap (2000) - available at hotsos.com How many batch processes to execute per CPU? 2. Well, a range of values really, between 1 and 1.8? Most recent work expands on this CPU-intensive batch jobs per CPU <2 (nearer to 1) I/O-intensive batch jobs per CPU >2 CPU and I/O request durations are exactly equal (rare) - CPU * 2 Misconfiguration could change everything What is a batch job anyway? What is The Magic of ‘2’?

Batch Queue Management and the Magic of ‘2’

Cary Millsap (2000) - available at hotsos.com

How many batch processes to execute per CPU?

2. Well, a range of values really, between 1 and 1.8?

Most recent work expands on this

CPU-intensive batch jobs per CPU <2 (nearer to 1)

I/O-intensive batch jobs per CPU >2

CPU and I/O request durations are exactly equal (rare) - CPU * 2

Misconfiguration could change everything

What is a batch job anyway?

What is The Magic of ‘2’? Oracle 10.2 Docs mention the Magic of ‘2’ PARALLEL_THREADS_PER_CPU enables you to adjust for hardware configurations with I/O subsystems that are slow relative to the CPU speed and for application workloads that perform few computations relative to the amount of data involved . If the system is neither CPU-bound nor I/O-bound, then the PARALLEL_THREADS_PER_CPU value should be increased. This increases the default DOP and allow better utilization of hardware resources. The default for PARALLEL_THREADS_PER_CPU on most platforms is two . However, the default for machines with relatively slow I/O subsystems can be as high as eight .

Oracle 10.2 Docs mention the Magic of ‘2’

PARALLEL_THREADS_PER_CPU enables you to adjust for hardware configurations with I/O subsystems that are slow relative to the CPU speed and for application workloads that perform few computations relative to the amount of data involved .

If the system is neither CPU-bound nor I/O-bound, then the PARALLEL_THREADS_PER_CPU value should be increased. This increases the default DOP and allow better utilization of hardware resources.

The default for PARALLEL_THREADS_PER_CPU on most platforms is two . However, the default for machines with relatively slow I/O subsystems can be as high as eight .

What Tests? Introduction What is the Magic of ‘2’? What Tests? Test Scripts and Tools Test Results When is a Conclusion …

Introduction

What is the Magic of ‘2’?

What Tests?

Test Scripts and Tools

Test Results

When is a Conclusion …

What should I test? Parallel operations (obviously) Multiple CPUs I/O infrastructure Operating System – Unix / Linux Free (as in beer) Cross-platform Tools and Utilities Oracle Version – 10.2 The latest and greatest, or common and well-known? Boy, that was a good choice. Workloads – Keep it simple Data! CPU vs I/O balance What Tests?

What should I test?

Parallel operations (obviously)

Multiple CPUs

I/O infrastructure

Operating System – Unix / Linux

Free (as in beer)

Cross-platform

Tools and Utilities

Oracle Version – 10.2

The latest and greatest, or common and well-known?

Boy, that was a good choice.

Workloads – Keep it simple

Data!

CPU vs I/O balance

First attempt Full Table scan of a 2 million row table PCTFREE 90 expanded it to 2.8Gb Small enough for all platforms Big enough to exercise the I/O subsystem properly NOT! EMC took 7 seconds. Second attempt Full Table scan of 8 million row table PCTFREE 90 expanded it to 10Gb Too big for the little PC now! (Used 1/8 of the data) Solved most problems But too I/O intensive (More on this later) What Tests?

First attempt

Full Table scan of a 2 million row table

PCTFREE 90 expanded it to 2.8Gb

Small enough for all platforms

Big enough to exercise the I/O subsystem properly

NOT! EMC took 7 seconds.

Second attempt

Full Table scan of 8 million row table

PCTFREE 90 expanded it to 10Gb

Too big for the little PC now! (Used 1/8 of the data)

Solved most problems

But too I/O intensive (More on this later)

Third attempt FTS plus a Hash Join and Sort of two 8 million row tables PCTFREE 90 expanded them to over 10Gb Unsuitable for the PC, used 1/8 data again Started to produce more interesting results Multi-user tests More on these later 8 new 1 million row tables PCTFREE 90 expanded them to 147Mb each What Tests?

Third attempt

FTS plus a Hash Join and Sort of two 8 million row tables

PCTFREE 90 expanded them to over 10Gb

Unsuitable for the PC, used 1/8 data again

Started to produce more interesting results

Multi-user tests

More on these later

8 new 1 million row tables

PCTFREE 90 expanded them to 147Mb each

What Tests? The Test Process will be much easier if you have Enough Time Appropriate Hardware A Dedicated Assistant A Pleasant Working Environment Two out of Four ain’t bad …

The Test Process will be much easier if you have

Enough Time

Appropriate Hardware

A Dedicated Assistant

A Pleasant Working Environment

Two out of Four ain’t bad …

Intel Single-CPU PC – Tulip PC White Box Linux – Kernel 2.6.9 1 x 550Mhz Pentium 3 768Mb RAM Single 20Gb IDE Intel SMP Server – Intel ISP4400 (SRKA4) White Box Linux – Kernel 2.6.9 4 x 700Mhz Pentium 3 Xeon 3.5Gb RAM 4 x Seagate Cheetah U-160 SCSI Software RAID-0 (256Kb stripe) Separate system/software disk Enable/Disable CPUs by editing grub.conf £300 on eBay including all HDD and shipping What Tests?

Intel Single-CPU PC – Tulip PC

White Box Linux – Kernel 2.6.9

1 x 550Mhz Pentium 3

768Mb RAM

Single 20Gb IDE

Intel SMP Server – Intel ISP4400 (SRKA4)

White Box Linux – Kernel 2.6.9

4 x 700Mhz Pentium 3 Xeon

3.5Gb RAM

4 x Seagate Cheetah U-160 SCSI

Software RAID-0 (256Kb stripe)

Separate system/software disk

Enable/Disable CPUs by editing grub.conf

£300 on eBay including all HDD and shipping

Enterprise SMP server – Sun E10K Solaris 8 12 x 400Mhz SPARC 12Gb RAM EMC Symmetrix 8730 via Brocade SAN 5 x Hard Disk Slices (Hypers) in RAID 1+0 (960Kb stripe) Enable/Disable CPUs using psradm Yes, really ! We had some spare kit kicking around. (Thanks, Mike) DBA Lessons #1 - Always be nice to System and Storage Administrators #2 – Work for companies with a lot of money What Tests?

Enterprise SMP server – Sun E10K

Solaris 8

12 x 400Mhz SPARC

12Gb RAM

EMC Symmetrix 8730 via Brocade SAN

5 x Hard Disk Slices (Hypers) in RAID 1+0 (960Kb stripe)

Enable/Disable CPUs using psradm

Yes, really !

We had some spare kit kicking around. (Thanks, Mike)

DBA Lessons

#1 - Always be nice to System and Storage Administrators

#2 – Work for companies with a lot of money

Test Scripts and Tools Introduction What is the Magic of ‘2’? What Tests? Test Scripts and Tools Test Results When is a Conclusion …

Introduction

What is the Magic of ‘2’?

What Tests?

Test Scripts and Tools

Test Results

When is a Conclusion …

Test Scripts and Tools init.ora Disabled parallel_adaptive_multi_user Set parallel_max_servers to 512 I forgot to increase this a couple of times A stupid mistake in the paper (and an important lesson) Parallel_max_servers=512 keeps defaulting to 385? processes=400 ! Setup scripts To be able to recreate environment easily setup1.sql – Tablespaces, user account and privs setup2.sql – Create two 8 million row / 11Gb tables setup3.sql – Create eight 1 million row / 147Mb tables.

init.ora

Disabled parallel_adaptive_multi_user

Set parallel_max_servers to 512

I forgot to increase this a couple of times

A stupid mistake in the paper (and an important lesson)

Parallel_max_servers=512 keeps defaulting to 385?

processes=400 !

Setup scripts

To be able to recreate environment easily

setup1.sql – Tablespaces, user account and privs

setup2.sql – Create two 8 million row / 11Gb tables

setup3.sql – Create eight 1 million row / 147Mb tables.

Test Scripts and Tools Test scripts To run selected SQL statements consistently across a range of DOPs, unattended. rolling.sh – FTS and HJ/Sort against the big tables session.sh – HJ/Sort of one big table and one of the smaller tables, accepting a session parameter so that multiple copies can run concurrently multi.sh – Harness script that runs session.sh for a given number of users

Test scripts

To run selected SQL statements consistently across a range of DOPs, unattended.

rolling.sh – FTS and HJ/Sort against the big tables

session.sh – HJ/Sort of one big table and one of the smaller tables, accepting a session parameter so that multiple copies can run concurrently

multi.sh – Harness script that runs session.sh for a given number of users

Test Scripts and Tools Information Collection Simple log file SQL statements Output Timings Autotrace v$pq_tqstat query after each statement 10046 Trace File Consolidated version, using client_id and trcsess tkprof output too Watch the overhead in disk space and trcsess run time! System Statistics

Information Collection

Simple log file

SQL statements

Output

Timings

Autotrace

v$pq_tqstat query after each statement

10046 Trace File

Consolidated version, using client_id and trcsess

tkprof output too

Watch the overhead in disk space and trcsess run time!

System Statistics

Test Scripts and Tools Operating System Statistics Resource Usage Bottlenecks Long-running tests – likely to be a lot of data! ORCA/orcallator http://www.orcaware.com/orca Go for the latest development tarball, which includes procallator for Linux statistics collection Easy configuration to generate HTML output Pretty graphs! Lots of them in the paper, but not here.

Operating System Statistics

Resource Usage

Bottlenecks

Long-running tests – likely to be a lot of data!

ORCA/orcallator

http://www.orcaware.com/orca

Go for the latest development tarball, which includes

procallator for Linux statistics collection

Easy configuration to generate HTML output

Pretty graphs!

Lots of them in the paper, but not here.

Test Results Introduction What is the Magic of ‘2’? What Tests? Test Scripts and Tools Test Results When is a Conclusion …

Introduction

What is the Magic of ‘2’?

What Tests?

Test Scripts and Tools

Test Results

When is a Conclusion …

PC – 1 CPU – 1.3Gb

ISP4400 – 1-4 CPUs – FTS 11Gb

ISP4400 – 1-4 CPUs – HJ 22Gb

E10K – 1-12 CPUs – FTS 11Gb

E10K – 1-12 CPUs – HJ 22Gb

Multi-user Tests First attempt Hash Join/Sort statement only 170Mb Tables – 128,000 rows (PCTFREE 90) Between 1 and 12 concurrent users, noparallel to DOP 4 Showed how quickly PX response drops off with multiple users Then I noticed something strange in the V$PQ_TQSTAT output Slaves weren’t doing much work. What’s that sound I can hear? PCTFREE 90 - lots of disk I/O (largely empty blocks) Very small data volumes feeding into later stages of the plan! Mmmmm …. Perhaps that doesn’t test the CPUs too well

First attempt

Hash Join/Sort statement only

170Mb Tables – 128,000 rows (PCTFREE 90)

Between 1 and 12 concurrent users, noparallel to DOP 4

Showed how quickly PX response drops off with multiple users

Then I noticed something strange in the V$PQ_TQSTAT output

Slaves weren’t doing much work.

What’s that sound I can hear?

PCTFREE 90 - lots of disk I/O (largely empty blocks)

Very small data volumes feeding into later stages of the plan!

Mmmmm …. Perhaps that doesn’t test the CPUs too well

Multi-user Tests

Doh! If the CPUs weren’t working hard enough on the multi-user tests, then … I should re-run the Single User/Volume Tests

If the CPUs weren’t working hard enough on the multi-user tests, then …

I should re-run the Single User/Volume Tests

Single User Volume Tests II

When is a Conclusion … Introduction What is the Magic of ‘2’? What Tests? Test Scripts and Tools Test Results When is a Conclusion …

Introduction

What is the Magic of ‘2’?

What Tests?

Test Scripts and Tools

Test Results

When is a Conclusion …

… not a Conclusion? When it contains lots of mights, maybes and coulds? When you’ve been testing the wrong thing? IF you’re the only user of the server and it has more than one CPU and enough disks then You should definitely give PX at a DOP of 2 a try Benefit from the direct path I/O, not the parallelism? _serial_direct_read=true Benefits diminish rapidly If using an unsuitable disk configuration, like these tests Then again, I think a lot of people are When is a Conclusion …

… not a Conclusion?

When it contains lots of mights, maybes and coulds?

When you’ve been testing the wrong thing?

IF you’re the only user of the server and it has more than one CPU and enough disks then

You should definitely give PX at a DOP of 2 a try

Benefit from the direct path I/O, not the parallelism?

_serial_direct_read=true

Benefits diminish rapidly

If using an unsuitable disk configuration, like these tests

Then again, I think a lot of people are

The only way to know for sure is to test your SQL, with your data with a range of DOPs Then choose something below the apparent optimum? Parallel Execution loves hardware But it’s not just about having loads of kit You need to have the right balance of CPU, Memory and I/O bandwidth Bottlenecks will become apparent more quickly Don’t use it for online Unless it’s a handful of users With a predictable maximum number of concurrent activities Set parallel_adaptive_multi_user to TRUE? (10g default) You must explain it to your users! When is a Conclusion …

The only way to know for sure is to test your SQL, with your data with a range of DOPs

Then choose something below the apparent optimum?

Parallel Execution loves hardware

But it’s not just about having loads of kit

You need to have the right balance of CPU, Memory and I/O bandwidth

Bottlenecks will become apparent more quickly

Don’t use it for online

Unless it’s a handful of users

With a predictable maximum number of concurrent activities

Set parallel_adaptive_multi_user to TRUE? (10g default)

You must explain it to your users!

More things to try Bigger stripe widths and filesystem options ( DONE ) Different extent and block sizes ( DONE ) Disk-separated data files and Hash Partitioned Tables Hardware RAID Different Automatic PGA Management Settings Oracle’s Default PX Parameter Values Different SQL What have I started ?!? What price an old EMC Symmetrix on eBay? Do you think Scottish Power do 3-phase power for domestic customers? How will I explain the noise to Housemates and Partner! When is a Conclusion …

More things to try

Bigger stripe widths and filesystem options ( DONE )

Different extent and block sizes ( DONE )

Disk-separated data files and Hash Partitioned Tables

Hardware RAID

Different Automatic PGA Management Settings

Oracle’s Default PX Parameter Values

Different SQL

What have I started ?!?

What price an old EMC Symmetrix on eBay?

Do you think Scottish Power do 3-phase power for domestic customers?

How will I explain the noise to Housemates and Partner!

The scripts are there http://oracledoug.com/px_slaves.doc Tailor them to your needs. Improve them! Let me know your results – I’m interested. Including details of your environment Data creation scripts Your SQL When is a Conclusion …

The scripts are there

http://oracledoug.com/px_slaves.doc

Tailor them to your needs. Improve them!

Let me know your results – I’m interested.

Including details of your environment

Data creation scripts

Your SQL

How Many Slaves? Parallel Execution and the Magic of 2 Doug Burns [email_address] http://oracledoug.com

Add a comment

Related pages

How Many Slaves? - Doug Burns' Oracle and ZX Spectrum ...

How Many Slaves? Parallel Execution and the 'Magic of 2' Abstract and Scope Several sources have suggested to me that the Magic of Two, as described by ...
Read more

November Updates from the OAUX Team - UKOUG

November Updates from the OAUX Team. ... To get a good look at the many ... We're No Longer Slaves To The System. UKOUG President Linda Barker ...
Read more

Slaves. How do we know these people are slaves? - Documents

Slaves How do we know these people are slaves? Spot the slaves Get used to it. p34 What were the 5 main ways people become slaves in Ancient Rome?
Read more

px_slaves - scribd.com

How Many Slaves?Parallel Execution and the 'Magic of 2' Abstract and Scope Several sources have suggested to me that the Magic of Tw...
Read more

Technical Papers - Doug Burns' Oracle and ZX Spectrum ...

Technical Papers. The links on the ... that I presented at the UKOUG 2004 annual conference in Birmingham and the UKOUG's Unix and ... How Many Slaves ...
Read more

How many - Documents

How many? Nobody knows How many? How many? Nobody knows Docslide.us. Upload Login / Signup. Leadership; Technology; Education; Marketing; Design; More Topics.
Read more