40 %
60 %
Information about Glaskowsky

Published on February 4, 2008

Author: Virginia



GPUs and CPUs: The Uneasy Alliance:  GPUs and CPUs: The Uneasy Alliance Panel Discussion Panelists:  Panelists Neil Trevett, 3Dlabs Michael Doggett, ATI Adam Lake, Intel David Kirk, NVIDIA Bill Mark, University of Texas at Austin Moderator Peter N. Glaskowsky, MemoryLogix Neil Trevett 3Dlabs:  Neil Trevett 3Dlabs Neil Trevett is Senior Vice President for Market Development at 3Dlabs, Inc. Trevett also serves as President of the Web3D Consortium and secretary of the Khronos Group developing the OpenML and OpenGL ES standards for dynamic media processing and graphics APIs for embedded appliances and applications. GP2 Musings:  GP2 Musings Neil Trevett, Senior VP Market Development, 3Dlabs President, Khronos Group Los Angeles 2004 CPUs and GPUs – Dynamic Tension:  CPUs and GPUs – Dynamic Tension CPUs and GPUs exist because of their different design goals CPUs – maximize performance and minimize cost of executing SCALAR code GPUs – exploit parallelism to beat CPUs at executing VECTOR code BUT - GPUs are rapidly integrating many CPU techniques Learned and refined by the CPU community over decades Advanced GPUs designed exclusively for PROFESSSIONAL PRODUCTIVITY If you would like try a Wildcat Realizm board email A message from your sponsor CPUs and GPUs – Dynamic Tension:  CPUs and GPUs – Dynamic Tension Fundamentally different designs finding increasingly common ground Increasing commonality creates possibilities for tighter integration E.g. merge virtual address spaces with cache coherency Would enable new CPU/GPU cooperative paradigms Possibility of increased coprocessor linkage Break the AGP/PCIe bottleneck CPU GPU Increasing areas of commonality CPU Subsystem GPU Subsystem Cache Coherent Unified Virtual Memory Space GPUs – More Than Graphics Processors?:  GPUs – More Than Graphics Processors? The volume of graphics shipments has created the GPU phenomenon Ingenious work ongoing to find alternative uses for these graphics machines Can GPUs be modified to address non-graphics needs? E.g. double precision, less SIMD more MIMD, more general data storage Primarily an economic question Not just technology Does reaching for new markets decrease your graphics market share? Increased costs bring no benefit for core market Graphics Market Design Spectrum $ Imaging HPC Design Shift will only occur if the “Integral of Achieved Profit” is increased Shifting this far – decreases effectiveness in graphics market? Probably a small stretch for increased volume Programming GPUs – Industry Challenge:  Programming GPUs – Industry Challenge GPU microarchitectures will not be exposed externally any time soon Too much intellectual property would be exposed Would create too much architectural inertia at a time of rapid innovation Agree that Domain Specific Libraries are effective, pragmatic approach Good to start solving specific real problems now But we should aim higher than just a library approach? Feels like we need to expose the full flexibility of programmability Creating effective industry programming infrastructure is a challenge Domain Languages Evolving GPU architectures Firewall to GPU ISAs Domain Languages Domain Languages Domain Languages Domain Languages Evolving GPU architectures Evolving GPU architectures Evolving GPU architectures Evolving GPU architectures Evolving GPU architectures Combinatorial Problem Industry Standard Virtual Machine?:  Industry Standard Virtual Machine? Could a Virtual Machine standard avoid combinatorial explosion? Uncouples multiple languages from multiple GPUs Target for domain language architects AND enables innovation by GPU vendors Create an open and cross-platform industry standard virtual machine? Correct virtual machine could help and persuade GPUs evolve into stream processors What should that virtual machine be? Can we work together to figure out this key question? ARB Vertex and Fragment extensions? OpenGL Shading Language? Brook or sh? Domain Languages GPUs Too-graphics oriented, too low-level to track the capabilities of evolving GPU architecture? Too-graphics oriented? Effectively a graphics Domain Specific Library – with the flexibility of programmability? Can be extended for more generality? What direction should the OpenGL ARB take? The level of abstraction we need to break out of the graphics mind-set? TOO big a leap from graphics base? Too high-level to be a useful virtual machine? Virtual Machines Battery Powered GPUs!:  OpenGL ES 2.0 OpenGL 2.0 OpenGL ES 1.1 OpenGL ES 1.0 Battery Powered GPUs! The Khronos Group is now defining OpenGL ES 2.0 The OpenGL Shading Language comes to cell phones! Driven hard cell-phone industry for compelling hand-held gaming Aggressive development to match the availability of GPUs in handsets OpenGL ES 2.0 will not just be in phones – e.g. games consoles Sony Playstation is a Khronos Member OpenGL 1.3 OpenGL 1.5 Enabled software AND hardware 3D engines – including small-footprint, low-end fixed point platforms GLSL-based Shader programmability for embedded devices. Tackling issues such as remote compilation Mid-03 Mid-04 Mid-05 Increased emphasis on hardware acceleration and enhanced 3D pipeline Embedded Industry - GP2 Genetic Diversity:  Embedded Industry - GP2 Genetic Diversity Cell phones – 100Ms units a year that will have GPUs 3D gaming now PLUS phones mutating to general-purpose personal compute devices Size, power and cost - low-power design now getting lot of attention Interesting for build handhelds AND large arrays for HPC etc. Embedded industry has fast innovation, flexible infrastructure Tight CPU/GPU integration might happen here first – systems on a chip Programmable acceleration avoids multiple media acceleration blocks A programmable GPU can accelerate 3D, images, video, audio, speech and …. OpenMAX – a new Khronos standard – domain specific primitive libraries Uneasy alliance with DSPs too!! Will GPUs even assume some baseband processing? ARM CPU Core Low Power GPU Core Cache Coherent Unified Virtual Memory Space Single Chip Domain-specific primitive libraries – can be accelerated on GPUs Michael Doggett ATI:  Michael Doggett ATI Michael Doggett is an architect at ATI. He is working on upcoming graphics hardware for microsoft and desktop PC graphics chips. Before joining ATI, Doggett was a post doc at the University of Tuebingen in Germany and completed his Ph.D. at the University of New South Wales in Sydney, Australia. GPUs and CPUs: The Uneasy Alliance:  GPUs and CPUs: The Uneasy Alliance Mike Doggett ATI GPUs:  GPUs Not stream processors Graphics black box Deep pipeline Arithmetic intensity GPUs:  GPUs How to get new features into GPUs ? Get game developers to use them Architectural Specs API definition GPUBench Double precision Performance tradeoff Simulated double GPU future:  GPU future Competitive market More of the same Adam Lake Intel:  Adam Lake Intel Adam Lake is a Sr. Software Engineer at Intel specializing in 3D graphics. Previous areas of work include stream processing, compilers for high level shading languages, and non-photorealistic rendering. He holds an M.S. degree from the University of North Carolina at Chapel Hill. A few alternatives…:  A few alternatives… Intel IXP Network Processor Family:  Intel IXP Network Processor Family IXP Perf. Characteristics:  IXP Perf. Characteristics IXP2800 [Intel02] 51 GB/s peak to RDRAM 3 RDRAM channels input and output, total aggregate@533 MHz 32 GB/s peak to SDRAM 4 QDR II SDRAM ports (2 read/2write) @250 MHz Example Application: 10GB/s Ethernet 1.4 GHz clock rate IXP2400 4,800 MIPS IXP1200 1,200 MIPS Notes: NO FPU!! Packet arrival rate determines # instructions executed per packet Key takeaways for IXP:  Key takeaways for IXP Designed for Network processing workloads Switch on event model for hardware resources No FPU, nor plans for FPU Improving software stack Shangri-la project MXP5800:  MXP5800 Specs of MXP5800:  Specs of MXP5800 Internal B/W 532 Mbytes/S/Connection Theoretical External B/W 1 GByte/S 130 nm 256 MHz 35 mm x 35 mm die Key takeaways from MXP:  Key takeaways from MXP Not a general purpose Microprocessor Shipping today with software tools One common ISA for all execution units So what’s the point? :  So what’s the point? Some alternatives for general purpose computing on special purpose hardware Larger context of stream processing architectures Programming Models:  Programming Models Getting the programming model right is hard Graphics architects got it right for graphics Made harder if you try to be completely general Reason: Increase generality, you lose performance You can quickly lose any benefit of your stream programming model Fully general streaming, in the limit, is multithreading Call to Action:  Call to Action For some applications in computational science and other domains performance is dominant factor, not cost However, in other domains, cost is dominant: Purchase Price per MIP Not just raw performance Call to action Consider chipset implementations: Analysis of GPGPU taking raw $ cost into account There are 3 options, not 2: CPU vs. CPU and chipset vs. GPU The BIG Problems:  The BIG Problems How do we program it? Programming Model How do we feed it? Memory hierarchy and bandwidth How do we keep it cool? Power and Thermal requirements provide significant challenges for ALL architectures David Kirk NVIDIA:  David Kirk NVIDIA David Kirk has been NVIDIA's Chief Scientist since January 1997. Prior to joining NVIDIA, Kirk held positions at Crystal Dynamics and the Apollo Systems Division of Hewlett-Packard Company. Kirk holds M.S. and Ph.D. degrees in Computer Science from the California Institute of Technology. (Year 2000) The GeForce256 Graphics Pipeline:  vertex setup rasterizer pixel texture memory per pixel texture filter & x8 blending (Year 2000) The GeForce256 Graphics Pipeline vertex transform & lighting per-pixel interpolation polygon polygon setup & rasterization Z-buffer, x8 blending & anti-alias image (Year 2004) The GeForce6 Graphics Pipeline:  vertex setup rasterizer pixel texture image per-pixel texture, fp16 blending (Year 2004) The GeForce6 Graphics Pipeline programmable vertex processing (fp32) programmable per- pixel math (fp32) polygon polygon setup, culling, rasterization Z-buf, fp16 blending, anti-alias (MRT) memory (Year 2004) The GeForce6 NON-Graphics Pipeline:  data setup rasterizer data data data data fetch, fp16 blending (Year 2004) The GeForce6 NON-Graphics Pipeline programmable MIMD processing (fp32) programmable SIMD processing (fp32) lists SIMD “rasterization” predicated write, fp16 blend, multiple output memory “GP” Processors:  “GP” Processors X Shared peak Input bandwidth Shared peak Output bandwidth Dedicated peak Processing power memory Bill Mark University of Texas at Austin:  Bill Mark University of Texas at Austin Bill Mark is an assistant professor in the Department of Computer Sciences at the University of Texas at Austin. Mark was the lead architect of NVIDIA's Cg language and development system. He holds a Ph.D. from the University of North Carolina at Chapel Hill. GP2 Panel Presentation:  GP2 Panel Presentation William Mark, University of Texas at Austin We’re entering an era of disruptive change:  We’re entering an era of disruptive change Driven by VLSI technology Too many transistors: CPU performance plateau Heat/Power is now a first-class constraint Possible to fit many processors on a single chip Two kinds of change coming: Technical – single-chip parallel computation Industry structure – pressure for vertical re-integration What do we mean by “CPU vs. GPU”?:  What do we mean by “CPU vs. GPU”? General HW vs. specialized HW GPU’s moving towards generality, but not fully there yet Sequential vs. Parallel Latency optimized vs. Throughput optimized Two separate chips Different sets of companies (exception: Intel) Raw HW access vs. Managed code Need at least two parallel programming models:  Need at least two parallel programming models Stream model Naturally exposes parallelism and communication Easy to use, when problem maps well Communicating sequential processes (e.g. pthreads) Explicitly exposes spatial dimension of HW parallelism Efficiently supports data-dependent communication patterns Useful for creating/modifying large irregular data structures Harder to use – e.g. race conditions Hard to get performance portability HW must satisfy mass-market needs:  HW must satisfy mass-market needs Games will continue to dominate Rendering Simulation? – an opportunity Maximize impact of research by meeting game needs Chicken/Egg problem: Co-evolve algorithms and architectures Different visibility algorithms – ray casting? Global illumination – shadows, ambient occlusion, reflection, … Parallelize model management, simulation, game behavior, … Solving these problems will help other applications 2-year predictions:  2-year predictions CPU’s: multi-core trend accelerates Multicore used by games and HPC GPU’s: More powerful streaming model Scatter, gather, conditional streams, reductions, etc. Start to see more success stories for GPGPU But limits of stream model become apparent “Dark Horses” attract increasing attention CELL and others 6-year predictions:  6-year predictions One processing chip for PC’s Who makes it? Heterogeneous architecture for this chip: Classical CPU Parallel fine-grained shared memory (pthreads) Parallel stream processor (Brook) Supports ray-casting visibility This architecture emerges in console space first This architecture meets many HPC needs Peter N. Glaskowsky MemoryLogix:  Peter N. Glaskowsky MemoryLogix Peter Glaskowsky is Chief System Architect at MemoryLogix, a Silicon Valley microprocessor design startup. Formerly, Glaskowsky was editor in chief of Microprocessor Report and a principal analyst with In-Stat/MDR, a chief engineer at Integrated Device Technology, and a lead engineer at SuperMac and Telebit. Some Panel Topics:  Some Panel Topics Which problems are the natural province of the CPU? …of the GPU? Which CPU design elements will be borrowed by GPUs, and vice-versa? Which problems support cooperation between the CPU and GPU? How do we stimulate this cooperation? Or will it be more like competition? Panelists:  Panelists Neil Trevett, 3Dlabs Michael Doggett, ATI Adam Lake, Intel David Kirk, NVIDIA Bill Mark, University of Texas at Austin Moderator Peter N. Glaskowsky, MemoryLogix

Add a comment

Related presentations

Related pages

Business Logistics: N.a. Glaskowsky ...

N.a. Glaskowsky - Business Logistics jetzt kaufen. ISBN: 9780155056527, Fremdsprachige Bücher - Fremdsprachige Bücher
Read more

Business Logistics: Nicholas A., Jr. Glaskowsky ...

Business Logistics: Nicholas A., Jr. Glaskowsky: Fremdsprachige Bücher Prime testen Fremdsprachige Bücher. Los. Alle ...
Read more

Peter Glaskowsky | LinkedIn

View Peter Glaskowsky’s professional profile on LinkedIn. LinkedIn is the world's largest business network, helping professionals like Peter Glaskowsky ...
Read more

Peter Glaskowsky, - CNET

Find out more about Peter Glaskowsky, , and see articles, reviews, videos and comments on CNET by Peter Glaskowsky.
Read more

Glaskowsky, Nicholas A in Easthampton | Glaskowsky ...

Find Glaskowsky, Nicholas A in Easthampton with Address, Phone number from Yahoo US Local. Includes Glaskowsky, Nicholas A Reviews, maps & directions to ...
Read more

Glaskowsky Nicholas A - Antiquitäten - 180 Main St ...

Glaskowsky Nicholas A in Easthampton mit Beiträgen von Menschen, wie du und ich. Mit Yelp kannst du suchen, Empfehlungen teilen und dich mit anderen ...
Read more

Nicholas A Glaskowsky - Easthampton , MA - Business Page

Nicholas A Glaskowsky CLAIM THIS BUSINESS. 180 MAIN ST EASTHAMPTON, MA 01027 Get Directions (413) 527-2410. Business Info. Founded 2001; Incorporated ...
Read more

Ideaphile - Welcome to my low-maintenance website

Peter N. Glaskowsky ...
Read more

Schaum's Outline of Theory and Problems of Business Law ...

Schaum's Outline of Theory and Problems of Business Law Schaum's Outlines: Donald A. Wiesner, Nicholas A., Jr. Glaskowsky: Fremdsprachige Bücher
Read more

Design Considerations for Space Elevator Tether Climbers ...

Design Considerations for Space Elevator Tether Climbers: Cathy Swan, Peter Swan, Robert "Skip" Penny, John Knapman, Peter Glaskowsky ...
Read more