Published on February 27, 2014
Computing Platforms for the 21Century Abstract: Wikipedia defines Platform as "A raised level surface on which people or things can stand". A more familiar technical interpretation applies to the hardware and OS configuration applicable to the execution of software; most frequently applicable to highly stable PC or Mainframe architectures. But the world has changed a lot in the 21 century as serious computing power moved into the hands of the consumer. Nowadays computers that don't look like computers, with production runs in the tens or hundreds of millions; totally eclipse traditional computing and thus the traditional computing platform. So does the ARM architecture define a new platform for this computing environment, or is it more complex than that? One of our greatest forefathers, Isaac Newton, realised the reality of platforms when he talked of standing on the shoulders of giants. A platform is a stable place where engineers and scientists can stand to achieve more than they would by their own efforts alone. Platforms are about re-using rather than re-inventing; about Productivity, Quality, TTM, ROI, etc. for the 21 century products we Engineers are now charged to deliver ... It's the economy, stupid! Context Seminar at Liverpool University http://www.liv.ac.uk/electrical-engineering-and-electronics/ 45min Keynote, 60min Slot. 25feb14 SlideCast and pdf available via http://ianp24.blogspot.co.uk/ 1
Opinions expressed are those of the author alone Prof. Ian Phillips Principal Staff Eng’r, ARM Ltd email@example.com Visiting Prof. at ... Contribution to Industry Award 2008 Seminar Uo.Liverpool 25feb14 SlideCast and pdf available via http://ianp24.blogspot.co.uk/ 2 1v0
The Traditional Computing Platform General Purpose Compute Platforms PC – Dominated by x86 architecture (Intel + AMD + Windows) Linux OpenBSD FreeVMS MacOS ‘N’ – Universal Binaries (PowerPC/x86) Mainframe - IBM, EMC, Hitachi, Unysis, HP, NEC, Fujitsu DOS But also Apple ... Windows ‘N’ Fortran C/C++ Cobol - One of first languages (1959). In 1997, 80% of the world's business ran on COBOL with >200 billion lines of code in existence and >5 billion lines of new code annually (Gartner). Portable Computing – Pocketable GP Compute Platforms iOS (iPad/iPhone/iPod) Android Windows 8 ... We all have our personal favourites! 3
Markets provide the Product Opportunities 3rd Era Millions of Units Computing as part of our lives 2nd Era Broad-based computing for specific tasks 1st Era Select work tasks 1960 1970 1980 1990 2000 ... Older Markets are still there; just not the Biggest! 4 2010 2020
The Face of Computing Today 5
The Face of Computing Today 6
The Computing Machine ... Computing: A general term for algebraic manipulation of Data ... Numerated Phenomena IN (x) y=F(x,t,s) Processed Data/ Information OUT (y) ... State and Time are frequently factors in this. It can include phenomena ranging from human thinking to calculations with a narrower meaning. Usually used it to exercise analogies (models) of real-world situations; Frequently in real-time (Fast enough to be a stabilising factor in a loop). Wikipedia ... Not prescriptive about Implementation Technology! ... Not prescriptive about Programmability! 7
Antikythera c87BC ... Planet Motion Computer Early-Mechanical Computation • Inventor: Hipparchos (c.190 BC – c.120 BC). • Ancient Greek Astronomer, Philosopher and Mathematician. Single-Task, Continuous Time, Analogue Mechanical Computing (With backlash!) See: http://www.youtube.com/watch?v=L1CuR29OajI 8
Babbage's Difference Engine 1837 Late-Mechanical Computation (Re)construction c2000 The difference engine consists of a number of columns, numbered from 1 to N. Each column is able to store one decimal number. The only operation the engine can do is add the value of a column n + 1 to column n to produce the new value of n. Column N can only store a constant, column 1 displays (and possibly prints) the value of the calculation on the current iteration. Computer for Calculating Tables: A Basic ALU Engine 9
“Baby” 1947 (Reconstruction) Valve/Software Computation General Purpose, Quantised Time and Data, (Digital) Electronic Computing 10
Electronic System1 2014 1: aka; Cyber-Physical System (Geek-Talk!) Analogue Electronics Digital Electronics Software Memory Mechanics Micro-Motors Optics Sensors Displays Discharge Tube Incorporating DIGIC5+ (ARM) Robotic Assembly Plastic, Metal, Glass ...Technologies working seamlessly to deliver Functionality ... Enhanced Human Memory 11
Putting Technologies into Context 21c Businesses have to be Selling things that Customers (esp. End-Customers) want to buy. Focusing on Their Core Competencies Opportunities, Competition, Operations and Investors are Global by ... Business Product Differentiation (Functionality+) Focusing on what End-Customers need ... Technologies enable Product Options Business-Models make the Money ..but.. New Products are Design is a Cost (Risk) to be Minimised Technology (HW, SW, Mechanics, Optics, Graphene, etc) just offers the potential to differentiate your Products! The Value of New Technology may not exceed the Cost (Risk)! ... Successful End-Products fund their entire Value-Chains 12
Moore’s Law: A Technology Opportunity ... X 100nm 10um Transistor/PM (K) 1um Transistors/Chip (M) Approximate Process Geometry 10nm 100um ITRS’99 13 http://en.wikipedia.org/wiki/Moore’s_law
... But an Increasing Design Problem ! 100nm 10um Transistor/PM (K) 1um Transistors/Chip (M) Approximate Process Geometry 10nm 100um ITRS’99 14 http://en.wikipedia.org/wiki/Moore’s_law
Reuse Closes the Productivity Gap! Pre.1990 chip design was entire ... Moore’s Law was handled by ever Bigger Teams and ever Faster Tools With Improved Productivity through HDL and Synthesis ... I was a chip designer in 1978; and did it all myself in 3mth (~1k gates!) Post 1995 reuse silently entered the picture ... Circuit Blocks CPUs (and Software) ... With Supporting External IP Methodology! Up-Integration (Incl. Software) Chip Reuse (ASSP) ... Delivering Productivity, Quality and Reliability ... Birth of IP and Know-How Companies (Like ARM c1991) ... Lead to the Commoditisation of Silicon (and FABs) ! 15
How Much Reuse Today? Mobile Products have ~500m gate SoCs / ~500m lines of code Doubling every 18mth Designer Productivity: is just 100-1000 Gates(Lines)/day That is tested, verified, incorporated gates(lines) That’s 2,500-25,000 p.yrs to clean-sheet design! (Un-Resourceable) Typically ‘Product Designs’ have 50-200 p.yr available ... That’s just ~0.5% New ... >99.5% Reuse already! Not Viable to do clean-sheet product design ... nor has it been since ~1995 The core HW/SW is only a part of a Product ... 16 There’s all of the other Components and Sub-Systems There’s the IO systems (RF, Audio, Optical, Geo-spatial, Temporal) There’s the Mechanical There’s the Reproduction (Factory) There's the Business Model (Cash-flow, Distribution, Legal) There’s the Support (Repair, Installation, Maintenance, Replacement)
How do we Reuse? Design Tools (across all Product Disciplines) underpin this ... Reuse of Modules and Components Reuse of Existing Code and Circuits Sharing Methodology Sharing Architecture Creating Tools to Accelerate Methodology and Repeatability Design For “x” (DFx) is Design For Up-Stream (Re)Deployment A significant part is (and will remain) Knowledge based ... The Designer has done similar work before The Team has Collective experience The Company has experience and a customer base The Design Engineer’s Role is ... To create Order out of Chaos Using Current-Technology and Knowledge; to create a Viable Product 17
Reuse Platform for Productivity Disintegration of Value-Chains ... Allows Componentisation of Product (Physical and Virtual) Encourages Focus on Your Value-Add Outsource other people’s expertise Across all aspects of business (Technical, Business and Admin) Created the opportunity for ; and for many others. ∘ English as the lingua-franca ∘ Instant global telecoms (ICT) ∘ IT and the Internet ∘ International Contract Law ∘ The World-Trade Organisation (WTO) ∘ Standardisation of GP-Compute Architecture Changed the meaning of Local ... ... This is a very different way of conducting business ... has never happened before in Human History ... And most people don’t see it today 18
All Exponentials Must End ... 130nm Growing opinion that 14 or 7nm will be the smallest yieldable node ... Ever! Just 2-3 gen. (3-5yr) to the 90nm end of Planar Scaling 30nm Only things on the drawing board today ... 14nm ... can get into the last of the of planar chips! Its also the end-of-the-road for ‘promising technologies’ ! 19 Clean-Sheet Synthesis Scalable Processor Arrays Formal Design Top-Down Design 7nm ...And the end for Moore’s Law?
Packing Technology into an iCon Analogue and Digital Design Embedded Software Mechanics, Plastics and Glass Micro-Machines (MEMs) Displays and Transducers Robotics and Test Knowledge and Know-How Research, Education and Training Components, Sub-Systems and Systems; Design, Assembly and Manufacture Metrology, Methodology and Tools ... Involving Many Specialist Businesses ... Round and Round the World ... Not-Least from the UK 20
Inside The Control Board (a-side) Level-2: Sub-Assemblies Visible Computing Contributors ... Samsung: Flash Memory - NV-MOS (ARM Partner) Cirrus Logic: Audio Codec - Bi-CMOS (ARM Partner) AKM: Magnetic Sensor - MEM-CMOS Texas Instruments:Touch Screen Controller and mobile DDR - Analogue-CMOS (ARM Partner) RF Filters - SAW Filter Technology Invisible Computing Contributors ... OS, Drivers, Stacks, Applications, GSM, Security, Graphics, Video, Sound, etc Software Tools, Debug Tools, etc 21 http://www.ifixit.com
Inside The Control Board (b-side) Level-2: Sub-Assemblies More Visible Computing Contributors ... A4 Processor. Spec:Apple, Design & Mfr: Samsung Digital-CMOS (nm) ... Provides the iPhone 4 with its GP computing power. (Said to contain ARM A8 600 MHz CPU and other ARM IP) ST-Micro: 3 axis Gyroscope - MEM-CMOS (ARM Partner) Broadcom: Wi-Fi, Bluetooth, and GPS - Analogue-CMOS (ARM Ptr) Skyworks: GSM Analogue-Bipolar Triquint: GSM PA Analogue-GaAs Infineon: GSM Transceiver - Anal/Digi-CMOS (ARM Partner) GPS Bluetooth, EDR &FM 22 http://www.ifixit.com
The A4 SIP Package (Cross-section) Memory ‘Package’ 2 Memory Dies Processor SOC Die Glue 4-Layer Platform Package’ Down 3-Levels: IC Packaging 23 The processor is the centre rectangle. The silver circles beneath it are solder balls. Two rectangles above are RAM die, offset to make room for the wirebonds. Putting the RAM close to the processor reduces latency, making RAM faster and cuts power. Unknown Mfr (Memory) Samsung/ARM (Processor) Unknown (SIP Technology) Source ... http://www.ifixit.com
The Processor Unit NB: The Tegra 3 is similar to the A4/5, but is not used in the iPhone 24 (Nvidea Tegra 3, Around 1B transistors)
Lots and Lots of Designers ... 159 Tier-1 Suppliers ... Thousands of Design Engineers 10’s of thousands of Engineers Globally ... Hundreds more Tier-2 suppliers (Including ARM) 25
… System-Packaging Maintains Momentum! Interposer today Die-Integration ..and.. 13aug13 Genuine 3D-Process very soon 24-Layers 3D NAND-Flash 4x Transfer to Production Die-Stack 10 Layer Interposer Die-Stack Mixed-Technology 8x Sampling Active Carrier PV - 500nm Ge RF - 300nm GaAs CPU- 90nm Si CMOS DRAM - 20nm Si FIN-MOS 300nm Si CMOS 10 stack 1.6 mm 26
Moore's Real Law ... x2 System Functionality every 18-24mth A Cascade of Technologies over the ages Functional Density (units) 1012 1010 106 102 Electronic era: System era: 1975-2005 2003-2030 100 1960 1980 2000 2020 ... A ‘Law’ that started: Stone ⇒ Wood ⇒ Bronze ⇒ Iron ⇒ ... 27
ARM: A Platform for Electronic Systems? “ARM designs processor technology that lies at the heart of advanced consumer products” 28
1991: ARM a RISC-Processor Core … ADDR[31:0] Address Register Address Incrementer Scan Debug Control Incrementer P C PC Update Register Bank Instruction Decoder Decode Stage A L U B u s A B u s Multiplier B B u s Instruction Decompression Control Logic Write Data Register WDATA[31:0] 29 nIRQ nFIQ nRESET ABORT TRANS PROT Barrel Shifter 32 Bit ALU and CFGBIGEND CLK CLKEN WRITE SIZE[1:0] Read Data Register RDATA[31:0] LOCK CPnOPC CPnCPI CPA CPB
The ‘Lego-Brick’ Chip-Design Concept Par. Port DMA ARM7 Core 30 UART (2) PCMCIA Timers W’Dog Arb’tr. Misc. Int’t. Contr. Memory Interface
Systems Get Ever-More Complex! Today, users require a pocket ‘Super-Computer’ ... Silicon Technology Provides a few-Billion transistors ... ARM’s Technology (still) makes it Practical to utilise them ... • 10 Processors • • • • • nVidea Tegra3 ARM ARM ARM ARM ARM ARM • 4 x A9 Processors (2x2): 4 x MALI 400 Fragment Proc: 1 x MALI 400 Vertex Proc. 1 x MALI Video CoDec Software Stacks, OS’s and Design Tools/ ARM Technology gives chip/system designers ... • Improved Productivity • Improved TTM • Improved Quality/Certainty ... So By Definition ARM is (≥1) Platform! 31
Systems using Billions of Transistors ARM Technology drives efficient Electronic System solutions: Software increasing system efficiency with optimized software solutions Diverse components, including CPU and GPU processors designed for specific tasks Interconnect System IP delivering coherency and the quality of service required for lowest memory bandwidth Physical IP for a highly optimized processor implementation Backed by >900 Global Partners ... 32 >800 Licences Millions of Developers
Methodology For Productivity C/C++ Debug & Trace Development Energy Trace Modules Middleware 33
The Right Horse for The Course ... About 50MTr About 50KTr ... Delivering ~5x speed (Architecture + Process + Clock) 34
... Means 24 Processors in 6 Families 35
A Platform for Power Efficiency Watts don’t just happen; they are caused! In the Chip ... Matching the processor to the application Minimise voltage/frequency (P=CV2f) Variable/Gated clock domains Variable/Switched voltage domains Maximises Activity-Proportionality (Counter Intuitive) Give the OS and the Application SW Information and Controls Methodology and Utilities In the Software ... In the System ... Architecture Extend control beyond the chip ... HW Dissipates, but SW Makes It! 36
Parallel is More Power-Efficient Processor Input Processor Output Output Input f/2 f Processor Capacitance = C Voltage = V Frequency = f Power = CV2f f/2 Capacitance = 2.2C Voltage = 0.6V Frequency = 0.5f Power = 0.4CV2f f ... By a factor determined by Amdahl or Gustafson? 37
CoreLink Supports Multi-Processing Heterogeneous processors – CPU, GPU, DSP and accelerators Virtualized Interrupts Up to 4 cores per cluster Up to 4 coherent clusters Quad CortexA15 Quad CortexA15 Quad CortexA15 L2 cache L2 cache L2 cache Quad ACE CortexA15 L2 cache DSP DSP DSP PCIe DPI Crypto USB AHB ACE SATA NIC-400 IO Virtualisation with System MMU CoreLink™ CCN-504 Cache Coherent Network Integrated L3 cache Snoop Filter 8-16MB L3 cache CoreLink™ DMC-520 Dual channel DDR3/4 x72 10-40 GbE Interrupt Control Uniform System memory CoreLink™ DMC-520 NIC-400 Network Interconnect PHY x72 DDR4-3200 x72 DDR4-3200 Flash GPIO Peripheral address space 38 Up to 18 AMBA interfaces for I/O coherent accelerators and IO
big.LITTLE Processing For High-Performance, Variable-Load systems... Tightly coupled combination of two ARM CPU clusters: Cortex-A15 (big Performance) and Cortex-A7 (LITTLE Power) - functionally identical Same programmers view, looks the same to OS and applications big.LITTLE combines high-performance and low power Automatically selects the right processor for the right job Redefines the efficiency/performance trade-off “Demanding tasks” >2x Performance Current big.LITTLE smartphone 39 big “Always on, always connected tasks” LITTLE 30% of the Power (select use cases) Current big.LITTLE smartphone
LITTLE Fine-Tuned to Different Performance Points Most energy-efficient applications processor from ARM Simple, in-order, 8 stage pipelines Performance better than mainstream, high-volume smartphones (Cortex-A8 and Cortex-A9) big Highest performance in mobile power envelope 40 Complex, out-of-order, multi-issue pipelines Up to 2x the performance of today’s high-end smartphones Cortex-A7 Cortex-A53 Q u e u e I s s u e I n t e g e r Cortex-A15 Cortex-A57
big.LITTLE Software Model CPU Migration Migrate a single processor workload to the appropriate CPU Migration = save context then resume on another core Also known as Linaro “In Kernel Switcher” DVFS driver modifications and kernel modifications Based on standard power management routines Small modification to OS and DVFS, ~600 lines of code big.LITTLE MP OS scheduler moves threads/tasks to appropriate CPU Based on CPU workload Based on dynamic thread performance requirements Enables highest peak performance by using all cores at once 41
A Platform for Applications BeagleBoard Black (TI CPU) Samsung Raspberry-Pi (Samsung CPU) Xilinx Zinq 42
A Platform for Things (IoT) Freescale NXP mbed web-based dev’t iot environment www.mbed.org 43 ST Micro
A Platform for Society Electronic Systems will underpin all aspects of our lives. We depend on them today; we will be ever-more-so in the future Based on Electronic Technology, but integrate a mix of technology to delivering Human-Level Functionality. Economic Independence of supply is not an option: but Co-Dependence is! The most important technology in a System is the one that doesn’t work! ...They will NOT Solve Societies Challenges, but will be fundamental to the solutions. 44
Conclusions ... Business is about Making Money for Investors ... Good enough is enough; perfection is for the gods. Technology enables Product Options; not all of which are Valuable Most Tech Enterprises, provide ‘components’ into ES Products Platforms are Productivity-Aids ... A way of creating new Products as quickly and cheaply as possible Sophisticated is not the same as Valuable ARM is a Productivity-Aid to the biggest Computer Market today Electronic Systems will underpin all of our futures ... Society will create the 21C using the power of Electronic Systems And will be increasingly unaware of them and their technologies! Ever more Sophisticated Systems will require ever greater Reuse ... Platforms will make 21C Electronic-Systems Possible 45
Prof. Ian Phillips Principal Staff Eng’r, ARM Ltd firstname.lastname@example.org Visiting Prof. at ... Contribution to Industry Award 2008 http://ianp24.blogspot.co.uk/ Ian.email@example.com 46
A Compute Platform is normally considered to be the highly stable HW and SW architecture associated with Mainframe or PC computers. But the 21 ...
These videos were created using Computing Platforms for the 21C - Uo.Liverpool, 25feb14
View 1071 21c posts, presentations, experts, and more. Get the professional knowledge you need on LinkedIn. LinkedIn Home What is LinkedIn? Join Today
Copy/Paste: Ian Phillips, Principal Staff Engineer, ... 25feb14: Computing Platforms for the 21C [pdf|vid]. Seminar, Uo.Liverpool. 2013 Presentations.
4sep13: Computing Platforms for the 21c pdfvid. 7mar13: ... Uo.Prof Ian Phillips. 25feb14: Computing Platforms for the 21C pdfvid. Seminar, Uo.Liverpool.
MBA 21C; System is processing data Please download to view Download 1
21c Notebooking 1. "Ten years ago, building something as simple as a networked thermometer required some understanding of electrical engineering.
Two 21C Eco-Heroes. by tamaki. on Mar 12, 2015. Report Category: Documents. Download: 1 Comment: 0. 454. views. Comments. Description. Download ...
HANDBOOK ON GREEN INFORMATION AND COMMUNICATION SYSTEMS. Chapter 9: Green Computing Platforms for Biomedical Systems. Vinay Vijendra Kumar Lakshmi, ...
Computing Platforms for the XXIc - DSD/SEAA Keynote Wikipedia defines Platform as "A raised level surface on which people or things can stand".