======================================================================
=                               POWER9                               =
======================================================================

                            Introduction
======================================================================
POWER9 is a family of superscalar, multithreading, multi-core
microprocessors produced by IBM, based on the Power ISA.  It was
announced in August 2016. The POWER9-based processors are being
manufactured using a 14 nm FinFET process, in 12- and 24-core
versions, for scale out and scale up applications, and possibly other
variations, since the POWER9 architecture is open for licensing and
modification by the OpenPOWER Foundation members.

Summit, the seventh fastest supercomputer in the world (based on the
Top500 list as of November 2023), is based on POWER9, while also using
Nvidia Tesla GPUs as accelerators.


Core
======
The POWER9 core comes in two variants, a four-way multithreaded one
called 'SMT4' and an eight-way one called 'SMT8'. The SMT4- and
SMT8-cores are similar, in that they consist of a number of so-called
'slices' fed by common schedulers. A slice is a rudimentary 64-bit
single-threaded processing core with load store unit (LSU), integer
unit (ALU) and a vector scalar unit (VSU, doing SIMD and floating
point). A 'super-slice' is the combination of two slices. An SMT4-core
consists of a 32 KiB L1 cache (1 KiB = 1024 bytes), a 32 KiB L1 data
cache, an instruction fetch unit (IFU) and an instruction sequencing
unit (ISU) which feeds two super-slices. An SMT8-core has two sets of
L1 caches and, IFUs and ISUs to feed four super-slices. The result is
that the 12-core and 24-core versions of POWER9 each consist of the
same number of slices (96 each) and the same amount of L1 cache.

A POWER9 core, whether SMT4 or SMT8, has a 12-stage pipeline (five
stages shorter than its predecessor, the POWER8), but aims to retain
the clock frequency of around 4 GHz. It will be the first to
incorporate elements of the Power ISA v.3.0 that was released in
December 2015, including the VSX-3 instructions. The POWER9 design is
made to be modular and used in more processor variants and used for
licensing, on a different fabrication process than IBM's. On chip are
co-processors for compression and cryptography, as well as a large
low-latency eDRAM L3 cache.

The POWER9 comes with a new interrupt controller architecture called
'"eXternal Interrupt Virtualization Engine"' (XIVE) which replaces a
much simpler architecture that was used in POWER4 through POWER8. XIVE
will also be used in Power10.


Scale out / scale up
======================
* IBM POWER9 'SO' scale-out variant, optimized for dual socket
computers with up to 120 GB/s bandwidth (1 GB = 1 billion bytes) to
directly attached DDR4 memory (targeted for release in 2017)
* IBM POWER9 'SU' scale-up variant, optimized for four sockets or
more, for large NUMA machines with up to 230 GB/s bandwidth to
buffered memory (uses "25.6 GHz" signaling with the PowerAXON 25
GT/sec Link interface)

Both POWER9 variants can ship in versions with some cores disabled due
to yield reasons, as such Raptor Computing Systems first sold 4-core
chips, and even IBM initially sold its AC922 systems with no more than
22-core chips, even though both types of chips have 24 cores on their
dies.


I/O
=====
A lot of facilities are on-chip for helping with massive off-chip I/O
performance:
* The 'SO' variant has integrated DDR4 controllers for directly
attached RAM, while the 'SU' variant will use the off-chip Centaur
architecture introduced with POWER8 to include high performance eDRAM
L4 cache and memory controllers for DDR4 RAM.
* The 'Bluelink' interconnects for close attachment of graphics
co-processors from Nvidia (over NVLink v.2) and OpenCAPI accelerators.
* General purpose PCIe v.4 connections for attaching regular ASICs,
FPGAs and other peripherals as well as CAPI 2.0 and CAPI 1.0 devices
designed for POWER8.
* Multiprocessor (symmetric multiprocessor system) links to connect
other POWER9 processors on the same motherboard, or in other closely
attached enclosures.


                             Chip types
======================================================================
POWER9 chips can be made with two types of cores, and in a Scale Out
or Scale Up configuration. POWER9 cores are either SMT4 or SMT8, with
SMT8 cores intended for PowerVM systems, while the SMT4 cores are
intended for PowerNV systems, which do not use PowerVM, and
predominantly run Linux. With POWER9, chips made for Scale Out can
support directly-attached memory, while Scale Up chips are intended
for use with machines with more than two CPU sockets, and use buffered
memory.

POWER9 Chips     PowerNV         PowerVM
24 × SMT4 core !! 12 × SMT8 core
Scale Out       Nimbus   unknown
Scale Up                 Cumulus


Modules
=========
The IBM Portal for OpenPOWER lists the three available modules for the
Nimbus chip, although the Scale-Out SMT8 variant for PowerVM also uses
the LaGrange module/socket:
* 'Sforza' - 50 mm × 50 mm, 4 DDR4, 48 PCIe lanes, 1 XBus 4B
* 'Monza' - 68.5 mm × 68.5 mm, 8 DDR4, 34 PCIe lanes, 1 XBus 4B, 48
OpenCAPI lanes
* 'LaGrange' - 68.5 mm × 68.5 mm, 8 DDR4, 42 PCIe lanes, 2 XBus 4B, 16
OpenCAPI lanes

Sforza modules use a land grid array (LGA) 2601-pin socket.


Raptor Computing Systems / Raptor Engineering
===============================================
'Talos II' - two-socket workstation/server platform using POWER9 SMT4
Sforza processors; available as 2U server, 4U server, tower, or EATX
mainboard. Marketed as secure and owner-controllable with free and
open-source software and firmware. Initially shipping with 4-core,
8-core, 18-core, and 22-core chip options until chips with more cores
are available.

'Talos II Lite' - single-socket version of the Talos II mainboard,
made using the same PCB.

'Blackbird' - single-socket microATX platform using SMT4 Sforza
processors (up to 8-core 160 W variant), 4-8 cores, 2 RAM slots
(supporting up to 256 GiB total)


Google–Rackspace partnership
==============================
'Barreleye G2 / Zaius' - two-socket server platform using LaGrange
processors; both the Barreleye G2 and Zaius chassis use the Zaius
POWER9 motherboard


IBM
=====
'Power System AC922' - 2U, 2× POWER9 SMT4 Monza, with up to 6× Nvidia
Volta GPUs, 2× CAPI 2.0 attached accelerators and 1 TiB DDR4 RAM. AC
here is an abbreviation for Accelerated Computing; this system is also
known as "Witherspoon" or "Newell".

'Power System L922' - 2U, 1-2× POWER9 SMT8, 8-12 cores per processor,
up to 4 TiB DDR4 RAM (1 TiB = 1024 GiB), PowerVM running Linux.

'Power System S914' - 4U, 1× POWER9 SMT8, 4-8 cores, up to 1 TiB DDR4
RAM, PowerVM running AIX/IBM i/Linux.

'Power System S922' - 2U, 1-2× POWER9 SMT8, 4-11 cores per processor,
up to 4 TiB DDR4 RAM, PowerVM running AIX/IBM i/Linux.

'Power System S924' - 4U, 2× POWER9 SMT8, 8-12 cores per processor, up
to 4 TiB DDR4 RAM, PowerVM running AIX/IBM i/Linux.

'Power System H922' - 2U, 1-2× POWER9 SMT8, 4-10 cores per processor,
up to 4 TiB DDR4 RAM, PowerVM running SAP HANA (on Linux) with AIX/IBM
i on up to 25% of the system.

'Power System H924' - 4U, 2× POWER9 SMT8, 8-12 cores per processor, up
to 4 TiB DDR4 RAM, PowerVM running SAP HANA (on Linux) with AIX/IBM i
on up to 25% of the system.

'Power System E950' - 4U, 2-4× POWER9 SMT8, 8-12 cores per processor,
up to 16 TiB buffered DDR4 RAM

'Power System E980' - 1-4× 4U, 4-16× POWER9 SMT8, 8-12 cores per
processor, up to 64 TiB buffered DDR4 RAM

'Hardware Management Console 7063-CR2' - 1U, 1× POWER9 SMT8, 6 cores,
64-128 GB DDR4 RAM.


Penguin Computing
===================
'Magna PE2112GTX' - 2U, two-socket server for high performance
computing using LaGrange processors. Manufactured by Wistron.


IBM supercomputers
====================
'Summit' and 'Sierra' The United States Department of Energy together
with Oak Ridge National Laboratory and Lawrence Livermore National
Laboratory contracted IBM and Nvidia to build two supercomputers, the
'Summit' and the 'Sierra', are based on POWER9 processors coupled with
Nvidia's Volta GPUs. These systems are slated to go online in 2017.
Sierra is based on IBM's Power Systems AC922 compute node.
The first racks of Summit were delivered to Oak Ridge National
Laboratory on 31 July 2017.

'MareNostrum 4' - One of the three clusters in the emerging
technologies block of the fourth MareNostrum supercomputer is a POWER9
cluster with Nvidia Volta GPUs. This cluster is expected to provide
more than 1.5 petaflops of computing capacity when installed. The
emerging technologies block of the MareNostrum 4 exists to test if new
developments might be "suitable for future versions of MareNostrum".


                      Operating system support
======================================================================
As with its predecessor, POWER9 is supported by FreeBSD, IBM AIX, IBM
i, Linux (both running with and without PowerVM), and OpenBSD.

Implementation of POWER9 support in the Linux kernel began with
version 4.6 in March 2016.

Red Hat Enterprise Linux (RHEL), SUSE Linux Enterprise (SLES), Debian
Linux, Ubuntu Linux, and CentOS are supported .

The GNU Guix package manager also supports POWER9, but currently only
with another operating system to host it, i.e. no GNU Guix System.


                              See also
======================================================================
* IBM Power microprocessors
* OpenBMC


                           External links
======================================================================
* [https://www.ibm.com/it-infrastructure/power/power9 IBM Power9]
* [https://www.ibm.com/systems/power/openpower/ IBM Portal for
OpenPOWER]


License
=========
All content on Gopherpedia comes from Wikipedia, and is licensed under CC-BY-SA
License URL: http://creativecommons.org/licenses/by-sa/3.0/
Original Article: http://en.wikipedia.org/wiki/POWER9