diff options
author | Angelos Mouzakitis <a.mouzakitis@virtualopensystems.com> | 2023-10-10 14:33:42 +0000 |
---|---|---|
committer | Angelos Mouzakitis <a.mouzakitis@virtualopensystems.com> | 2023-10-10 14:33:42 +0000 |
commit | af1a266670d040d2f4083ff309d732d648afba2a (patch) | |
tree | 2fc46203448ddcc6f81546d379abfaeb323575e9 /roms/skiboot/doc/release-notes | |
parent | e02cda008591317b1625707ff8e115a4841aa889 (diff) |
Change-Id: Iaf8d18082d3991dec7c0ebbea540f092188eb4ec
Diffstat (limited to 'roms/skiboot/doc/release-notes')
164 files changed, 32728 insertions, 0 deletions
diff --git a/roms/skiboot/doc/release-notes/index.rst b/roms/skiboot/doc/release-notes/index.rst new file mode 100644 index 000000000..43919770c --- /dev/null +++ b/roms/skiboot/doc/release-notes/index.rst @@ -0,0 +1,10 @@ +============= +Release Notes +============= + +.. toctree:: + :maxdepth: 1 + :glob: + + * + diff --git a/roms/skiboot/doc/release-notes/skiboot-4.0.rst b/roms/skiboot/doc/release-notes/skiboot-4.0.rst new file mode 100644 index 000000000..5853f0f0f --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-4.0.rst @@ -0,0 +1,16 @@ +.. _skiboot-4.0: + +=========== +skiboot 4.0 +=========== + +Skiboot 4.0 was released 19th November 2014. It was the first release to obtain +an independent version number and numbering scheme. Previous releases were +identified either purely by a GIT SHA1 hash or the associated PowerKVM release +number. + +This release introduced the following OPAL calls: + + - :ref:`OPAL_IPMI_SEND` + - :ref:`OPAL_IPMI_RECV` + - :ref:`OPAL_I2C_REQUEST` diff --git a/roms/skiboot/doc/release-notes/skiboot-4.1.1.rst b/roms/skiboot/doc/release-notes/skiboot-4.1.1.rst new file mode 100644 index 000000000..a2aa8657a --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-4.1.1.rst @@ -0,0 +1,40 @@ +.. _skiboot-4.1.1: + +============= +skiboot 4.1.1 +============= + +Skiboot 4.1 was released 30th January 2015. + + * fsp: Avoid NULL dereference in case of invalid class_resp bits + CQ: SW288484 + * Makefile: Support CROSS_COMPILE as well as CROSS + * Additional unit testing: + + * Tiny hello_world kernel + * Will run boot tests with hello_world and (if present) petitboot + image in the POWER8 Functional simulator (mambo) (if present) + * Run CCAN unit tests as part of 'make check' + * Increased testing of PEL code + * unit test console-log + * skeleton libc unit tests + * Fix compatible match for palmetto & habanero + The strings should be "tyan,..." not "ibm,..." + (N/A for IBM systems) + * i2c: Unify the frequencies to calculate bit rate divisor + * Unlock rtc cache lock when cache isn't valid + Could cause IPL crash on POWER7 + * Initial documentation for OPAL API, ABI and Specification + * Add Firestone platform + * Fix crash when one socket wasn't populated with a CPU + LTC-Bugzilla: 120562 + * Bug fix in RTC state machine which possibly led to RTC not working + * Makefile fixes for running with some GCC 4.9 compilers + * Add device tree properties for pstate vdd and vcs values + * cpuidle: Add validated metrics for idle states + Export residency times in device tree + * Revert "platforms/astbmc: Temporary reboot workaround" + (N/A for IBM systems) + * Fix buffer overrun in print_* functions. + This could cause IPL failures or conceivably other runtime problems + diff --git a/roms/skiboot/doc/release-notes/skiboot-4.1.rst b/roms/skiboot/doc/release-notes/skiboot-4.1.rst new file mode 100644 index 000000000..341c10f55 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-4.1.rst @@ -0,0 +1,43 @@ +.. _skiboot-4.1: + +=========== +skiboot 4.1 +=========== + +Skiboot 4.1 was released 10th December 2014. It was a release where more +development transitioned to the open source mailing list rather than internal +mailing lists. + +Changes include: + + - We now build with -fstack-protector and -Werror + - Stack checking extensions when built with STACK_CHECK=1 + - Reduced stack usage in some areas, -Wstack-usage=1024 now. + + - Some functions could use 2kb stack, now all are <1kb + - Unsafe libc functions such as sprintf() have been removed + - Symbolic backtraces + - expose skiboot symbol map to OS (via device-tree) + - removed machine check interrupt patching in OPAL + - occ/hbrt: Call stopOCC() for implementing reset OCC command from FSP + - occ: Fix the low level ACK message sent to FSP on receiving {RESET/LOAD}_OCC + - hardening to errors of various FSP code + + - fsp: Avoid NULL dereference in case of invalid class_resp bits- + abort if device tree parsing fails + - FSP: Validate fsp_msg in fsp_queue_msg + - fsp-elog: Add various NULL checks + - Finessing of when to use error log vs prerror() + - More i2c work + - Can now run under Mambo simulator (see external/mambo/skiboot.tcl) + (commonly known as "POWER8 Functional Simulator") + - Document skiboot versioning scheme + - opal: Handle more TFAC errors. + + - TB_RESIDUE_ERR, FW_CONTROL_ERR and CHIP_TOD_PARITY_ERR + - ipmi: populate FRU data + - rtc: Add a generic rtc cache + - ipmi/rtc: use generic cache + - Error Logging backend for bmc based machines + - PSI: Drive link down on HIR + - occ: Fix clearing of OCC interrupt on remote fix diff --git a/roms/skiboot/doc/release-notes/skiboot-5.0.rst b/roms/skiboot/doc/release-notes/skiboot-5.0.rst new file mode 100644 index 000000000..0a62546ef --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.0.rst @@ -0,0 +1,145 @@ +.. _skiboot-5.0: + +=========== +skiboot 5.0 +=========== + +Skiboot 5.0 was released Friday 10th April 2015. + +Changes in 5.0 (since rc3): + + - Fix chip id for nx coprocessors. + - hw/ipmi: Fix FW Boot Progress sensor + - bt: Add a temporary workaround for bmc dropping messages + - FSP/CUPD: Fix lock issue + +Changes in rc3 (since rc2): + + - add support for cec_power_down on mambo + - external/opal-prd: Use link register for cross-endian branch + - opal header file rework, Linux and skiboot now very closely match (API + in opal-api.h) + - libflash: don't use the low level interface if it doesn't exist + - libflash/file: add file abstraction for libflash + - external: create a GUARD partition parsing utility + +Changes in rc2 (since rc1): + + - opal: Fix an issue where partial LID load causes opal to hang. + - nx: use proc_gen instead of param + - use chip id for NX engine Coproc Instance num + - Fix (hopefully) missing dot symbols in skiboot.map + - exceptions: Catch exceptions at boot time + - exceptions: Remove deprecated exception patching stuff + - mambo: Make mambo_utils.tcl optional + - mambo: Exit mambo when the simulation is stopped + - add NX register defines + - set NX crb input queues to 842 only + - core: Catch attempts to branch through a NULL pointer + - plat/firestone: Add missing platform hooks + - plat/firestone: Add missing platform hooks + - elog: Don't call uninitialized platform elog_commit + - external/opal-prd: Use "official" switch-endian syscall + - hw/ipmi: Rework sensors and fix boot count sensor + +Changes in rc1 (since 4.1.1): + +General: + + * big OPAL API documentation updates + We now document around 19 OPAL calls. There's still ~100 left to doc + though :) + * skiboot can load FreeBSD kernel payload (thanks to Nathan Whitehorn) + * You can now run sparse by setting C=1 when building + * PSI: Revert the timeout for PSI link recovery to architected value + now 30mins (prev 15) + * cpuidle: Add validated metrics for idle states + * core/flash: Add flash API + OPAL_FLASH_(READ|WRITE|ERASE) + * capi: Dynamically calculate which CAPP port to use + no longer hardwired to PHB0 + * vpd: Use slca parent-child relationship to create vpd tree + * opal: Do not overwrite same HMI event for multiple HMI errors. + Now Linux will get a HMI event for each HMI error + * HMI event v2 now includes information about checkstop + * HMI improvements, handle more conditions gracefully: + + * TB residue error + * TFMR firmware control error + * TFMR parity + * TFMR HDEC parity error + * TFMR DEC parity error + * TFMR SPURR/PURR parity error + * TB residue and HDEC parity HMI errors on split core + * hostservices: Cache lids prior to first load request + * Warn when pollers are called with a lock held + and keep track of lock depth. + + **NOTE:** This means we will get backtraces in skiboot msglog on FSP machines + This is a KNOWN ISSUE and is largely harmless. + There's still a couple that we haven't yet cleaned, these + messages can be thought of as a TODO list for developers. + + * Don't run pollers in time_wait if lock held + * pci: Don't hang if we have only one CPU + * Detect recursive poller entry + * General cleanup + * Cleanup of opal.h so that we can have Linux and skiboot match + * add sparse annotations to opal.h + * Platform hooks for loading and preloading resources (LIDs) + This lays the groundwork for cutting 4-20 seconds off boot in a + future skiboot release. + * Fix potential race when clearing OCC interrupt status + * Add platform operation for reading sensors + + * add support to read core and memory buffer temperatures + +Mambo/POWER8 Functional Simulator: + + * Replace is_mambo_chip() with a better quirks mechanism. + * Don't hang if we only have one CPU and PCI. + +BMC systems: + + * BMC can load payload from flash + * IPMI on BMC systems: graceful poweroff and reboot + * IPMI on BMC systems: watchdog timer support + * IPMI on BMC systems: PNOR locking + * Support for IPMI progress sensor + * IPMI boot count sensor + * capi: Rework microcode flash download and CAPP upload + load microcode on non-fsp systems + * NEW opal-prd userspace tool that handles PRD on non-FSP systems. + and OPAL PRD calls to support it. + * Improvements to opal-prd, libflash, and ipmi + * ECC support in libflash + * Load CAPI micro code, enabling CAPI on OpenPower systems. + * Dynamically calculate which CAPP port to use, don't hardcode to PHB0 + * memboot flash backend + +POWER8 + + * add nx-842 coproc support + +FSP systems: + + * Make abort() update sp attn area (like assert does) + On FSP systems this gives better error logs/dumps when abort() is hit + * FSP/LEDS: Many improvements and bug fixes + * LED support for FSP machines + Adds OPAL_LEDS_(GET|SET)_INDICATOR and device-tree bindings + * Refactor of fsp-rtc + * OCC loading fixes, including possible race condition where we would + fail to IPL. + +POWER7 + + * Fix unsupported return code of OPAL_(UN)REGISTER_DUMP_REGION on P7 + * occ: Don't do bad XSCOMs on P7 + The OCC interrupt register only exists on P8, accessing it on P7 causes + not only error logs but also causes PRD to eventually gard chips. + * cpu: Handle opal_reinit_cpus() more gracefully on P7 + no longer generate error logs + * libflash updates for openpower + * misc code cleanup + * add nx-842 coproc support diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.0-beta1.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.0-beta1.rst new file mode 100644 index 000000000..6baaaf0ee --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.0-beta1.rst @@ -0,0 +1,290 @@ +skiboot-5.1.0-beta1 +=================== + +skiboot-5.1.0-beta1 was released on July 21st, 2015. + +skiboot-5.1.0-beta1 is the first beta release of skiboot 5.1, which will +become a new stable release, replacing skiboot-5.0 (released April 14th 2015) + +Skiboot 5.1-beta1 contains all fixes from skiboot-5.0 stable branch up to +skiboot-5.0.5. + +New features +^^^^^^^^^^^^ +Over skiboot-5.0, the following features have been added: + +* Centaur i2c support +* Add Naples chip (CPU, PHB, LPC serial interrupts) support +* Added qemu platform +* improvements to FSI error handling +* improvements in chip TOD failover (some only on FSP systems) +* Set Relative Priority Register (RPR) to recommended value + + * this affects thread priority in SMT modes +* greatly reduce memory consumption by CPU stacks for non-present CPUs + + * Previously we would reserve enough memory for max PIR for each CPU type. + * This fix frees up 77MB of RAM on a typical P8 system. +* increased OPAL API documentation +* Asynchronous preloading of resources from FSP/flash + + * improves boot time on some systems +* Basic Garrison platform support +* Add Mambo platform (P8 Functional Simulator, systemsim) + + * includes fake NVRAM, RTC +* Support building with GCOV, increasing memory for skiboot binary to 2MB + + * includes boot code coverage testing +* Increased skiboot HEAP size. + + * We are not aware of any system where you would run out, but on large + systems it was getting closer than we liked. +* add boot_tests.sh for helping automate boot testing on FSP and BMC machines +* Versioning of pflash and gard utilities to help Linux (or other OS) + distributions with packaging. +* OCC throttle status messages to host +* CAPP timebase sync ("ibm,capp-timebase-sync" in DT to indicate CAPP timebase + was synced by OPAL) + +New features for FSP based machines +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +* in-band IPMI support +* ethernet adaptor location codes +* add DIMM frequency information to device tree +* improvements in FSP error log code paths +* fix some boot time memory leaks + + * harmless to end user + +New features for AMI BMC based machines +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +* PCIe power workaround for K80 +* Added support for Macronix 128Mbit flash chips +* Initial PRD support for Firestone platform +* improved reliability when BMC reboots + +Bug Fixes +^^^^^^^^^ +The following bugs have been fixed: + +* Increase PHB3 timeout for electrical links coming up to 2 seconds. + + * fixes issues with some Mellanox cards +* Hang in opal_reinit_cpus() that could prevent kdump from functioning +* PHB3: fix crash in phb3_init +* PHB3: fix crash with fenced PHB in phb3_init_hw() +* Fix bugs in hw/bt.c (interface for IPMI on BMC machines) that could possibly + lead to a crash (dereferencing invalid address, deadlock) +* ipmi/sel: fix use-after-free +* Bug fixes in EEH handling + + * opal_pci_next_error() cleared OPAL_EVENT_PCI_ERROR unconditionally, possibly leading to missed errors. + +FSP-specific bugs fixed: +^^^^^^^^^^^^^^^^^^^^^^^^ +* (also fixed in skiboot-5.0.2) Fix race in firenze_get_slot_info() leading to + assert() with many PCI cards + + With many PCI cards, we'd hit a race where calls to + firenze_add_pcidev_to_fsp_inventory would step on each other leading to + memory corruption and finally an assert() in the allocator being hit + during boot. +* PCIe power workaround for K80 cards +* /ibm,opal/led renamed to /ibm,opal/leds in Device Tree + + * compatible change as no FSP based systems shipped with skiboot-5.0 + +General improvements: +^^^^^^^^^^^^^^^^^^^^^ +* don't run pollers on non-boot CPUs in time_wait +* improvements to opal-prd, pflash, libflash + + * including new blocklevel interface in libflash +* many minor fixes to issues found by static analysis +* improvements in FSP error log code paths +* code cleanup in memory allocator +* Don't expose individual nvram partitions in the device tree, just the whole + flash device. +* build improvements for building on ppc64el host +* improvements in cpu_relax() for idle threads, needed for GCOV on large + machines. +* Optimized memset() for POWER8, greatly reducing number of instructions + executed for boot, which helps boot time in simulators. +* Major improvements in hello_world kernel + + * Bloat of huge 17 instruction test case reduced to 10. +* Disable bust_locks for general calls of abort() + + * Should enable better error messages during abort() when other users of + LPC bus exist (e.g. flash) + +Contributors +------------ + +Thanks to everyone who has made skiboot-5.1.0-beta1 happen! + + +Processed 321 csets from 25 developers +3 employers found +A total of 13696 lines added, 2754 removed (delta 10942) + +Developers with the most changesets + +========================== =========== +Developer Changesets +========================== =========== +Stewart Smith 101 (31.5%) +Benjamin Herrenschmidt 32 (10.0%) +Cyril Bur 31 (9.7%) +Vasant Hegde 28 (8.7%) +Jeremy Kerr 27 (8.4%) +Kamalesh Babulal 19 (5.9%) +Alistair Popple 12 (3.7%) +Mahesh Salgaonkar 12 (3.7%) +Neelesh Gupta 8 (2.5%) +Cédric Le Goater 8 (2.5%) +Joel Stanley 8 (2.5%) +Ananth N Mavinakayanahalli 8 (2.5%) +Gavin Shan 6 (1.9%) +Michael Neuling 6 (1.9%) +Frederic Bonnard 3 (0.9%) +Vipin K Parashar 2 (0.6%) +Vaidyanathan Srinivasan 2 (0.6%) +Philippe Bergheaud 1 (0.3%) +Shilpasri G Bhat 1 (0.3%) +Daniel Axtens 1 (0.3%) +Hari Bathini 1 (0.3%) +Michael Ellerman 1 (0.3%) +Andrei Warkentin 1 (0.3%) +Dan Horák 1 (0.3%) +Anton Blanchard 1 (0.3%) +========================== =========== + +Developers with the most changed lines + +========================== ============= +Developer Changed Lines +========================== ============= +Stewart Smith 3987 (27.9%) +Benjamin Herrenschmidt 3811 (26.6%) +Cyril Bur 1918 (13.4%) +Jeremy Kerr 1307 (9.1%) +Mahesh Salgaonkar 886 (6.2%) +Vasant Hegde 764 (5.3%) +Neelesh Gupta 473 (3.3%) +Vipin K Parashar 176 (1.2%) +Alistair Popple 175 (1.2%) +Philippe Bergheaud 171 (1.2%) +Shilpasri G Bhat 165 (1.2%) +Cédric Le Goater 89 (0.6%) +Frederic Bonnard 78 (0.5%) +Gavin Shan 73 (0.5%) +Joel Stanley 65 (0.5%) +Kamalesh Babulal 63 (0.4%) +Michael Neuling 47 (0.3%) +Daniel Axtens 31 (0.2%) +Ananth N Mavinakayanahalli 22 (0.2%) +Anton Blanchard 3 (0.0%) +Vaidyanathan Srinivasan 2 (0.0%) +Hari Bathini 2 (0.0%) +Michael Ellerman 1 (0.0%) +Andrei Warkentin 1 (0.0%) +Dan Horák 1 (0.0%) +========================== ============= + +Developers with the most lines removed: + +========================= ============== +========================= ============== +Vipin K Parashar 105 (3.8%) +Michael Neuling 24 (0.9%) +Hari Bathini 1 (0.0%) +========================= ============== + +Developers with the most signoffs (total 214) + +========================= ============== +Stewart Smith 214 (100.0%) +========================= ============== + +Developers with the most reviews (total 21) + +========================== ============== +========================== ============== +Vasant Hegde 7 (33.3%) +Joel Stanley 3 (14.3%) +Gavin Shan 2 (9.5%) +Kamalesh Babulal 2 (9.5%) +Alistair Popple 2 (9.5%) +Stewart Smith 1 (4.8%) +Andrei Warkentin 1 (4.8%) +Preeti U Murthy 1 (4.8%) +Samuel Mendoza-Jonas 1 (4.8%) +Ananth N Mavinakayanahalli 1 (4.8%) +========================== ============== + +Developers with the most test credits (total 1) + +========================= ============== +========================= ============== +Chad Larson 1 (100.0%) +========================= ============== + +Developers who gave the most tested-by credits (total 1) + +========================= ============== +========================= ============== +Gavin Shan 1 (100.0%) +========================= ============== + +Developers with the most report credits (total 4) + +========================= ============== +========================= ============== +Benjamin Herrenschmidt 2 (50.0%) +Chad Larson 1 (25.0%) +Andrei Warkentin 1 (25.0%) +========================= ============== + +Developers who gave the most report credits (total 4) + +========================= ============== +========================= ============== +Stewart Smith 3 (75.0%) +Gavin Shan 1 (25.0%) +========================= ============== + +Top changeset contributors by employer + +========================== ============== +========================== ============== +IBM 319 (99.4%) +dan@danny.cz 1 (0.3%) +andrey.warkentin@gmail.com 1 (0.3%) +========================== ============== + +Top lines changed by employer + +========================== ============== +========================== ============== +IBM 14309 (100.0%) +dan@danny.cz 1 (0.0%) +andrey.warkentin@gmail.com 1 (0.0%) +========================== ============== + +Employers with the most signoffs (total 214) + +========================= ============== +IBM 214 (100.0%) +========================= ============== + +Employers with the most hackers (total 25) + +========================== ============== +========================== ============== +IBM 23 (92.0%) +dan@danny.cz 1 (4.0%) +andrey.warkentin@gmail.com 1 (4.0%) +========================== ============== + diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.0-beta2.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.0-beta2.rst new file mode 100644 index 000000000..9058f9569 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.0-beta2.rst @@ -0,0 +1,69 @@ +skiboot-5.1.0-beta2 +=================== + +skiboot-5.1.0-beta2 was released on August 14th, 2015. + +skiboot-5.1.0-beta2 is the second beta release of skiboot 5.1, which will +become a new stable release, replacing skiboot-5.0 (released April 14th 2015) + +Skiboot 5.1.0-beta2 contains all fixes from skiboot-5.0 stable branch up to +skiboot-5.0.5 and everything from 5.1.0-beta1. + +New Features +^^^^^^^^^^^^ + +Over skiboot-5.1.0-beta1, the following features have been added: + +- opal-api: Add OPAL call to handle abnormal reboots. + OPAL_CEC_REBOOT2 + Currently it will support two reboot types (0). normal reboot, that + will behave similar to that of opal_cec_reboot() call, and + (1). platform error reboot. + + Long term, this is designed to replace OPAL_CEC_REBOOT. + +Bug fixes +^^^^^^^^^ +Over skiboot-5.1.0-beta1, the following bugs have been fixed: + +- external/opal-prd: Only map each PRD range once + + - could eventually lead to failing to map PRD ranges +- On skiboot crash, don't try to print symbol when we didn't find one + + - makes backtrace prettier +- On skiboot crash, dump hssr0 and hsrr1 registers correctly. +- Better support old and biarch compilers + + - test "new" compiler flags before using them + - Specify -mabi=elfv1 if supported (which means it's needed) +- fix boot-coverage-report makefile target +- ipmi: Fix the opal_ipmi_recv() call to handle the error path + + - Could make kernel a sad panda when in continues with other IPMI commands +- IPMI: truncate SELs at 2kb + + - it's the limit of the astbmc. We think. +- IPMI/SEL/PEL: + + - As per PEL spec, we should log events with severity >= 0x22 and "service + action flag" is "on". But in our case, all logs OPAL originagted logs + are makred as report externally. + We now only report logs with severity >= 0x22 +- IPMI: fixes to eSEL logging +- hw/phb3: Change reserved PE to 255 + + - Currently, we have reserved PE#0 to which all RIDs are mapped prior + to PE assignment request from kernel. The last M64 BAR is configured + to have shared mode. So we have to cut off the first M64 segment, + which corresponds to reserved PE#0 in kernel. If the first BAR + (for example PF's IOV BAR) requires huge alignment in kernel, we + have to waste huge M64 space to accommodate the alignment. If we + have reserved PE#256, the waste of M64 space will be avoided. + +Other changes +^^^^^^^^^^^^^ +- unified version numbers for bundled utilities +- external/boot_test/boot_test.sh + + - better usable for automated boot testing diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.0.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.0.rst new file mode 100644 index 000000000..d7e65792e --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.0.rst @@ -0,0 +1,367 @@ +.. _skiboot-5.1.0: + +skiboot-5.1.0 +============= + +skiboot-5.1.0 was released on August 17th, 2015. + +skiboot-5.1.0 is the first stable release of 5.1.0 following two beta releases. +This new stable release replaces skiboot-5.0 as the current stable skiboot +release (5.0 was released April 14th 2015). + +Skiboot 5.1.0 contains all fixes from skiboot-5.0 stable branch up to +skiboot-5.0.5 and everything from 5.1.0-beta1 and 5.1.0-beta2. + +Over skiboot-5.1.0-beta2, we have the following changes: + +- opal_prd now supports multiple socket systems +- fix compiler warnings in gard and libflash + +Below are the changes introduced in previous skiboot-5.1.0 releases over +the previous stable release, skiboot-5.0: + +New features +^^^^^^^^^^^^ + +- Add Naples chip (CPU, PHB, LPC serial interrupts) support +- Added qemu platform +- improvements to FSI error handling +- improvements in chip TOD failover (some only on FSP systems) +- Set Relative Priority Register (RPR) to recommended value + + - this affects thread priority in SMT modes + +- greatly reduce memory consumption by CPU stacks for non-present CPUs + + - Previously we would reserve enough memory for max PIR for each CPU + type. + - This fix frees up 77MB of RAM on a typical P8 system. + +- increased OPAL API documentation +- Asynchronous preloading of resources from FSP/flash + + - improves boot time on some systems + +- Basic Garrison platform support +- Add Mambo platform (P8 Functional Simulator, systemsim) + + - includes fake NVRAM, RTC + +- Support building with GCOV, increasing memory for skiboot binary to 2MB + + - includes boot code coverage testing + +- Increased skiboot HEAP size. + + - We are not aware of any system where you would run out, but on large + systems it was getting closer than we liked. + +- add boot_tests.sh for helping automate boot testing on FSP and BMC machines +- Versioning of pflash and gard utilities to help Linux (or other OS) + distributions with packaging. +- OCC throttle status messages to host +- CAPP timebase sync ("ibm,capp-timebase-sync" in DT to indicate CAPP timebase + was synced by OPAL) +- opal-api: Add OPAL call to handle abnormal reboots. + +``OPAL_CEC_REBOOT2`` currently supports two reboot types: + + 0. normal reboot, that will behave similar to that of opal_cec_reboot() call + 1. platform error reboot. + +Long term, this is designed to replace OPAL_CEC_REBOOT. + +New features for FSP based machines +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +- in-band IPMI support +- ethernet adaptor location codes +- add DIMM frequency information to device tree +- improvements in FSP error log code paths +- fix some boot time memory leaks + + - harmless to end user + +New features for AMI BMC based machines +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +- PCIe power workaround for K80 +- Added support for Macronix 128Mbit flash chips +- Initial PRD support for Firestone platform +- improved reliability when BMC reboots + +The following bugs have been fixed +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +- Increase PHB3 timeout for electrical links coming up to 2 seconds. + + - fixes issues with some Mellanox cards + +- Hang in opal_reinit_cpus() that could prevent kdump from functioning +- PHB3: fix crash in phb3_init +- PHB3: fix crash with fenced PHB in phb3_init_hw() +- Fix bugs in hw/bt.c (interface for IPMI on BMC machines) that could possibly + lead to a crash (dereferencing invalid address, deadlock) +- ipmi/sel: fix use-after-free +- Bug fixes in EEH handling + + - opal_pci_next_error() cleared OPAL_EVENT_PCI_ERROR unconditionally, + possibly leading to missed errors. + +- external/opal-prd: Only map each PRD range once + + - could eventually lead to failing to map PRD ranges + +- On skiboot crash, don't try to print symbol when we didn't find one + + - makes backtrace prettier + +- On skiboot crash, dump hssr0 and hsrr1 registers correctly. +- Better support old and biarch compilers + + - test "new" compiler flags before using them + - Specify -mabi=elfv1 if supported (which means it's needed) + +- fix boot-coverage-report makefile target +- ipmi: Fix the opal_ipmi_recv() call to handle the error path + +- Could make kernel a sad panda when in continues with other IPMI commands +- IPMI: truncate SELs at 2kb + + - it's the limit of the astbmc. We think. + +- IPMI/SEL/PEL: + + - As per PEL spec, we should log events with severity >= 0x22 and "service + action flag" is "on". But in our case, all logs OPAL originagted logs + are makred as report externally. + We now only report logs with severity >= 0x22 + +- IPMI: fixes to eSEL logging +- hw/phb3: Change reserved PE to 255 + + - Currently, we have reserved PE#0 to which all RIDs are mapped prior + to PE assignment request from kernel. The last M64 BAR is configured + to have shared mode. So we have to cut off the first M64 segment, + which corresponds to reserved PE#0 in kernel. If the first BAR + (for example PF's IOV BAR) requires huge alignment in kernel, we + have to waste huge M64 space to accommodate the alignment. If we + have reserved PE#256, the waste of M64 space will be avoided. + +FSP-specific bugs fixed +^^^^^^^^^^^^^^^^^^^^^^^ +- (also fixed in skiboot-5.0.2) Fix race in firenze_get_slot_info() leading to + assert() with many PCI cards + + With many PCI cards, we'd hit a race where calls to + firenze_add_pcidev_to_fsp_inventory would step on each other leading to + memory corruption and finally an assert() in the allocator being hit + during boot. + +- PCIe power workaround for K80 cards +- /ibm,opal/led renamed to /ibm,opal/leds in Device Tree + + - compatible change as no FSP based systems shipped with skiboot-5.0 + +General improvements +^^^^^^^^^^^^^^^^^^^^ +- Preliminary Centaur i2c support + + - lays framework for supporting Centaur i2c + +- don't run pollers on non-boot CPUs in time_wait +- improvements to opal-prd, pflash, libflash + + - including new blocklevel interface in libflash + +- many minor fixes to issues found by static analysis +- improvements in FSP error log code paths +- code cleanup in memory allocator +- Don't expose individual nvram partitions in the device tree, just the whole + flash device. +- build improvements for building on ppc64el host +- improvements in cpu_relax() for idle threads, needed for GCOV on large + machines. +- Optimized memset() for POWER8, greatly reducing number of instructions + executed for boot, which helps boot time in simulators. +- Major improvements in hello_world kernel + + - Bloat of huge 17 instruction test case reduced to 10. + +- Disable bust_locks for general calls of abort() + + - Should enable better error messages during abort() when other users of + LPC bus exist (e.g. flash) + +- unified version numbers for bundled utilities +- external/boot_test/boot_test.sh + + - better usable for automated boot testing + +Contributors +------------ +Since skiboot-5.0, we've had the following changesets: + +Processed 372 csets from 27 developers +2 employers found +A total of 15868 lines added, 3359 removed (delta 12509) + +Developers with the most changesets + +========================== ============= +Developer Changesets +========================== ============= +Stewart Smith 117 (31.5%) +Jeremy Kerr 37 (9.9%) +Cyril Bur 33 (8.9%) +Vasant Hegde 32 (8.6%) +Benjamin Herrenschmidt 32 (8.6%) +Kamalesh Babulal 22 (5.9%) +Joel Stanley 12 (3.2%) +Mahesh Salgaonkar 12 (3.2%) +Alistair Popple 12 (3.2%) +Neelesh Gupta 9 (2.4%) +Gavin Shan 8 (2.2%) +Cédric Le Goater 8 (2.2%) +Ananth N Mavinakayanahalli 8 (2.2%) +Vipin K Parashar 6 (1.6%) +Michael Neuling 6 (1.6%) +Samuel Mendoza-Jonas 3 (0.8%) +Frederic Bonnard 3 (0.8%) +Andrew Donnellan 2 (0.5%) +Vaidyanathan Srinivasan 2 (0.5%) +Philippe Bergheaud 1 (0.3%) +Shilpasri G Bhat 1 (0.3%) +Daniel Axtens 1 (0.3%) +Hari Bathini 1 (0.3%) +Michael Ellerman 1 (0.3%) +Andrei Warkentin 1 (0.3%) +Dan Horák 1 (0.3%) +Anton Blanchard 1 (0.3%) +========================== ============= + + +Developers with the most changed lines + +========================== ============ +========================== ============ +Stewart Smith 4499 (27.3%) +Benjamin Herrenschmidt 3782 (22.9%) +Jeremy Kerr 1887 (11.4%) +Cyril Bur 1654 (10.0%) +Vasant Hegde 959 (5.8%) +Mahesh Salgaonkar 886 (5.4%) +Neelesh Gupta 473 (2.9%) +Samuel Mendoza-Jonas 387 (2.3%) +Vipin K Parashar 332 (2.0%) +Philippe Bergheaud 171 (1.0%) +Shilpasri G Bhat 165 (1.0%) +Alistair Popple 151 (0.9%) +Joel Stanley 105 (0.6%) +Cédric Le Goater 89 (0.5%) +Gavin Shan 83 (0.5%) +Frederic Bonnard 76 (0.5%) +Kamalesh Babulal 65 (0.4%) +Michael Neuling 46 (0.3%) +Daniel Axtens 31 (0.2%) +Andrew Donnellan 22 (0.1%) +Ananth N Mavinakayanahalli 20 (0.1%) +Anton Blanchard 3 (0.0%) +Vaidyanathan Srinivasan 2 (0.0%) +Hari Bathini 2 (0.0%) +Michael Ellerman 1 (0.0%) +Andrei Warkentin 1 (0.0%) +Dan Horák 1 (0.0%) +========================== ============ + +Developers with the most lines removed + +=========================== ============ +=========================== ============ +Michael Neuling 24 (0.7%) +Hari Bathini 1 (0.0%) +=========================== ============ + +Developers with the most signoffs (total 253) + +=========================== ============ +=========================== ============ +Stewart Smith 249 (98.4%) +Mahesh Salgaonkar 4 (1.6%) +=========================== ============ + +Developers with the most reviews (total 24) + +=========================== ============ +=========================== ============ +Vasant Hegde 9 (37.5%) +Joel Stanley 3 (12.5%) +Gavin Shan 2 (8.3%) +Kamalesh Babulal 2 (8.3%) +Samuel Mendoza-Jonas 2 (8.3%) +Alistair Popple 2 (8.3%) +Stewart Smith 1 (4.2%) +Andrei Warkentin 1 (4.2%) +Preeti U Murthy 1 (4.2%) +Ananth N Mavinakayanahalli 1 (4.2%) +=========================== ============ + +Developers with the most test credits (total 1) + +=========================== ============ +=========================== ============ +Chad Larson 1 (100.0%) +=========================== ============ + +Developers who gave the most tested-by credits (total 1) + +=========================== ============ +=========================== ============ +Gavin Shan 1 (100.0%) +=========================== ============ + +Developers with the most report credits (total 4) + +=========================== ============ +=========================== ============ +Benjamin Herrenschmidt 2 (50.0%) +Chad Larson 1 (25.0%) +Andrei Warkentin 1 (25.0%) +=========================== ============ + +Developers who gave the most report credits (total 4) + +=========================== ============ +=========================== ============ +Stewart Smith 3 (75.0%) +Gavin Shan 1 (25.0%) +=========================== ============ + +Top changeset contributors by employer + +========================== ============ +========================== ============ +IBM 369 (99.2%) +(Unknown) 3 (0.8%) +========================== ============ + +Top lines changed by employer + +========================= ============== +========================= ============== +IBM 16497 (100.0%) +(Unknown) 3 (0.0%) +========================= ============== + +Employers with the most signoffs (total 253) + +========================= ============= +========================= ============= +IBM 253 (100.0%) +========================= ============= + +Employers with the most hackers (total 27) + +========================= ============ +========================= ============ +IBM 24 (88.9%) +(Unknown) 3 (11.1%) +========================= ============ + diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.1.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.1.rst new file mode 100644 index 000000000..22873e45f --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.1.rst @@ -0,0 +1,44 @@ +skiboot-5.1.1 +------------- + +skiboot-5.1.1 was released on August 18th, 2015. + +skiboot-5.1.1 is the send stable release of 5.1, it follows skiboot-5.1.0. + +Skiboot 5.1.1 contains all fixes from skiboot-5.1.0 and is a minor bugfix +release. + +Changes +^^^^^^^ +Over skiboot-5.1.0, we have the following changes: + +- Fix detection of compiler options on ancient GCC (e.g. gcc 4.4, shipped with + RHEL6) +- ensure the GNUC version defines for GCOV are coming from target CC rather + than host CC for extract-gcov +- phb3: Continue CAPP setup even if PHB is already in CAPP mode + This fixes a critical bug in CAPI support. + + CAPI requires that all faults are escalated into a fence, not a + freeze. This is done by setting bits in a number of MMIO + registers. phb3_set_capi_mode() calls phb3_init_capp_errors() to do + this. However, if the PHB is already in CAPP mode - for example in the + recovery case - phb3_set_capi_mode() will bail out early, and those + registers will not be set. + + This is quite easy to verify. PCI config space access errors, for + example, normally cause a freeze. On a CAPI-mode PHB, they should + cause a fence. Say we have a CAPI card on PHB 0, and we inject a + PCI config space error: :: + + echo 0x8000000000000000 > /sys/kernel/debug/powerpc/PCI0000/err_injct_inboundA; + lspci; + + The first time we inject this, the PHB will fence and recover, but + won't reset the registers. Therefore, the second time we inject it, + we will incorrectly freeze, not fence. + + Worse, the recovery for the resultant EEH freeze event interacts + poorly with the CAPP, triggering an EEH recovery of the PHB. The + combination of the two attempted recoveries will get the PHB into + an inoperable state. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.10.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.10.rst new file mode 100644 index 000000000..442861be3 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.10.rst @@ -0,0 +1,38 @@ +skiboot-5.1.10 +-------------- + +skiboot-5.1.10 was released on Friday November 13th, 2015. + +skiboot-5.1.10 is the 11th stable release of 5.1, it follows skiboot-5.1.9 +(which was released October 30th, 2015). + +Skiboot 5.1.10 contains all fixes from skiboot-5.1.9 and is a minor bug +fix release. + +Over skiboot-5.1.9, we have the following change: + +IBM FSP machines +^^^^^^^^^^^^^^^^ + +- FSP: Handle Delayed Power Off initiated CEC shutdown with FSP in Reset/Reload + + In a scenario where the DPO has been initiated, but the FSP then went into + reset before the CEC power down came in, OPAL may not give up the link since + it may never see the PSI interrupt. So, if we are in dpo_pending and an FSP + reset is detected via the DISR, give up the PSI link voluntarily. + +Generic +^^^^^^^ + +- sensor: add a compatible property + OPAL needs an extra compatible property "ibm,opal-sensor" to make + module autoload work smoothly in Linux for ibmpowernv driver. +- console: Completely flush output buffer before power down and reboot + Completely flush the output buffer of the console driver before + power down and reboot. Implements the flushing function for uart + consoles, which includes the astbmc and rhesus platforms. + + This fixes an issue where some console output is sometimes lost before + power down or reboot in uart consoles. If this issue is also prevalent + in other console types then it can be fixed later by adding a .flush + to that driver's con_ops. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.11.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.11.rst new file mode 100644 index 000000000..54a2719b1 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.11.rst @@ -0,0 +1,17 @@ +skiboot-5.1.11 +-------------- + +skiboot-5.1.11 was released on Friday November 13th, 2015. + +Since it was Friday 13th, we had to find a bug right after we tagged +and released skiboot-5.1.10. + +skiboot-5.1.11 is the 12th stable release of 5.1, it follows skiboot-5.1.10 +(which was released November 13th, 2015). + +Skiboot 5.1.11 contains one additional bug fix over skiboot-5.1.10. + +It is: + +- On IBM FSP machines, if IPMI/Serial console is not connected during shutdown + or reboot, machine would enter termination state rather than shut down. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.12.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.12.rst new file mode 100644 index 000000000..d2e2315fb --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.12.rst @@ -0,0 +1,54 @@ +skiboot-5.1.12 +-------------- + +skiboot-5.1.12 was released on Friday December 4th, 2015. + +skiboot-5.1.12 is the 13th stable release of 5.1, it follows skiboot-5.1.11 +(which was released November 13th, 2015). + +Skiboot 5.1.12 contains bug fixes and a performance improvement. + +opal-prd +^^^^^^^^ + +- Display an explict and obvious message if running on a system that does + not support opal-prd, such as an IBM FSP based POWER system, where the + FSP takes on the role of opal-prd. + +pflash +^^^^^^ + +- Fix a missing (C) header + - cherry-picked from master. + +General +^^^^^^^ + +- Don't link with libgcc + - On some toolchains, we don't have libgcc available. + +POWER8 PHB (PCIe) specific +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- hw/phb3: Flush cache line after updating P/Q bits + When doing an MSI EOI, we update the P and Q bits in the IVE. That causes + the corresponding cache line to be dirty in the L3 which will cause a + subsequent update by the PHB (upon receiving the next MSI) to get a few + retries until it gets flushed. + + We improve the situation (and thus performance) by doing a dcbf + instruction to force a flush of the update we do in SW. + + This improves interrupt performance, reducing latency per interrupt. + The improvement will vary by workload. + +IBM FSP based machines +^^^^^^^^^^^^^^^^^^^^^^ + +- FSP: Give up PSI link on shutdown + This clears up some erroneous SRCs (error logs) in some situations. +- Correctly report back Real Time Clock errors to host + Under certain rare error conditions, we could return an error code + to the host OS that would cause current Linux kernels to get stuck + in an infinite loop during boot. + This was introduced in skiboot-5.0-rc1. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.13.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.13.rst new file mode 100644 index 000000000..28dab2d57 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.13.rst @@ -0,0 +1,55 @@ +.. _skiboot-5.1.13: + +skiboot-5.1.13 +-------------- + +skiboot-5.1.13 was released on Wed January 27th, 2016. + +skiboot-5.1.13 is the 14th stable release of 5.1, it follows skiboot-5.1.12 +(which was released December 4th, 2015). This release contains bug fixes. + +General +^^^^^^^ + +- core/device.c: Sort nodes with name@unit names by unit + + - This gives predictable device tree ordering to the payload + (usually petitboot) + - This means that utilities such as "lspci" will always return the same + ordering. + +- Add OPAL_CONSOLE_FLUSH to the OPAL API + uart consoles only flush output when polled. The Linux kernel calls + these pollers frequently, except when in a panic state. As such, panic + messages are not fully printed unless the system is configured to reboot + after panic. + + This patch adds a new call to the OPAL API to flush the buffer. If the + system has a uart console (i.e. BMC machines), it will incrementally + flush the buffer, returning if there is more to be flushed or not. If + the system has a different console, the function will have no effect. + This will allow the Linux kernel to ensure that panic message have been + fully printed out. + +CAPI +^^^^ + +- hmi: Identify the phb upon CAPI malfunction alert + Previously, any error on a CAPI adapter would assume PHB0. + This could cause issues on Firestone machines. + +gard utility +^^^^^^^^^^^^ + +- Fix displaying 'cleared' gard records + When a garded component is replaced hostboot detects this and updates the + gard partition. + + Previously, there was ambiguity on if the gard record ID or the whole gard + record needed to be erased. This fix makes gard and hostboot agree. + +firestone platform +^^^^^^^^^^^^^^^^^^ + +- fix spacing in slot name + The other SlotN names have no space. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.14.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.14.rst new file mode 100644 index 000000000..68cf3f560 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.14.rst @@ -0,0 +1,21 @@ +skiboot-5.1.14 +-------------- + +skiboot-5.1.14 was released on Wed March 9th, 2016. + +skiboot-5.1.14 is the 15th stable release of 5.1, it follows skiboot-5.1.13 +(which was released January 27th, 2016). This release contains a spelling +fix in a log message and an added device tree property to enable older +kernels (with bootloader support) to use a framebuffer that is redirected +to the BMC VGA port. + +As such, skiboot-5.1.14 has no advantage over skiboot-5.1.13 unless you +are wanting the neat offb framebuffer trick. + +Changes are: + +- fsp: fix spelling of "advertise" in log message + See: https://www.youtube.com/watch?v=8Gv0H-vPoDc +- Explicit 1:1 mapping in ranges properties have been added to PCI + bridges. This allows a neat trick with offb and VGA ports that should + probably not be told to young children. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.15.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.15.rst new file mode 100644 index 000000000..05f3658bf --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.15.rst @@ -0,0 +1,9 @@ +skiboot-5.1.15 +-------------- + +skiboot-5.1.15 was released on Wed March 16th, 2016. + +skiboot-5.1.15 is the 16th stable release of 5.1, it follows skiboot-5.1.14 +(which was released March 9th, 2016). This release contains one bug fix, a +fix for a memory leak in an error path for AMI BMC based systems when +logging non-severe errors. As such, it is a minor bug fix update. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.16.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.16.rst new file mode 100644 index 000000000..fc7fab429 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.16.rst @@ -0,0 +1,60 @@ +skiboot-5.1.16 +============== + +skiboot-5.1.16 was released on Friday April 29th, 2016. + +skiboot-5.1.16 is the 17th stable release of 5.1, it follows skiboot-5.1.15 +(which was released March 16th, 2016). + +This release contains a few bug fixes and is a recommended upgrade. + +Changes +------- + +PHB3 (all POWER8 platforms) +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- hw/phb3: Ensure PQ bits are cleared in the IVC when masking IRQ + When we mask an interrupt, we may race with another interrupt coming + in from the hardware. If this occurs, the P and/or Q bit may end up + being set but we never EOI/clear them. This could result in a lost + interrupt or the next interrupt that comes in after re-enabling never + being presented. + + This fixes a bug seen with some CAPI workloads which have lots of + interrupt masking at the same time as high interrupt load. The fix is + not specific to CAPI though. +- hw/phb3: Fix potential race in EOI + When we EOI we need to clear the present (P) bit in the Interrupt + Vector Cache (IVC). We must clear P ensuring that any additional + interrupts that come in aren't lost while also maintaining coherency + with the Interrupt Vector Table (IVT). + + To do this, the hardware provides a conditional update bit in the + IVC. This bit ensures that generation counts between the IVT and the + IVC updates are synchronised. + + Unfortunately we never set this the bit to conditionally update the P + bit in the IVC based on the generation count. Also, we didn't set + what we wanted the new generation count to be if the update was + successful. + +FSP platforms +^^^^^^^^^^^^^ + +- OPAL:Handle mbox response with bad status:0x24 during FSP termination + OPAL committed a predictive log with SRC BB822411 in some situations. + +Generic +^^^^^^^ + +- hmi: Fix a bug where partial hmi event was reported to host. + This bug fix ensures the CPU PIR is reported correctly: :: + + [ 305.628283] Fatal Hypervisor Maintenance interrupt [Not recovered] + [ 305.628341] Error detail: Malfunction Alert + [ 305.628388] HMER: 8040000000000000 + - [ 305.628423] CPU PIR: 00000000 + + [ 200.123021] CPU PIR: 000008e8 + [ 305.628458] [Unit: VSU] Logic core check stop + diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.17.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.17.rst new file mode 100644 index 000000000..d69cf7cfc --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.17.rst @@ -0,0 +1,21 @@ +skiboot-5.1.17 +-------------- + +skiboot-5.1.17 was released on Thursday 21st July 2016. + +skiboot-5.1.17 is the 18th stable release of 5.1, it follows skiboot-5.1.16 +(which was released April 29th, 2016). + +This release contains a few minor bug fixes. + +Changes +^^^^^^^ + +All platforms: + +- Fix a few typos in user visible (OPAL log) strings +- pci: Do a dummy config write to devices to establish bus number +- Make the XSCOM engine code more resilient to errors: + - hw/xscom: Reset XSCOM engine after querying sleeping core FIR + - hw/xscom: Reset XSCOM engine after finite number of retries when busy + - xscom: Return OPAL_WRONG_STATE on XSCOM ops if CPU is asleep diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.18.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.18.rst new file mode 100644 index 000000000..03cd1db4e --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.18.rst @@ -0,0 +1,35 @@ +.. _skiboot-5.1.18: + +skiboot-5.1.18 +-------------- + +skiboot-5.1.18 was released on Friday 26th August 2016. + +skiboot-5.1.18 is the 19th stable release of 5.1, it follows skiboot-5.1.17 +(which was released July 21st, 2016). + +This release contains a few minor bug fixes. + +Changes are: + +All platforms: + +- opal/hmi: Fix a TOD HMI failure during a race condition. + Rare race condition which meant we wouldn't recover from TOD error + +- hw/phb3: Update capi initialization sequence + The capi initialization sequence was revised in a circumvention + document when a 'link down' error was converted from fatal to Endpoint + Recoverable. Other, non-capi, register setup was corrected even before + the initial open-source release of skiboot, but a few capi-related + registers were not updated then, so this patch fixes it. + The point is that a link-down error detected by the UTL logic will + lead to an AIB fence, so that the CAPP unit can detect the error. + +FSP platforms: + +- FSP/ELOG: Fix OPAL generated elog resend logic +- FSP/ELOG: Fix possible event notifier hangs +- FSP/ELOG: Disable event notification if list is not consistent +- FSP/ELOG: Fix OPAL generated elog event notification +- FSP/ELOG: Disable event notification during kexec diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.19.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.19.rst new file mode 100644 index 000000000..3779e8f21 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.19.rst @@ -0,0 +1,51 @@ +.. _skiboot-5.1.19: + +skiboot-5.1.19 +-------------- + +skiboot-5.1.19 was released on Monday 16th January 2017. + +skiboot-5.1.19 is the 20th stable release of 5.1, it follows skiboot-5.1.18 +(which was released 26th August 2016). + +This release contains a few minor bug fixes. + +Changes are: + +Generic: + +- Makefile: Disable stack protector due to gcc problems +- stack: Don't recurse into __stack_chk_fail +- Makefile: Use -ffixed-r13 + We did not find evidence of this ever being a problem, but this fix + is good and preventative. +- Limit number of "Poller recursion detected" errors to display + In some error conditions, we could spiral out of control on this + and spend all of our time printing the exact same backtrace. + Limit it to 16 times, because 16 is a nice number. + +FSP based Systems: + +- fsp: Don't recurse pollers in ibm_fsp_terminate + If we were to terminate in a poller, we'd call op_display() which + called pollers which hit the recursive poller warning, which ended + in not much fun at all. + +PCI: + +- hw/phb3: set PHB retry state correctly when fresetting during a creset +- phb3: Lock the PHB on set_xive callbacks + Those are called by the interrupts core and thus skip the locking + implicit in the PCI opal calls. +- hw/{phb3, p7ioc}: Return success for freset on empty PHB + OPAL_CLOSED is returned when fundamental reset is issued on the + PHB who doesn't have subordinate devices (root port excluded). + The kernel raises an error message, which is unnecessary. This + returns OPAL_SUCCESS for this case to avoid the error message. +- hw/phb3: fix error handling in complete reset + During a complete reset, when we get a timeout waiting for pending + transaction in state PHB3_STATE_CRESET_WAIT_CQ, we mark the PHB as broken + and return OPAL_PARAMETER. + Change the return code to OPAL_HARDWARE which is way more sensible, and set + the state to PHB3_STATE_FENCED so that the kernel can retry the complete + reset. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.2.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.2.rst new file mode 100644 index 000000000..9af48022f --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.2.rst @@ -0,0 +1,182 @@ +skiboot-5.1.2 +------------- + +skiboot-5.1.2 was released on September 9th, 2015. + +skiboot-5.1.2 is the third stable release of 5.1, it follows skiboot-5.1.1 +(which was released August 18th, 2015). + +Skiboot 5.1.2 contains all fixes from skiboot-5.1.1 and is a minor bugfix +release. + +Changes +^^^^^^^ +Over skiboot-5.1.1, we have the following changes: + +- phb3: Handle fence in phb3_pci_msi_check_q to fix hang + + If the PHB is fenced during phb3_pci_msi_check_q, it can get stuck in an + infinite loop waiting to lock the FFI. Further, as the phb lock is held + during this function it will prevent any other CPUs from dealing with + the fence, leading to the entire system hanging. + + If the PHB_FFI_LOCK returns all Fs, return immediately to allow the + fence to be dealt with. +- phb3: Continue CAPP setup even if PHB is already in CAPP mode + This fixes a critical bug in CAPI support. +- Platform hook for terminate call + + - on assert() or other firmware failure, we will make a SEL callout + on ASTBMC platforms + - (slight) refactor of code for IBM-FSP platforms + +- refactor slot naming code +- Slot names for Habanero platform +- misc improvements in userspace utilities (incl pflash, gard) +- build improvements + + - fixes for two compiler warnings were squashed in 5.1.1 commit, + re-introduce the fixes. + - misc compiler/static analysis warning fixes + +- gard utility: + + - If gard tool detects the GUARD PNOR partition is corrupted, it will + pro-actively re-initialize it. + Modern Hostboot is more sensitive to the content of the GUARD partition + in order to boot. + - Update record clearing to match Hostboots expectations + We now write ECC bytes throughout the whole partition. + Without this fix, hostboot may not bring up the machine. + - In the event of a corrupted GUARD partition so that even the first entry + cannot be read, the gard utility now provides the user with the option + to wipe the entirety of the GUARD partition to attempt recovery. + +- opal_prd utility: + + - Add run command to pass through commands to HostBoot RunTime (HBRT) + + - this is for OpenPower firmware developers only. + + - Add htmght-passthru command. + + - this is for OpenPower firmware developers only. + + - Add override interface to pass attribute-override information to HBRT. + - Server sends response in error path, so that client doesn't block forever + +- external/mambo tcl scripts + + - Running little-endian kernels in mambo requires HILE to be set properly, + which requires a bump in the machine's pvr value to a DD2.x chip. + +Stats +^^^^^ +For skiboot-5.1.0 to 5.1.2: +Processed 67 csets from 11 developers +1 employers found +A total of 2258 lines added, 784 removed (delta 1474) + +Developers with the most changesets + +=========================== ========== +=========================== ========== +Stewart Smith 24 (35.8%) +Cyril Bur 18 (26.9%) +Vasant Hegde 8 (11.9%) +Neelesh Gupta 5 (7.5%) +Benjamin Herrenschmidt 5 (7.5%) +Daniel Axtens 2 (3.0%) +Samuel Mendoza-Jonas 1 (1.5%) +Vaidyanathan Srinivasan 1 (1.5%) +Vipin K Parashar 1 (1.5%) +Ian Munsie 1 (1.5%) +Michael Neuling 1 (1.5%) +=========================== ========== + +Developers with the most changed lines + +========================== =========== +========================== =========== +Cyril Bur 969 (42.5%) +Neelesh Gupta 433 (19.0%) +Benjamin Herrenschmidt 304 (13.3%) +Vasant Hegde 236 (10.3%) +Stewart Smith 163 (7.1%) +Vaidyanathan Srinivasan 135 (5.9%) +Vipin K Parashar 8 (0.4%) +Ian Munsie 8 (0.4%) +Daniel Axtens 2 (0.1%) +Michael Neuling 2 (0.1%) +Samuel Mendoza-Jonas 1 (0.0%) +========================== =========== + +Developers with the most lines removed + +========================== ========== +========================== ========== +Daniel Axtens 2 (0.3%) +Michael Neuling 1 (0.1%) +========================== ========== + +Developers with the most signoffs (total 44) + +========================== ========== +========================== ========== +Stewart Smith 43 (97.7%) +Neelesh Gupta 1 (2.3%) +========================== ========== + +Developers with the most reviews (total 8) + +========================== ========== +========================== ========== +Patrick Williams 5 (62.5%) +Samuel Mendoza-Jonas 3 (37.5%) +========================== ========== + +Developers with the most test credits (total 0) + +Developers who gave the most tested-by credits (total 0) + +Developers with the most report credits (total 1) + +========================== ========== +========================== ========== +Benjamin Herrenschmidt 1 (100.0%) +========================== ========== + +Developers who gave the most report credits (total 1) + +========================== ========== +========================== ========== +Samuel Mendoza-Jonas 1 (100.0%) +========================== ========== + +Top changeset contributors by employer + +========================== ========== +========================== ========== +IBM 67 (100.0%) +========================== ========== + +Top lines changed by employer + +========================= ========== +========================= ========== +IBM 2281 (100.0%) +========================= ========== + +Employers with the most signoffs (total 44) + +========================== ========== +========================== ========== +IBM 44 (100.0%) +========================== ========== + +Employers with the most hackers (total 11) + +========================== ========== +========================== ========== +IBM 11 (100.0%) +========================== ========== diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.20.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.20.rst new file mode 100644 index 000000000..b2d1fa688 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.20.rst @@ -0,0 +1,175 @@ +.. _skiboot-5.1.20: + +skiboot-5.1.20 +-------------- + +skiboot-5.1.20 was released on Friday 18th August 2017. + +skiboot-5.1.20 is the 21st stable release of 5.1, it follows skiboot-5.1.19 +(which was released 16th January 2017). + +This release contains a few minor bug fixes backported to the 5.1.x series. +All of the fixes have previously appeared in the 5.4.x stable series. + +Changes are: + +- FSP/CONSOLE: Workaround for unresponsive ipmi daemon + + In some corner cases, where FSP is active but not responding to + console MBOX message (due to buggy IPMI) and we have heavy console + write happening from kernel, then eventually our console buffer + becomes full. At this point OPAL starts sending OPAL_BUSY_EVENT to + kernel. Kernel will keep on retrying. This is creating kernel soft + lockups. In some extreme case when every CPU is trying to write to + console, user will not be able to ssh and thinks system is hang. + + If we reset FSP or restart IPMI daemon on FSP, system recovers and + everything becomes normal. + + This patch adds workaround to above issue by returning OPAL_HARDWARE + when cosole is full. Side effect of this patch is, we may endup dropping + latest console data. But better to drop console data than system hang. + + Alternative approach is to drop old data from console buffer, make space + for new data. But in normal condition only FSP can update 'next_out' + pointer and if we touch that pointer, it may introduce some other + race conditions. Hence we decided to just new console write request. + +- FSP: Set status field in response message for timed out message + + For timed out FSP messages, we set message status as "fsp_msg_timeout". + But most FSP driver users (like surviellance) are ignoring this field. + They always look for FSP returned status value in callback function + (second byte in word1). So we endup treating timed out message as success + response from FSP. + + Sample output: :: + + [69902.432509048,7] SURV: Sending the heartbeat command to FSP + [70023.226860117,4] FSP: Response from FSP timed out, word0 = d66a00d7, word1 = 0 state: 3 + .... + [70023.226901445,7] SURV: Received heartbeat acknowledge from FSP + [70023.226903251,3] FSP: fsp_trigger_reset() entry + + Here SURV code thought it got valid response from FSP. But actually we didn't + receive response from FSP. + +- FSP: Improve timeout message + + Presently we print word0 and word1 in error log. word0 contains + sequence number and command class. One has to understand word0 + format to identify command class. + + Lets explicitly print command class, sub command etc. + +- FSP/RTC: Remove local fsp_in_reset variable + + Now that we are using fsp_in_rr() to detect FSP reset/reload, fsp_in_reset + become redundant. Lets remove this local variable. + +- FSP/RTC: Fix possible FSP R/R issue in rtc write path + + fsp_opal_rtc_write() checks FSP status before queueing message to FSP. But if + FSP R/R starts before getting response to queued message then we will continue + to return OPAL_BUSY_EVENT to host. In some extreme condition host may + experience hang. Once FSP is back we will repost message, get response from FSP + and return OPAL_SUCCESS to host. + + This patch caches new values and returns OPAL_SUCCESS if FSP R/R is happening. + And once FSP is back we will send cached value to FSP. + +- hw/fsp/rtc: read/write cached rtc tod on fsp hir. + + Currently fsp-rtc reads/writes the cached RTC TOD on an fsp + reset. Use latest fsp_in_rr() function to properly read the cached rtc + value when fsp reset initiated by the hir. + + Below is the kernel trace when we set hw clock, when hir process starts. :: + + [ 1727.775824] NMI watchdog: BUG: soft lockup - CPU#57 stuck for 23s! [hwclock:7688] + [ 1727.775856] Modules linked in: vmx_crypto ibmpowernv ipmi_powernv uio_pdrv_genirq ipmi_devintf powernv_op_panel uio ipmi_msghandler powernv_rng leds_powernv ip_tables x_tables autofs4 ses enclosure scsi_transport_sas crc32c_vpmsum lpfc ipr tg3 scsi_transport_fc + [ 1727.775883] CPU: 57 PID: 7688 Comm: hwclock Not tainted 4.10.0-14-generic #16-Ubuntu + [ 1727.775883] task: c000000fdfdc8400 task.stack: c000000fdfef4000 + [ 1727.775884] NIP: c00000000090540c LR: c0000000000846f4 CTR: 000000003006dd70 + [ 1727.775885] REGS: c000000fdfef79a0 TRAP: 0901 Not tainted (4.10.0-14-generic) + [ 1727.775886] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> + [ 1727.775889] CR: 28024442 XER: 20000000 + [ 1727.775890] CFAR: c00000000008472c SOFTE: 1 + GPR00: 0000000030005128 c000000fdfef7c20 c00000000144c900 fffffffffffffff4 + GPR04: 0000000028024442 c00000000090540c 9000000000009033 0000000000000000 + GPR08: 0000000000000000 0000000031fc4000 c000000000084710 9000000000001003 + GPR12: c0000000000846e8 c00000000fba0100 + [ 1727.775897] NIP [c00000000090540c] opal_set_rtc_time+0x4c/0xb0 + [ 1727.775899] LR [c0000000000846f4] opal_return+0xc/0x48 + [ 1727.775899] Call Trace: + [ 1727.775900] [c000000fdfef7c20] [c00000000090540c] opal_set_rtc_time+0x4c/0xb0 (unreliable) + [ 1727.775901] [c000000fdfef7c60] [c000000000900828] rtc_set_time+0xb8/0x1b0 + [ 1727.775903] [c000000fdfef7ca0] [c000000000902364] rtc_dev_ioctl+0x454/0x630 + [ 1727.775904] [c000000fdfef7d40] [c00000000035b1f4] do_vfs_ioctl+0xd4/0x8c0 + [ 1727.775906] [c000000fdfef7de0] [c00000000035bab4] SyS_ioctl+0xd4/0xf0 + [ 1727.775907] [c000000fdfef7e30] [c00000000000b184] system_call+0x38/0xe0 + [ 1727.775908] Instruction dump: + [ 1727.775909] f821ffc1 39200000 7c832378 91210028 38a10020 39200000 38810028 f9210020 + [ 1727.775911] 4bfffe6d e8810020 80610028 4b77f61d <60000000> 7c7f1b78 3860000a 2fbffff4 + + This is found when executing the `op-test-framework fspresetReload testcase <https://github.com/open-power/op-test-framework/blob/master/testcases/fspresetReload.py>`_ + + With this fix ran fsp hir torture testcase in the above test + which is working fine. + +- FSP/CHIPTOD: Return false in error path + +- On FSP platforms: notify FSP of Platform Log ID after Host Initiated Reset Reload + Trigging a Host Initiated Reset (when the host detects the FSP has gone + out to lunch and should be rebooted), would cause "Unknown Command" messages + to appear in the OPAL log. + + This patch implements those messages. + + Log showing unknown command: :: + + / # cat /sys/firmware/opal/msglog | grep -i ,3 + [ 110.232114723,3] FSP: fsp_trigger_reset() entry + [ 188.431793837,3] FSP #0: Link down, starting R&R + [ 464.109239162,3] FSP #0: Got XUP with no pending message ! + [ 466.340598554,3] FSP-DPO: Unknown command 0xce0900 + [ 466.340600126,3] FSP: Unhandled message ce0900 + +- hw/i2c: Fix early lock drop + + When interacting with an I2C master the p8-i2c driver (common to p9) + aquires a per-master lock which it holds for the duration of it's + interaction with the master. Unfortunately, when + p8_i2c_check_initial_status() detects that the master is busy with + another transaction it drops the lock and returns OPAL_BUSY. This is + contrary to the driver's locking strategy which requires that the + caller aquire and drop the lock. This leads to a crash due to the + double unlock(), which skiboot treats as fatal. + +- head.S: store all of LR and CTR + + When saving the CTR and LR registers the skiboot exception handlers use the + 'stw' instruction which only saves the lower 32 bits of the register. Given + these are both 64 bit registers this leads to some strange register dumps, + for example: :: + + *********************************************** + Unexpected exception 200 ! + SRR0 : 0000000030016968 SRR1 : 9000000000201000 + HSRR0: 0000000000000180 HSRR1: 9000000000001000 + LR : 3003438830823f50 CTR : 3003438800000018 + CFAR : 00000000300168fc + CR : 40004208 XER: 00000000 + + In this dump the upper 32 bits of LR and CTR are actually stack gunk + which obscures the underlying issue. + +- hw/fsp: Do not queue SP and SPCN class messages during reset/reload + In certain cases of communicating with the FSP (e.g. sensors), the OPAL FSP + driver returns a default code (async + completion) even though there is no known bound from the time of this error + return to the actual data being available. The kernel driver keeps waiting + leading to soft-lockup on the host side. + + Mitigate both these (known) cases by returning OPAL_BUSY so the host driver + knows to retry later. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.21.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.21.rst new file mode 100644 index 000000000..3aab05031 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.21.rst @@ -0,0 +1,26 @@ +.. _skiboot-5.1.21: + +skiboot-5.1.21 +-------------- + +skiboot-5.1.21 was released on Tuesday 19th September 2017. + +skiboot-5.1.21 is the 22nd stable release of 5.1, it follows skiboot-5.1.20 +(which was released 18th August 2017). + +This release contains one backported bug fix to the 5.1.x series. + +Changes are: + +- FSP: Add check to detect FSP Reset/Reload inside fsp_sync_msg() + + During FSP Reset/Reload we move outstanding MBOX messages from msgq to + rr_queue including inflight message (fsp_reset_cmdclass()). But we are not + resetting inflight message state. + + In extreme corner case where we sent message to FSP via fsp_sync_msg() path + and FSP Reset/Reload happens before getting respose from FSP, then we will + endup waiting in fsp_sync_msg() until everything becomes normal. + + This patch adds fsp_in_rr() check to fsp_sync_msg() and return error to caller + if FSP is in R/R. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.3.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.3.rst new file mode 100644 index 000000000..f80046e82 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.3.rst @@ -0,0 +1,141 @@ +skiboot-5.1.3 +------------- + +skiboot-5.1.3 was released on September 15th, 2015. + +skiboot-5.1.3 is the 4th stable release of 5.1, it follows skiboot-5.1.2 +(which was released September 9th, 2015). + +Skiboot 5.1.3 contains all fixes from skiboot-5.1.2 and is a minor bugfix +release. + +Changes +^^^^^^^ +Over skiboot-5.1.2, we have the following changes: + +- slot names for firestone platform +- fix display of LPC errors +- SBE based timer support + + - on supported platforms limits reliance on Linux heartbeat +- fix use-after-free in fsp/ipmi +- fix hang on TOD/TB errors (time-of-day/timebase) on OpenPower systems + + - On getting a Hypervizor Maintenance Interrupt to get the timebase + back into a running state, we would call prlog which would use + the LPC UART console driver on OpenPower systems, which depends on + a working timebase, leading to a hang. + We now don't depend on a working timebase in this recovery codepath. +- enable prd for garrison platform +- PCI: Clear error bits after changing MPS + Chaning MPS on PCI upstream bridge might cause error bits set on + downstream endpoints when system boots into Linux as below case + shows: :: + + host# lspci -vvs 0001:06:00.0 + 0001:06:00.0 Ethernet controller: Broadcom Corporation \ + NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10) + DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- + CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ + + This clears those error bits in AER and PCIe capability after MPS + is changed. With the patch applied, no more error bits are seen. + +Contributors +^^^^^^^^^^^^ +Processed 14 csets from 6 developers +1 employers found +A total of 462 lines added, 163 removed (delta 299) + +Developers with the most changesets + +============================ ========= +============================ ========= +Benjamin Herrenschmidt 5 (35.7%) +Stewart Smith 4 (28.6%) +Mahesh Salgaonkar 2 (14.3%) +Gavin Shan 1 (7.1%) +Jeremy Kerr 1 (7.1%) +Neelesh Gupta 1 (7.1%) +============================ ========= + +Developers with the most changed lines + +========================== =========== +========================== =========== +Benjamin Herrenschmidt 407 (80.8%) +Mahesh Salgaonkar 23 (4.6%) +Gavin Shan 19 (3.8%) +Stewart Smith 18 (3.6%) +Jeremy Kerr 5 (1.0%) +Neelesh Gupta 2 (0.4%) +========================== =========== + +Developers with the most lines removed + +========================== =========== +========================== =========== +Stewart Smith 8 (4.9%) +Jeremy Kerr 3 (1.8%) +Neelesh Gupta 1 (0.6%) +========================== =========== + +Developers with the most signoffs (total 10) + +========================== =========== +========================== =========== +Stewart Smith 10 (100.0%) +========================== =========== + +Developers with the most reviews (total 1) + +========================== =========== +========================== =========== +Joel Stanley 1 (100.0%) +========================== =========== + +Developers with the most test credits (total 0) + +Developers who gave the most tested-by credits (total 0) + +Developers with the most report credits (total 1) + +========================== =========== +========================== =========== +John Walthour 1 (100.0%) +========================== =========== + +Developers who gave the most report credits (total 1) + +========================== =========== +========================== =========== +Gavin Shan 1 (100.0%) +========================== =========== + +Top changeset contributors by employer + +========================== =========== +========================== =========== +IBM 14 (100.0%) +========================== =========== + +Top lines changed by employer + +========================== =========== +========================== =========== +IBM 504 (100.0%) +========================== =========== + +Employers with the most signoffs (total 10) + +========================== =========== +========================== =========== +IBM 10 (100.0%) +========================== =========== + +Employers with the most hackers (total 6) + +========================== =========== +========================== =========== +IBM 6 (100.0%) +========================== =========== diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.4.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.4.rst new file mode 100644 index 000000000..1d5ec7d04 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.4.rst @@ -0,0 +1,34 @@ +skiboot-5.1.4 +------------- + +skiboot-5.1.4 was released on September 26th, 2015. + +skiboot-5.1.4 is the 5th stable release of 5.1, it follows skiboot-5.1.3 +(which was released September 15th, 2015). + +Skiboot 5.1.4 contains all fixes from skiboot-5.1.3 and is an important bug +fix release and a strongly recommended update from any prior skiboot-5.1.x +release. + +Changes +^^^^^^^ +Over skiboot-5.1.3, we have the following changes: + +- Rate limit OPAL_MSG_OCC to only one outstanding message to host + + In the event of a lot of OCC events (or many CPU cores), we could + send many OCC messages to the host, which if it wasn't calling + opal_get_msg really often, would cause skiboot to malloc() additional + messages until we ran out of skiboot heap and things didn't end up + being much fun. + + When running certain hardware exercisers, they seem to steal all time + from Linux being able to call opal_get_msg, causing these to queue up + and get "opalmsg: No available node in the free list, allocating" warnings + followed by tonnes of backtraces of failing memory allocations. + +- Ensure reserved memory ranges are exposed correctly to host + (fix corrupted SLW image) + + We seem to have not hit this on ASTBMC based OpenPower machines, but was + certainly hit on FSP based machines diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.5.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.5.rst new file mode 100644 index 000000000..e619d4ef2 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.5.rst @@ -0,0 +1,46 @@ +skiboot-5.1.5 +------------- + +skiboot-5.1.5 was released on October 1st, 2015. + +skiboot-5.1.5 is the 6th stable release of 5.1, it follows skiboot-5.1.4 +(which was released September 26th, 2015). + +Skiboot 5.1.5 contains all fixes from skiboot-5.1.4 and is a minor bug +fix release. + +Changes +^^^^^^^ +Over skiboot-5.1.4, we have the following changes: + +Generic +^^^^^^^ +- centaur: Add indirect XSCOM support + Fixes a bug where opal-prd would not be able to recover from a bunch + of errors as the indirect XSCOMs to centaurs would fail. +- xscom: Fix logging of indirect XSCOM errors + Better logging of error messages. +- PHB3: Fix wrong PE number in error injection +- Improvement in boot_test.sh utility to support copying a pflash binary + to BMCs. + +AST BMC machines +^^^^^^^^^^^^^^^^ + +- ipmi-sel: Run power action immediately if host not up + Our normal sequence for a soft power action (IPMI 'power soft' or + 'power cycle') involve receiving a SEL from the BMC, sending a message + to Linux's opal platform support which instructs the host OS to shut + down, and finally the host will request OPAL to cut power. + + When the host is not yet up we will send the message to /dev/null, and + no action will be taken. This patches changes that behaviour to perform + the action immediately if we know how. + +OpenPower machines: +^^^^^^^^^^^^^^^^^^^ + +- opal-prd: Increase IPMI timeout to a slightly better value + Proactively bump the timeout to 5seconds to match current value in petitboot + Observed in the wild that this fixes bugs for petitboot. + diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.6.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.6.rst new file mode 100644 index 000000000..ee7df7842 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.6.rst @@ -0,0 +1,37 @@ +skiboot-5.1.6 +============= + +skiboot-5.1.6 was released on October 8th, 2015. + +skiboot-5.1.6 is the 7th stable release of 5.1, it follows skiboot-5.1.5 +(which was released October 1st, 2015). + +Skiboot 5.1.6 contains all fixes from skiboot-5.1.5 and is a minor bug +fix release. + +Changes +------- +Over skiboot-5.1.5, we have the following changes: + +Generic: +^^^^^^^^ + +- Ensure we run pollers in cpu_wait_job() + + In root causing a bug on AST BMC Alistair found that pollers weren't + being run for around 3800ms. + + This could show as not resetting the boot count sensor on successful + boot. + +AST BMC Machines +^^^^^^^^^^^^^^^^ + +- hw/bt.c: Check for timeout after checking for message response + + When deciding if a BT message has timed out we should first check for + a message response. This will ensure that messages will not time out + if there was a delay calling the pollers. + + This could show as not resetting the boot count sensor on successful + boot. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.7.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.7.rst new file mode 100644 index 000000000..b678b421e --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.7.rst @@ -0,0 +1,31 @@ +skiboot-5.1.7 +------------- + +skiboot-5.1.7 was released on October 13th, 2015. + +skiboot-5.1.7 is the 8th stable release of 5.1, it follows skiboot-5.1.6 +(which was released October 8th, 2015). + +Skiboot 5.1.7 contains all fixes from skiboot-5.1.6 and is a minor bug +fix release with one important bug fix for FSP systems. + +Over skiboot-5.1.6, we have the following changes: + +Generic: + +- PHB3: Retry fundamental reset + This introduces another PHB3 state (PHB3_STATE_FRESET_START) + allowing to redo fundamental reset if the link doesn't come up + in time at the first attempt, to improve the robustness of PHB's + fundamental reset. If the link comes up after the first reset, + the 2nd reset won't be issued at all. + +FSP based systems: + +- hw/fsp/fsp-leds.c: use allocated buffer for FSP_CMD_GET_LED_LIST response + + This fixes a bug where we would overwrite roughly 4kb of memory belonging + to Linux when the FSP would ask firmware for a list of LEDs in the system. + This wouldn't happen often (once before Linux was running and possibly + only once during runtime, and *early* runtime at that) but it was possible + for this corruption to show up and be detected. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.8.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.8.rst new file mode 100644 index 000000000..c856ba902 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.8.rst @@ -0,0 +1,20 @@ +skiboot-5.1.8 +------------- + +skiboot-5.1.8 was released on October 19th, 2015. + +skiboot-5.1.8 is the 9th stable release of 5.1, it follows skiboot-5.1.7 +(which was released October 13th, 2015). + +Skiboot 5.1.8 contains all fixes from skiboot-5.1.7 and is a minor bug +fix release, with a single fix for recovery from a (rare) error. + +Over skiboot-5.1.7, we have the following change: + +- opal/hmi: Fix a soft lockup issue on Hypervisor Maintenance Interrupt + for certain timebase errors. + + We also introduce a timeout to handle the worst situation where all other + threads are badly stuck without setting a cleanup done bit. Under such + situation timeout will help to avoid soft lockups and report failure to + kernel. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.1.9.rst b/roms/skiboot/doc/release-notes/skiboot-5.1.9.rst new file mode 100644 index 000000000..f5fa7d8d8 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.1.9.rst @@ -0,0 +1,17 @@ +skiboot-5.1.9 +------------- + +skiboot-5.1.9 was released on October 30th, 2015. + +skiboot-5.1.9 is the 10th stable release of 5.1, it follows skiboot-5.1.8 +(which was released October 19th, 2015). + +Skiboot 5.1.9 contains all fixes from skiboot-5.1.8 and is a minor bug +fix release, with a single fix to help diagnosis after a rare error condition. + +Over skiboot-5.1.8, we have the following change: + +- opal/hmi: Signal PRD about NX unit checkstop. + We now signal Processor Recovery & Diagnostics (PRD) correctly following + an NX unit checkstop +- minor fix to the boot_test.sh test script diff --git a/roms/skiboot/doc/release-notes/skiboot-5.10-rc1.rst b/roms/skiboot/doc/release-notes/skiboot-5.10-rc1.rst new file mode 100644 index 000000000..90a5b3005 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.10-rc1.rst @@ -0,0 +1,1560 @@ +.. _skiboot-5.10-rc1: + +skiboot-5.10-rc1 +================ + +skiboot v5.10-rc1 was released on Tuesday February 6th 2018. It is the first +release candidate of skiboot 5.10, which will become the new stable release +of skiboot following the 5.9 release, first released October 31st 2017. + +skiboot v5.10-rc1 contains all bug fixes as of :ref:`skiboot-5.9.8` +and :ref:`skiboot-5.4.9` (the currently maintained stable releases). There +may be more 5.9.x stable releases, it will depend on demand. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.10 in February, with skiboot 5.10 +being for all POWER8 and POWER9 platforms in op-build v1.21. +This release will be targeted to early POWER9 systems. + +Over skiboot-5.9, we have the following changes: + +New Features +------------ +- hdata: Parse IPL FW feature settings + + Add parsing for the firmware feature flags in the HDAT. This + indicates the settings of various parameters which are set at IPL time + by firmware. + +- opal/xstop: Use nvram option to enable/disable sw checkstop. + + Add a mechanism to enable/disable sw checkstop by looking at nvram option + opal-sw-xstop=<enable/disable>. + + For now this patch disables the sw checkstop trigger unless explicitly + enabled through nvram option 'opal-sw-xstop=enable'i for p9. This will allow + an opportunity to get host kernel in panic path or xmon for unrecoverable + HMIs or MCE, to be able to debug the issue effectively. + + To enable sw checkstop in opal issue following command: :: + + nvram -p ibm,skiboot --update-config opal-sw-xstop=enable + + **NOTE:** This is a workaround patch to disable sw checkstop by default to gain + control in host kernel for better checkstop debugging. Once we have most of + the checkstop issues stabilized/resolved, revisit this patch to enable sw + checkstop by default. + + For p8 platform it will remain enabled by default unless explicitly disabled. + + To disable sw checkstop on p8 issue following command: :: + + nvram -p ibm,skiboot --update-config opal-sw-xstop=disable +- hdata: Parse SPD data + + Parse SPD data and populate device tree. + + list of properties parsing from SPD: :: + + [root@ltc-wspoon dimm@d00f]# lsprop . + memory-id 0000000c (12) # DIMM type + product-version 00000032 (50) # Module Revision Code + device_type "memory-dimm-ddr4" + serial-number 15d9acb6 (366587062) + status "okay" + size 00004000 (16384) + phandle 000000bd (189) + ibm,loc-code "UOPWR.0000000-Node0-DIMM7" + part-number "36ASF2G72PZ-2G6B2 " + reg 0000d007 (53255) + name "dimm" + manufacturer-id 0000802c (32812) # Vendor ID, we can get vendor name from this ID + + Also update documentation. +- hdata: Add memory hierarchy under xscom node + + We have memory to chip mapping but doesn't have complete memory hierarchy. + This patch adds memory hierarchy under xscom node. This is specific to + P9 system as these hierarchy may change between processor generation. + + It uses memory controller ID details and populates nodes like: + xscom@<addr>/mcbist@<mcbist_id>/mcs@<mcs_id>/mca@<mca_id>/dimm@<resource_id> + + Also this patch adds few properties under dimm node. + Finally make sure xscom nodes created before calling memory_parse(). + +Fast Reboot and Quiesce +^^^^^^^^^^^^^^^^^^^^^^^ +We have a preliminary fast reboot implementation for POWER9 systems, which +we look to enabling by default in the next release. + +The OPAL Quiesce calls are designed to improve reliability and debuggability +around reboot and error conditions. See the full API documentation for details: +:ref:`OPAL_QUIESCE`. + +- fast-reboot: bare bones fast reboot implementation for POWER9 + + This is an initial fast reboot implementation for p9 which has only been + tested on the Witherspoon platform, and without the use of NPUs, NX/VAS, + etc. + + This has worked reasonably well so far, with no failures in about 100 + reboots. It is hidden behind the traditional fast-reboot experimental + nvram option, until more platforms and configurations are tested. +- fast-reboot: move boot CPU clean-up logically together with secondaries + + Move the boot CPU clean-up and state transition to active, logically + together with secondaries. Don't release secondaries from fast reboot + hold until everyone has cleaned up and transitioned to active. + + This is cosmetic, but it is helpful to run the fast reboot state machine + the same way on all CPUs. +- fast-reboot: improve failure error messages + + Change existing failure error messages to PR_NOTICE so they get + printed to the console, and add some new ones. It's not a more + severe class because it falls back to IPL on failure. +- fast-reboot: quiesce opal before initiating a fast reboot + + Switch fast reboot to use quiescing rather than "wait for a while". + + If firmware can not be quiesced, then fast reboot is skipped. This + significantly improves the robustness of fast reboot in the face of + bugs or unexpected latencies. + + Complexity of synchronization in fast-reboot is reduced, because we + are guaranteed to be single-threaded when quiesce succeeds, so locks + can be removed. + + In the case that firmware can be quiesced, then it will generally + reduce fast reboot times by nearly 200ms, because quiescing usually + takes very little time. +- core: Add support for quiescing OPAL + + Quiescing is ensuring all host controlled CPUs (except the current + one) are out of OPAL and prevented from entering. This can be use in + debug and shutdown paths, particularly with system reset sequences. + + This patch adds per-CPU entry and exit tracking for OPAL calls, and + adds logic to "hold" or "reject" at entry time, if OPAL is quiesced. + + An OPAL call is added, to expose the functionality to Linux, where it + can be used for shutdown, kexec, and before generating sreset IPIs for + debugging (so the debug code does not recurse into OPAL). +- dctl: p9 increase thread quiesce timeout + + We require all instructions to be completed before a thread is + considered stopped, by the dctl interface. Long running instructions + like cache misses and CI loads may take a significant amount of time + to complete, and timeouts have been observed in stress testing. + + Increase the timeout significantly, to cover this. The workbook + just says to poll, but we like to have timeouts to avoid getting + stuck in firmware. + + +POWER9 power saving +^^^^^^^^^^^^^^^^^^^ + +There is much improved support for deeper sleep/idle (stop) states on POWER9. + +- OCC: Increase max pstate check on P9 to 255 + + This has changed from P8, we can now have > 127 pstates. + + This was observed on Boston during WoF bring up. +- SLW: Add idle state stop5 for DD2.0 and above + + Adding stop5 idle state with rough residency and latency numbers. +- SLW: Add p9_stop_api calls for IMC + + Add p9_stop_api for EVENT_MASK and PDBAR scoms. These scoms are lost on + wakeup from stop11. + +- SCOM restore for DARN and XIVE + + While waking up from stop11, we want NCU_DARN_BAR to have enable bit set. + Without this stop_api call, the value restored is without enable bit set. + We loose NCU_SPEC_BAR when the quad goes into stop11, stop_api will + restore while waking up from stop11. + +- SLW: Call p9_stop_api only if deep_states are enabled + + All init time p9_stop_api calls have been isolated to slw_late_init. If + p9_stop_api fails, then the deep states can be excluded from device tree. + + For p9_stop_api called after device-tree for cpuidle is created , + has_deep_states will be used to check if this call is even required. +- Better handle errors in setting up sleep states (p9_stop_api) + + We won't put affected stop states in the device tree if the wakeup + engine is not present or has failed. +- SCOM Restore: Increased the EQ SCOM restore limit. + + Commit increases the SCOM restore limit from 16 to 31. +- hw/dts: retry special wakeup operation if core still gated + + It has been observed that in some cases the special wakeup + operation can "succeed" but the core is still in a gated/offline + state. + + Check for this state after attempting to wakeup a core and retry + the wakeup if necessary. +- core/direct-controls: add function to read core gated state +- core/direct-controls: wait for core special wkup bit cleared + + When clearing special wakeup bit on a core, wait until the + bit is actually cleared by the hardware in the status register + until returning success. + + This may help avoid issues with back-to-back reads where the + special wakeup request is cleared but the firmware is still + processing the request and the next attempt to set the bit + reads an immediate success from the previous operation. +- p9_stop_api: PM: Added support for version control in SCOM restore entries. + + - adds version info in SCOM restore entry header + - adds version specific details in SCOM restore entry header + - retains old behaviour of SGPE Hcode's base version +- p9_stop_api: EQ SCOM Restore: Introduced version control in SCOM restore entry. + + - introduces version control in header of SCOM restore entry + - ensures backward compatibility + - introduces flexibility to handle any number of SCOM restore entry. + +Secure and Trusted Boot for POWER9 +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +We introduce support for Secure and Trusted Boot for POWER9 systems, with equal +functionality that we have on POWER8 systems, that is, we have the mechanisms in +place to boot to petitboot (i.e. to BOOTKERNEL). + +See the :ref:`stb-overview` for full documentation of OPAL secure and trusted boot. + +- allow secure boot if not enforcing it + + We check the secure boot containers no matter what, only *enforcing* + secure boot if we're booting in secure mode. This gives us an extra + layer of checking firmware is legit even when secure mode isn't enabled, + as well as being really useful for testing. +- libstb/(create|print)-container: Sync with sb-signing-utils + + The sb-signing-utils project has improved upon the skeleton + create-container tool that existed in skiboot, including + being able to (quite easily) create *signed* images. + + This commit brings in that code (and makes it build in the + skiboot build environment) and updates our skiboot.*.stb + generating code to use the development keys. This means that by + default, skiboot build process will let you build firmware that can + do a secure boot with *development* keys. + + See :ref:`signing-firmware-code` for details on firmware signing. + + We also update print-container as well, syncing it with the + upstream project. + + Derived from github.com:open-power/sb-signing-utils.git + at v0.3-5-gcb111c03ad7f + (Some discussion ongoing on the changes, another sync will come shortly) + +- doc: update libstb documentation with POWER9 changes. + See: :ref:`stb-overview`. + + POWER9 changes reflected in the libstb: + + - bumped ibm,secureboot node to v2 + - added ibm,cvc node + - hash-algo superseded by hw-key-hash-size + +- libstb/cvc: update memory-region to point to /reserved-memory + + The linux documentation, reserved-memory.txt, says that memory-region is + a phandle that pairs to a children of /reserved-memory. + + This updates /ibm,secureboot/ibm,cvc/memory-region to point to + /reserved-memory/secure-crypt-algo-code instead of + /ibm,hostboot/reserved-memory/secure-crypt-algo-code. +- libstb: add support for ibm,secureboot-v2 + + ibm,secureboot-v2 changes: + + - The Container Verification Code is represented by the ibm,cvc node. + - Each ibm,cvc child describes a CVC service. + - hash-algo is superseded by hw-key-hash-size. +- hdata/tpmrel.c: add ibm, cvc device tree node + + In P9, the Container Verification Code is stored in a hostboot reserved + memory and the list of provided CVC services is stored in the + TPMREL_IDATA_HASH_VERIF_OFFSETS idata array. Each CVC service has an + offset and version. + + This adds the ibm,cvc device tree node and its documentation. +- hdata/tpmrel.c: add firmware event log info to the tpm node + + This parses the firmware event log information from the + secureboot_tpm_info HDAT structure and add it to the tpm device tree + node. + + There can be multiple secureboot_tpm_info entries with each entry + corresponding to a master processor that has a tpm device, however, + multiple tpm is not supported. +- hdata/spira: add ibm,secureboot node in P9 + + In P9, skiboot builds the device tree from the HDAT. These are the + "ibm,secureboot" node changes compared to P8: + + - The Container-Verification-Code (CVC), a.k.a. ROM code, is no longer + stored in a secure ROM with static address. In P9, it is stored in a + hostboot reserved memory and each service provided also has a version, + not only an offset. + - The hash-algo property is not provided via HDAT, instead it provides + the hw-key-hash-size, which is indeed the information required by the + CVC to verify containers. + + This parses the iplparams_sysparams HDAT structure and creates the + "ibm,secureboot", which is bumped to "ibm,secureboot-v2". + + In "ibm,secureboot-v2": + + - hash-algo property is superseded by hw-key-hash-size. + - container verification code is explicitly described by a child node. + Added in a subsequent patch. + + See :ref:`device-tree/ibm,secureboot` for documentation. +- libstb/tpm_chip.c: define pr_fmt and fix messages logged + + This defines pr_fmt and also fix messages logged: + + - EV_SEPARATOR instead of 0xFFFFFFFF + - when an event is measured it also prints the tpm id, event type and + event log length + + Now we can filter the messages logged by libstb and its + sub-modules by running: :: + + grep STB /sys/firmware/opal/msglog +- libstb/tss: update the list of event types supported + + Skiboot, precisely the tpmLogMgr, initializes the firmware event log by + calculating its length so that a new event can be recorded without + exceeding the log size. In order to calculate the size, it walks through + the log until it finds a specific event type. However, if the log has + an unknown event type, the tpmLogMgr will not be able to reach the end + of the log. + + This updates the list of event types with all of those supported by + hostboot. Thus, skiboot can properly calculate the event log length. +- tpm_i2c_nuvoton: add nuvoton, npct601 to the compatible property + + The linux kernel doesn't have a driver compatible with + "nuvoton,npct650", but it does have for "nuvoton,npct601", which should + also be compatible with npct650. + + This adds "nuvoton,npct601" to the compatible devtree property. +- libstb/trustedboot.c: import stb_final() from stb.c + + The stb_final() primary goal is to measure the event EV_SEPARATOR + into PCR[0-7] when trusted boot is about to exit the boot services. + + This imports the stb_final() from stb.c into trustedboot.c, but making + the following changes: + + - Rename it to trustedboot_exit_boot_services(). + - As specified in the TCG PC Client specification, EV_SEPARATOR events must + be logged with the name 0xFFFFFF. + - Remove the ROM driver clean-up call. + - Don't allow code to be measured in skiboot after + trustedboot_exit_boot_services() is called. +- libstb/cvc.c: import softrom behaviour from drivers/sw_driver.c + + Softrom is used only for testing with mambo. By setting + compatible="ibm,secureboot-v1-softrom" in the "ibm,secureboot" node, + firmware images can be properly measured even if the + Container-Verification-Code (CVC) is not available. In this case, the + mbedtls_sha512() function is used to calculate the sha512 hash of the + firmware images. + + This imports the softrom behaviour from libstb/drivers/sw_driver.c code + into cvc.c, but now softrom is implemented as a flag. When the flag is + set, the wrappers for the CVC services work the same way as in + sw_driver.c. +- libstb/trustedboot.c: import tb_measure() from stb.c + + This imports tb_measure() from stb.c, but now it calls the CVC sha512 + wrapper to calculate the sha512 hash of the firmware image provided. + + In trustedboot.c, the tb_measure() is renamed to trustedboot_measure(). + + The new function, trustedboot_measure(), no longer checks if the + container payload hash calculated at boot time matches with the hash + found in the container header. A few reasons: + + - If the system admin wants the container header to be + checked/validated, the secure boot jumper must be set. Otherwise, + the container header information may not be reliable. + - The container layout is expected to change over time. Skiboot + would need to maintain a parser for each container layout + change. + - Skiboot could be checking the hash against a container version that + is not supported by the Container-Verification-Code (CVC). + + The tb_measure() calls are updated to trustedboot_measure() in a + subsequent patch. +- libstb/secureboot.c: import sb_verify() from stb.c + + This imports the sb_verify() function from stb.c, but now it calls the + CVC verify wrapper in order to verify signed firmware images. The + hw-key-hash and hw-key-hash-size initialized in secureboot.c are passed + to the CVC verify function wrapper. + + In secureboot.c, the sb_verify() is renamed to secureboot_verify(). The + sb_verify() calls are updated in a subsequent patch. + +XIVE +---- +- xive: Don't bother cleaning up disabled EQs in reset + + Additionally, warn if we find an enabled one that isn't one + of the firmware built-in queues. +- xive: Warn on valid VPs found in abnormal cases + + If an allocated VP is left valid at xive_reset() or Linux tries + to free a valid (enabled) VP block, print errors. The former happens + occasionally if kdump'ing while KVM is running so keep it as a debug + message. The latter is a programming error in Linux so use a an + error log level. +- xive: Properly reserve built-in VPs in non-group mode + + This is not normally used but if the #define is changed to + disable block group mode we would incorrectly clear the + buddy completely without marking the built-in VPs reserved. +- xive: Quieten debug messages in standard builds + + This makes a bunch of messages, especially the per-CPU ones, + only enabled in debug builds. This avoids clogging up the + OPAL logs with XIVE related messages that have proven not + being particularly useful for field defects. +- xive: Implement "single escalation" feature + + This adds a new VP flag to control the new DD2.0 + "single escalation" feature. + + This feature allows us to have a single escalation + interrupt per VP instead of one per queue. + + It works by hijacking queue 7 (which is this no longer + usable when that is enabled) and exploiting two new + hardware bits that will: + + - Make the normal queues (0..6) escalate unconditionally + thus ignoring the ESe bits. + - Route the above escalations to queue 7 + - Have queue 7 silently escalate without notification + + Thus the escalation of queue 7 becomes the one escalation + interrupt for all the other queues. +- xive: When disabling a VP, wipe all of its settings +- xive: Improve cleaning up of EQs + + Factors out the function that sets an EQ back to a clean + state and add a cleaning pass for queue left enabled + when freeing a block of VPs. +- xive: When disabling an EQ, wipe all of its settings + + This avoids having configuration bits left over +- xive: Define API for single-escalation VP mode + + This mode allows all queues of a VP to use the same + escalation interrupt, at the cost of losing priority 7. + + This adds the definition and documentation of the API, + the implementation will come next. +- xive: Fix ability to clear some EQ flags + + We could never clear "unconditional notify" and "escalate" +- xive: Update inits for DD2.0 + + This updates some inits based on information from the HW + designers. This includes enabling some new DD2.0 features + that we don't yet exploit. +- xive: Ensure VC informational FIRs are masked + + Some HostBoot versions leave those as checkstop, they are harmless + and can sometimes occur during normal operations. +- xive: Fix occasional VC checkstops in xive_reset + + The current workaround for the scrub bug described in + __xive_cache_scrub() has an issue in that it can leave + dirty invalid entries in the cache. + + When cleaning up EQs or VPs during reset, if we then + remove the underlying indirect page for these entries, + the XIVE will checkstop when trying to flush them out + of the cache. + + This replaces the existing workaround with a new pair of + workarounds for VPs and EQs: + + - The VP one does the dummy watch on another entry than + the one we scrubbed (which does the job of pushing old + stores out) using an entry that is known to be backed by + a permanent indirect page. + - The EQ one switches to a more efficient workaround + which consists of doing a non-side-effect ESB load from + the EQ's ESe control bits. +- xive: Do not return a trigger page for an escalation interrupt + + This is bogus, we don't support them. (Thankfully the callers + didn't actually try to use this on escalation interrupts). +- xive: Mark a freed IRQs IVE as valid and masked + + Removing the valid bit means a FIR will trip if it's accessed + inadvertently. Under some circumstances, the XIVE will speculatively + access an IVE for a masked interrupt and trip it. So make sure that + freed entries are still marked valid (but masked). + +PCI +--- + +- pci: Shared slot state synchronisation for hot reset + + When a device is shared between two PHBs, it doesn't get reset properly + unless both PHBs issue a hot reset at "the same time". Practically this + means a hot reset needs to be issued on both sides, and neither should + bring the link up until the reset on both has completed. +- pci: Track peers of slots + + Witherspoon introduced a new concept where one physical slot is shared + between two PHBs. Making a slot aware of its peer enables syncing + between them where necessary. + +PHB4 +---- +- phb4: Change PCI MMIO timers + + Currently we have a mismatch between the NCU and PCI timers for MMIO + accesses. The PCI timers must be lower than the NCU timers otherwise + it may cause checkstops. + + This changes PCI timeouts controlled by skiboot to 33-50ms. It should + be forwards and backwards compatible with expected hostboot changes to + the NCU timer. +- phb4: Change default GEN3 lane equalisation setting to 0x54 + + Currently our GEN3 lane equalisation settings are set to 0x77. Change + this to 0x54. This change will allow us to train at GEN3 in a shorter + time and more consistently. + + This setting gives us a TX preset 0x4 and RX hint 0x5. This gives a + boost in gain for high frequency signalling. It allows the most optimal + continuous time linear equalizers (CTLE) for the remote receiver port + and de-emphasis and pre-shoot for the remote transmitter port. + + Machine Readable Workbooks (MRW) are moving to this new value also. +- phb4: Init changes + + These init changes for phb4 from the HW team. + + Link down are now endpoint recoverable (ERC) rather than PHB fatal + errors. + + BLIF Completion Timeout Error now generate an interrupt rather than + causing freeze events. +- phb4: Fix lane equalisation setting + + Fix cut and paste from phb3. The sizes have changes now we have GEN4, + so the check here needs to change also + + Without this we end up with the default settings (all '7') rather + than what's in HDAT. +- hdata: Fix copying GEN4 lane equalisation settings + + These aren't copied currently but should be. +- phb4: Fix PE mapping of M32 BAR + + The M32 BAR is the PHB4 region used to map all the non-prefetchable + or 32-bit device BARs. It's supposed to have its segments remapped + via the MDT and Linux relies on that to assign them individual PE#. + + However, we weren't configuring that properly and instead used the + mode where PE# == segment#, thus causing EEH to freeze the wrong + device or PE#. +- phb4: Fix lost bit in PE number on config accesses + + A PE number can be up to 9 bits, using a uint8_t won't fly.. + + That was causing error on config accesses to freeze the + wrong PE. +- phb4: Update inits + + New init value from HW folks for the fence enable register. + + This clears bit 17 (CFG Write Error CA or UR response) and bit 22 (MMIO Write + DAT_ERR Indication) and sets bit 21 (MMIO CFG Pending Error) + +CAPI +---- + +- capi: Disable CAPP virtual machines + + When exercising more than one CAPI accelerators simultaneously in + cache coherency mode, the verification team is seeing a deadlock. To + fix this a workaround of disabling CAPP virtual machines is + suggested. These 'virtual machines' let PSL queue multiple CAPP + commands for servicing by CAPP there by increasing + throughput. Below is the error scenario described by the h/w team: + + " With virtual machines enabled we had a deadlock scenario where with 2 + or more CAPI's in a system you could get in a deadlock scenario due to + cast-outs that are required break the deadlock (evict lines that + another CAPI is requesting) get stuck in the virtual machine queue by + a command ahead of it that is being retried by the same scenario in + the other CAPI. " + +- capi: Perform capp recovery sequence only when PBCQ is idle + + Presently during a CRESET the CAPP recovery sequence can be executed + multiple times in case PBCQ on the PEC is still busy processing in/out + bound in-flight transactions. +- xive: Mask MMIO load/store to bad location FIR + + For opencapi, the trigger page of an interrupt is mapped to user + space. The intent is to write the page to raise an interrupt but + there's nothing to prevent a user process from reading it, which has + the unfortunate consequence of checkstopping the system. + + Mask the FIR bit raised when an MMIO operation targets an invalid + location. It's the recommendation from recent documentation and + hostboot is expected to mask it at some point. In the meantime, let's + play it safe. +- phb4: Dump CAPP error registers when it asserts link down + + This patch introduces a new function phb4_dump_app_err_regs() that + dumps CAPP error registers in case the PEC nestfir register indicates + that the fence was due to a CAPP error (BIT-24). + + Contents of these registers are helpful in diagnosing CAPP + issues. Registers that are dumped in phb4_dump_app_err_regs() are: + + * CAPP FIR Register + * CAPP APC Master Error Report Register + * CAPP Snoop Error Report Register + * CAPP Transport Error Report Register + * CAPP TLBI Error Report Register + * CAPP Error Status and Control Register +- capi: move the acknowledge of the HMI interrupt + + We need to acknowledge an eventual HMI initiated by the previous forced + fence on the PHB to work around a non-existent PE in the phb4_creset() + function. + For this reason do_capp_recovery_scoms() is called now at the + beginning of the step: PHB4_SLOT_CRESET_WAIT_CQ +- capi: update ci store buffers and dma engines + + The number of read (APC type traffic) and mmio store (MSG type traffic) + resources assigned to the CAPP is controlled by the CAPP control + register. + + According to the type of CAPI cards present on the server, we have to + configure differently the CAPP messages and the DMA read engines given + to the CAPP for use. + +HMI +--- +- core/hmi: Display chip location code while displaying core FIR. +- core/hmi: Do not display FIR details if none of the bits are set. + + So that we don't flood OPAL console logs with information that is not + useful. +- opal/hmi: HMI logging with location code info. + + Add few HMI debug prints with location code info few additional info. + + No functionality change. + + With this patch the log messages will look like: :: + + [210612.175196744,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [210612.175200449,7] HMI: [Loc: UOPWR.1302LFA-Node0-Proc1]: P:8 C:16 T:1: TFMR(2d12000870e04020) Timer Facility Error + + [210660.259689526,7] HMI: Received HMI interrupt: HMER = 0x2040000000000000 + [210660.259695649,7] HMI: [Loc: UOPWR.1302LFA-Node0-Proc0]: P:0 C:16 T:1: Processor recovery Done. + +- core/hmi: Use pr_fmt macro for tagging log messages + + No functionality changes. +- opal: Get chip location code + + and store it under proc_chip for quick reference during HMI handling + code. + +Sensors +------- +- occ-sensors: Fix up quad/gpu location mix-up + + The GPU and QUAD sensor location types are swapped compared to what + exists in the OCC code base which is authoritative. Fix them up. +- sensors: occ: Skip counter type of sensors + + Don't add counter type of sensors to device-tree as they don't + fit into hwmon sensor interface. +- sensors: dts: Assert special wakeup on idle cores while reading temperature + + In P9, when a core enters a stop state, its clocks will be stopped + to save power and hence we will not be able to perform a SCOM + operation to read the DTS temperature sensor. Hence, assert + a special wakeup on cores that have entered a stop state in order to + successfully complete the SCOM operation. +- sensors: occ: Skip power sensors with zero sample value + + APSS is not available on platforms like Zaius, Romulus where OCC + can only measure Vdd (core) and Vdn (nest) power from the AVSbus + reading. So all the sensors for APSS channels will be populated + with 0. Different component power sensors like system, memory + which point to the APSS channels will also be 0. + + As per OCC team (Martha Broyles) zeroed power sensor means that the + system doesn't have it. So this patch filters out these sensors. +- sensors: occ: Skip GPU sensors for non-gpu systems +- sensors: Fix dtc warning for new occ in-band sensors. + + dtc complains about missing reg property when a DT node is having a + unit name or address but no reg property. :: + + /ibm,opal/sensors/vrm-in@c00004 has a unit name, but no reg property + /ibm,opal/sensors/gpu-in@c0001f has a unit name, but no reg property + /ibm,opal/sensor-groups/occ-js@1c00040 has a unit name, but no reg property + + This patch fixes these warnings for new occ in-band sensors and also for + sensor-groups by adding necessary properties. +- sensors: Fix dtc warning for dts sensors. + + dtc complains about missing reg property when a DT node is having a + unit name or address but no reg property. + + Example warning for core dts sensor: :: + + /ibm,opal/sensors/core-temp@5c has a unit name, but no reg property + /ibm,opal/sensors/core-temp@804 has a unit name, but no reg property + + This patch fixes this by adding necessary properties. +- hw/occ: Fix psr cpu-to-gpu sensors node dtc warning. + + dtc complains about missing reg property when a DT node is having a + unit name or address but no reg property. :: + + /ibm,opal/power-mgt/psr/cpu-to-gpu@0 has a unit name, but no reg property + /ibm,opal/power-mgt/psr/cpu-to-gpu@100 has a unit name, but no reg property + + This patch fixes this by adding necessary properties. + +General fixes +------------- +- lpc: Clear pending IRQs at boot + + When we come in from hostboot the LPC master has the bus reset indicator + set. This error isn't handled until the host kernel unmasks interrupts, + at which point we get the following spurious error: :: + + [ 20.053560375,3] LPC: Got LPC reset on chip 0x0 ! + [ 20.053564560,3] LPC[000]: Unknown LPC error Error address reg: 0x00000000 + + Fix this by clearing the various error bits in the LPC status register + before we initialise the skiboot LPC bus driver. +- hw/imc: Check ucode state before exposing units to Linux + + disable_unavailable_units() checks whether the ucode + is in the running state before enabling the nest units + in the device tree. From a recent debug, it is found + that on some system boot, ucode is not loaded and + running in all the chips in the system. And this + caused a fail in OPAL_IMC_COUNTERS_STOP call where + we check for ucode state on each chip. Bug here is + that disable_unavailable_units() checks the state + of the ucode only in boot cpu chip. Patch adds a + condition in disable_unavailable_units() to check + for the ucode state in all the chip before enabling + the nest units in the device tree node. + +- hdata/vpd: Add vendor property + + ibm,vpd blob contains VN field. Use that to populate vendor property + for various FRU's. +- hdata/vpd: Fix DTC warnings + + All the nodes under the vpd hierarchy have a unit address (their SLCA + index) but no reg properties. Add them and their size/address cells + to squash the warnings. +- HDAT/i2c: Fix SPD EEPROM compatible string + + Hostboot doesn't give us accurate information about the DIMM SPD + devices. Hack around by assuming any EEPROM we find on the SPD I2C + master is an SPD EEPROM. +- hdata/i2c: Fix 512Kb EEPROM size + + There's no such thing as a 412Kb EEPROM. +- libflash/mbox-flash: fall back to requesting lower MBOX versions from BMC + + Some BMC mbox implementations seem to sometimes mysteriously fail when trying + to negotiate v3 when they only support v2. To work around this, we + can fall back to requesting lower mbox protocol versions until we find + one that works. + + In theory, this should already "just work", but we have a counter example, + which this patch fixes. +- IPMI: Fix platform.cec_reboot() null ptr checks + + Kudos to Hugo Landau who reported this in: + https://github.com/open-power/skiboot/issues/142 +- hdata: Add location code property to xscom node + + This patch adds chip location code property to xscom node. +- p8-i2c: Limit number of retry attempts + + Current we will attempt to start an I2C transaction until it succeeds. + In the event that the OCC does not release the lock on an I2C bus this + results in an async token being held forever and the kernel thread that + started the transaction will block forever while waiting for an async + completion message. Fix this by limiting the number of attempts to + start the transaction. +- p8-i2c: Don't write the watermark register at init + + On P9 the I2C master is shared with the OCC. Currently the watermark + values are set once at init time which is bad for two reasons: + + a) We don't take the OCC master lock before setting it. Which + may cause issues if the OCC is currently using the master. + b) The OCC might change the watermark levels and we need to reset + them. + + Change this so that we set the watermark value when a new transaction + is started rather than at init time. +- hdata: Rename 'fsp-ipl-side' as 'sp-ipl-side' + + as OPAL is building device tree for both FSP and BMC system. + Also I don't see anyone using this property today. Hence renaming + should be fine. +- hdata/vpd: add support for parsing CPU VRML records + + Allows skiboot to parse out the processor part/serial numbers + on OpenPOWER P9 machines. +- core/lock: Introduce atomic cmpxchg and implement try_lock with it + + cmpxchg will be used in a subsequent change, and this reduces the + amount of asm code. +- direct-controls: add xscom error handling for p8 + + Add xscom checks which will print something useful and return error + back to callers (which already have error handling plumbed in). +- direct-controls: p8 implementation of generic direct controls + + This reworks the sreset functionality that was brought over from + fast-reboot, and fits it under the generic direct controls APIs. + + The fast reboot APIs are implemented using generic direct controls, + which also makes them available on p9. +- fast-reboot: allow mambo fast reboot independent of CPU type + + Don't tie mambo fast reboot to POWER8 CPU type. +- fast-reboot: remove delay after sreset + + There is a 100ms delay when targets reach sreset which does not appear + to have a good purpose. Remove it and therefore reduce the sreset timeout + by the same amount. +- fast-reboot: add more barriers around cpu state changes + + This is a bit of paranoia, but when a CPU changes state to signal it + has reached a particular point, all previous stores should be visible. +- fast-reboot: add sreset timeout detection and handling + + Have the initiator wait for all its sreset targets to call in, and + time out after 200ms if they did not. Fail and revert to IPL reboot. + + Testing indicates that after successful sreset_all_others(), it + takes less than 102ms (in hundreds of fast reboots) for secondaries + to call in. 100 of that is due to an initial delay, but core + un-splitting was not measured. +- fast-reboot: make spin loops consistent and SMT friendly +- fast-reboot: add sreset_all_others error handling + + Pass back failures from sreset_all_others, also change return codes to + OPAL form in sreset_all_prepare to match. + + Errors will revert to the IPL path, so it's not critical to completely + clean up everything if that would complicate things. Detecting the + error and failing is the important thing. +- fast-reboot: restore SMT priority on spin loop exit +- Add documentation for ibm, firmware-versions device tree node +- NX: Print read xscom config failures. + + Currently in NX, only write xscom config failures are tracing. + Add trace statements for read xscom config failures too. + No functional changes. +- hw/nx: Fix NX BAR assignments + + The NX rng BAR is used by each core to source random numbers for the + DARN instruction. Currently we configure each core to use the NX rng of + the chip that it exists on. Unfortunately, the NX can be de-configured by + hostboot and in this case we need to use the NX of a different chip. + + This patch moves the BAR assignments for the NX into the normal nx-rng + init path. This lets us check if the normal (chip local) NX is active + when configuring which NX a core should use so that we can fall back + gracefully. +- FSP-elog: Reduce verbosity of elog messages + + These messages just fill up the opal console log with useless messages + resulting in us losing useful information. + + They have been like this since the first commit in skiboot. Make them + trace. +- core/bitmap: fix bitmap iteration limit corruption + + The bitmap iterators did not reduce the number of bits to scan + when searching for the next bit, which would result in them + overrunning their bitmap. + + These are only used in one place, in xive reset, and the effect + is that the xive reset code will keep zeroing memory until it + reaches a block of memory of MAX_EQ_COUNT >> 3 bits in length, + all zeroes. +- hw/imc: always enable "imc_nest_chip" exports property + + imc_dt_update_nest_node() adds a "imc_nest_chip" property + to the "exports" node (under opal_node) to view nest counter + region. This comes handy when debugging ucode runtime + errors (like counter data update or control block update + so on...). And current code enables the property only if + the microcode is in running state at system boot. To aid + the debug of ucode not running/starting issues at boot, + enable the addition of "imc_nest_chip" property always. + +NVLINK2 +------- + +- npu2-hw-procedures.c: Correct phy lane mapping + + Each NVLINK2 device is associated with a particular group of OBUS lanes via + a lane mask which is read from HDAT via the device-tree. However Skiboot's + interpretation of lane mask was different to what is exported from the + HDAT. + + Specifically the lane mask bits in the HDAT are encoded in IBM bit ordering + for a 24-bit wide value. So for example in normal bit ordering lane-0 is + represented by having lane-mask bit 23 set and lane-23 is represented by + lane-mask bit 0. This patch alters the Skiboot interpretation to match what + is passed from HDAT. + +- npu2-hw-procedures.c: Power up lanes during ntl reset + + Newer versions of Hostboot will not power up the NVLINK2 PHY lanes by + default. The phy_reset procedure already powers up the lanes but they also + need to be powered up in order to access the DL. + + The reset_ntl procedure is called by the device driver to bring the DL out + of reset and get it into a working state. Therefore we also need to add + lane and clock power up to the reset_ntl procedure. +- npu2.c: Add PE error detection + + Invalid accesses from the GPU can cause a specific PE to be frozen by the + NPU. Add an interrupt handler which reports the frozen PE to the operating + system via as an EEH event. +- npu2.c: Fix XIVE IRQ alignment +- npu2: hw-procedures: Refactor reset_ntl procedure + + Change the implementation of reset_ntl to match the latest programming + guide documentation. +- npu2: hw-procedures: Add phy_rx_clock_sel() + + Change the RX clk mux control to be done by software instead of HW. This + avoids glitches caused by changing the mux setting. +- npu2: hw-procedures: Change phy_rx_clock_sel values + + The clock selection bits we set here are inputs to a state machine. + + DL clock select (bits 30-31) + + 0b00 + lane 0 clock + 0b01 + lane 7 clock + 0b10 + grid clock + 0b11 + invalid/no-op + + To recover from a potential glitch, we need to ensure that the value we + set forces a state change. Our current sequence is to set 0x3 followed + by 0x1. With the above now known, that is actually a no-op followed by + selection of lane 7. Depending on lane reversal, that selection is not a + state change for some bricks. + + The way to force a state change in all cases is to switch to the grid + clock, and then back to a lane. +- npu2: hw-procedures: Manipulate IOVALID during training + + Ensure that the IOVALID bit for this brick is raised at the start of + link training, in the reset_ntl procedure. + + Then, to protect us from a glitch when the PHY clock turns off or gets + chopped, lower IOVALID for the duration of the phy_reset and + phy_rx_dccal procedures. +- npu2: hw-procedures: Add check_credits procedure + + As an immediate mitigation for a current hardware glitch, add a procedure + that can be used to validate NTL credit values. This will be called as a + safeguard to check that link training succeeded. + + Assert that things are exactly as we expect, because if they aren't, the + system will experience a catastrophic failure shortly after the start of + link traffic. +- npu2: Print bdfn in NPU2DEV* logging macros + + Revise the NPU2DEV{DBG,INF,ERR} logging macros to include the device's + bdfn. It's useful to know exactly which link we're referring to. + + For instance, instead of :: + + [ 234.044921238,6] NPU6: Starting procedure reset_ntl + [ 234.048578101,6] NPU6: Starting procedure reset_ntl + [ 234.051049676,6] NPU6: Starting procedure reset_ntl + [ 234.053503542,6] NPU6: Starting procedure reset_ntl + [ 234.057182864,6] NPU6: Starting procedure reset_ntl + [ 234.059666137,6] NPU6: Starting procedure reset_ntl + + we'll get :: + + [ 234.044921238,6] NPU6:0:0.0 Starting procedure reset_ntl + [ 234.048578101,6] NPU6:0:0.1 Starting procedure reset_ntl + [ 234.051049676,6] NPU6:0:0.2 Starting procedure reset_ntl + [ 234.053503542,6] NPU6:0:1.0 Starting procedure reset_ntl + [ 234.057182864,6] NPU6:0:1.1 Starting procedure reset_ntl + [ 234.059666137,6] NPU6:0:1.2 Starting procedure reset_ntl +- npu2: Move to new GPU memory map + + There are three different ways we configure the MCD and memory map. + + 1) Old way (current way) + Skiboot configures the MCD and puts GPUs at 4TB and below + 2) New way with MCD + Hostboot configures the MCD and skiboot puts GPU at 4TB and above + 3) New way without MCD + No one configures the MCD and skiboot puts GPU at 4TB and below + + The patch keeps option 1 and adds options 2 and 3. + + The different configurations are detected using certain scoms (see + patch). + + Option 1 will go away eventually as it's a configuration that can + cause xstops or data integrity problems. We are keeping it around to + support existing hostboot. + + Option 2 supports only 4 GPUs and 512GB of memory per socket. + + Option 3 supports 6 GPUs and 4TB of memory but may have some + performance impact. +- phys-map: Rename GPU_MEM to GPU_MEM_4T_DOWN + + This map is soon to be replaced, but we are going to keep it around + for a little while so that we support older hostboot firmware. + +Platform Specific Fixes +----------------------- + +Witherspoon +^^^^^^^^^^^ +- Witherspoon: Remove old Witherspoon platform definition + + An old Witherspoon platform definition was added to aid the transition from + versions of Hostboot which didn't have the correct NVLINK2 HDAT information + available and/or planar VPD. These system should now be updated so remove + the possibly incorrect default assumption. + + This may disable NVLINK2 on old out-dated systems but it can easily be + restored with the appropriate FW and/or VPD updates. In any case there is a + a 50% chance the existing default behaviour was incorrect as it only + supports 6 GPU systems. Using an incorrect platform definition leads to + undefined behaviour which is more difficult to detect/debug than not + creating the NVLINK2 devices so remove the possibly incorrect default + behaviour. +- Witherspoon: Fix VPD EEPROM type + + There are user-space tools that update the planar VPD via the sysfs + interface. Currently we do not get correct information from hostboot + about the exact type of the EEPROM so we need to manually fix it up + here. This needs to be done as a platform specific fix since there is + not standardised VPD EEPROM type. + +IBM FSP Systems +^^^^^^^^^^^^^^^ + +- nvram: Fix 'missing' nvram on FSP systems. + + commit ba4d46fdd9eb ("console: Set log level from nvram") wants to read + from NVRAM rather early. This works fine on BMC based systems as + nvram_init() is actually synchronous. This is not true for FSP systems + and it turns out that the query for the console log level simply + queries blank nvram. + + The simple fix is to wait for the NVRAM read to complete before + performing any query. Unfortunately it turns out that the fsp-nvram + code does not inform the generic NVRAM layer when the read is complete, + rather, it must be prompted to do so. + + This patch addresses both these problems. This patch adds a check before + the first read of the NVRAM (for the console log level) that the read + has completed. The fsp-nvram code has been updated to inform the generic + layer as soon as the read completes. + + The old prompt to the fsp-nvram code has been removed but a check to + ensure that the NVRAM has been loaded remains. It is conservative but + if the NVRAM is not done loading before the host is booted it will not + have an nvram device-tree node which means it won't be able to access + the NVRAM at all, ever, even after the NVRAM has loaded. + + +Utilities +---------- + +- Fix xscom-utils distclean target + + In Debian/Ubuntu, the packaging system likes to have a full clean-up that + restores the tree back to original one, so add some files to the distclean + target. +- Add man pages for xscom-utils and pflash + + For the need of Debian/Ubuntu packaging, I inferred some initial man + pages from their help output. + +gard +^^^^ +- gard: Add tests + + I hear Stewart likes these for some reason. Dunno why. +- gard: Add OpenBMC vPNOR support + + A big-ol-hack to add some checking for OpenBMC's vPNOR GUARD files under + /media/pnor-prsv. This isn't ideal since it doesn't handle the create + case well, but it's better than nothing. +- gard: Always use MTD to access flash + + Direct mode is generally either unsafe or unsupported. We should always + access the PNOR via an MTD device so make that the default. If someone + really needs direct mode, then they can use pflash. +- gard: Fix up do_create return values + + The return value of a subcommand is interpreted as a libflash error code + when it's positive or some subcommand specific error when negative. + Currently the create subcommand always returns zero when exiting (even + for errors) so fix that. +- gard: Add usage message for -p + + The -p argument only really makes sense when -f is specified. Print an + actual error message rather than just the usage blob. +- gard: Fix max instance count + + There's an entire byte for the instance count rather than a nibble. Only + barf if the instance number is beyond 255 rather than 16. +- gard: Fix up path parsing + + Currently we assume that the Unit ID can be used as an array index into + the chip_units[] structure. There are holes in the ID space though, so + this doesn't actually work. Fix it up by walking the array looking for + the ID. +- gard: Set chip generation based on PVR + + Currently we assume that this tool is being used on a P8 system by + default and allow the user to override this behaviour using the -8 and + -9 command line arguments. When running on the host we can use the + PVR to guess what chip generation so do that. + + This also changes the default behaviour to assume that the host is a P9 + when running on an ARM system. This tool didn't even work when compiled + for ARM until recently and the OpenBMC vPNOR hack that we have currently + is broken for P9 systems that don't use vPNOR (Zaius and Romulus). +- gard: Allow records with an ID of 0xffffffff + + We currently assume that a record with an ID of 0xffffffff is invalid. + Apparently this is incorrect and we should display these records, so + expand the check to compare the entire record with 0xff rather than + just the ID. +- gard: create: Allow creating arbitrary GARD records + + Add a new sub-command that allows us to create GARD records for + arbitrary chip units. There isn't a whole lot of constraints on this and + that limits how useful it can be, but it does allow a user to GARD out + individual DIMMs, chips or cores from the BMC (or host) if needed. + + There are a few caveats though: + + 1) Not everything can, or should, have a GARD record applied it to. + 2) There is no validation that the unit actually exists. Doing that + sort of validation requires something that understands the FAPI + targeting information (I think) and adding support for it here + would require some knowledge from the system XML file. + 3) There's no way to get a list of paths in the system. + 4) Although we can create a GARD record at runtime it won't be applied + until the next IPL. +- gard: Add path parsing support + + In order to support manual GARD records we need to be able to parse the + hardware unit path strings. This patch implements that. +- gard: list: Improve output + + Display the full path to the GARDed hardware unit in each record rather + than relying on the output of `gard show` and convert do_list() to use + the iterator while we're here. +- gard: {list, show}: Fix the Type field in the output + + The output of `gard list` has a field named "Type", however this + doesn't actually indicate the type of the record. Rather, it + shows the type of the path used to identify the hardware being + GARDed. This is of pretty dubious value considering the Physical + path seems to always be used when referring to GARDed hardware. +- gard: Add P9 support +- gard: Update chip unit data + + Source the list of units from the hostboot source rather than the + previous hard coded list. The list of path element types changes + between generations so we need to add a level of indirection to + accommodate P9. This also changes the names used to match those + printed by Hostboot at IPL time and paves the way to adding support + for manual GARD record creation. +- gard: show: Remove "Res Recovery" field + + This field has never been populated by hostboot on OpenPower systems + so there's no real point in reporting it's contents. + +libflash / pflash +^^^^^^^^^^^^^^^^^ + +Anybody shipping libflash or pflash to interact with POWER9 systems must +upgrade to this version. + +- pflash: Support for volatile flag + + The volatile flag was added to the PNOR image to + indicate partitions that are cleared during a host + power off. Display this flag from the pflash command. +- pflash: Support for clean_on_ecc_error flag + + Add the misc flag clear_on_ecc_error to libflash/pflash. This was + the only missing flag. The generator of the virtual PNOR image + relies on libflash/pflash to provide the partition information, + so all flags are needed to build an accurate virtual PNOR partition + table. +- pflash: Respect write(2) return values + + The write(2) system call returns the number of bytes written, this is + important since it is entitled to write less than what we requested. + Currently we ignore the return value and assume it wrote everything we + requested. While in practice this is likely to always be the case, it + isn't actually correct. +- external/pflash: Fix erasing within a single erase block + + It is possible to erase within a single erase block. Currently the + pflash code assumes that if the erase starts part way into an erase + block it is because it needs to be aligned up to the boundary with the + next erase block. + + Doing an erase smaller than a single erase block will cause underflows + and looping forever on erase. +- external/pflash: Fix non-zero return code for successful read when size%256 != 0 + + When performing a read the return value from pflash is non-zero, even for + a successful read, when the size being read is not a multiple of 256. + This is because do_read_file returns the value from the write system + call which is then returned by pflash. When the size is a multiple of + 256 we get lucky in that this wraps around back to zero. However for any + other value the return code is size % 256. This means even when the + operation is successful the return code will seem to reflect an error. + + Fix this by returning zero if the entire size was read correctly, + otherwise return the corresponding error code. +- libflash: Fix parity calculation on ARM + + To calculate the ECC syndrome we need to calculate the parity of a 64bit + number. On non-powerpc platforms we use the GCC builtin function + __builtin_parityl() to do this calculation. This is broken on 32bit ARM + where sizeof(unsigned long) is four bytes. Using __builtin_parityll() + instead cures this. +- libflash/mbox-flash: Add the ability to lock flash +- libflash/mbox-flash: Understand v3 +- libflash/mbox-flash: Use BMC suggested timeout value +- libflash/mbox-flash: Simplify message sending + + hw/lpc-mbox no longer requires that the memory associated with messages + exist for the lifetime of the message. Once it has been sent to the BMC, + that is bmc_mbox_enqueue() returns, lpc-mbox does not need the message + to continue to exist. On the receiving side, lpc-mbox will ensure that a + message exists for the receiving callback function. + + Remove all code to deal with allocating messages. +- hw/lpc-mbox: Simplify message bookkeeping and timeouts + + Currently the hw/lpc-mbox layer keeps a pointer for the currently + in-flight message for the duration of the mbox call. This creates + problems when messages timeout, is that pointer still valid, what can we + do with it. The memory is owned by the caller but if the caller has + declared a timeout, it may have freed that memory. + + Another problem is locking. This patch also locks around sending and + receiving to avoid races with timeouts and possible resends. There was + some locking previously which was likely insufficient - definitely too + hard to be sure is correct + + All this is made much easier with the previous rework which moves + sequence number allocation and verification into lpc-mbox rather than + the caller. +- libflash/mbox-flash: Allow mbox-flash to tell the driver msg timeouts + + Currently when mbox-flash decides that a message times out the driver + has no way of knowing to drop the message and will continue waiting for + a response indefinitely preventing more messages from ever being sent. + + This is a problem if the BMC crashes or has some other issue where it + won't ever respond to our outstanding message. + + This patch provides a method for mbox-flash to tell the driver how long + it should wait before it no longer needs to care about the response. +- libflash/mbox-flash: Move sequence handling to driver level +- libflash/mbox-flash: Always close windows before opening a new window + + The MBOX protocol states that if an open window command fails then all + open windows are closed. Currently, if an open window command fails + mbox-flash will erroneously assume that the previously open window is + still open. + + The solution to this is to mark all windows as closed before issuing an + open window command and then on success we'll mark the new window as + open. +- libflash/mbox-flash: Add v2 error codes + +opal-prd +^^^^^^^^ + +Anybody shipping `opal-prd` for POWER9 systems must upgrade `opal-prd` to +this new version. + +- prd: Log unsupported message type + + Useful for debugging. + + Sample output: :: + + [29155.157050283,7] PRD: Unsupported prd message type : 0xc + +- opal-prd: occ: Add support for runtime OCC load/start in ZZ + + This patch adds support to handle OCC load/start event from FSP/PRD. + During IPL we send a success directly to FSP without invoking any HBRT + load routines on receiving OCC load mbox message from FSP. At runtime + we forward this event to host opal-prd. + + This patch provides support for invoking OCC load/start HBRT routines + like load_pm_complex() and start_pm_complex() from opal-prd. +- opal-prd: Add support for runtime OCC reset in ZZ + + This patch handles OCC_RESET runtime events in host opal-prd and also + provides support for calling 'hostinterface->wakeup()' which is + required for doing the reset operation. +- prd: Enable error logging via firmware_request interface + + In P9 HBRT sends error logs to FSP via firmware_request interface. + This patch adds support to parse error log and send it to FSP. +- prd: Add generic response structure inside prd_fw_msg + + This patch adds generic response structure. Also sync prd_fw_msg type + macros with hostboot. +- opal-prd: flush after logging to stdio in debug mode + + When in debug mode, flush after each log output. This makes it more + likely that we'll catch failure reasons on severe errors. + +Debugging and reliability improvements +-------------------------------------- + +- lock: Add additional lock auditing code + + Keep track of lock owner name and replace lock_depth counter + with a per-cpu list of locks held by the cpu. + + This allows us to print the actual locks held in case we hit + the (in)famous message about opal_pollers being run with a + lock held. + + It also allows us to warn (and drop them) if locks are still + held when returning to the OS or completing a scheduled job. +- Add support for new GCC 7 parametrized stack protector + + This gives us per-cpu guard values as well. For now I just + XOR a magic constant with the CPU PIR value. +- Mambo: run hello_world and sreset_world tests with Secure and Trusted Boot + + We *disable* the secure boot part, but we keep the verified boot + part as we don't currently have container verification code for Mambo. + + We can run a small part of the code currently though. + +- core/flash.c: extern function to get the name of a PNOR partition + + This adds the flash_map_resource_name() to allow skiboot subsystems to + lookup the name of a PNOR partition. Thus, we don't need to duplicate + the same information in other places (e.g. libstb). +- libflash/mbox-flash: only wait for MBOX_DEFAULT_POLL_MS if busy + + This makes the mbox unit test run 300x quicker and seems to + shave about 6 seconds from boot time on Witherspoon. +- make check: Make valgrind optional + + To (slightly) lower the barrier for contributions, we can make valgrind + optional with just a small amount of plumbing. + + This allows make check to run successfully without valgrind. +- libflash/test: Add tests for mbox-flash + + A first basic set of tests for mbox-flash. These tests do their testing + by stubbing out or otherwise replacing functions not in + libflash/mbox-flash.c. The stubbed out version of the function can then + be used to emulate a BMC mbox daemon talking to back to the code in + mbox-flash and it can ensure that there is some adherence to the + protocol and that from a block-level api point of view the world appears + sane. + + This makes these tests simple to run and they have been integrated into + `make check`. The down side is that these tests rely on duplicated + feature incomplete BMC daemon behaviour. Therefore these tests are a + strong indicator of broken behaviour but a very unreliable indicator of + correctness. + + Full integration tests with a 'real' BMC daemon are probably beyond the + scope of this repository. +- external/test/test.sh: fix VERSION substitution when no tags + + i.e. we get a hash rather than a version number + + This seems to be occurring in Travis if it doesn't pull a tag. +- external/test: make stripping out version number more robust + + For some bizarre reason, Travis started failing on this + substitution when there'd been zero code changes in this + area... This at least papers over whatever the problem is + for the time being. +- io: Add load_wait() helper + + This uses the standard form twi/isync pair to ensure a load + is consumed by the core before continuing. This can be necessary + under some circumstances for example when having the following + sequence: + + - Store reg A + - Load reg A (ensure above store pushed out) + - delay loop + - Store reg A + + I.E., a mandatory delay between 2 stores. In theory the first store + is only guaranteed to reach the device after the load from the same + location has completed. However the processor will start executing + the delay loop without waiting for the return value from the load. + + This construct enforces that the delay loop isn't executed until + the load value has been returned. +- chiptod: Keep boot timestamps contiguous + + Currently we reset the timebase value to (almost) zero when + synchronising the timebase of each chip to the Chip TOD network which + results in this: :: + + [ 42.374813167,5] CPU: All 80 processors called in... + [ 2.222791151,5] FLASH: Found system flash: Macronix MXxxL51235F id:0 + [ 2.222977933,5] BT: Interface initialized, IO 0x00e4 + + This patch modifies the chiptod_init() process to use the current + timebase value rather than resetting it to zero. This results in the + timestamps remaining contiguous from the start of hostboot until + the petikernel starts. e.g. :: + + [ 70.188811484,5] CPU: All 144 processors called in... + [ 72.458004252,5] FLASH: Found system flash: id:0 + [ 72.458147358,5] BT: Interface initialized, IO 0x00e4 + +- hdata/spira: Add missing newline to prlog() call + + We're missing a \n here. +- opal/xscom: Add recovery for lost core wakeup SCOM failures. + + Due to a hardware issue where core responding to SCOM was delayed due to + thread reconfiguration, leaves the SCOM logic in a state where the + subsequent SCOM to that core can get errors. This is affected for Core + PC SCOM registers in the range of 20010A80-20010ABF + + The solution is if a xscom timeout occurs to one of Core PC SCOM registers + in the range of 20010A80-20010ABF, a clearing SCOM write is done to + 0x20010800 with data of '0x00000000' which will also get a timeout but + clears the SCOM logic errors. After the clearing write is done the original + SCOM operation can be retried. + + The SCOM timeout is reported as status 0x4 (Invalid address) in HMER[21-23]. +- opal/xscom: Move the delay inside xscom_reset() function. + + So caller of xscom_reset() does not have to bother about adding a delay + separately. Instead caller can control whether to add a delay or not using + second argument to xscom_reset(). +- timer: Stop calling list_top() racily + + This will trip the debug checks in debug builds under some circumstances + and is actually a rather bad idea as we might look at a timer that is + concurrently being removed and modified, and thus incorrectly assume + there is no work to do. +- fsp: Bail out of HIR if FSP is resetting voluntarily + + a. Surveillance response times out and OPAL triggers a HIR + b. Before the HIR process kicks in, OPAL gets a PSI interrupt indicating link down + c. HIR process continues and OPAL tries to write to DRCR; PSI link inactive => xstop + + OPAL should confirm that the FSP is not already in reset in the HIR path. +- sreset_kernel: only run SMT tests due to not supporting re-entry +- Use systemsim-p9 v1.1 +- direct-controls: enable fast reboot direct controls for mambo + + Add mambo direct controls to stop threads, which is required for + reliable fast-reboot. Enable direct controls by default on mambo. +- core/opal: always verify cpu->pir on entry +- asm/head: add entry/exit calls + + Add entry and exit C functions that can do some more complex + checks before the opal proper call. This requires saving off + volatile registers that have arguments in them. +- core/lock: improve bust_locks + + Prevent try_lock from modifying the lock state when bust_locks is set. + unlock will not unlock it in that case, so locks will get taken and + never released while bust_locks is set. +- hw/occ: Log proper SCOM register names + + This patch fixes the logging of incorrect SCOM + register names. +- mambo: Add support for NUMA + + Currently the mambo scripts can do multiple chips, but only the first + ever has memory. + + This patch adds support for having memory on each chip, with each + appearing as a separate NUMA node. Each node gets MEM_SIZE worth of + memory. + + It's opt-in, via ``export MAMBO_NUMA=1``. +- external/mambo: Switch qtrace command to use plug-ins + + The plug-in seems to be the preferred way to do this now, it works + better, and the qtracer emitter seems to generate invalid traces + in new mambo versions. +- asm/head: Loop after attn + + We use the attn instruction to raise an error in early boot if OPAL + don't recognise the PVR. It's possible for hostboot to disable the + attn instruction before entering OPAL so add an extra busy loop after + the attn to prevent attempting to boot on an unknown processor. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.10-rc2.rst b/roms/skiboot/doc/release-notes/skiboot-5.10-rc2.rst new file mode 100644 index 000000000..b39ebb938 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.10-rc2.rst @@ -0,0 +1,162 @@ +.. _skiboot-5.10-rc2: + +skiboot-5.10-rc2 +================ + +skiboot v5.10-rc2 was released on Friday February 9th 2018. It is the second +release candidate of skiboot 5.10, which will become the new stable release +of skiboot following the 5.9 release, first released October 31st 2017. + +skiboot v5.10-rc2 contains all bug fixes as of :ref:`skiboot-5.9.8` +and :ref:`skiboot-5.4.9` (the currently maintained stable releases). There +may be more 5.9.x stable releases, it will depend on demand. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.10 in February, with skiboot 5.10 +being for all POWER8 and POWER9 platforms in op-build v1.21. +This release will be targeted to early POWER9 systems. + +Over skiboot-5.10-rc1, we have the following changes: + +- hw/npu2: Implement logging HMI actions +- opal-prd: Fix FTBFS with -Werror=format-overflow + + i2c.c fails to compile with gcc7 and -Werror=format-overflow used in + Debian Unstable and Ubuntu 18.04 : :: + + i2c.c: In function ‘i2c_init’: + i2c.c:211:15: error: ‘%s’ directive writing up to 255 bytes into a + region of size 236 [-Werror=format-overflow=] + +- core/exception: beautify exception handler, add MCE-involved registers + + Print DSISR and DAR, to help with deciphering machine check exceptions, + and improve the output a bit, decode NIP symbol, improve alignment, etc. + Also print a specific header for machine check, because we do expect to + see these if there is a hardware failure. + + Before: :: + + [ 0.005968779,3] *********************************************** + [ 0.005974102,3] Unexpected exception 200 ! + [ 0.005978696,3] SRR0 : 000000003002ad80 SRR1 : 9000000000001000 + [ 0.005985239,3] HSRR0: 00000000300027b4 HSRR1: 9000000030001000 + [ 0.005991782,3] LR : 000000003002ad80 CTR : 0000000000000000 + [ 0.005998130,3] CFAR : 00000000300b58bc + [ 0.006002769,3] CR : 40000004 XER: 20000000 + [ 0.006008069,3] GPR00: 000000003002ad80 GPR16: 0000000000000000 + [ 0.006015170,3] GPR01: 0000000031c03bd0 GPR17: 0000000000000000 + [...] + + After: :: + + [ 0.003287941,3] *********************************************** + [ 0.003561769,3] Fatal MCE at 000000003002ad80 .nvram_init+0x24 + [ 0.003579628,3] CFAR : 00000000300b5964 + [ 0.003584268,3] SRR0 : 000000003002ad80 SRR1 : 9000000000001000 + [ 0.003590812,3] HSRR0: 00000000300027b4 HSRR1: 9000000030001000 + [ 0.003597355,3] DSISR: 00000000 DAR : 0000000000000000 + [ 0.003603480,3] LR : 000000003002ad68 CTR : 0000000030093d80 + [ 0.003609930,3] CR : 40000004 XER : 20000000 + [ 0.003615698,3] GPR00: 00000000300149e8 GPR16: 0000000000000000 + [ 0.003622799,3] GPR01: 0000000031c03bc0 GPR17: 0000000000000000 + [...] +- core/init: manage MSR[ME] explicitly, always enable + + The current boot sequence inherits MSR[ME] from the IPL firmware, and + never changes it. Some environments disable MSR[ME] (e.g., mambo), and + others can enable it (hostboot). + + This has two problems. First, MSR[ME] must be disabled while in the + process of taking over the interrupt vector from the previous + environment. Second, after installing our machine check handler, + MSR[ME] should be enabled to get some useful output rather than a + checkstop. +- fast-reboot: occ: Re-parse the pstate table during fast-reboot + + OCC shares the frequency list to host by copying the pstate table to + main memory in HOMER. This table is parsed during boot to create + device-tree properties for frequency and pstate IDs. OCC can update + the pstate table to present a new set of frequencies to the host. But + host will remain oblivious to these changes unless it is re-inited + with the updated device-tree CPU frequency properties. So this patch + allows to re-parse the pstate table and update the device-tree + properties during fast-reboot. + + OCC updates the pstate table when asked to do so using pstate-table + bias command. And this is mainly used by WOF team for + characterization purposes. +- fast-reboot: move pci_reset error handling into fast-reboot code + + pci_reset() currently does a platform reboot if it fails. It + should not know about fast-reboot at this level, so instead have + it return an error, and the fast reboot caller will do the + platform reboot. + + The code essentially does the same thing, but flexibility is + improved. Ideally the fast reboot code should perform pci_reset + and all such fail-able operations before the CPU resets itself + and destroys its own stack. That's not the case now, but that + should be the goal. +- capi: Fix the max tlbi divider and the directory size. + + Switch to 512KB mode (directory size) as we don’t use bit 48 of the tag + in addressing the array. This mode is controlled by the Snoop CAPI + Configuration Register. + Set the maximum of the number of data polls received before signaling + TLBI hang detect timer expired. The value of '0000' is equal to 16. +- npu2/tce: Fix page size checking + + The page size is encoded in the TVT data [59:63] as @shift+11 but + the tce_kill handler does not do the math right; this fixes it. +- stb: Enforce secure boot if called before libstb initialized +- stb: Correctly error out when no PCR for resource +- core/init: move imc catalog preload init after the STB init. + + As a safer side move the imc catalog preload after the STB init + to make sure the imc catalog resource get's verified and measured + properly during loading when both secure and trusted boot modes + are on. +- libstb: fix failure of calling trusted measure without STB initialization. + + When we load a flash resource during OPAL init, STB calls trusted measure + to measure the given resource. There is a situation when a flash gets loaded + before STB initialization then trusted measure cannot measure properly. + + So this patch fixes this issue by calling trusted measure only if the + corresponding trusted init was done. + + The ideal fix is to make sure STB init done at the first place during init + and then do the loading of flash resources, by that way STB can properly + verify and measure the all resources. +- libstb: fix failure of calling cvc verify without STB initialization. + + Currently in OPAL init time at various stages we are loading various + PNOR partition containers from the flash device. When we load a flash + resource STB calls the CVC verify and trusted measure(sha512) functions. + So when we have a flash resource gets loaded before STB initialization, + then cvc verify function fails to start the verify and enforce the boot. + + Below is one of the example failure where our VERSION partition gets + loading early in the boot stage without STB initialization done. + + This is with secure mode off. + STB: VERSION NOT VERIFIED, invalid param. buf=0x305ed930, len=4096 key-hash=0x0 hash-size=0 + + In the same code path when secure mode is on, the boot process will abort. + + So this patch fixes this issue by calling cvc verify only if we have + STB init was done. + + And also we need a permanent fix in init path to ensure STB init gets + done at first place and then start loading all other flash resources. +- libstb/tpm_chip: Add missing new line to print messages. +- libstb: increase the log level of verify/measure messages to PR_NOTICE. + + Currently libstb logs the verify and hash caluculation messages in + PR_INFO level. So when there is a secure boot enforcement happens + in loading last flash resource(Ex: BOOTKERNEL), the previous verify + and measure messages are not logged to console, which is not clear + to the end user which resource is verified and measured. + So this patch fixes this by increasing the log level to PR_NOTICE. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.10-rc3.rst b/roms/skiboot/doc/release-notes/skiboot-5.10-rc3.rst new file mode 100644 index 000000000..03ceb0a02 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.10-rc3.rst @@ -0,0 +1,148 @@ +.. _skiboot-5.10-rc3: + +skiboot-5.10-rc3 +================ + +skiboot v5.10-rc3 was released on Thursday February 15th 2018. It is the third +release candidate of skiboot 5.10, which will become the new stable release +of skiboot following the 5.9 release, first released October 31st 2017. + +skiboot v5.10-rc3 contains all bug fixes as of :ref:`skiboot-5.9.8` +and :ref:`skiboot-5.4.9` (the currently maintained stable releases). There +may be more 5.9.x stable releases, it will depend on demand. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.10 in February, with skiboot 5.10 +being for all POWER8 and POWER9 platforms in op-build v1.21. +This release will be targeted to early POWER9 systems. + +Over skiboot-5.10-rc2, we have the following changes: + +- vas: Disable VAS/NX-842 on some P9 revisions + + VAS/NX-842 are not functional on some P9 revisions, so disable them + in hardware and skip creating their device tree nodes. + + Since the intent is to prevent OS from configuring VAS/NX, we remove + only the platform device nodes but leave the VAS/NX DT nodes under + xscom (i.e we don't skip add_vas_node() in hdata/spira.c) +- phb4: Only escalate freezes on MMIO load where necessary + + In order to work around a hardware issue, MMIO load freezes were + escalated to fences on every chip. Now that hardware no longer requires + this, restrict escalation to the chips that actually need it. +- pflash: Fix makefile dependency issue +- DT: Add "version" property under ibm, firmware-versions node + + First line of VERSION section in PNOR contains firmware version. + Use that to add "version" property under firmware versions dt node. + + Sample output: + + .. code-block:: console + + root@xxx2:/proc/device-tree/ibm,firmware-versions# lsprop + version "witherspoon-ibm-OP9_v1.19_1.94" + +- npu2: Disable TVT range check when in bypass mode + + On POWER9 the GPUs need to be able to access the MMIO memory space. Therefore + the TVT range check needs to include the MMIO address space. As any possible + range check would cover all of memory anyway this patch just disables the TVT + range check all together when bypassing the TCE tables. +- hw/npu2: support creset of npu2 devices + + creset calls in the hw procedure that resets the PHY, we don't + take them out of reset, just put them in reset. + + this fixes a kexec issue. +- ATTN: Enable flush instruction cache bit in HID register + + In P9, we have to enable "flush the instruction cache" bit along with + "attn instruction support" bit to trigger attention. +- capi: Enable channel tag streaming for PHB in CAPP mode + + We re-enable channel tag streaming for PHB in CAPP mode as without it + PEC was waiting for cresp for each DMA write command before sending a + new DMA write command on the Powerbus. This resulted in much lower DMA + write performance than expected. + + The patch updates enable_capi_mode() to remove the masking of + channel_streaming_en bit in PBCQ Hardware Configuration Register. Also + does some re-factoring of the code that updates this register to use + xscom_write_mask instead of xscom_read followed by a xscom_write. +- core/device.c: Fix dt_find_compatible_node + + dt_find_compatible_node() and dt_find_compatible_node_on_chip() are used to + find device nodes under a parent/root node with a given compatible + property. + + dt_next(root, prev) is used to walk the child nodes of the given parent and + takes two arguments - root contains the parent node to walk whilst prev + contains the previous child to search from so that it can be used as an + iterator over all children nodes. + + The first iteration of dt_find_compatible_node(root, prev) calls + dt_next(root, root) which is not a well defined operation as prev is + assumed to be child of the root node. The result is that when a node + contains no children it will start returning the parent nodes siblings + until it hits the top of the tree at which point a NULL derefence is + attempted when looking for the root nodes parent. + + Dereferencing NULL can result in undesirable data exceptions during system + boot and untimely non-hilarious system crashes. dt_next() should not be + called with prev == root. Instead we add a check to dt_next() such that + passing prev = NULL will cause it to start iterating from the first child + node (if any). +- stb: Put correct label (for skiboot) into container + + Hostboot will expect the label field of the stb header to contain + "PAYLOAD" for skiboot or it will fail to load and run skiboot. + + The failure looks something like this: :: + + 53.40896|ISTEP 20. 1 - host_load_payload + 53.65840|secure|Secureboot Failure plid = 0x90000755, rc = 0x1E07 + + 53.65881|System shutting down with error status 0x1E07 + 53.67547|================================================ + 53.67954|Error reported by secure (0x1E00) PLID 0x90000755 + 53.67560| Container's component ID does not match expected component ID + 53.67561| ModuleId 0x09 SECUREBOOT::MOD_SECURE_VERIFY_COMPONENT + 53.67845| ReasonCode 0x1e07 SECUREBOOT::RC_ROM_VERIFY + 53.67998| UserData1 : 0x0000000000000000 + 53.67999| UserData2 : 0x0000000000000000 + 53.67999|------------------------------------------------ + 53.68000| Callout type : Procedure Callout + 53.68000| Procedure : EPUB_PRC_HB_CODE + 53.68001| Priority : SRCI_PRIORITY_HIGH + 53.68001|------------------------------------------------ + 53.68002| Callout type : Procedure Callout + 53.68003| Procedure : EPUB_PRC_FW_VERIFICATION_ERR + 53.68003| Priority : SRCI_PRIORITY_HIGH + 53.68004|------------------------------------------------ +- hw/occ: Fix fast-reboot crash in P8 platforms. + + commit 85a1de35cbe4 ("fast-boot: occ: Re-parse the pstate table during fast-boot" ) + breaks the fast-reboot on P8 platforms while reiniting the OCC pstates. On P8 + platforms OPAL adds additional two properties #address-cells and #size-cells + under ibm,opal/power-mgmt/ DT node. While in fast-reboot same properties adding + back to the same node results in Duplicate properties and hence fast-reboot fails + with below traces. :: + + [ 541.410373292,5] OCC: All Chip Rdy after 0 ms + [ 541.410488745,3] Duplicate property "#address-cells" in node /ibm,opal/power-mgt + [ 541.410694290,0] Aborting! + CPU 0058 Backtrace: + S: 0000000031d639d0 R: 000000003001367c .backtrace+0x48 + S: 0000000031d63a60 R: 000000003001a03c ._abort+0x4c + S: 0000000031d63ae0 R: 00000000300267d8 .new_property+0xd8 + S: 0000000031d63b70 R: 0000000030026a28 .__dt_add_property_cells+0x30 + S: 0000000031d63c10 R: 000000003003ea3c .occ_pstates_init+0x984 + S: 0000000031d63d90 R: 00000000300142d8 .load_and_boot_kernel+0x86c + S: 0000000031d63e70 R: 000000003002586c .fast_reboot_entry+0x358 + S: 0000000031d63f00 R: 00000000300029f4 fast_reset_entry+0x2c + + This patch fixes this issue by removing these two properties on P8 while doing + OCC pstates re-init in fast-reboot code path. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.10-rc4.rst b/roms/skiboot/doc/release-notes/skiboot-5.10-rc4.rst new file mode 100644 index 000000000..c5e0b3c8b --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.10-rc4.rst @@ -0,0 +1,81 @@ +.. _skiboot-5.10-rc4: + +skiboot-5.10-rc4 +================ + +skiboot v5.10-rc4 was released on Wednesday February 21st 2018. It is the fourth +release candidate of skiboot 5.10, which will become the new stable release +of skiboot following the 5.9 release, first released October 31st 2017. + +skiboot v5.10-rc4 contains all bug fixes as of :ref:`skiboot-5.9.8` +and :ref:`skiboot-5.4.9` (the currently maintained stable releases). There +may be more 5.9.x stable releases, it will depend on demand. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.10 in February, with skiboot 5.10 +being for all POWER8 and POWER9 platforms in op-build v1.21. +This release will be targeted to early POWER9 systems. + +Over skiboot-5.10-rc3, we have the following changes: + +- core: Fix mismatched names between reserved memory nodes & properties + + OPAL exposes reserved memory regions through the device tree in both new + (nodes) and old (properties) formats. + + However, the names used for these don't match - we use a generated cell + address for the nodes, but the plain region name for the properties. + + This fixes a warning from FWTS +- sensor-groups: occ: Add support to disable/enable sensor group + + This patch adds a new opal call to enable/disable a sensor group. This + call is used to select the sensor groups that needs to be copied to + main memory by OCC at runtime. +- sensors: occ: Add energy counters + + Export the accumulated power values as energy sensors. The accumulator + field of power sensors are used for representing energy counters which + can be exported as energy counters in Linux hwmon interface. +- sensors: Support reading u64 sensor values + + This patch adds support to read u64 sensor values. This also adds + changes to the core and the backend implementation code to make this + API as the base call. Host can use this new API to read sensors + upto 64bits. + + This adds a list to store the pointer to the kernel u32 buffer, for + older kernels making async sensor u32 reads. +- dt: add /cpus/ibm,powerpc-cpu-features device tree bindings + + This is a new CPU feature advertising interface that is fine-grained, + extensible, aware of privilege levels, and gives control of features + to all levels of the stack (firmware, hypervisor, and OS). + + The design and binding specification is described in detail in doc/. +- phb3/phb4/p7ioc: Document supported TCE sizes in DT + + Add a new property, "ibm,supported-tce-sizes", to advertise to Linux how + big the available TCE sizes are. Each value is a bit shift, from + smallest to largest. +- phb4: Fix TCE page size + + The page sizes for TCEs on P9 were inaccurate and just copied from PHB3, + so correct them. +- Revert "pci: Shared slot state synchronisation for hot reset" + + An issue was found in shared slot reset where the system can be stuck in + an infinite loop, pull the code out until there's a proper fix. + + This reverts commit 1172a6c57ff3c66f6361e572a1790cbcc0e5ff37. +- hdata/iohub: Use only wildcard slots for pluggables + + We don't want to cause a VID:DID check against pluggable devices, as + they may use multiple devids. + + Narrow the condition under which VID:DID is listed in the dt, so that + we'll end up creating a wildcard slot for these instead. +- increase log verbosity in debug builds +- Add -debug to version on DEBUG builds +- cpu_wait_job: Correctly report time spent waiting for job diff --git a/roms/skiboot/doc/release-notes/skiboot-5.10.1.rst b/roms/skiboot/doc/release-notes/skiboot-5.10.1.rst new file mode 100644 index 000000000..baee512ae --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.10.1.rst @@ -0,0 +1,23 @@ +.. _skiboot-5.10.1: + +============== +skiboot-5.10.1 +============== + +skiboot 5.10.1 was released on Thursday March 1st, 2018. It replaces +:ref:`skiboot-5.10` as the current stable release in the 5.10.x series. + +Over :ref:`skiboot-5.10`, we have an improvement for debugging NPU2/NVLink +problems and a bug fix. These changes are: + +- NPU2 HMIs: dump out a *LOT* of npu2 registers for debugging +- libflash/blocklevel: Correct miscalculation in blocklevel_smart_erase() + + This fixes a bug in pflash. + + If blocklevel_smart_erase() detects that the smart erase fits entire in + one erase block, it has an early bail path. In this path it miscaculates + where in the buffer the backend needs to read from to perform the final + write. + + Fixes: https://github.com/open-power/skiboot/issues/151 diff --git a/roms/skiboot/doc/release-notes/skiboot-5.10.2.rst b/roms/skiboot/doc/release-notes/skiboot-5.10.2.rst new file mode 100644 index 000000000..9c828dfdd --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.10.2.rst @@ -0,0 +1,29 @@ +.. _skiboot-5.10.2: + +============== +skiboot-5.10.2 +============== + +skiboot 5.10.2 was released on Tuesday March 6th, 2018. It replaces +:ref:`skiboot-5.10.1` as the current stable release in the 5.10.x series. + +Over :ref:`skiboot-5.10.1`, we have one improvement: + +- Tie tm-suspend fw-feature and opal_reinit_cpus() together + + Currently opal_reinit_cpus(OPAL_REINIT_CPUS_TM_SUSPEND_DISABLED) + always returns OPAL_UNSUPPORTED. + + This ties the tm suspend fw-feature to the + opal_reinit_cpus(OPAL_REINIT_CPUS_TM_SUSPEND_DISABLED) so that when tm + suspend is disabled, we correctly report it to the kernel. For + backwards compatibility, it's assumed tm suspend is available if the + fw-feature is not present. + + Currently hostboot will clear fw-feature(TM_SUSPEND_ENABLED) on P9N + DD2.1. P9N DD2.2 will set fw-feature(TM_SUSPEND_ENABLED). DD2.0 and + below has TM disabled completely (not just suspend). + + We are using opal_reinit_cpus() to determine this setting (rather than + the device tree/HDAT) as some future firmware may let us change this + dynamically after boot. That is not the case currently though. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.10.3.rst b/roms/skiboot/doc/release-notes/skiboot-5.10.3.rst new file mode 100644 index 000000000..0dc87ed31 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.10.3.rst @@ -0,0 +1,82 @@ +.. _skiboot-5.10.3: + +============== +skiboot-5.10.3 +============== + +skiboot 5.10.3 was released on Thursday March 28th, 2018. It replaces +:ref:`skiboot-5.10.2` as the current stable release in the 5.10.x series. + +It is recommended that 5.10.3 be used instead of any previous 5.10.x version +due to the bug fixes and debugging enhancements in it. + +Over :ref:`skiboot-5.10.2`, we have a few improvements and bug fixes: + +- NPU2: dump NPU2 registers on npu2 HMI + + Due to the nature of debugging npu2 issues, folk are wanting the + full list of NPU2 registers dumped when there's a problem. + + This is different than the solution introduced in 5.10.1 + as there we would dump the registers in a way that would trigger a FIR + bit that would confuse PRD. +- npu2: Add performance tuning SCOM inits + + Peer-to-peer GPU bandwidth latency testing has produced some tunable + values that improve performance. Add them to our device initialization. + + File these under things that need to be cleaned up with nice #defines + for the register names and bitfields when we get time. + + A few of the settings are dependent on the system's particular NVLink + topology, so introduce a helper to determine how many links go to a + single GPU. +- hw/npu2: Assign a unique LPARSHORTID per GPU + + This gets used elsewhere to index items in the XTS tables. +- occ: Set up OCC messaging even if we fail to setup pstates + + This means that we no longer hit this bug if we fail to get valid pstates + from the OCC. :: + + [console-pexpect]#echo 1 > //sys/firmware/opal/sensor_groups//occ-csm0/clear + echo 1 > //sys/firmware/opal/sensor_groups//occ-csm0/clear + [ 94.019971181,5] CPU ATTEMPT TO RE-ENTER FIRMWARE! PIR=083d cpu @0x33cf4000 -> pir=083d token=8 + [ 94.020098392,5] CPU ATTEMPT TO RE-ENTER FIRMWARE! PIR=083d cpu @0x33cf4000 -> pir=083d token=8 + [ 10.318805] Disabling lock debugging due to kernel taint + [ 10.318808] Severe Machine check interrupt [Not recovered] + [ 10.318812] NIP [000000003003e434]: 0x3003e434 + [ 10.318813] Initiator: CPU + [ 10.318815] Error type: Real address [Load/Store (foreign)] + [ 10.318817] opal: Hardware platform error: Unrecoverable Machine Check exception + [ 10.318821] CPU: 117 PID: 2745 Comm: sh Tainted: G M 4.15.9-openpower1 #3 + [ 10.318823] NIP: 000000003003e434 LR: 000000003003025c CTR: 0000000030030240 + [ 10.318825] REGS: c00000003fa7bd80 TRAP: 0200 Tainted: G M (4.15.9-openpower1) + [ 10.318826] MSR: 9000000000201002 <SF,HV,ME,RI> CR: 48002888 XER: 20040000 + [ 10.318831] CFAR: 0000000030030258 DAR: 394a00147d5a03a6 DSISR: 00000008 SOFTE: 1 +- core/fast-reboot: disable fast reboot upon fundamental entry/exit/locking errors + + This disables fast reboot in several more cases where serious errors + like lock corruption or call re-entrancy are detected. +- core/opal: allow some re-entrant calls + + This allows a small number of OPAL calls to succeed despite re-entering + the firmware, and rejects others rather than aborting. + + This allows a system reset interrupt that interrupts OPAL to do something + useful. Sreset other CPUs, use the console, which allows xmon to work or + stack traces to be printed, reboot the system. + + Use OPAL_INTERNAL_ERROR when rejecting, rather than OPAL_BUSY, which is + used for many other things that does not mean a serious permanent error. +- core/opal: abort in case of re-entrant OPAL call + + The stack is already destroyed by the time we get here, so there + is not much point continuing. +- npu2: Disable fast reboot + + Fast reboot does not yet work right with the NPU. It's been disabled on + NVLink and OpenCAPI machines. Do the same for NVLink2. + + This amounts to a port of 3e4577939bbf ("npu: Fix broken fast reset") + from the npu code to npu2. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.10.4.rst b/roms/skiboot/doc/release-notes/skiboot-5.10.4.rst new file mode 100644 index 000000000..a2c3b6b04 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.10.4.rst @@ -0,0 +1,28 @@ +.. _skiboot-5.10.4: + +============== +skiboot-5.10.4 +============== + +skiboot 5.10.4 was released on Wednesday April 4th, 2018. It replaces +:ref:`skiboot-5.10.3` as the current stable release in the 5.10.x series. + +It is recommended that 5.10.3 be used instead of any previous 5.10.x version +due to the bug fixes and debugging enhancements in it. + +Over :ref:`skiboot-5.10.3`, we have one bug fix: + +- xive: disable store EOI support + + Hardware has limitations which would require to put a sync after each + store EOI to make sure the MMIO operations that change the ESB state + are ordered. This is a killer for performance and the PHBs do not + support the sync. So remove the store EOI for the moment, until + hardware is improved. + + Also, while we are at changing the XIVE source flags, let's fix the + settings for the PHB4s which should follow these rules : + + - SHIFT_BUG for DD10 + - STORE_EOI for DD20 and if enabled + - TRIGGER_PAGE for DDx0 and if not STORE_EOI diff --git a/roms/skiboot/doc/release-notes/skiboot-5.10.5.rst b/roms/skiboot/doc/release-notes/skiboot-5.10.5.rst new file mode 100644 index 000000000..1cc16eaf5 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.10.5.rst @@ -0,0 +1,61 @@ +.. _skiboot-5.10.5: + +============== +skiboot-5.10.5 +============== + +skiboot 5.10.5 was released on Tuesday April 24th, 2018. It replaces +:ref:`skiboot-5.10.4` as the current stable release in the 5.10.x series. + +It is recommended that 5.10.5 be used instead of any previous 5.10.x version +due to the bug fixes and debugging enhancements in it. + +Over :ref:`skiboot-5.10.4`, we have four bug fixes: + +- npu2/hw-procedures: fence bricks on GPU reset + + The NPU workbook defines a way of fencing a brick and + getting the brick out of fence state. We do have an implementation + of bringing the brick out of fenced/quiesced state. We do + the latter in our procedures, but to support run time reset + we need to do the former. + + The fencing ensures that access to memory behind the links + will not lead to HMI's, but instead SUE's will be populated + in cache (in the case of speculation). The expectation is then + that prior to and after reset, the operating system components + will flush the cache for the region of memory behind the GPU. + + This patch does the following: + + 1. Implements a npu2_dev_fence_brick() function to set/clear + fence state + 2. Clear FIR bits prior to clearing the fence status + 3. Clear's the fence status + 4. We take the powerbus out of CQ fence much later now, + in credits_check() which is the last hardware procedure + called after link training. + +- hdata/spira: parse vpd to add part-number and serial-number to xscom@ node + + Expected by FWTS and associates our processor with the part/serial + number, which is obviously a good thing for one's own sanity. +- hw/imc: Check for pause_microcode_at_boot() return status + + pause_microcode_at_boot() loops through all the chip's ucode + control block and pause the ucode if it is in the running state. + But it does not fail if any of the chip's ucode is not initialised. + + Add code to return a failure if ucode is not initialized in any + of the chip. Since pause_microcode_at_boot() is called just before + attaching the IMC device nodes in imc_init(), add code to check for + the function return. +- core/cpufeatures: Fix setting DARN and SCV HWCAP feature bits + + DARN and SCV has been assigned AT_HWCAP2 (32-63) bits: :: + + #define PPC_FEATURE2_DARN 0x00200000 /* darn random number insn */ + #define PPC_FEATURE2_SCV 0x00100000 /* scv syscall */ + + A cpufeatures-aware OS will not advertise these to userspace without + this patch. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.10.6.rst b/roms/skiboot/doc/release-notes/skiboot-5.10.6.rst new file mode 100644 index 000000000..be9ea4def --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.10.6.rst @@ -0,0 +1,48 @@ +.. _skiboot-5.10.6: + +============== +skiboot-5.10.6 +============== + +skiboot 5.10.6 was released on Monday May 28th, 2018. It replaces +:ref:`skiboot-5.10.5` as the current stable release in the 5.10.x series. + +It is recommended that 5.10.6 be used instead of any previous 5.10.x version, +especially due to the locking bug fixes. + +It is expected that this will be the final 5.10.x version, with 6.0.x taking +over as the main stable branch. + +Over :ref:`skiboot-5.10.5`, we have the following fixes: + +- opal-prd: Do not error out on first failure for soft/hard offline. + + The memory errors (CEs and UEs) that are detected as part of background + memory scrubbing are reported by PRD asynchronously to opal-prd along with + affected memory ranges. hservice_memory_error() converts these ranges into + page granularity before hooking up them to soft/hard offline-ing + infrastructure. + + But the current implementation of hservice_memory_error() does not hookup + all the pages to soft/hard offline-ing if any of the page offline action + fails. e.g hard offline can fail for: + + - Pages that are not part of buddy managed pool. + - Pages that are reserved by kernel using memblock_reserved() + - Pages that are in use by kernel. + + But for the pages that are in use by user space application, the hard + offline marks the page as hwpoison, sends SIGBUS signal to kill the + affected application as recovery action and returns success. + + Hence, It is possible that some of the pages in that memory range are in + use by application or free. By stopping on first error we loose the + opportunity to hwpoison the subsequent pages which may be free or in use by + application. This patch fixes this issue. +- xive: fix missing unlock in error path + + Found with sparse and some added lock annotations. +- OPAL_PCI_SET_POWER_STATE: fix locking in error paths + + Otherwise we could exit OPAL holding locks, potentially leading + to all sorts of problems later on. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.10.rst b/roms/skiboot/doc/release-notes/skiboot-5.10.rst new file mode 100644 index 000000000..02438ada3 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.10.rst @@ -0,0 +1,2257 @@ +.. _skiboot-5.10: + +skiboot-5.10 +============ + +skiboot v5.10 was released on Friday February 23rd 2018. It is the first +release of skiboot 5.10, and becomes the new stable release +of skiboot following the 5.9 release, first released October 31st 2017. + +skiboot v5.10 contains all bug fixes as of :ref:`skiboot-5.9.8` +and :ref:`skiboot-5.4.9`. We do not forsee any further 5.9.x releases. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +Over skiboot-5.9, we have the following changes: + +New Features +------------ + +Since skiboot-5.10-rc3: + +- sensor-groups: occ: Add support to disable/enable sensor group + + This patch adds a new opal call to enable/disable a sensor group. This + call is used to select the sensor groups that needs to be copied to + main memory by OCC at runtime. +- sensors: occ: Add energy counters + + Export the accumulated power values as energy sensors. The accumulator + field of power sensors are used for representing energy counters which + can be exported as energy counters in Linux hwmon interface. +- sensors: Support reading u64 sensor values + + This patch adds support to read u64 sensor values. This also adds + changes to the core and the backend implementation code to make this + API as the base call. Host can use this new API to read sensors + upto 64bits. + + This adds a list to store the pointer to the kernel u32 buffer, for + older kernels making async sensor u32 reads. +- dt: add /cpus/ibm,powerpc-cpu-features device tree bindings + + This is a new CPU feature advertising interface that is fine-grained, + extensible, aware of privilege levels, and gives control of features + to all levels of the stack (firmware, hypervisor, and OS). + + The design and binding specification is described in detail in doc/. + +Since skiboot-5.10-rc2: + +- DT: Add "version" property under ibm, firmware-versions node + + First line of VERSION section in PNOR contains firmware version. + Use that to add "version" property under firmware versions dt node. + + Sample output: + + .. code-block:: console + + root@xxx2:/proc/device-tree/ibm,firmware-versions# lsprop + version "witherspoon-ibm-OP9_v1.19_1.94" + +Since skiboot-5.10-rc1: + +- hw/npu2: Implement logging HMI actions + + +Since skiboot-5.9: + +- hdata: Parse IPL FW feature settings + + Add parsing for the firmware feature flags in the HDAT. This + indicates the settings of various parameters which are set at IPL time + by firmware. + +- opal/xstop: Use nvram option to enable/disable sw checkstop. + + Add a mechanism to enable/disable sw checkstop by looking at nvram option + opal-sw-xstop=<enable/disable>. + + For now this patch disables the sw checkstop trigger unless explicitly + enabled through nvram option 'opal-sw-xstop=enable'i for p9. This will allow + an opportunity to get host kernel in panic path or xmon for unrecoverable + HMIs or MCE, to be able to debug the issue effectively. + + To enable sw checkstop in opal issue following command: :: + + nvram -p ibm,skiboot --update-config opal-sw-xstop=enable + + **NOTE:** This is a workaround patch to disable sw checkstop by default to gain + control in host kernel for better checkstop debugging. Once we have most of + the checkstop issues stabilized/resolved, revisit this patch to enable sw + checkstop by default. + + For p8 platform it will remain enabled by default unless explicitly disabled. + + To disable sw checkstop on p8 issue following command: :: + + nvram -p ibm,skiboot --update-config opal-sw-xstop=disable +- hdata: Parse SPD data + + Parse SPD data and populate device tree. + + list of properties parsing from SPD: :: + + [root@ltc-wspoon dimm@d00f]# lsprop . + memory-id 0000000c (12) # DIMM type + product-version 00000032 (50) # Module Revision Code + device_type "memory-dimm-ddr4" + serial-number 15d9acb6 (366587062) + status "okay" + size 00004000 (16384) + phandle 000000bd (189) + ibm,loc-code "UOPWR.0000000-Node0-DIMM7" + part-number "36ASF2G72PZ-2G6B2 " + reg 0000d007 (53255) + name "dimm" + manufacturer-id 0000802c (32812) # Vendor ID, we can get vendor name from this ID + + Also update documentation. +- hdata: Add memory hierarchy under xscom node + + We have memory to chip mapping but doesn't have complete memory hierarchy. + This patch adds memory hierarchy under xscom node. This is specific to + P9 system as these hierarchy may change between processor generation. + + It uses memory controller ID details and populates nodes like: + xscom@<addr>/mcbist@<mcbist_id>/mcs@<mcs_id>/mca@<mca_id>/dimm@<resource_id> + + Also this patch adds few properties under dimm node. + Finally make sure xscom nodes created before calling memory_parse(). + +Fast Reboot and Quiesce +^^^^^^^^^^^^^^^^^^^^^^^ +We have a preliminary fast reboot implementation for POWER9 systems, which +we look to enabling by default in the next release. + +The OPAL Quiesce calls are designed to improve reliability and debuggability +around reboot and error conditions. See the full API documentation for details: +:ref:`OPAL_QUIESCE`. + +- fast-reboot: bare bones fast reboot implementation for POWER9 + + This is an initial fast reboot implementation for p9 which has only been + tested on the Witherspoon platform, and without the use of NPUs, NX/VAS, + etc. + + This has worked reasonably well so far, with no failures in about 100 + reboots. It is hidden behind the traditional fast-reboot experimental + nvram option, until more platforms and configurations are tested. +- fast-reboot: move boot CPU clean-up logically together with secondaries + + Move the boot CPU clean-up and state transition to active, logically + together with secondaries. Don't release secondaries from fast reboot + hold until everyone has cleaned up and transitioned to active. + + This is cosmetic, but it is helpful to run the fast reboot state machine + the same way on all CPUs. +- fast-reboot: improve failure error messages + + Change existing failure error messages to PR_NOTICE so they get + printed to the console, and add some new ones. It's not a more + severe class because it falls back to IPL on failure. +- fast-reboot: quiesce opal before initiating a fast reboot + + Switch fast reboot to use quiescing rather than "wait for a while". + + If firmware can not be quiesced, then fast reboot is skipped. This + significantly improves the robustness of fast reboot in the face of + bugs or unexpected latencies. + + Complexity of synchronization in fast-reboot is reduced, because we + are guaranteed to be single-threaded when quiesce succeeds, so locks + can be removed. + + In the case that firmware can be quiesced, then it will generally + reduce fast reboot times by nearly 200ms, because quiescing usually + takes very little time. +- core: Add support for quiescing OPAL + + Quiescing is ensuring all host controlled CPUs (except the current + one) are out of OPAL and prevented from entering. This can be use in + debug and shutdown paths, particularly with system reset sequences. + + This patch adds per-CPU entry and exit tracking for OPAL calls, and + adds logic to "hold" or "reject" at entry time, if OPAL is quiesced. + + An OPAL call is added, to expose the functionality to Linux, where it + can be used for shutdown, kexec, and before generating sreset IPIs for + debugging (so the debug code does not recurse into OPAL). +- dctl: p9 increase thread quiesce timeout + + We require all instructions to be completed before a thread is + considered stopped, by the dctl interface. Long running instructions + like cache misses and CI loads may take a significant amount of time + to complete, and timeouts have been observed in stress testing. + + Increase the timeout significantly, to cover this. The workbook + just says to poll, but we like to have timeouts to avoid getting + stuck in firmware. + + +POWER9 power saving +^^^^^^^^^^^^^^^^^^^ + +There is much improved support for deeper sleep/idle (stop) states on POWER9. + +- OCC: Increase max pstate check on P9 to 255 + + This has changed from P8, we can now have > 127 pstates. + + This was observed on Boston during WoF bring up. +- SLW: Add idle state stop5 for DD2.0 and above + + Adding stop5 idle state with rough residency and latency numbers. +- SLW: Add p9_stop_api calls for IMC + + Add p9_stop_api for EVENT_MASK and PDBAR scoms. These scoms are lost on + wakeup from stop11. + +- SCOM restore for DARN and XIVE + + While waking up from stop11, we want NCU_DARN_BAR to have enable bit set. + Without this stop_api call, the value restored is without enable bit set. + We loose NCU_SPEC_BAR when the quad goes into stop11, stop_api will + restore while waking up from stop11. + +- SLW: Call p9_stop_api only if deep_states are enabled + + All init time p9_stop_api calls have been isolated to slw_late_init. If + p9_stop_api fails, then the deep states can be excluded from device tree. + + For p9_stop_api called after device-tree for cpuidle is created , + has_deep_states will be used to check if this call is even required. +- Better handle errors in setting up sleep states (p9_stop_api) + + We won't put affected stop states in the device tree if the wakeup + engine is not present or has failed. +- SCOM Restore: Increased the EQ SCOM restore limit. + + Commit increases the SCOM restore limit from 16 to 31. +- hw/dts: retry special wakeup operation if core still gated + + It has been observed that in some cases the special wakeup + operation can "succeed" but the core is still in a gated/offline + state. + + Check for this state after attempting to wakeup a core and retry + the wakeup if necessary. +- core/direct-controls: add function to read core gated state +- core/direct-controls: wait for core special wkup bit cleared + + When clearing special wakeup bit on a core, wait until the + bit is actually cleared by the hardware in the status register + until returning success. + + This may help avoid issues with back-to-back reads where the + special wakeup request is cleared but the firmware is still + processing the request and the next attempt to set the bit + reads an immediate success from the previous operation. +- p9_stop_api: PM: Added support for version control in SCOM restore entries. + + - adds version info in SCOM restore entry header + - adds version specific details in SCOM restore entry header + - retains old behaviour of SGPE Hcode's base version +- p9_stop_api: EQ SCOM Restore: Introduced version control in SCOM restore entry. + + - introduces version control in header of SCOM restore entry + - ensures backward compatibility + - introduces flexibility to handle any number of SCOM restore entry. + +Secure and Trusted Boot for POWER9 +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +We introduce support for Secure and Trusted Boot for POWER9 systems, with equal +functionality that we have on POWER8 systems, that is, we have the mechanisms in +place to boot to petitboot (i.e. to BOOTKERNEL). + +See the :ref:`stb-overview` for full documentation of OPAL secure and trusted boot. + +Since skiboot-5.10-rc2: + +- stb: Put correct label (for skiboot) into container + + Hostboot will expect the label field of the stb header to contain + "PAYLOAD" for skiboot or it will fail to load and run skiboot. + + The failure looks something like this: :: + + 53.40896|ISTEP 20. 1 - host_load_payload + 53.65840|secure|Secureboot Failure plid = 0x90000755, rc = 0x1E07 + + 53.65881|System shutting down with error status 0x1E07 + 53.67547|================================================ + 53.67954|Error reported by secure (0x1E00) PLID 0x90000755 + 53.67560| Container's component ID does not match expected component ID + 53.67561| ModuleId 0x09 SECUREBOOT::MOD_SECURE_VERIFY_COMPONENT + 53.67845| ReasonCode 0x1e07 SECUREBOOT::RC_ROM_VERIFY + 53.67998| UserData1 : 0x0000000000000000 + 53.67999| UserData2 : 0x0000000000000000 + 53.67999|------------------------------------------------ + 53.68000| Callout type : Procedure Callout + 53.68000| Procedure : EPUB_PRC_HB_CODE + 53.68001| Priority : SRCI_PRIORITY_HIGH + 53.68001|------------------------------------------------ + 53.68002| Callout type : Procedure Callout + 53.68003| Procedure : EPUB_PRC_FW_VERIFICATION_ERR + 53.68003| Priority : SRCI_PRIORITY_HIGH + 53.68004|------------------------------------------------ + +Since skiboot-5.10-rc1: + +- stb: Enforce secure boot if called before libstb initialized +- stb: Correctly error out when no PCR for resource +- core/init: move imc catalog preload init after the STB init. + + As a safer side move the imc catalog preload after the STB init + to make sure the imc catalog resource get's verified and measured + properly during loading when both secure and trusted boot modes + are on. +- libstb: fix failure of calling trusted measure without STB initialization. + + When we load a flash resource during OPAL init, STB calls trusted measure + to measure the given resource. There is a situation when a flash gets loaded + before STB initialization then trusted measure cannot measure properly. + + So this patch fixes this issue by calling trusted measure only if the + corresponding trusted init was done. + + The ideal fix is to make sure STB init done at the first place during init + and then do the loading of flash resources, by that way STB can properly + verify and measure the all resources. +- libstb: fix failure of calling cvc verify without STB initialization. + + Currently in OPAL init time at various stages we are loading various + PNOR partition containers from the flash device. When we load a flash + resource STB calls the CVC verify and trusted measure(sha512) functions. + So when we have a flash resource gets loaded before STB initialization, + then cvc verify function fails to start the verify and enforce the boot. + + Below is one of the example failure where our VERSION partition gets + loading early in the boot stage without STB initialization done. + + This is with secure mode off. + STB: VERSION NOT VERIFIED, invalid param. buf=0x305ed930, len=4096 key-hash=0x0 hash-size=0 + + In the same code path when secure mode is on, the boot process will abort. + + So this patch fixes this issue by calling cvc verify only if we have + STB init was done. + + And also we need a permanent fix in init path to ensure STB init gets + done at first place and then start loading all other flash resources. +- libstb/tpm_chip: Add missing new line to print messages. +- libstb: increase the log level of verify/measure messages to PR_NOTICE. + + Currently libstb logs the verify and hash caluculation messages in + PR_INFO level. So when there is a secure boot enforcement happens + in loading last flash resource(Ex: BOOTKERNEL), the previous verify + and measure messages are not logged to console, which is not clear + to the end user which resource is verified and measured. + So this patch fixes this by increasing the log level to PR_NOTICE. + +Since skiboot-5.9: + +- allow secure boot if not enforcing it + + We check the secure boot containers no matter what, only *enforcing* + secure boot if we're booting in secure mode. This gives us an extra + layer of checking firmware is legit even when secure mode isn't enabled, + as well as being really useful for testing. +- libstb/(create|print)-container: Sync with sb-signing-utils + + The sb-signing-utils project has improved upon the skeleton + create-container tool that existed in skiboot, including + being able to (quite easily) create *signed* images. + + This commit brings in that code (and makes it build in the + skiboot build environment) and updates our skiboot.*.stb + generating code to use the development keys. This means that by + default, skiboot build process will let you build firmware that can + do a secure boot with *development* keys. + + See :ref:`signing-firmware-code` for details on firmware signing. + + We also update print-container as well, syncing it with the + upstream project. + + Derived from github.com:open-power/sb-signing-utils.git + at v0.3-5-gcb111c03ad7f + (Some discussion ongoing on the changes, another sync will come shortly) + +- doc: update libstb documentation with POWER9 changes. + See: :ref:`stb-overview`. + + POWER9 changes reflected in the libstb: + + - bumped ibm,secureboot node to v2 + - added ibm,cvc node + - hash-algo superseded by hw-key-hash-size + +- libstb/cvc: update memory-region to point to /reserved-memory + + The linux documentation, reserved-memory.txt, says that memory-region is + a phandle that pairs to a children of /reserved-memory. + + This updates /ibm,secureboot/ibm,cvc/memory-region to point to + /reserved-memory/secure-crypt-algo-code instead of + /ibm,hostboot/reserved-memory/secure-crypt-algo-code. +- libstb: add support for ibm,secureboot-v2 + + ibm,secureboot-v2 changes: + + - The Container Verification Code is represented by the ibm,cvc node. + - Each ibm,cvc child describes a CVC service. + - hash-algo is superseded by hw-key-hash-size. +- hdata/tpmrel.c: add ibm, cvc device tree node + + In P9, the Container Verification Code is stored in a hostboot reserved + memory and the list of provided CVC services is stored in the + TPMREL_IDATA_HASH_VERIF_OFFSETS idata array. Each CVC service has an + offset and version. + + This adds the ibm,cvc device tree node and its documentation. +- hdata/tpmrel.c: add firmware event log info to the tpm node + + This parses the firmware event log information from the + secureboot_tpm_info HDAT structure and add it to the tpm device tree + node. + + There can be multiple secureboot_tpm_info entries with each entry + corresponding to a master processor that has a tpm device, however, + multiple tpm is not supported. +- hdata/spira: add ibm,secureboot node in P9 + + In P9, skiboot builds the device tree from the HDAT. These are the + "ibm,secureboot" node changes compared to P8: + + - The Container-Verification-Code (CVC), a.k.a. ROM code, is no longer + stored in a secure ROM with static address. In P9, it is stored in a + hostboot reserved memory and each service provided also has a version, + not only an offset. + - The hash-algo property is not provided via HDAT, instead it provides + the hw-key-hash-size, which is indeed the information required by the + CVC to verify containers. + + This parses the iplparams_sysparams HDAT structure and creates the + "ibm,secureboot", which is bumped to "ibm,secureboot-v2". + + In "ibm,secureboot-v2": + + - hash-algo property is superseded by hw-key-hash-size. + - container verification code is explicitly described by a child node. + Added in a subsequent patch. + + See :ref:`device-tree/ibm,secureboot` for documentation. +- libstb/tpm_chip.c: define pr_fmt and fix messages logged + + This defines pr_fmt and also fix messages logged: + + - EV_SEPARATOR instead of 0xFFFFFFFF + - when an event is measured it also prints the tpm id, event type and + event log length + + Now we can filter the messages logged by libstb and its + sub-modules by running: :: + + grep STB /sys/firmware/opal/msglog +- libstb/tss: update the list of event types supported + + Skiboot, precisely the tpmLogMgr, initializes the firmware event log by + calculating its length so that a new event can be recorded without + exceeding the log size. In order to calculate the size, it walks through + the log until it finds a specific event type. However, if the log has + an unknown event type, the tpmLogMgr will not be able to reach the end + of the log. + + This updates the list of event types with all of those supported by + hostboot. Thus, skiboot can properly calculate the event log length. +- tpm_i2c_nuvoton: add nuvoton, npct601 to the compatible property + + The linux kernel doesn't have a driver compatible with + "nuvoton,npct650", but it does have for "nuvoton,npct601", which should + also be compatible with npct650. + + This adds "nuvoton,npct601" to the compatible devtree property. +- libstb/trustedboot.c: import stb_final() from stb.c + + The stb_final() primary goal is to measure the event EV_SEPARATOR + into PCR[0-7] when trusted boot is about to exit the boot services. + + This imports the stb_final() from stb.c into trustedboot.c, but making + the following changes: + + - Rename it to trustedboot_exit_boot_services(). + - As specified in the TCG PC Client specification, EV_SEPARATOR events must + be logged with the name 0xFFFFFF. + - Remove the ROM driver clean-up call. + - Don't allow code to be measured in skiboot after + trustedboot_exit_boot_services() is called. +- libstb/cvc.c: import softrom behaviour from drivers/sw_driver.c + + Softrom is used only for testing with mambo. By setting + compatible="ibm,secureboot-v1-softrom" in the "ibm,secureboot" node, + firmware images can be properly measured even if the + Container-Verification-Code (CVC) is not available. In this case, the + mbedtls_sha512() function is used to calculate the sha512 hash of the + firmware images. + + This imports the softrom behaviour from libstb/drivers/sw_driver.c code + into cvc.c, but now softrom is implemented as a flag. When the flag is + set, the wrappers for the CVC services work the same way as in + sw_driver.c. +- libstb/trustedboot.c: import tb_measure() from stb.c + + This imports tb_measure() from stb.c, but now it calls the CVC sha512 + wrapper to calculate the sha512 hash of the firmware image provided. + + In trustedboot.c, the tb_measure() is renamed to trustedboot_measure(). + + The new function, trustedboot_measure(), no longer checks if the + container payload hash calculated at boot time matches with the hash + found in the container header. A few reasons: + + - If the system admin wants the container header to be + checked/validated, the secure boot jumper must be set. Otherwise, + the container header information may not be reliable. + - The container layout is expected to change over time. Skiboot + would need to maintain a parser for each container layout + change. + - Skiboot could be checking the hash against a container version that + is not supported by the Container-Verification-Code (CVC). + + The tb_measure() calls are updated to trustedboot_measure() in a + subsequent patch. +- libstb/secureboot.c: import sb_verify() from stb.c + + This imports the sb_verify() function from stb.c, but now it calls the + CVC verify wrapper in order to verify signed firmware images. The + hw-key-hash and hw-key-hash-size initialized in secureboot.c are passed + to the CVC verify function wrapper. + + In secureboot.c, the sb_verify() is renamed to secureboot_verify(). The + sb_verify() calls are updated in a subsequent patch. + +XIVE +---- +- xive: Don't bother cleaning up disabled EQs in reset + + Additionally, warn if we find an enabled one that isn't one + of the firmware built-in queues. +- xive: Warn on valid VPs found in abnormal cases + + If an allocated VP is left valid at xive_reset() or Linux tries + to free a valid (enabled) VP block, print errors. The former happens + occasionally if kdump'ing while KVM is running so keep it as a debug + message. The latter is a programming error in Linux so use a an + error log level. +- xive: Properly reserve built-in VPs in non-group mode + + This is not normally used but if the #define is changed to + disable block group mode we would incorrectly clear the + buddy completely without marking the built-in VPs reserved. +- xive: Quieten debug messages in standard builds + + This makes a bunch of messages, especially the per-CPU ones, + only enabled in debug builds. This avoids clogging up the + OPAL logs with XIVE related messages that have proven not + being particularly useful for field defects. +- xive: Implement "single escalation" feature + + This adds a new VP flag to control the new DD2.0 + "single escalation" feature. + + This feature allows us to have a single escalation + interrupt per VP instead of one per queue. + + It works by hijacking queue 7 (which is this no longer + usable when that is enabled) and exploiting two new + hardware bits that will: + + - Make the normal queues (0..6) escalate unconditionally + thus ignoring the ESe bits. + - Route the above escalations to queue 7 + - Have queue 7 silently escalate without notification + + Thus the escalation of queue 7 becomes the one escalation + interrupt for all the other queues. +- xive: When disabling a VP, wipe all of its settings +- xive: Improve cleaning up of EQs + + Factors out the function that sets an EQ back to a clean + state and add a cleaning pass for queue left enabled + when freeing a block of VPs. +- xive: When disabling an EQ, wipe all of its settings + + This avoids having configuration bits left over +- xive: Define API for single-escalation VP mode + + This mode allows all queues of a VP to use the same + escalation interrupt, at the cost of losing priority 7. + + This adds the definition and documentation of the API, + the implementation will come next. +- xive: Fix ability to clear some EQ flags + + We could never clear "unconditional notify" and "escalate" +- xive: Update inits for DD2.0 + + This updates some inits based on information from the HW + designers. This includes enabling some new DD2.0 features + that we don't yet exploit. +- xive: Ensure VC informational FIRs are masked + + Some HostBoot versions leave those as checkstop, they are harmless + and can sometimes occur during normal operations. +- xive: Fix occasional VC checkstops in xive_reset + + The current workaround for the scrub bug described in + __xive_cache_scrub() has an issue in that it can leave + dirty invalid entries in the cache. + + When cleaning up EQs or VPs during reset, if we then + remove the underlying indirect page for these entries, + the XIVE will checkstop when trying to flush them out + of the cache. + + This replaces the existing workaround with a new pair of + workarounds for VPs and EQs: + + - The VP one does the dummy watch on another entry than + the one we scrubbed (which does the job of pushing old + stores out) using an entry that is known to be backed by + a permanent indirect page. + - The EQ one switches to a more efficient workaround + which consists of doing a non-side-effect ESB load from + the EQ's ESe control bits. +- xive: Do not return a trigger page for an escalation interrupt + + This is bogus, we don't support them. (Thankfully the callers + didn't actually try to use this on escalation interrupts). +- xive: Mark a freed IRQs IVE as valid and masked + + Removing the valid bit means a FIR will trip if it's accessed + inadvertently. Under some circumstances, the XIVE will speculatively + access an IVE for a masked interrupt and trip it. So make sure that + freed entries are still marked valid (but masked). + +PCI +--- + +Since skiboot-5.10-rc3: + +- phb3/phb4/p7ioc: Document supported TCE sizes in DT + + Add a new property, "ibm,supported-tce-sizes", to advertise to Linux how + big the available TCE sizes are. Each value is a bit shift, from + smallest to largest. +- phb4: Fix TCE page size + + The page sizes for TCEs on P9 were inaccurate and just copied from PHB3, + so correct them. +- Revert "pci: Shared slot state synchronisation for hot reset" + + An issue was found in shared slot reset where the system can be stuck in + an infinite loop, pull the code out until there's a proper fix. + + This reverts commit 1172a6c57ff3c66f6361e572a1790cbcc0e5ff37. +- hdata/iohub: Use only wildcard slots for pluggables + + We don't want to cause a VID:DID check against pluggable devices, as + they may use multiple devids. + + Narrow the condition under which VID:DID is listed in the dt, so that + we'll end up creating a wildcard slot for these instead. + +Since skiboot-5.9: + +- pci: Shared slot state synchronisation for hot reset + + When a device is shared between two PHBs, it doesn't get reset properly + unless both PHBs issue a hot reset at "the same time". Practically this + means a hot reset needs to be issued on both sides, and neither should + bring the link up until the reset on both has completed. +- pci: Track peers of slots + + Witherspoon introduced a new concept where one physical slot is shared + between two PHBs. Making a slot aware of its peer enables syncing + between them where necessary. + +PHB4 +---- + +Since skiboot-5.10-rc4: + +- phb4: Disable lane eq when retrying some nvidia GEN3 devices + + This fixes these nvidia cards training at only GEN2 spends rather than + GEN3 by disabling PCIe lane equalisation. + + Firstly we check if the card is in a whitelist. If it is and the link + has not trained optimally, retry with lane equalisation off. We do + this on all POWER9 chip revisions since this is a device issue, not + a POWER9 chip issue. + +Since skiboot-5.10-rc2: + +- phb4: Only escalate freezes on MMIO load where necessary + + In order to work around a hardware issue, MMIO load freezes were + escalated to fences on every chip. Now that hardware no longer requires + this, restrict escalation to the chips that actually need it. + +Since skiboot-5.9: + +- phb4: Change PCI MMIO timers + + Currently we have a mismatch between the NCU and PCI timers for MMIO + accesses. The PCI timers must be lower than the NCU timers otherwise + it may cause checkstops. + + This changes PCI timeouts controlled by skiboot to 33-50ms. It should + be forwards and backwards compatible with expected hostboot changes to + the NCU timer. +- phb4: Change default GEN3 lane equalisation setting to 0x54 + + Currently our GEN3 lane equalisation settings are set to 0x77. Change + this to 0x54. This change will allow us to train at GEN3 in a shorter + time and more consistently. + + This setting gives us a TX preset 0x4 and RX hint 0x5. This gives a + boost in gain for high frequency signalling. It allows the most optimal + continuous time linear equalizers (CTLE) for the remote receiver port + and de-emphasis and pre-shoot for the remote transmitter port. + + Machine Readable Workbooks (MRW) are moving to this new value also. +- phb4: Init changes + + These init changes for phb4 from the HW team. + + Link down are now endpoint recoverable (ERC) rather than PHB fatal + errors. + + BLIF Completion Timeout Error now generate an interrupt rather than + causing freeze events. +- phb4: Fix lane equalisation setting + + Fix cut and paste from phb3. The sizes have changes now we have GEN4, + so the check here needs to change also + + Without this we end up with the default settings (all '7') rather + than what's in HDAT. +- hdata: Fix copying GEN4 lane equalisation settings + + These aren't copied currently but should be. +- phb4: Fix PE mapping of M32 BAR + + The M32 BAR is the PHB4 region used to map all the non-prefetchable + or 32-bit device BARs. It's supposed to have its segments remapped + via the MDT and Linux relies on that to assign them individual PE#. + + However, we weren't configuring that properly and instead used the + mode where PE# == segment#, thus causing EEH to freeze the wrong + device or PE#. +- phb4: Fix lost bit in PE number on config accesses + + A PE number can be up to 9 bits, using a uint8_t won't fly.. + + That was causing error on config accesses to freeze the + wrong PE. +- phb4: Update inits + + New init value from HW folks for the fence enable register. + + This clears bit 17 (CFG Write Error CA or UR response) and bit 22 (MMIO Write + DAT_ERR Indication) and sets bit 21 (MMIO CFG Pending Error) + +CAPI +---- + +Since skiboot-5.10-rc2: + +- capi: Enable channel tag streaming for PHB in CAPP mode + + We re-enable channel tag streaming for PHB in CAPP mode as without it + PEC was waiting for cresp for each DMA write command before sending a + new DMA write command on the Powerbus. This resulted in much lower DMA + write performance than expected. + + The patch updates enable_capi_mode() to remove the masking of + channel_streaming_en bit in PBCQ Hardware Configuration Register. Also + does some re-factoring of the code that updates this register to use + xscom_write_mask instead of xscom_read followed by a xscom_write. + +Since skiboot-5.10-rc1: + +- capi: Fix the max tlbi divider and the directory size. + + Switch to 512KB mode (directory size) as we don’t use bit 48 of the tag + in addressing the array. This mode is controlled by the Snoop CAPI + Configuration Register. + Set the maximum of the number of data polls received before signaling + TLBI hang detect timer expired. The value of '0000' is equal to 16. + +Since skiboot-5.9: + +- capi: Disable CAPP virtual machines + + When exercising more than one CAPI accelerators simultaneously in + cache coherency mode, the verification team is seeing a deadlock. To + fix this a workaround of disabling CAPP virtual machines is + suggested. These 'virtual machines' let PSL queue multiple CAPP + commands for servicing by CAPP there by increasing + throughput. Below is the error scenario described by the h/w team: + + " With virtual machines enabled we had a deadlock scenario where with 2 + or more CAPI's in a system you could get in a deadlock scenario due to + cast-outs that are required break the deadlock (evict lines that + another CAPI is requesting) get stuck in the virtual machine queue by + a command ahead of it that is being retried by the same scenario in + the other CAPI. " + +- capi: Perform capp recovery sequence only when PBCQ is idle + + Presently during a CRESET the CAPP recovery sequence can be executed + multiple times in case PBCQ on the PEC is still busy processing in/out + bound in-flight transactions. +- xive: Mask MMIO load/store to bad location FIR + + For opencapi, the trigger page of an interrupt is mapped to user + space. The intent is to write the page to raise an interrupt but + there's nothing to prevent a user process from reading it, which has + the unfortunate consequence of checkstopping the system. + + Mask the FIR bit raised when an MMIO operation targets an invalid + location. It's the recommendation from recent documentation and + hostboot is expected to mask it at some point. In the meantime, let's + play it safe. +- phb4: Dump CAPP error registers when it asserts link down + + This patch introduces a new function phb4_dump_app_err_regs() that + dumps CAPP error registers in case the PEC nestfir register indicates + that the fence was due to a CAPP error (BIT-24). + + Contents of these registers are helpful in diagnosing CAPP + issues. Registers that are dumped in phb4_dump_app_err_regs() are: + + * CAPP FIR Register + * CAPP APC Master Error Report Register + * CAPP Snoop Error Report Register + * CAPP Transport Error Report Register + * CAPP TLBI Error Report Register + * CAPP Error Status and Control Register +- capi: move the acknowledge of the HMI interrupt + + We need to acknowledge an eventual HMI initiated by the previous forced + fence on the PHB to work around a non-existent PE in the phb4_creset() + function. + For this reason do_capp_recovery_scoms() is called now at the + beginning of the step: PHB4_SLOT_CRESET_WAIT_CQ +- capi: update ci store buffers and dma engines + + The number of read (APC type traffic) and mmio store (MSG type traffic) + resources assigned to the CAPP is controlled by the CAPP control + register. + + According to the type of CAPI cards present on the server, we have to + configure differently the CAPP messages and the DMA read engines given + to the CAPP for use. + +HMI +--- +- core/hmi: Display chip location code while displaying core FIR. +- core/hmi: Do not display FIR details if none of the bits are set. + + So that we don't flood OPAL console logs with information that is not + useful. +- opal/hmi: HMI logging with location code info. + + Add few HMI debug prints with location code info few additional info. + + No functionality change. + + With this patch the log messages will look like: :: + + [210612.175196744,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [210612.175200449,7] HMI: [Loc: UOPWR.1302LFA-Node0-Proc1]: P:8 C:16 T:1: TFMR(2d12000870e04020) Timer Facility Error + + [210660.259689526,7] HMI: Received HMI interrupt: HMER = 0x2040000000000000 + [210660.259695649,7] HMI: [Loc: UOPWR.1302LFA-Node0-Proc0]: P:0 C:16 T:1: Processor recovery Done. + +- core/hmi: Use pr_fmt macro for tagging log messages + + No functionality changes. +- opal: Get chip location code + + and store it under proc_chip for quick reference during HMI handling + code. + +Sensors +------- +- occ-sensors: Fix up quad/gpu location mix-up + + The GPU and QUAD sensor location types are swapped compared to what + exists in the OCC code base which is authoritative. Fix them up. +- sensors: occ: Skip counter type of sensors + + Don't add counter type of sensors to device-tree as they don't + fit into hwmon sensor interface. +- sensors: dts: Assert special wakeup on idle cores while reading temperature + + In P9, when a core enters a stop state, its clocks will be stopped + to save power and hence we will not be able to perform a SCOM + operation to read the DTS temperature sensor. Hence, assert + a special wakeup on cores that have entered a stop state in order to + successfully complete the SCOM operation. +- sensors: occ: Skip power sensors with zero sample value + + APSS is not available on platforms like Zaius, Romulus where OCC + can only measure Vdd (core) and Vdn (nest) power from the AVSbus + reading. So all the sensors for APSS channels will be populated + with 0. Different component power sensors like system, memory + which point to the APSS channels will also be 0. + + As per OCC team (Martha Broyles) zeroed power sensor means that the + system doesn't have it. So this patch filters out these sensors. +- sensors: occ: Skip GPU sensors for non-gpu systems +- sensors: Fix dtc warning for new occ in-band sensors. + + dtc complains about missing reg property when a DT node is having a + unit name or address but no reg property. :: + + /ibm,opal/sensors/vrm-in@c00004 has a unit name, but no reg property + /ibm,opal/sensors/gpu-in@c0001f has a unit name, but no reg property + /ibm,opal/sensor-groups/occ-js@1c00040 has a unit name, but no reg property + + This patch fixes these warnings for new occ in-band sensors and also for + sensor-groups by adding necessary properties. +- sensors: Fix dtc warning for dts sensors. + + dtc complains about missing reg property when a DT node is having a + unit name or address but no reg property. + + Example warning for core dts sensor: :: + + /ibm,opal/sensors/core-temp@5c has a unit name, but no reg property + /ibm,opal/sensors/core-temp@804 has a unit name, but no reg property + + This patch fixes this by adding necessary properties. +- hw/occ: Fix psr cpu-to-gpu sensors node dtc warning. + + dtc complains about missing reg property when a DT node is having a + unit name or address but no reg property. :: + + /ibm,opal/power-mgt/psr/cpu-to-gpu@0 has a unit name, but no reg property + /ibm,opal/power-mgt/psr/cpu-to-gpu@100 has a unit name, but no reg property + + This patch fixes this by adding necessary properties. + +General fixes +------------- + +Since skiboot-5.10-rc3: + +- core: Fix mismatched names between reserved memory nodes & properties + + OPAL exposes reserved memory regions through the device tree in both new + (nodes) and old (properties) formats. + + However, the names used for these don't match - we use a generated cell + address for the nodes, but the plain region name for the properties. + + This fixes a warning from FWTS + +Since skiboot-5.10-rc2: + +- vas: Disable VAS/NX-842 on some P9 revisions + + VAS/NX-842 are not functional on some P9 revisions, so disable them + in hardware and skip creating their device tree nodes. + + Since the intent is to prevent OS from configuring VAS/NX, we remove + only the platform device nodes but leave the VAS/NX DT nodes under + xscom (i.e we don't skip add_vas_node() in hdata/spira.c) +- core/device.c: Fix dt_find_compatible_node + + dt_find_compatible_node() and dt_find_compatible_node_on_chip() are used to + find device nodes under a parent/root node with a given compatible + property. + + dt_next(root, prev) is used to walk the child nodes of the given parent and + takes two arguments - root contains the parent node to walk whilst prev + contains the previous child to search from so that it can be used as an + iterator over all children nodes. + + The first iteration of dt_find_compatible_node(root, prev) calls + dt_next(root, root) which is not a well defined operation as prev is + assumed to be child of the root node. The result is that when a node + contains no children it will start returning the parent nodes siblings + until it hits the top of the tree at which point a NULL derefence is + attempted when looking for the root nodes parent. + + Dereferencing NULL can result in undesirable data exceptions during system + boot and untimely non-hilarious system crashes. dt_next() should not be + called with prev == root. Instead we add a check to dt_next() such that + passing prev = NULL will cause it to start iterating from the first child + node (if any). + + This manifested itself in a crash on boot on ZZ systems. +- hw/occ: Fix fast-reboot crash in P8 platforms. + + commit 85a1de35cbe4 ("fast-boot: occ: Re-parse the pstate table during fast-boot" ) + breaks the fast-reboot on P8 platforms while reiniting the OCC pstates. On P8 + platforms OPAL adds additional two properties #address-cells and #size-cells + under ibm,opal/power-mgmt/ DT node. While in fast-reboot same properties adding + back to the same node results in Duplicate properties and hence fast-reboot fails + with below traces. :: + + [ 541.410373292,5] OCC: All Chip Rdy after 0 ms + [ 541.410488745,3] Duplicate property "#address-cells" in node /ibm,opal/power-mgt + [ 541.410694290,0] Aborting! + CPU 0058 Backtrace: + S: 0000000031d639d0 R: 000000003001367c .backtrace+0x48 + S: 0000000031d63a60 R: 000000003001a03c ._abort+0x4c + S: 0000000031d63ae0 R: 00000000300267d8 .new_property+0xd8 + S: 0000000031d63b70 R: 0000000030026a28 .__dt_add_property_cells+0x30 + S: 0000000031d63c10 R: 000000003003ea3c .occ_pstates_init+0x984 + S: 0000000031d63d90 R: 00000000300142d8 .load_and_boot_kernel+0x86c + S: 0000000031d63e70 R: 000000003002586c .fast_reboot_entry+0x358 + S: 0000000031d63f00 R: 00000000300029f4 fast_reset_entry+0x2c + + This patch fixes this issue by removing these two properties on P8 while doing + OCC pstates re-init in fast-reboot code path. + +Since skiboot-5.10-rc1: + +- fast-reboot: occ: Re-parse the pstate table during fast-reboot + + OCC shares the frequency list to host by copying the pstate table to + main memory in HOMER. This table is parsed during boot to create + device-tree properties for frequency and pstate IDs. OCC can update + the pstate table to present a new set of frequencies to the host. But + host will remain oblivious to these changes unless it is re-inited + with the updated device-tree CPU frequency properties. So this patch + allows to re-parse the pstate table and update the device-tree + properties during fast-reboot. + + OCC updates the pstate table when asked to do so using pstate-table + bias command. And this is mainly used by WOF team for + characterization purposes. +- fast-reboot: move pci_reset error handling into fast-reboot code + + pci_reset() currently does a platform reboot if it fails. It + should not know about fast-reboot at this level, so instead have + it return an error, and the fast reboot caller will do the + platform reboot. + + The code essentially does the same thing, but flexibility is + improved. Ideally the fast reboot code should perform pci_reset + and all such fail-able operations before the CPU resets itself + and destroys its own stack. That's not the case now, but that + should be the goal. + +Since skiboot-5.9: + +- lpc: Clear pending IRQs at boot + + When we come in from hostboot the LPC master has the bus reset indicator + set. This error isn't handled until the host kernel unmasks interrupts, + at which point we get the following spurious error: :: + + [ 20.053560375,3] LPC: Got LPC reset on chip 0x0 ! + [ 20.053564560,3] LPC[000]: Unknown LPC error Error address reg: 0x00000000 + + Fix this by clearing the various error bits in the LPC status register + before we initialise the skiboot LPC bus driver. +- hw/imc: Check ucode state before exposing units to Linux + + disable_unavailable_units() checks whether the ucode + is in the running state before enabling the nest units + in the device tree. From a recent debug, it is found + that on some system boot, ucode is not loaded and + running in all the chips in the system. And this + caused a fail in OPAL_IMC_COUNTERS_STOP call where + we check for ucode state on each chip. Bug here is + that disable_unavailable_units() checks the state + of the ucode only in boot cpu chip. Patch adds a + condition in disable_unavailable_units() to check + for the ucode state in all the chip before enabling + the nest units in the device tree node. + +- hdata/vpd: Add vendor property + + ibm,vpd blob contains VN field. Use that to populate vendor property + for various FRU's. +- hdata/vpd: Fix DTC warnings + + All the nodes under the vpd hierarchy have a unit address (their SLCA + index) but no reg properties. Add them and their size/address cells + to squash the warnings. +- HDAT/i2c: Fix SPD EEPROM compatible string + + Hostboot doesn't give us accurate information about the DIMM SPD + devices. Hack around by assuming any EEPROM we find on the SPD I2C + master is an SPD EEPROM. +- hdata/i2c: Fix 512Kb EEPROM size + + There's no such thing as a 412Kb EEPROM. +- libflash/mbox-flash: fall back to requesting lower MBOX versions from BMC + + Some BMC mbox implementations seem to sometimes mysteriously fail when trying + to negotiate v3 when they only support v2. To work around this, we + can fall back to requesting lower mbox protocol versions until we find + one that works. + + In theory, this should already "just work", but we have a counter example, + which this patch fixes. +- IPMI: Fix platform.cec_reboot() null ptr checks + + Kudos to Hugo Landau who reported this in: + https://github.com/open-power/skiboot/issues/142 +- hdata: Add location code property to xscom node + + This patch adds chip location code property to xscom node. +- p8-i2c: Limit number of retry attempts + + Current we will attempt to start an I2C transaction until it succeeds. + In the event that the OCC does not release the lock on an I2C bus this + results in an async token being held forever and the kernel thread that + started the transaction will block forever while waiting for an async + completion message. Fix this by limiting the number of attempts to + start the transaction. +- p8-i2c: Don't write the watermark register at init + + On P9 the I2C master is shared with the OCC. Currently the watermark + values are set once at init time which is bad for two reasons: + + a) We don't take the OCC master lock before setting it. Which + may cause issues if the OCC is currently using the master. + b) The OCC might change the watermark levels and we need to reset + them. + + Change this so that we set the watermark value when a new transaction + is started rather than at init time. +- hdata: Rename 'fsp-ipl-side' as 'sp-ipl-side' + + as OPAL is building device tree for both FSP and BMC system. + Also I don't see anyone using this property today. Hence renaming + should be fine. +- hdata/vpd: add support for parsing CPU VRML records + + Allows skiboot to parse out the processor part/serial numbers + on OpenPOWER P9 machines. +- core/lock: Introduce atomic cmpxchg and implement try_lock with it + + cmpxchg will be used in a subsequent change, and this reduces the + amount of asm code. +- direct-controls: add xscom error handling for p8 + + Add xscom checks which will print something useful and return error + back to callers (which already have error handling plumbed in). +- direct-controls: p8 implementation of generic direct controls + + This reworks the sreset functionality that was brought over from + fast-reboot, and fits it under the generic direct controls APIs. + + The fast reboot APIs are implemented using generic direct controls, + which also makes them available on p9. +- fast-reboot: allow mambo fast reboot independent of CPU type + + Don't tie mambo fast reboot to POWER8 CPU type. +- fast-reboot: remove delay after sreset + + There is a 100ms delay when targets reach sreset which does not appear + to have a good purpose. Remove it and therefore reduce the sreset timeout + by the same amount. +- fast-reboot: add more barriers around cpu state changes + + This is a bit of paranoia, but when a CPU changes state to signal it + has reached a particular point, all previous stores should be visible. +- fast-reboot: add sreset timeout detection and handling + + Have the initiator wait for all its sreset targets to call in, and + time out after 200ms if they did not. Fail and revert to IPL reboot. + + Testing indicates that after successful sreset_all_others(), it + takes less than 102ms (in hundreds of fast reboots) for secondaries + to call in. 100 of that is due to an initial delay, but core + un-splitting was not measured. +- fast-reboot: make spin loops consistent and SMT friendly +- fast-reboot: add sreset_all_others error handling + + Pass back failures from sreset_all_others, also change return codes to + OPAL form in sreset_all_prepare to match. + + Errors will revert to the IPL path, so it's not critical to completely + clean up everything if that would complicate things. Detecting the + error and failing is the important thing. +- fast-reboot: restore SMT priority on spin loop exit +- Add documentation for ibm, firmware-versions device tree node +- NX: Print read xscom config failures. + + Currently in NX, only write xscom config failures are tracing. + Add trace statements for read xscom config failures too. + No functional changes. +- hw/nx: Fix NX BAR assignments + + The NX rng BAR is used by each core to source random numbers for the + DARN instruction. Currently we configure each core to use the NX rng of + the chip that it exists on. Unfortunately, the NX can be de-configured by + hostboot and in this case we need to use the NX of a different chip. + + This patch moves the BAR assignments for the NX into the normal nx-rng + init path. This lets us check if the normal (chip local) NX is active + when configuring which NX a core should use so that we can fall back + gracefully. +- FSP-elog: Reduce verbosity of elog messages + + These messages just fill up the opal console log with useless messages + resulting in us losing useful information. + + They have been like this since the first commit in skiboot. Make them + trace. +- core/bitmap: fix bitmap iteration limit corruption + + The bitmap iterators did not reduce the number of bits to scan + when searching for the next bit, which would result in them + overrunning their bitmap. + + These are only used in one place, in xive reset, and the effect + is that the xive reset code will keep zeroing memory until it + reaches a block of memory of MAX_EQ_COUNT >> 3 bits in length, + all zeroes. +- hw/imc: always enable "imc_nest_chip" exports property + + imc_dt_update_nest_node() adds a "imc_nest_chip" property + to the "exports" node (under opal_node) to view nest counter + region. This comes handy when debugging ucode runtime + errors (like counter data update or control block update + so on...). And current code enables the property only if + the microcode is in running state at system boot. To aid + the debug of ucode not running/starting issues at boot, + enable the addition of "imc_nest_chip" property always. + +NVLINK2 +------- + +Since skiboot-5.10-rc2: + +- npu2: Disable TVT range check when in bypass mode + + On POWER9 the GPUs need to be able to access the MMIO memory space. Therefore + the TVT range check needs to include the MMIO address space. As any possible + range check would cover all of memory anyway this patch just disables the TVT + range check all together when bypassing the TCE tables. +- hw/npu2: support creset of npu2 devices + + creset calls in the hw procedure that resets the PHY, we don't + take them out of reset, just put them in reset. + + this fixes a kexec issue. + +Since skiboot-5.10-rc1: + +- npu2/tce: Fix page size checking + + The page size is encoded in the TVT data [59:63] as @shift+11 but + the tce_kill handler does not do the math right; this fixes it. + +Since skiboot-5.9: + +- npu2-hw-procedures.c: Correct phy lane mapping + + Each NVLINK2 device is associated with a particular group of OBUS lanes via + a lane mask which is read from HDAT via the device-tree. However Skiboot's + interpretation of lane mask was different to what is exported from the + HDAT. + + Specifically the lane mask bits in the HDAT are encoded in IBM bit ordering + for a 24-bit wide value. So for example in normal bit ordering lane-0 is + represented by having lane-mask bit 23 set and lane-23 is represented by + lane-mask bit 0. This patch alters the Skiboot interpretation to match what + is passed from HDAT. + +- npu2-hw-procedures.c: Power up lanes during ntl reset + + Newer versions of Hostboot will not power up the NVLINK2 PHY lanes by + default. The phy_reset procedure already powers up the lanes but they also + need to be powered up in order to access the DL. + + The reset_ntl procedure is called by the device driver to bring the DL out + of reset and get it into a working state. Therefore we also need to add + lane and clock power up to the reset_ntl procedure. +- npu2.c: Add PE error detection + + Invalid accesses from the GPU can cause a specific PE to be frozen by the + NPU. Add an interrupt handler which reports the frozen PE to the operating + system via as an EEH event. +- npu2.c: Fix XIVE IRQ alignment +- npu2: hw-procedures: Refactor reset_ntl procedure + + Change the implementation of reset_ntl to match the latest programming + guide documentation. +- npu2: hw-procedures: Add phy_rx_clock_sel() + + Change the RX clk mux control to be done by software instead of HW. This + avoids glitches caused by changing the mux setting. +- npu2: hw-procedures: Change phy_rx_clock_sel values + + The clock selection bits we set here are inputs to a state machine. + + DL clock select (bits 30-31) + + 0b00 + lane 0 clock + 0b01 + lane 7 clock + 0b10 + grid clock + 0b11 + invalid/no-op + + To recover from a potential glitch, we need to ensure that the value we + set forces a state change. Our current sequence is to set 0x3 followed + by 0x1. With the above now known, that is actually a no-op followed by + selection of lane 7. Depending on lane reversal, that selection is not a + state change for some bricks. + + The way to force a state change in all cases is to switch to the grid + clock, and then back to a lane. +- npu2: hw-procedures: Manipulate IOVALID during training + + Ensure that the IOVALID bit for this brick is raised at the start of + link training, in the reset_ntl procedure. + + Then, to protect us from a glitch when the PHY clock turns off or gets + chopped, lower IOVALID for the duration of the phy_reset and + phy_rx_dccal procedures. +- npu2: hw-procedures: Add check_credits procedure + + As an immediate mitigation for a current hardware glitch, add a procedure + that can be used to validate NTL credit values. This will be called as a + safeguard to check that link training succeeded. + + Assert that things are exactly as we expect, because if they aren't, the + system will experience a catastrophic failure shortly after the start of + link traffic. +- npu2: Print bdfn in NPU2DEV* logging macros + + Revise the NPU2DEV{DBG,INF,ERR} logging macros to include the device's + bdfn. It's useful to know exactly which link we're referring to. + + For instance, instead of :: + + [ 234.044921238,6] NPU6: Starting procedure reset_ntl + [ 234.048578101,6] NPU6: Starting procedure reset_ntl + [ 234.051049676,6] NPU6: Starting procedure reset_ntl + [ 234.053503542,6] NPU6: Starting procedure reset_ntl + [ 234.057182864,6] NPU6: Starting procedure reset_ntl + [ 234.059666137,6] NPU6: Starting procedure reset_ntl + + we'll get :: + + [ 234.044921238,6] NPU6:0:0.0 Starting procedure reset_ntl + [ 234.048578101,6] NPU6:0:0.1 Starting procedure reset_ntl + [ 234.051049676,6] NPU6:0:0.2 Starting procedure reset_ntl + [ 234.053503542,6] NPU6:0:1.0 Starting procedure reset_ntl + [ 234.057182864,6] NPU6:0:1.1 Starting procedure reset_ntl + [ 234.059666137,6] NPU6:0:1.2 Starting procedure reset_ntl +- npu2: Move to new GPU memory map + + There are three different ways we configure the MCD and memory map. + + 1) Old way (current way) + Skiboot configures the MCD and puts GPUs at 4TB and below + 2) New way with MCD + Hostboot configures the MCD and skiboot puts GPU at 4TB and above + 3) New way without MCD + No one configures the MCD and skiboot puts GPU at 4TB and below + + The patch keeps option 1 and adds options 2 and 3. + + The different configurations are detected using certain scoms (see + patch). + + Option 1 will go away eventually as it's a configuration that can + cause xstops or data integrity problems. We are keeping it around to + support existing hostboot. + + Option 2 supports only 4 GPUs and 512GB of memory per socket. + + Option 3 supports 6 GPUs and 4TB of memory but may have some + performance impact. +- phys-map: Rename GPU_MEM to GPU_MEM_4T_DOWN + + This map is soon to be replaced, but we are going to keep it around + for a little while so that we support older hostboot firmware. + +Platform Specific Fixes +----------------------- + +Witherspoon +^^^^^^^^^^^ +- Witherspoon: Remove old Witherspoon platform definition + + An old Witherspoon platform definition was added to aid the transition from + versions of Hostboot which didn't have the correct NVLINK2 HDAT information + available and/or planar VPD. These system should now be updated so remove + the possibly incorrect default assumption. + + This may disable NVLINK2 on old out-dated systems but it can easily be + restored with the appropriate FW and/or VPD updates. In any case there is a + a 50% chance the existing default behaviour was incorrect as it only + supports 6 GPU systems. Using an incorrect platform definition leads to + undefined behaviour which is more difficult to detect/debug than not + creating the NVLINK2 devices so remove the possibly incorrect default + behaviour. +- Witherspoon: Fix VPD EEPROM type + + There are user-space tools that update the planar VPD via the sysfs + interface. Currently we do not get correct information from hostboot + about the exact type of the EEPROM so we need to manually fix it up + here. This needs to be done as a platform specific fix since there is + not standardised VPD EEPROM type. + +IBM FSP Systems +^^^^^^^^^^^^^^^ + +- nvram: Fix 'missing' nvram on FSP systems. + + commit ba4d46fdd9eb ("console: Set log level from nvram") wants to read + from NVRAM rather early. This works fine on BMC based systems as + nvram_init() is actually synchronous. This is not true for FSP systems + and it turns out that the query for the console log level simply + queries blank nvram. + + The simple fix is to wait for the NVRAM read to complete before + performing any query. Unfortunately it turns out that the fsp-nvram + code does not inform the generic NVRAM layer when the read is complete, + rather, it must be prompted to do so. + + This patch addresses both these problems. This patch adds a check before + the first read of the NVRAM (for the console log level) that the read + has completed. The fsp-nvram code has been updated to inform the generic + layer as soon as the read completes. + + The old prompt to the fsp-nvram code has been removed but a check to + ensure that the NVRAM has been loaded remains. It is conservative but + if the NVRAM is not done loading before the host is booted it will not + have an nvram device-tree node which means it won't be able to access + the NVRAM at all, ever, even after the NVRAM has loaded. + + +Utilities +---------- + +Since skiboot-5.10-rc1: + +- opal-prd: Fix FTBFS with -Werror=format-overflow + + i2c.c fails to compile with gcc7 and -Werror=format-overflow used in + Debian Unstable and Ubuntu 18.04 : :: + + i2c.c: In function ‘i2c_init’: + i2c.c:211:15: error: ‘%s’ directive writing up to 255 bytes into a + region of size 236 [-Werror=format-overflow=] + +Since skiboot-5.9: + +- Fix xscom-utils distclean target + + In Debian/Ubuntu, the packaging system likes to have a full clean-up that + restores the tree back to original one, so add some files to the distclean + target. +- Add man pages for xscom-utils and pflash + + For the need of Debian/Ubuntu packaging, I inferred some initial man + pages from their help output. + + +gard +^^^^ +- gard: Add tests + + I hear Stewart likes these for some reason. Dunno why. +- gard: Add OpenBMC vPNOR support + + A big-ol-hack to add some checking for OpenBMC's vPNOR GUARD files under + /media/pnor-prsv. This isn't ideal since it doesn't handle the create + case well, but it's better than nothing. +- gard: Always use MTD to access flash + + Direct mode is generally either unsafe or unsupported. We should always + access the PNOR via an MTD device so make that the default. If someone + really needs direct mode, then they can use pflash. +- gard: Fix up do_create return values + + The return value of a subcommand is interpreted as a libflash error code + when it's positive or some subcommand specific error when negative. + Currently the create subcommand always returns zero when exiting (even + for errors) so fix that. +- gard: Add usage message for -p + + The -p argument only really makes sense when -f is specified. Print an + actual error message rather than just the usage blob. +- gard: Fix max instance count + + There's an entire byte for the instance count rather than a nibble. Only + barf if the instance number is beyond 255 rather than 16. +- gard: Fix up path parsing + + Currently we assume that the Unit ID can be used as an array index into + the chip_units[] structure. There are holes in the ID space though, so + this doesn't actually work. Fix it up by walking the array looking for + the ID. +- gard: Set chip generation based on PVR + + Currently we assume that this tool is being used on a P8 system by + default and allow the user to override this behaviour using the -8 and + -9 command line arguments. When running on the host we can use the + PVR to guess what chip generation so do that. + + This also changes the default behaviour to assume that the host is a P9 + when running on an ARM system. This tool didn't even work when compiled + for ARM until recently and the OpenBMC vPNOR hack that we have currently + is broken for P9 systems that don't use vPNOR (Zaius and Romulus). +- gard: Allow records with an ID of 0xffffffff + + We currently assume that a record with an ID of 0xffffffff is invalid. + Apparently this is incorrect and we should display these records, so + expand the check to compare the entire record with 0xff rather than + just the ID. +- gard: create: Allow creating arbitrary GARD records + + Add a new sub-command that allows us to create GARD records for + arbitrary chip units. There isn't a whole lot of constraints on this and + that limits how useful it can be, but it does allow a user to GARD out + individual DIMMs, chips or cores from the BMC (or host) if needed. + + There are a few caveats though: + + 1) Not everything can, or should, have a GARD record applied it to. + 2) There is no validation that the unit actually exists. Doing that + sort of validation requires something that understands the FAPI + targeting information (I think) and adding support for it here + would require some knowledge from the system XML file. + 3) There's no way to get a list of paths in the system. + 4) Although we can create a GARD record at runtime it won't be applied + until the next IPL. +- gard: Add path parsing support + + In order to support manual GARD records we need to be able to parse the + hardware unit path strings. This patch implements that. +- gard: list: Improve output + + Display the full path to the GARDed hardware unit in each record rather + than relying on the output of `gard show` and convert do_list() to use + the iterator while we're here. +- gard: {list, show}: Fix the Type field in the output + + The output of `gard list` has a field named "Type", however this + doesn't actually indicate the type of the record. Rather, it + shows the type of the path used to identify the hardware being + GARDed. This is of pretty dubious value considering the Physical + path seems to always be used when referring to GARDed hardware. +- gard: Add P9 support +- gard: Update chip unit data + + Source the list of units from the hostboot source rather than the + previous hard coded list. The list of path element types changes + between generations so we need to add a level of indirection to + accommodate P9. This also changes the names used to match those + printed by Hostboot at IPL time and paves the way to adding support + for manual GARD record creation. +- gard: show: Remove "Res Recovery" field + + This field has never been populated by hostboot on OpenPower systems + so there's no real point in reporting it's contents. + +libflash / pflash +^^^^^^^^^^^^^^^^^ + +Anybody shipping libflash or pflash to interact with POWER9 systems must +upgrade to this version. + +Since skiboot-5.10-rc2: + +- pflash: Fix makefile dependency issue + +Since skiboot-5.9: + +- pflash: Support for volatile flag + + The volatile flag was added to the PNOR image to + indicate partitions that are cleared during a host + power off. Display this flag from the pflash command. +- pflash: Support for clean_on_ecc_error flag + + Add the misc flag clear_on_ecc_error to libflash/pflash. This was + the only missing flag. The generator of the virtual PNOR image + relies on libflash/pflash to provide the partition information, + so all flags are needed to build an accurate virtual PNOR partition + table. +- pflash: Respect write(2) return values + + The write(2) system call returns the number of bytes written, this is + important since it is entitled to write less than what we requested. + Currently we ignore the return value and assume it wrote everything we + requested. While in practice this is likely to always be the case, it + isn't actually correct. +- external/pflash: Fix erasing within a single erase block + + It is possible to erase within a single erase block. Currently the + pflash code assumes that if the erase starts part way into an erase + block it is because it needs to be aligned up to the boundary with the + next erase block. + + Doing an erase smaller than a single erase block will cause underflows + and looping forever on erase. +- external/pflash: Fix non-zero return code for successful read when size%256 != 0 + + When performing a read the return value from pflash is non-zero, even for + a successful read, when the size being read is not a multiple of 256. + This is because do_read_file returns the value from the write system + call which is then returned by pflash. When the size is a multiple of + 256 we get lucky in that this wraps around back to zero. However for any + other value the return code is size % 256. This means even when the + operation is successful the return code will seem to reflect an error. + + Fix this by returning zero if the entire size was read correctly, + otherwise return the corresponding error code. +- libflash: Fix parity calculation on ARM + + To calculate the ECC syndrome we need to calculate the parity of a 64bit + number. On non-powerpc platforms we use the GCC builtin function + __builtin_parityl() to do this calculation. This is broken on 32bit ARM + where sizeof(unsigned long) is four bytes. Using __builtin_parityll() + instead cures this. +- libflash/mbox-flash: Add the ability to lock flash +- libflash/mbox-flash: Understand v3 +- libflash/mbox-flash: Use BMC suggested timeout value +- libflash/mbox-flash: Simplify message sending + + hw/lpc-mbox no longer requires that the memory associated with messages + exist for the lifetime of the message. Once it has been sent to the BMC, + that is bmc_mbox_enqueue() returns, lpc-mbox does not need the message + to continue to exist. On the receiving side, lpc-mbox will ensure that a + message exists for the receiving callback function. + + Remove all code to deal with allocating messages. +- hw/lpc-mbox: Simplify message bookkeeping and timeouts + + Currently the hw/lpc-mbox layer keeps a pointer for the currently + in-flight message for the duration of the mbox call. This creates + problems when messages timeout, is that pointer still valid, what can we + do with it. The memory is owned by the caller but if the caller has + declared a timeout, it may have freed that memory. + + Another problem is locking. This patch also locks around sending and + receiving to avoid races with timeouts and possible resends. There was + some locking previously which was likely insufficient - definitely too + hard to be sure is correct + + All this is made much easier with the previous rework which moves + sequence number allocation and verification into lpc-mbox rather than + the caller. +- libflash/mbox-flash: Allow mbox-flash to tell the driver msg timeouts + + Currently when mbox-flash decides that a message times out the driver + has no way of knowing to drop the message and will continue waiting for + a response indefinitely preventing more messages from ever being sent. + + This is a problem if the BMC crashes or has some other issue where it + won't ever respond to our outstanding message. + + This patch provides a method for mbox-flash to tell the driver how long + it should wait before it no longer needs to care about the response. +- libflash/mbox-flash: Move sequence handling to driver level +- libflash/mbox-flash: Always close windows before opening a new window + + The MBOX protocol states that if an open window command fails then all + open windows are closed. Currently, if an open window command fails + mbox-flash will erroneously assume that the previously open window is + still open. + + The solution to this is to mark all windows as closed before issuing an + open window command and then on success we'll mark the new window as + open. +- libflash/mbox-flash: Add v2 error codes + +opal-prd +^^^^^^^^ + +Anybody shipping `opal-prd` for POWER9 systems must upgrade `opal-prd` to +this new version. + +- prd: Log unsupported message type + + Useful for debugging. + + Sample output: :: + + [29155.157050283,7] PRD: Unsupported prd message type : 0xc + +- opal-prd: occ: Add support for runtime OCC load/start in ZZ + + This patch adds support to handle OCC load/start event from FSP/PRD. + During IPL we send a success directly to FSP without invoking any HBRT + load routines on receiving OCC load mbox message from FSP. At runtime + we forward this event to host opal-prd. + + This patch provides support for invoking OCC load/start HBRT routines + like load_pm_complex() and start_pm_complex() from opal-prd. +- opal-prd: Add support for runtime OCC reset in ZZ + + This patch handles OCC_RESET runtime events in host opal-prd and also + provides support for calling 'hostinterface->wakeup()' which is + required for doing the reset operation. +- prd: Enable error logging via firmware_request interface + + In P9 HBRT sends error logs to FSP via firmware_request interface. + This patch adds support to parse error log and send it to FSP. +- prd: Add generic response structure inside prd_fw_msg + + This patch adds generic response structure. Also sync prd_fw_msg type + macros with hostboot. +- opal-prd: flush after logging to stdio in debug mode + + When in debug mode, flush after each log output. This makes it more + likely that we'll catch failure reasons on severe errors. + +Debugging and reliability improvements +-------------------------------------- + +Since skiboot-5.10-rc3: + +- increase log verbosity in debug builds +- Add -debug to version on DEBUG builds +- cpu_wait_job: Correctly report time spent waiting for job + +Since skiboot-5.10-rc2: + +- ATTN: Enable flush instruction cache bit in HID register + + In P9, we have to enable "flush the instruction cache" bit along with + "attn instruction support" bit to trigger attention. + +Since skiboot-5.10-rc1: + +- core/init: manage MSR[ME] explicitly, always enable + + The current boot sequence inherits MSR[ME] from the IPL firmware, and + never changes it. Some environments disable MSR[ME] (e.g., mambo), and + others can enable it (hostboot). + + This has two problems. First, MSR[ME] must be disabled while in the + process of taking over the interrupt vector from the previous + environment. Second, after installing our machine check handler, + MSR[ME] should be enabled to get some useful output rather than a + checkstop. +- core/exception: beautify exception handler, add MCE-involved registers + + Print DSISR and DAR, to help with deciphering machine check exceptions, + and improve the output a bit, decode NIP symbol, improve alignment, etc. + Also print a specific header for machine check, because we do expect to + see these if there is a hardware failure. + + Before: :: + + [ 0.005968779,3] *********************************************** + [ 0.005974102,3] Unexpected exception 200 ! + [ 0.005978696,3] SRR0 : 000000003002ad80 SRR1 : 9000000000001000 + [ 0.005985239,3] HSRR0: 00000000300027b4 HSRR1: 9000000030001000 + [ 0.005991782,3] LR : 000000003002ad80 CTR : 0000000000000000 + [ 0.005998130,3] CFAR : 00000000300b58bc + [ 0.006002769,3] CR : 40000004 XER: 20000000 + [ 0.006008069,3] GPR00: 000000003002ad80 GPR16: 0000000000000000 + [ 0.006015170,3] GPR01: 0000000031c03bd0 GPR17: 0000000000000000 + [...] + + After: :: + + [ 0.003287941,3] *********************************************** + [ 0.003561769,3] Fatal MCE at 000000003002ad80 .nvram_init+0x24 + [ 0.003579628,3] CFAR : 00000000300b5964 + [ 0.003584268,3] SRR0 : 000000003002ad80 SRR1 : 9000000000001000 + [ 0.003590812,3] HSRR0: 00000000300027b4 HSRR1: 9000000030001000 + [ 0.003597355,3] DSISR: 00000000 DAR : 0000000000000000 + [ 0.003603480,3] LR : 000000003002ad68 CTR : 0000000030093d80 + [ 0.003609930,3] CR : 40000004 XER : 20000000 + [ 0.003615698,3] GPR00: 00000000300149e8 GPR16: 0000000000000000 + [ 0.003622799,3] GPR01: 0000000031c03bc0 GPR17: 0000000000000000 + [...] + + +Since skiboot-5.9: + +- lock: Add additional lock auditing code + + Keep track of lock owner name and replace lock_depth counter + with a per-cpu list of locks held by the cpu. + + This allows us to print the actual locks held in case we hit + the (in)famous message about opal_pollers being run with a + lock held. + + It also allows us to warn (and drop them) if locks are still + held when returning to the OS or completing a scheduled job. +- Add support for new GCC 7 parametrized stack protector + + This gives us per-cpu guard values as well. For now I just + XOR a magic constant with the CPU PIR value. +- Mambo: run hello_world and sreset_world tests with Secure and Trusted Boot + + We *disable* the secure boot part, but we keep the verified boot + part as we don't currently have container verification code for Mambo. + + We can run a small part of the code currently though. + +- core/flash.c: extern function to get the name of a PNOR partition + + This adds the flash_map_resource_name() to allow skiboot subsystems to + lookup the name of a PNOR partition. Thus, we don't need to duplicate + the same information in other places (e.g. libstb). +- libflash/mbox-flash: only wait for MBOX_DEFAULT_POLL_MS if busy + + This makes the mbox unit test run 300x quicker and seems to + shave about 6 seconds from boot time on Witherspoon. +- make check: Make valgrind optional + + To (slightly) lower the barrier for contributions, we can make valgrind + optional with just a small amount of plumbing. + + This allows make check to run successfully without valgrind. +- libflash/test: Add tests for mbox-flash + + A first basic set of tests for mbox-flash. These tests do their testing + by stubbing out or otherwise replacing functions not in + libflash/mbox-flash.c. The stubbed out version of the function can then + be used to emulate a BMC mbox daemon talking to back to the code in + mbox-flash and it can ensure that there is some adherence to the + protocol and that from a block-level api point of view the world appears + sane. + + This makes these tests simple to run and they have been integrated into + `make check`. The down side is that these tests rely on duplicated + feature incomplete BMC daemon behaviour. Therefore these tests are a + strong indicator of broken behaviour but a very unreliable indicator of + correctness. + + Full integration tests with a 'real' BMC daemon are probably beyond the + scope of this repository. +- external/test/test.sh: fix VERSION substitution when no tags + + i.e. we get a hash rather than a version number + + This seems to be occurring in Travis if it doesn't pull a tag. +- external/test: make stripping out version number more robust + + For some bizarre reason, Travis started failing on this + substitution when there'd been zero code changes in this + area... This at least papers over whatever the problem is + for the time being. +- io: Add load_wait() helper + + This uses the standard form twi/isync pair to ensure a load + is consumed by the core before continuing. This can be necessary + under some circumstances for example when having the following + sequence: + + - Store reg A + - Load reg A (ensure above store pushed out) + - delay loop + - Store reg A + + I.E., a mandatory delay between 2 stores. In theory the first store + is only guaranteed to reach the device after the load from the same + location has completed. However the processor will start executing + the delay loop without waiting for the return value from the load. + + This construct enforces that the delay loop isn't executed until + the load value has been returned. +- chiptod: Keep boot timestamps contiguous + + Currently we reset the timebase value to (almost) zero when + synchronising the timebase of each chip to the Chip TOD network which + results in this: :: + + [ 42.374813167,5] CPU: All 80 processors called in... + [ 2.222791151,5] FLASH: Found system flash: Macronix MXxxL51235F id:0 + [ 2.222977933,5] BT: Interface initialized, IO 0x00e4 + + This patch modifies the chiptod_init() process to use the current + timebase value rather than resetting it to zero. This results in the + timestamps remaining contiguous from the start of hostboot until + the petikernel starts. e.g. :: + + [ 70.188811484,5] CPU: All 144 processors called in... + [ 72.458004252,5] FLASH: Found system flash: id:0 + [ 72.458147358,5] BT: Interface initialized, IO 0x00e4 + +- hdata/spira: Add missing newline to prlog() call + + We're missing a \n here. +- opal/xscom: Add recovery for lost core wakeup SCOM failures. + + Due to a hardware issue where core responding to SCOM was delayed due to + thread reconfiguration, leaves the SCOM logic in a state where the + subsequent SCOM to that core can get errors. This is affected for Core + PC SCOM registers in the range of 20010A80-20010ABF + + The solution is if a xscom timeout occurs to one of Core PC SCOM registers + in the range of 20010A80-20010ABF, a clearing SCOM write is done to + 0x20010800 with data of '0x00000000' which will also get a timeout but + clears the SCOM logic errors. After the clearing write is done the original + SCOM operation can be retried. + + The SCOM timeout is reported as status 0x4 (Invalid address) in HMER[21-23]. +- opal/xscom: Move the delay inside xscom_reset() function. + + So caller of xscom_reset() does not have to bother about adding a delay + separately. Instead caller can control whether to add a delay or not using + second argument to xscom_reset(). +- timer: Stop calling list_top() racily + + This will trip the debug checks in debug builds under some circumstances + and is actually a rather bad idea as we might look at a timer that is + concurrently being removed and modified, and thus incorrectly assume + there is no work to do. +- fsp: Bail out of HIR if FSP is resetting voluntarily + + a. Surveillance response times out and OPAL triggers a HIR + b. Before the HIR process kicks in, OPAL gets a PSI interrupt indicating link down + c. HIR process continues and OPAL tries to write to DRCR; PSI link inactive => xstop + + OPAL should confirm that the FSP is not already in reset in the HIR path. +- sreset_kernel: only run SMT tests due to not supporting re-entry +- Use systemsim-p9 v1.1 +- direct-controls: enable fast reboot direct controls for mambo + + Add mambo direct controls to stop threads, which is required for + reliable fast-reboot. Enable direct controls by default on mambo. +- core/opal: always verify cpu->pir on entry +- asm/head: add entry/exit calls + + Add entry and exit C functions that can do some more complex + checks before the opal proper call. This requires saving off + volatile registers that have arguments in them. +- core/lock: improve bust_locks + + Prevent try_lock from modifying the lock state when bust_locks is set. + unlock will not unlock it in that case, so locks will get taken and + never released while bust_locks is set. +- hw/occ: Log proper SCOM register names + + This patch fixes the logging of incorrect SCOM + register names. +- mambo: Add support for NUMA + + Currently the mambo scripts can do multiple chips, but only the first + ever has memory. + + This patch adds support for having memory on each chip, with each + appearing as a separate NUMA node. Each node gets MEM_SIZE worth of + memory. + + It's opt-in, via ``export MAMBO_NUMA=1``. +- external/mambo: Switch qtrace command to use plug-ins + + The plug-in seems to be the preferred way to do this now, it works + better, and the qtracer emitter seems to generate invalid traces + in new mambo versions. +- asm/head: Loop after attn + + We use the attn instruction to raise an error in early boot if OPAL + don't recognise the PVR. It's possible for hostboot to disable the + attn instruction before entering OPAL so add an extra busy loop after + the attn to prevent attempting to boot on an unknown processor. + +Contributors +------------ + +- 302 csets from 32 developers +- 3 employers found +- A total of 15919 lines added, 4786 removed (delta 11133) + +Extending the analysis done for some previous releases, we can see our trends +in code review across versions: + +======= ====== ======== ========= ========= =========== +Release csets Ack % Reviews % Tested % Reported % +======= ====== ======== ========= ========= =========== +5.0 329 15 (5%) 20 (6%) 1 (0%) 0 (0%) +5.1 372 13 (3%) 38 (10%) 1 (0%) 4 (1%) +5.2-rc1 334 20 (6%) 34 (10%) 6 (2%) 11 (3%) +5.3-rc1 302 36 (12%) 53 (18%) 4 (1%) 5 (2%) +5.4 361 16 (4%) 28 (8%) 1 (0%) 9 (2%) +5.5 408 11 (3%) 48 (12%) 14 (3%) 10 (2%) +5.6 87 12 (14%) 6 (7%) 5 (6%) 2 (2%) +5.7 232 30 (13%) 32 (14%) 5 (2%) 2 (1%) +5.8 157 13 (8%) 36 (23%) 2 (1%) 6 (4%) +5.9 209 15 (7%) 78 (37%) 3 (1%) 10 (5%) +5.10 302 20 (6%) 62 (21%) 24 (8%) 11 (4%) +======= ====== ======== ========= ========= =========== + +The review count for v5.9 is largely bogus, there was a series of 25 whitespace +patches that got "Reviewed-by" and if we exclude them, we're back to 14%, +which is more like what I'd expect. + +For 5.10, We've seen an increase in Reviewed-by from 5.9, back to closer to +5.8 levels. I'm hoping we can keep the ~20% up. + +Initially I was really pleased with the increase in Tested-by, but with closer +examination, 17 of those are actually from various automated testing on +commits to code we bring in from hostboot/other firmware components. When +you exclude them, we're back down to 2% getting Tested-by, which isn't great. + +Developers with the most changesets +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================== === ======= +Developer # % +========================== === ======= +Stewart Smith 40 (13.2%) +Nicholas Piggin 37 (12.3%) +Oliver O'Halloran 36 (11.9%) +Benjamin Herrenschmidt 23 (7.6%) +Claudio Carvalho 20 (6.6%) +Cyril Bur 19 (6.3%) +Michael Neuling 13 (4.3%) +Shilpasri G Bhat 12 (4.0%) +Reza Arbab 12 (4.0%) +Pridhiviraj Paidipeddi 11 (3.6%) +Vasant Hegde 10 (3.3%) +Akshay Adiga 10 (3.3%) +Mahesh Salgaonkar 8 (2.6%) +Russell Currey 7 (2.3%) +Alistair Popple 7 (2.3%) +Vaibhav Jain 5 (1.7%) +Prem Shanker Jha 4 (1.3%) +Robert Lippert 4 (1.3%) +Frédéric Bonnard 3 (1.0%) +Christophe Lombard 3 (1.0%) +Jeremy Kerr 2 (0.7%) +Michael Ellerman 2 (0.7%) +Balbir Singh 2 (0.7%) +Andrew Donnellan 2 (0.7%) +Madhavan Srinivasan 2 (0.7%) +Adriana Kobylak 2 (0.7%) +Sukadev Bhattiprolu 1 (0.3%) +Alexey Kardashevskiy 1 (0.3%) +Frederic Barrat 1 (0.3%) +Ananth N Mavinakayanahalli 1 (0.3%) +Suraj Jitindar Singh 1 (0.3%) +Guilherme G. Piccoli 1 (0.3%) +========================== === ======= + +Developers with the most changed lines +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================== ==== ======= +Developer # % +========================== ==== ======= +Stewart Smith 4284 (24.5%) +Nicholas Piggin 2924 (16.7%) +Claudio Carvalho 2476 (14.2%) +Shilpasri G Bhat 1490 (8.5%) +Cyril Bur 1475 (8.4%) +Oliver O'Halloran 1242 (7.1%) +Benjamin Herrenschmidt 736 (4.2%) +Alistair Popple 498 (2.8%) +Vasant Hegde 299 (1.7%) +Akshay Adiga 273 (1.6%) +Reza Arbab 231 (1.3%) +Mahesh Salgaonkar 225 (1.3%) +Balbir Singh 213 (1.2%) +Frédéric Bonnard 169 (1.0%) +Michael Neuling 142 (0.8%) +Robert Lippert 97 (0.6%) +Pridhiviraj Paidipeddi 93 (0.5%) +Prem Shanker Jha 92 (0.5%) +Christophe Lombard 80 (0.5%) +Russell Currey 78 (0.4%) +Michael Ellerman 72 (0.4%) +Adriana Kobylak 71 (0.4%) +Madhavan Srinivasan 61 (0.3%) +Sukadev Bhattiprolu 58 (0.3%) +Vaibhav Jain 52 (0.3%) +Jeremy Kerr 27 (0.2%) +Ananth N Mavinakayanahalli 16 (0.1%) +Frederic Barrat 9 (0.1%) +Andrew Donnellan 5 (0.0%) +Alexey Kardashevskiy 3 (0.0%) +Suraj Jitindar Singh 1 (0.0%) +Guilherme G. Piccoli 1 (0.0%) +========================== ==== ======= + +Developers with the most lines removed +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================= ==== ======= +Developer # % +========================= ==== ======= +Alistair Popple 304 (6.4%) +Andrew Donnellan 1 (0.0%) +========================= ==== ======= + +Developers with the most signoffs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================== === ======= +Developer # % +========================== === ======= +Stewart Smith 262 (99.2%) +Reza Arbab 1 (0.4%) +Mahesh Salgaonkar 1 (0.4%) +========================== === ======= + +Developers with the most reviews +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +================================ ==== ======= +Developer # % +================================ ==== ======= +Andrew Donnellan 8 (13.6%) +Balbir Singh 5 (8.5%) +Vasant Hegde 5 (8.5%) +Gregory S. Still 4 (6.8%) +Nicholas Piggin 4 (6.8%) +Reza Arbab 3 (5.1%) +Alistair Popple 3 (5.1%) +RANGANATHPRASAD G. BRAHMASAMUDRA 3 (5.1%) +Jennifer A. Stofer 3 (5.1%) +Oliver O'Halloran 3 (5.1%) +Vaidyanathan Srinivasan 2 (3.4%) +Hostboot Team 2 (3.4%) +Christian R. Geddes 2 (3.4%) +Frederic Barrat 2 (3.4%) +Cyril Bur 2 (3.4%) +Stewart Smith 1 (1.7%) +Cédric Le Goater 1 (1.7%) +Samuel Mendoza-Jonas 1 (1.7%) +Daniel M. Crowell 1 (1.7%) +Vaibhav Jain 1 (1.7%) +Madhavan Srinivasan 1 (1.7%) +Michael Ellerman 1 (1.7%) +Shilpasri G Bhat 1 (1.7%) +**Total** 59 (100%) +================================ ==== ======= + +Developers with the most test credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== == ======= +Developer # % +=========================== == ======= +FSP CI Jenkins 4 (16.7%) +Jenkins Server 4 (16.7%) +Hostboot CI 4 (16.7%) +Oliver O'Halloran 3 (12.5%) +Jenkins OP Build CI 3 (12.5%) +Jenkins OP HW 2 (8.3%) +Pridhiviraj Paidipeddi 2 (8.3%) +Andrew Donnellan 1 (4.2%) +Vaidyanathan Srinivasan 1 (4.2%) +**Total** 24 (100%) +=========================== == ======= + +Developers who gave the most tested-by credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== == ======= +Developer # % +=========================== == ======= +Prem Shanker Jha 17 (70.8%) +Benjamin Herrenschmidt 3 (12.5%) +Stewart Smith 2 (8.3%) +Shilpasri G Bhat 1 (4.2%) +Ananth N Mavinakayanahalli 1 (4.2%) +**Total** 24 (100%) +=========================== == ======= + + +Developers with the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== == ======= +Developer # % +=========================== == ======= +Pridhiviraj Paidipeddi 2 (18.2%) +Benjamin Herrenschmidt 1 (9.1%) +Andrew Donnellan 1 (9.1%) +Michael Ellerman 1 (9.1%) +Deb McLemore 1 (9.1%) +Brad Bishop 1 (9.1%) +Michel Normand 1 (9.1%) +Hugo Landau 1 (9.1%) +Minda Wei 1 (9.1%) +Francesco A Campisano 1 (9.1%) +**Total** 11 (100%) +=========================== == ======= + +Developers who gave the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== == ======= +Developer # % +=========================== == ======= +Stewart Smith 7 (63.6%) +Suraj Jitindar Singh 1 (9.1%) +Jeremy Kerr 1 (9.1%) +Michael Neuling 1 (9.1%) +Frédéric Bonnard 1 (9.1%) +**Total** 11 (100%) + +=========================== == ======= + +Changesets and Employers +^^^^^^^^^^^^^^^^^^^^^^^^ + +Top changeset contributors by employer: + +========================== === ======= +Employer # % +========================== === ======= +IBM 298 (98.7%) +Google 3 (1.0%) +(Unknown) 1 (0.3%) +========================== === ======= + +Top lines changed by employer: + +======================== ===== ======= +Employer # % +======================== ===== ======= +IBM 17396 (99.4%) +Google 73 (0.4%) +(Unknown) 24 (0.1%) +======================== ===== ======= + +Employers with the most signoffs (total 264): + +======================== ===== ======= +Employer # % +======================== ===== ======= +IBM 264 (100.0%) +======================== ===== ======= + +Employers with the most hackers (total 33) + +========================== === ======= +Employer # % +========================== === ======= +IBM 31 (93.9%) +Google 1 (3.0%) +(Unknown) 1 (3.0%) +========================== === ======= diff --git a/roms/skiboot/doc/release-notes/skiboot-5.11-rc1.rst b/roms/skiboot/doc/release-notes/skiboot-5.11-rc1.rst new file mode 100644 index 000000000..89a3e6abc --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.11-rc1.rst @@ -0,0 +1,694 @@ +.. _skiboot-5.11-rc1: + +skiboot-5.11-rc1 +================ + +skiboot v5.11-rc1 was released on Wednesday March 28th 2018. It is the first +release candidate of skiboot 5.11, which will become the new stable release +of skiboot following the 5.10 release, first released February 23rd 2018. + +It is not expected to keep the 5.11 branch around for long, and instead +quickly move onto a 6.0, which will mark the basis for op-build v2.0 and +will be required for POWER9 systems. + +skiboot v5.11-rc1 contains all bug fixes as of :ref:`skiboot-5.10.3` +and :ref:`skiboot-5.4.9` (the currently maintained stable releases). There +may be more 5.10.x stable releases, it will depend on demand. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.11 in March, with skiboot 5.11 +being for all POWER8 and POWER9 platforms in op-build v1.22. +This release is targeted to early POWER9 systems. + +Over skiboot-5.10, we have the following changes: + +New Platforms +------------- + +- Add VESNIN platform support + + The Vesnin platform from YADRO is a 4 socked POWER8 system with up to 8TB + of memory with 460GB/s of memory bandwidth in only 2U. Many kudos to the + team from Yadro for submitting their code upstream! + +New Features +------------ + +- fast-reboot: enable by default for POWER9 + + - Fast reboot is disabled if NPU2 is present or CAPI2/OpenCAPI is used + +- PCI tunneled operations on PHB4 + + - phb4: set PBCQ Tunnel BAR for tunneled operations + + P9 supports PCI tunneled operations (atomics and as_notify) that are + initiated by devices. + + A subset of the tunneled operations require a response, that must be + sent back from the host to the device. For example, an atomic compare + and swap will return the compare status, as swap will only performed + in case of success. Similarly, as_notify reports if the target thread + has been woken up or not, because the operation may fail. + + To enable tunneled operations, a device driver must tell the host where + it expects tunneled operation responses, by setting the PBCQ Tunnel BAR + Response register with a specific value within the range of its BARs. + + This register is currently initialized by enable_capi_mode(). But, as + tunneled operations may also operate in PCI mode, a new API is required + to set the PBCQ Tunnel BAR Response register, without switching to CAPI + mode. + + This patch provides two new OPAL calls to get/set the PBCQ Tunnel + BAR Response register. + + Note: as there is only one PBCQ Tunnel BAR register, shared between + all the devices connected to the same PHB, only one of these devices + will be able to use tunneled operations, at any time. + - phb4: set PHB CMPM registers for tunneled operations + + P9 supports PCI tunneled operations (atomics and as_notify) that require + setting the PHB ASN Compare/Mask register with a 16-bit indication. + + This register is currently initialized by enable_capi_mode(). But, as + tunneled operations may also work in PCI mode, the ASN Compare/Mask + register should rather be initialized in phb4_init_ioda3(). + + This patch also adds "ibm,phb-indications" to the device tree, to tell + Linux the values of CAPI, ASN, and NBW indications, when supported. + + Tunneled operations tested by IBM in CAPI mode, by Mellanox Technologies + in PCI mode. + +- Tie tm-suspend fw-feature and opal_reinit_cpus() together + + Currently opal_reinit_cpus(OPAL_REINIT_CPUS_TM_SUSPEND_DISABLED) + always returns OPAL_UNSUPPORTED. + + This ties the tm suspend fw-feature to the + opal_reinit_cpus(OPAL_REINIT_CPUS_TM_SUSPEND_DISABLED) so that when tm + suspend is disabled, we correctly report it to the kernel. For + backwards compatibility, it's assumed tm suspend is available if the + fw-feature is not present. + + Currently hostboot will clear fw-feature(TM_SUSPEND_ENABLED) on P9N + DD2.1. P9N DD2.2 will set fw-feature(TM_SUSPEND_ENABLED). DD2.0 and + below has TM disabled completely (not just suspend). + + We are using opal_reinit_cpus() to determine this setting (rather than + the device tree/HDAT) as some future firmware may let us change this + dynamically after boot. That is not the case currently though. + +Power Management +---------------- + +- SLW: Increase stop4-5 residency by 10x + + Using DGEMM benchmark we observed there was a drop of 5-9% throughput with + and without stop4/5. In this benchmark the GPU waits on the cpu to wakeup + and provide the subsequent data block to compute. The wakup latency + accumulates over the run and shows up as a performance drop. + + Linux enters stop4/5 more aggressively for its wakeup latency. Increasing + the residency from 1ms to 10ms makes the performance drop <1% +- occ: Set up OCC messaging even if we fail to setup pstates + + This means that we no longer hit this bug if we fail to get valid pstates + from the OCC. :: + + [console-pexpect]#echo 1 > //sys/firmware/opal/sensor_groups//occ-csm0/clear + echo 1 > //sys/firmware/opal/sensor_groups//occ-csm0/clear + [ 94.019971181,5] CPU ATTEMPT TO RE-ENTER FIRMWARE! PIR=083d cpu @0x33cf4000 -> pir=083d token=8 + [ 94.020098392,5] CPU ATTEMPT TO RE-ENTER FIRMWARE! PIR=083d cpu @0x33cf4000 -> pir=083d token=8 + [ 10.318805] Disabling lock debugging due to kernel taint + [ 10.318808] Severe Machine check interrupt [Not recovered] + [ 10.318812] NIP [000000003003e434]: 0x3003e434 + [ 10.318813] Initiator: CPU + [ 10.318815] Error type: Real address [Load/Store (foreign)] + [ 10.318817] opal: Hardware platform error: Unrecoverable Machine Check exception + [ 10.318821] CPU: 117 PID: 2745 Comm: sh Tainted: G M 4.15.9-openpower1 #3 + [ 10.318823] NIP: 000000003003e434 LR: 000000003003025c CTR: 0000000030030240 + [ 10.318825] REGS: c00000003fa7bd80 TRAP: 0200 Tainted: G M (4.15.9-openpower1) + [ 10.318826] MSR: 9000000000201002 <SF,HV,ME,RI> CR: 48002888 XER: 20040000 + [ 10.318831] CFAR: 0000000030030258 DAR: 394a00147d5a03a6 DSISR: 00000008 SOFTE: 1 + + +mbox based platforms +^^^^^^^^^^^^^^^^^^^^ + +For platforms using the mbox protocol for host flash access (all BMC based +OpenPOWER systems, most OpenBMC based systems) there have been some hardening +efforts in the event of the BMC being poorly behaved. + +- mbox: Reduce default BMC timeouts + + Rebooting a BMC can take 70 seconds. Skiboot cannot possibly spin for + 70 seconds waiting for a BMC to come back. This also makes the current + default of 30 seconds a bit pointless, is it far too short to be a + worse case wait time but too long to avoid hitting hardlockup detectors + and wrecking havoc inside host linux. + + Just change it to three seconds so that host linux will survive and + that, reads and writes will fail but at least the host stays up. + + Also refactored the waiting loop just a bit so that it's easier to read. +- mbox: Harden against BMC daemon errors + + Bugs present in the BMC daemon mean that skiboot gets presented with + mbox windows of size zero. These windows cannot be valid and skiboot + already detects these conditions. + + Currently skiboot warns quite strongly about the occurrence of these + problems. The problem for skiboot is that it doesn't take any action. + Initially I wanting to avoid putting policy like this into skiboot but + since these bugs aren't going away and skiboot barfing is leading to + lockups and ultimately the host going down something needs to be done. + + I propose that when we detect the problem we fail the mbox call and punt + the problem back up to Linux. I don't like it but at least it will cause + errors to cascade and won't bring the host down. I'm not sure how Linux + is supposed to detect this or what it can even do but this is better + than a crash. + + Diagnosing a failure to boot if skiboot its self fails to read flash may + be marginally more difficult with this patch. This is because skiboot + will now only print one warning about the zero sized window rather than + continuously spitting it out. + +Fast Reboot Improvements +------------------------ + +Around fast-reboot we have made several improvements to harden the fast +reboot code paths and resort to a full IPL if something doesn't look right. + +- core/fast-reboot: zero memory after fast reboot + + This improves the security and predictability of the fast reboot + environment. + + There can not be a secure fence between fast reboots, because a + malicious OS can modify the firmware itself. However a well-behaved + OS can have a reasonable expectation that OS memory regions it has + modified will be cleared upon fast reboot. + + The memory is zeroed after all other CPUs come up from fast reboot, + just before the new kernel is loaded and booted into. This allows + image preloading to run concurrently, and will allow parallelisation + of the clearing in future. +- core/fast-reboot: verify mem regions before fast reboot + + Run the mem_region sanity checkers before proceeding with fast + reboot. + + This is the beginning of proactive sanity checks on opal data + for fast reboot (with complements the reactive disable_fast_reboot + cases). This is encouraged to re-use and share any kind of debug + code and unit test code. +- fast-reboot: occ: Only delete /ibm, opal/power-mgt nodes if they exist +- core/fast-reboot: disable fast reboot upon fundamental entry/exit/locking errors + + This disables fast reboot in several more cases where serious errors + like lock corruption or call re-entrancy are detected. +- capp: Disable fast-reboot whenever enable_capi_mode() is called + + This patch updates phb4_set_capi_mode() to disable fast-reboot + whenever enable_capi_mode() is called, irrespective to its return + value. This should prevent against a possibility of not disabling + fast-reboot when some changes to enable_capi_mode() causing return of + an error and leaving CAPP in enabled mode. +- fast-reboot: occ: Delete OCC child nodes in /ibm, opal/power-mgt + + Fast-reboot in P8 fails to re-init OCC data as there are chipwise OCC + nodes which are already present in the /ibm,opal/power-mgt node. These + per-chip nodes hold the voltage IDs for each pstate and these can be + changed on OCC pstate table biasing. So delete these before calling + the re-init code to re-parse and populate the pstate data. + +Debugging/SRESET improvemens +---------------------------- + +- core/opal: allow some re-entrant calls + + This allows a small number of OPAL calls to succeed despite re-entering + the firmware, and rejects others rather than aborting. + + This allows a system reset interrupt that interrupts OPAL to do something + useful. Sreset other CPUs, use the console, which allows xmon to work or + stack traces to be printed, reboot the system. + + Use OPAL_INTERNAL_ERROR when rejecting, rather than OPAL_BUSY, which is + used for many other things that does not mean a serious permanent error. +- core/opal: abort in case of re-entrant OPAL call + + The stack is already destroyed by the time we get here, so there + is not much point continuing. +- core/lock: Add lock timeout warnings + + There are currently no timeout warnings for locks in skiboot. We assume + that the lock will eventually become free, which may not always be the + case. + + This patch adds timeout warnings for locks. Any lock which spins for more + than 5 seconds will throw a warning and stacktrace for that thread. This is + useful for debugging siturations where a lock which hang, waiting for the + lock to be freed. +- core/lock: Add deadlock detection + + This adds simple deadlock detection. The detection looks for circular + dependencies in the lock requests. It will abort and display a stack trace + when a deadlock occurs. + The detection is enabled by DEBUG_LOCKS (enabled by default). + While the detection may have a slight performance overhead, as there are + not a huge number of locks in skiboot this overhead isn't significant. +- core/hmi: report processor recovery reason from core FIR bits on P9 + + When an error is encountered that causes processor recovery, HMI is + generated if the recovery was successful. The reason is recorded in + the core FIR, which gets copied into the WOF. + + In this case dump the WOF register and an error string into the OPAL + msglog. + + A broken init setting led to HMIs reported in Linux as: :: + + [ 3.591547] Harmless Hypervisor Maintenance interrupt [Recovered] + [ 3.591648] Error detail: Processor Recovery done + [ 3.591714] HMER: 2040000000000000 + + This patch would have been useful because it tells us exactly that + the problem is in the d-side ERAT: :: + + [ 414.489690798,7] HMI: Received HMI interrupt: HMER = 0x2040000000000000 + [ 414.489693339,7] HMI: [Loc: UOPWR.0000000-Node0-Proc0]: P:0 C:1 T:1: Processor recovery occurred. + [ 414.489699837,7] HMI: Core WOF = 0x0000000410000000 recovered error: + [ 414.489701543,7] HMI: LSU - SRAM (DCACHE parity, etc) + [ 414.489702341,7] HMI: LSU - ERAT multi hit + + In future it will be good to unify this reporting, so Linux could + print something more useful. Until then, this gives some good data. + +NPU2/NVLink2 Fixes +------------------ +- npu2: Add performance tuning SCOM inits + + Peer-to-peer GPU bandwidth latency testing has produced some tunable + values that improve performance. Add them to our device initialization. + + File these under things that need to be cleaned up with nice #defines + for the register names and bitfields when we get time. + + A few of the settings are dependent on the system's particular NVLink + topology, so introduce a helper to determine how many links go to a + single GPU. +- hw/npu2: Assign a unique LPARSHORTID per GPU + + This gets used elsewhere to index items in the XTS tables. +- NPU2: dump NPU2 registers on npu2 HMI + + Due to the nature of debugging npu2 issues, folk are wanting the + full list of NPU2 registers dumped when there's a problem. +- npu2: Remove DD1 support + + Major changes in the NPU between DD1 and DD2 necessitated a fair bit of + revision-specific code. + + Now that all our lab machines are DD2, we no longer test anything on DD1 + and it's time to get rid of it. + + Remove DD1-specific code and abort probe if we're running on a DD1 machine. +- npu2: Disable fast reboot + + Fast reboot does not yet work right with the NPU. It's been disabled on + NVLink and OpenCAPI machines. Do the same for NVLink2. + + This amounts to a port of 3e4577939bbf ("npu: Fix broken fast reset") + from the npu code to npu2. +- npu2: Use unfiltered mode in XTS tables + + The XTS_PID context table is limited to 256 possible pids/contexts. To + relieve this limitation, make use of "unfiltered mode" instead. + + If an entry in the XTS_BDF table has the bit for unfiltered mode set, we + can just use one context for that entire bdf/lpar, regardless of pid. + Instead of of searching the XTS_PID table, the NMMU checkout request + will simply use the entry indexed by lparshort id instead. + + Change opal_npu_init_context() to create these lparshort-indexed + wildcard entries (0-15) instead of allocating one for each pid. Check + that multiple calls for the same bdf all specify the same msr value. + + In opal_npu_destroy_context(), continue validating the bdf argument, + ensuring that it actually maps to an lpar, but no longer remove anything + from the XTS_PID table. If/when we start supporting virtualized GPUs, we + might consider actually removing these wildcard entries by keeping a + refcount, but keep things simple for now. + +CAPI/OpenCAPI +------------- +- npu2-opencapi: Add OpenCAPI OPAL API calls + + Add three OPAL API calls that are required by the ocxl driver. + + - OPAL_NPU_SPA_SETUP + + The Shared Process Area (SPA) is a table containing one entry (a + "Process Element") per memory context which can be accessed by the + OpenCAPI device. + + - OPAL_NPU_SPA_CLEAR_CACHE + + The NPU keeps a cache of recently accessed memory contexts. When a + Process Element is removed from the SPA, the cache for the link must be + cleared. + + - OPAL_NPU_TL_SET + + The Transaction Layer specification defines several templates for + messages to be exchanged on the link. During link setup, the host and + device must negotiate what templates are supported on both sides and at + what rates those messages can be sent. +- npu2-opencapi: Train OpenCAPI links and setup devices + + Scan the OpenCAPI links under the NPU, and for each link, reset the card, + set up a device, train the link and register a PHB. + + Implement the necessary operations for the OpenCAPI PHB type. + + For bringup, test and debug purposes, we allow an NVRAM setting, + "opencapi-link-training" that can be set to either disable link training + completely or to use the prbs31 test pattern. + + To disable link training: :: + + nvram -p ibm,skiboot --update-config opencapi-link-training=none + + To use prbs31: :: + + nvram -p ibm,skiboot --update-config opencapi-link-training=prbs31 +- npu2-hw-procedures: Add support for OpenCAPI PHY link training + + Unlike NVLink, which uses the pci-virt framework to fake a PCI + configuration space for NVLink devices, the OpenCAPI device model presents + us with a real configuration space handled by the device over the OpenCAPI + link. + + As a result, we have to train the OpenCAPI link in skiboot before we do PCI + probing, so that config space can be accessed, rather than having link + training being triggered by the Linux driver. +- npu2-opencapi: Configure NPU for OpenCAPI + + Scan the device tree for NPUs with OpenCAPI links and configure the NPU per + the initialisation sequence in the NPU OpenCAPI workbook. +- capp: Make error in capp timebase sync a non-fatal error + + Presently when we encounter an error while synchronizing capp timebase + with chip-tod at the end of enable_capi_mode() we return an + error. This has an to unintended consequences. First this will prevent + disabling of fast-reboot even though CAPP is already enabled by this + point. Secondly, failure during timebase sync is a non fatal error or + capp initialization as CAPP/PSL can continue working after this and an + AFU will only see an error when it tries to read the timebase value + from PSL. + + So this patch updates enable_capi_mode() to not return an error in + case call to chiptod_capp_timebase_sync() fails. The function will now + just log an error and continue further with capp init sequence. This + make the current implementation align with the one in kernel 'cxl' + driver which also assumes the PSL timebase sync errors as non-fatal + init error. +- npu2-opencapi: Fix assert on link reset during init + + We don't support resetting an opencapi link yet. + + Commit fe6d86b9 ("pci: Make fast reboot creset PHBs in parallel") + tries resetting any PHB whose slot defines a 'run_sm' callback. It + raises an assert when applied to an opencapi PHB, as 'run_sm' calls + the 'freset' callback, which is not yet defined for opencapi. + + Fix it for now by removing the currently useless definition of + 'run_sm' on the opencapi slot. It will print a message in the skiboot + log because the PHB cannot be reset, which is correct. It will all go + away when we add support for resetting an opencapi link. +- capp: Add lid definition for P9 DD-2.2 + + Update fsp_lid_map to include CAPP ucode lid for phb4-chipid == + 0x202d1 that corresponds to P9 DD-2.2 chip. +- capp: Disable fast-reboot when capp is enabled + + +PCI +--- + +- pci: Reduce log level of error message + + If a link doesn't train, we can end up with error messages like this: :: + + [ 63.027261959,3] PHB#0032[8:2]: LINK: Timeout waiting for electrical link + [ 63.027265573,3] PHB#0032:00:00.0 Error -6 resetting + + The first message is useful but the second message is just debug from + the core PCI code and is confusing to print to the console. + + This reduces the second print to debug level so it's not seen by the + console by default. +- Revert "platforms/astbmc/slots.c: Allow comparison of bus numbers when matching slots" + + This reverts commit bda7cc4d0354eb3f66629d410b2afc08c79f795f. + + Ben says: + It's on purpose that we do NOT compare the bus numbers, + they are always 0 in the slot table + we do a hierarchical walk of the tree, matching only the + devfn's along the way bcs the bus numbering isn't fixed + this breaks all slot naming etc... stuff on anything using + the "skiboot" slot tables (P8 opp typically) +- core/pci-dt-slot: Fix booting with no slot map + + Currently if you don't have a slot map in the device tree in + /ibm,pcie-slots, you can crash with a back trace like this: :: + + CPU 0034 Backtrace: + S: 0000000031cd3370 R: 000000003001362c .backtrace+0x48 + S: 0000000031cd3410 R: 0000000030019e38 ._abort+0x4c + S: 0000000031cd3490 R: 000000003002760c .exception_entry+0x180 + S: 0000000031cd3670 R: 0000000000001f10 * + S: 0000000031cd3850 R: 00000000300b4f3e * cpu_features_table+0x1d9e + S: 0000000031cd38e0 R: 000000003002682c .dt_node_is_compatible+0x20 + S: 0000000031cd3960 R: 0000000030030e08 .map_pci_dev_to_slot+0x16c + S: 0000000031cd3a30 R: 0000000030091054 .dt_slot_get_slot_info+0x28 + S: 0000000031cd3ac0 R: 000000003001e27c .pci_scan_one+0x2ac + S: 0000000031cd3ba0 R: 000000003001e588 .pci_scan_bus+0x70 + S: 0000000031cd3cb0 R: 000000003001ee74 .pci_scan_phb+0x100 + S: 0000000031cd3d40 R: 0000000030017ff0 .cpu_process_jobs+0xdc + S: 0000000031cd3e00 R: 0000000030014cb0 .__secondary_cpu_entry+0x44 + S: 0000000031cd3e80 R: 0000000030014d04 .secondary_cpu_entry+0x34 + S: 0000000031cd3f00 R: 0000000030002770 secondary_wait+0x8c + [ 73.016947149,3] Fatal MCE at 0000000030026054 .dt_find_property+0x30 + [ 73.017073254,3] CFAR : 0000000030026040 + [ 73.017138048,3] SRR0 : 0000000030026054 SRR1 : 9000000000201000 + [ 73.017198375,3] HSRR0: 0000000000000000 HSRR1: 0000000000000000 + [ 73.017263210,3] DSISR: 00000008 DAR : 7c7b1b7848002524 + [ 73.017352517,3] LR : 000000003002602c CTR : 000000003009102c + [ 73.017419778,3] CR : 20004204 XER : 20040000 + [ 73.017502425,3] GPR00: 000000003002682c GPR16: 0000000000000000 + [ 73.017586924,3] GPR01: 0000000031c23670 GPR17: 0000000000000000 + [ 73.017643873,3] GPR02: 00000000300fd500 GPR18: 0000000000000000 + [ 73.017767091,3] GPR03: fffffffffffffff8 GPR19: 0000000000000000 + [ 73.017855707,3] GPR04: 00000000300b3dc6 GPR20: 0000000000000000 + [ 73.017943944,3] GPR05: 0000000000000000 GPR21: 00000000300bb6d2 + [ 73.018024709,3] GPR06: 0000000031c23910 GPR22: 0000000000000000 + [ 73.018117716,3] GPR07: 0000000031c23930 GPR23: 0000000000000000 + [ 73.018195974,3] GPR08: 0000000000000000 GPR24: 0000000000000000 + [ 73.018278350,3] GPR09: 0000000000000000 GPR25: 0000000000000000 + [ 73.018353795,3] GPR10: 0000000000000028 GPR26: 00000000300be6fb + [ 73.018424362,3] GPR11: 0000000000000000 GPR27: 0000000000000000 + [ 73.018533159,3] GPR12: 0000000020004208 GPR28: 0000000030767d38 + [ 73.018642725,3] GPR13: 0000000031c20000 GPR29: 00000000300b3dc6 + [ 73.018737925,3] GPR14: 0000000000000000 GPR30: 0000000000000010 + [ 73.018794428,3] GPR15: 0000000000000000 GPR31: 7c7b1b7848002514 + + This has been seen in the lab on a witherspoon using the device tree + entry point (ie. no HDAT). + + This fixes the null pointer deref. + +Bugs Fixed +---------- +- xive: fix opal_xive_set_vp_info() error path + + In case of error, opal_xive_set_vp_info() will return without + unlocking the xive object. This is most certainly a typo. +- hw/imc: don't access homer memory if it was not initialised + + This can happen under mambo, at least. +- nvram: run nvram_validate() after nvram_reformat() + + nvram_reformat() sets nvram_valid = true, but it does not set + skiboot_part_hdr. Call nvram_validate() instead, which sets + everything up properly. +- dts: Zero struct to avoid using uninitialised value +- hw/imc: Don't dereference possible NULL +- libstb/create-container: munmap() signature file address +- npu2-opencapi: Fix memory leak +- npu2: Fix possible NULL dereference +- occ-sensors: Remove NULL checks after dereference +- core/ipmi-opal: Add interrupt-parent property for ipmi node on P9 and above. + + dtc complains below warning with newer 4.2+ kernels. :: + + dts: Warning (interrupts_property): Missing interrupt-parent for /ibm,opal/ipmi + + This fix adds interrupt-parent property under /ibm,opal/ipmi DT node on P9 + and above, which allows ipmi-opal to properly use the OPAL irqchip. + +Other fixes and improvements +---------------------------- + +- core/cpu: discover stack region size before initialising memory regions + + Stack allocation first allocates a memory region sized to hold stacks + for all possible CPUs up to the maximum PIR of the architecture, zeros + the region, then initialises all stacks. Max PIR is 32768 on POWER9, + which is 512MB for stacks. + + The stack region is then shrunk after CPUs are discovered, but this is + a bit of a hack, and it leaves a hole in the memory allocation regions + as it's done after mem regions are initialised. :: + + 0x000000000000..00002fffffff : ibm,os-reserve - OS + 0x000030000000..0000303fffff : ibm,firmware-code - OPAL + 0x000030400000..000030ffffff : ibm,firmware-heap - OPAL + 0x000031000000..000031bfffff : ibm,firmware-data - OPAL + 0x000031c00000..000031c0ffff : ibm,firmware-stacks - OPAL + *** gap *** + 0x000051c00000..000051d01fff : ibm,firmware-allocs-memory@0 - OPAL + 0x000051d02000..00007fffffff : ibm,firmware-allocs-memory@0 - OS + 0x000080000000..000080b3cdff : initramfs - OPAL + 0x000080b3ce00..000080b7cdff : ibm,fake-nvram - OPAL + 0x000080b7ce00..0000ffffffff : ibm,firmware-allocs-memory@0 - OS + + This change moves zeroing into the per-cpu stack setup. The boot CPU + stack is set up based on the current PIR. Then the size of the stack + region is set, by discovering the maximum PIR of the system from the + device tree, before mem regions are intialised. + + This results in all memory being accounted within memory regions, + and less memory fragmentation of OPAL allocations. +- Make gard display show that a record is cleared + + When clearing gard records, Hostboot only modifies the record_id + portion to be 0xFFFFFFFF. The remainder of the entry remains. + Without this change it can be confusing to users to know that + the record they are looking at is no longer valid. +- Reserve OPAL API number for opal_handle_hmi2 function. +- dts: spl_wakeup: Remove all workarounds in the spl wakeup logic + + We coded few workarounds in special wakeup logic to handle the + buggy firmware. Now that is fixed remove them as they break the + special wakeup protocol. As per the spec we should not de-assert + beofre assert is complete. So follow this protocol. +- build: use thin archives rather than incremental linking + + This changes to build system to use thin archives rather than + incremental linking for built-in.o, similar to recent change to Linux. + built-in.o is renamed to built-in.a, and is created as a thin archive + with no index, for speed and size. All built-in.a are aggregated into + a skiboot.tmp.a which is a thin archive built with an index, making it + suitable or linking. This is input into the final link. + + The advantags of build size and linker code placement flexibility are + not as great with skiboot as a bigger project like Linux, but it's a + conceptually better way to build, and is more compatible with link + time optimisation in toolchains which might be interesting for skiboot + particularly for size reductions. + + Size of build tree before this patch is 34.4MB, afterwards 23.1MB. +- core/init: Assert when kernel not found + + If the kernel doesn't load out of flash or there is nothing at + KERNEL_LOAD_BASE, we end up with an esoteric message as we try to + branch to out of skiboot into nothing :: + + [ 0.007197688,3] INIT: ELF header not found. Assuming raw binary. + [ 0.014035267,5] INIT: Starting kernel at 0x0, fdt at 0x3044ad90 13029 + [ 0.014042254,3] *********************************************** + [ 0.014069947,3] Fatal Exception 0xe40 at 0000000000000000 + [ 0.014085574,3] CFAR : 00000000300051c4 + [ 0.014090118,3] SRR0 : 0000000000000000 SRR1 : 0000000000000000 + [ 0.014096243,3] HSRR0: 0000000000000000 HSRR1: 9000000000001000 + [ 0.014102546,3] DSISR: 00000000 DAR : 0000000000000000 + [ 0.014108538,3] LR : 00000000300144c8 CTR : 0000000000000000 + [ 0.014114756,3] CR : 40002202 XER : 00000000 + [ 0.014120301,3] GPR00: 000000003001447c GPR16: 0000000000000000 + + This improves the message and asserts in this case: :: + + [ 0.014042685,5] INIT: Starting kernel at 0x0, fdt at 0x3044ad90 13049 bytes) + [ 0.014049556,0] FATAL: Kernel is zeros, can't execute! + [ 0.014054237,0] Assert fail: core/init.c:566:0 + [ 0.014060472,0] Aborting! +- core: Fix 'opal-runtime-size' property + + We are populating 'opal-runtime-size' before calculating actual stack size. + Hence we endup having wrong runtime size (ex: on P9 it shows ~540MB while + actual size is around ~40MB). Note that only device tree property is shows + wrong value, but reserved-memory reflects correct size. + + init_all_cpus() calculates and updates actual stack size. Hence move this + function call before add_opal_node(). + +- mambo: Add fw-feature flags for security related settings + + Newer firmwares report some feature flags related to security + settings via HDAT. On real hardware skiboot translates these into + device tree properties. For testing purposes just create the + properties manually in the tcl. + + These values don't exactly match any actual chip revision, but the + code should not rely on any exact set of values anyway. We just define + the most interesting flags, that if toggled to "disable" will change + Linux behaviour. You can see the actual values in the hostboot source + in src/usr/hdat/hdatiplparms.H. + + Also add an environment variable for easily toggling the top-level + "security on" setting. +- direct-controls: mambo fix for multiple chips +- libflash/blocklevel: Correct miscalculation in blocklevel_smart_erase() + + If blocklevel_smart_erase() detects that the smart erase fits entire in + one erase block, it has an early bail path. In this path it miscaculates + where in the buffer the backend needs to read from to perform the final + write. +- libstb/secureboot: Fix logging of secure verify messages. + + Currently we are logging secure verify/enforce messages in PR_EMERG + level even when there is no secureboot mode enabled. So reduce the + log level to PR_ERR when secureboot mode is OFF. + +Testing / Code coverage improvements +------------------------------------ + +Improvements in gcov support include support for newer GCCs as well +as easily exporting the area of memory you need to dump to feed to +`extract-gcov`. + +- cpu_idle_job: relax a bit + + This *dramatically* improves kernel boot time with GCOV builds + + from ~3minutes between loading kernel and switching the HILE + bit down to around 10 seconds. +- gcov: Another GCC, another gcov tweak +- Keep constructors with priorities + + Fixes GCOV builds with gcc7, which uses this. +- gcov: Add gcov data struct to sysfs + + Extracting the skiboot gcov data is currently a tedious process which + involves taking a mem dump of skiboot and searching for the gcov_info + struct. + This patch adds the gcov struct to sysfs under /opal/exports. Allowing the + data to be copied directly into userspace and processed. + diff --git a/roms/skiboot/doc/release-notes/skiboot-5.11.rst b/roms/skiboot/doc/release-notes/skiboot-5.11.rst new file mode 100644 index 000000000..53eb9bafc --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.11.rst @@ -0,0 +1,828 @@ +.. _skiboot-5.11: + +skiboot-5.11 +============ + +skiboot v5.11 was released on Friday April 6th 2018. It is the first +release of skiboot 5.11, which is now the new stable release +of skiboot following the 5.10 release, first released February 23rd 2018. + +It is *not* expected to keep the 5.11 branch around for long, and instead +quickly move onto a 6.0, which will mark the basis for op-build v2.0 and +will be required for POWER9 systems. + +It is expected that skiboot 6.0 will follow very shortly. Consider 5.11 +more of a beta release to 6.0 than anything. For POWER9 systems it should +certainly be more solid than previous releases though. + +skiboot v5.11 contains all bug fixes as of :ref:`skiboot-5.10.4` +and :ref:`skiboot-5.4.9` (the currently maintained stable releases). There +may be more 5.10.x stable releases, it will depend on demand. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +Over skiboot-5.10, we have the following changes: + +New Platforms +------------- + +- Add VESNIN platform support + + The Vesnin platform from YADRO is a 4 socked POWER8 system with up to 8TB + of memory with 460GB/s of memory bandwidth in only 2U. Many kudos to the + team from Yadro for submitting their code upstream! + +New Features +------------ + +- fast-reboot: enable by default for POWER9 + + - Fast reboot is disabled if NPU2 is present or CAPI2/OpenCAPI is used + +- PCI tunneled operations on PHB4 + + - phb4: set PBCQ Tunnel BAR for tunneled operations + + P9 supports PCI tunneled operations (atomics and as_notify) that are + initiated by devices. + + A subset of the tunneled operations require a response, that must be + sent back from the host to the device. For example, an atomic compare + and swap will return the compare status, as swap will only performed + in case of success. Similarly, as_notify reports if the target thread + has been woken up or not, because the operation may fail. + + To enable tunneled operations, a device driver must tell the host where + it expects tunneled operation responses, by setting the PBCQ Tunnel BAR + Response register with a specific value within the range of its BARs. + + This register is currently initialized by enable_capi_mode(). But, as + tunneled operations may also operate in PCI mode, a new API is required + to set the PBCQ Tunnel BAR Response register, without switching to CAPI + mode. + + This patch provides two new OPAL calls to get/set the PBCQ Tunnel + BAR Response register. + + Note: as there is only one PBCQ Tunnel BAR register, shared between + all the devices connected to the same PHB, only one of these devices + will be able to use tunneled operations, at any time. + - phb4: set PHB CMPM registers for tunneled operations + + P9 supports PCI tunneled operations (atomics and as_notify) that require + setting the PHB ASN Compare/Mask register with a 16-bit indication. + + This register is currently initialized by enable_capi_mode(). But, as + tunneled operations may also work in PCI mode, the ASN Compare/Mask + register should rather be initialized in phb4_init_ioda3(). + + This patch also adds "ibm,phb-indications" to the device tree, to tell + Linux the values of CAPI, ASN, and NBW indications, when supported. + + Tunneled operations tested by IBM in CAPI mode, by Mellanox Technologies + in PCI mode. + +- Tie tm-suspend fw-feature and opal_reinit_cpus() together + + Currently opal_reinit_cpus(OPAL_REINIT_CPUS_TM_SUSPEND_DISABLED) + always returns OPAL_UNSUPPORTED. + + This ties the tm suspend fw-feature to the + opal_reinit_cpus(OPAL_REINIT_CPUS_TM_SUSPEND_DISABLED) so that when tm + suspend is disabled, we correctly report it to the kernel. For + backwards compatibility, it's assumed tm suspend is available if the + fw-feature is not present. + + Currently hostboot will clear fw-feature(TM_SUSPEND_ENABLED) on P9N + DD2.1. P9N DD2.2 will set fw-feature(TM_SUSPEND_ENABLED). DD2.0 and + below has TM disabled completely (not just suspend). + + We are using opal_reinit_cpus() to determine this setting (rather than + the device tree/HDAT) as some future firmware may let us change this + dynamically after boot. That is not the case currently though. + +Power Management +---------------- + +- SLW: Increase stop4-5 residency by 10x + + Using DGEMM benchmark we observed there was a drop of 5-9% throughput with + and without stop4/5. In this benchmark the GPU waits on the cpu to wakeup + and provide the subsequent data block to compute. The wakup latency + accumulates over the run and shows up as a performance drop. + + Linux enters stop4/5 more aggressively for its wakeup latency. Increasing + the residency from 1ms to 10ms makes the performance drop <1% +- occ: Set up OCC messaging even if we fail to setup pstates + + This means that we no longer hit this bug if we fail to get valid pstates + from the OCC. :: + + [console-pexpect]#echo 1 > //sys/firmware/opal/sensor_groups//occ-csm0/clear + echo 1 > //sys/firmware/opal/sensor_groups//occ-csm0/clear + [ 94.019971181,5] CPU ATTEMPT TO RE-ENTER FIRMWARE! PIR=083d cpu @0x33cf4000 -> pir=083d token=8 + [ 94.020098392,5] CPU ATTEMPT TO RE-ENTER FIRMWARE! PIR=083d cpu @0x33cf4000 -> pir=083d token=8 + [ 10.318805] Disabling lock debugging due to kernel taint + [ 10.318808] Severe Machine check interrupt [Not recovered] + [ 10.318812] NIP [000000003003e434]: 0x3003e434 + [ 10.318813] Initiator: CPU + [ 10.318815] Error type: Real address [Load/Store (foreign)] + [ 10.318817] opal: Hardware platform error: Unrecoverable Machine Check exception + [ 10.318821] CPU: 117 PID: 2745 Comm: sh Tainted: G M 4.15.9-openpower1 #3 + [ 10.318823] NIP: 000000003003e434 LR: 000000003003025c CTR: 0000000030030240 + [ 10.318825] REGS: c00000003fa7bd80 TRAP: 0200 Tainted: G M (4.15.9-openpower1) + [ 10.318826] MSR: 9000000000201002 <SF,HV,ME,RI> CR: 48002888 XER: 20040000 + [ 10.318831] CFAR: 0000000030030258 DAR: 394a00147d5a03a6 DSISR: 00000008 SOFTE: 1 + + +mbox based platforms +^^^^^^^^^^^^^^^^^^^^ + +For platforms using the mbox protocol for host flash access (all BMC based +OpenPOWER systems, most OpenBMC based systems) there have been some hardening +efforts in the event of the BMC being poorly behaved. + +- mbox: Reduce default BMC timeouts + + Rebooting a BMC can take 70 seconds. Skiboot cannot possibly spin for + 70 seconds waiting for a BMC to come back. This also makes the current + default of 30 seconds a bit pointless, is it far too short to be a + worse case wait time but too long to avoid hitting hardlockup detectors + and wrecking havoc inside host linux. + + Just change it to three seconds so that host linux will survive and + that, reads and writes will fail but at least the host stays up. + + Also refactored the waiting loop just a bit so that it's easier to read. +- mbox: Harden against BMC daemon errors + + Bugs present in the BMC daemon mean that skiboot gets presented with + mbox windows of size zero. These windows cannot be valid and skiboot + already detects these conditions. + + Currently skiboot warns quite strongly about the occurrence of these + problems. The problem for skiboot is that it doesn't take any action. + Initially I wanting to avoid putting policy like this into skiboot but + since these bugs aren't going away and skiboot barfing is leading to + lockups and ultimately the host going down something needs to be done. + + I propose that when we detect the problem we fail the mbox call and punt + the problem back up to Linux. I don't like it but at least it will cause + errors to cascade and won't bring the host down. I'm not sure how Linux + is supposed to detect this or what it can even do but this is better + than a crash. + + Diagnosing a failure to boot if skiboot its self fails to read flash may + be marginally more difficult with this patch. This is because skiboot + will now only print one warning about the zero sized window rather than + continuously spitting it out. + +Fast Reboot Improvements +------------------------ + +Around fast-reboot we have made several improvements to harden the fast +reboot code paths and resort to a full IPL if something doesn't look right. + +- core/fast-reboot: zero memory after fast reboot + + This improves the security and predictability of the fast reboot + environment. + + There can not be a secure fence between fast reboots, because a + malicious OS can modify the firmware itself. However a well-behaved + OS can have a reasonable expectation that OS memory regions it has + modified will be cleared upon fast reboot. + + The memory is zeroed after all other CPUs come up from fast reboot, + just before the new kernel is loaded and booted into. This allows + image preloading to run concurrently, and will allow parallelisation + of the clearing in future. +- core/fast-reboot: verify mem regions before fast reboot + + Run the mem_region sanity checkers before proceeding with fast + reboot. + + This is the beginning of proactive sanity checks on opal data + for fast reboot (with complements the reactive disable_fast_reboot + cases). This is encouraged to re-use and share any kind of debug + code and unit test code. +- fast-reboot: occ: Only delete /ibm, opal/power-mgt nodes if they exist +- core/fast-reboot: disable fast reboot upon fundamental entry/exit/locking errors + + This disables fast reboot in several more cases where serious errors + like lock corruption or call re-entrancy are detected. +- capp: Disable fast-reboot whenever enable_capi_mode() is called + + This patch updates phb4_set_capi_mode() to disable fast-reboot + whenever enable_capi_mode() is called, irrespective to its return + value. This should prevent against a possibility of not disabling + fast-reboot when some changes to enable_capi_mode() causing return of + an error and leaving CAPP in enabled mode. +- fast-reboot: occ: Delete OCC child nodes in /ibm, opal/power-mgt + + Fast-reboot in P8 fails to re-init OCC data as there are chipwise OCC + nodes which are already present in the /ibm,opal/power-mgt node. These + per-chip nodes hold the voltage IDs for each pstate and these can be + changed on OCC pstate table biasing. So delete these before calling + the re-init code to re-parse and populate the pstate data. + +Debugging/SRESET improvemens +---------------------------- + +Since :ref:`skiboot-5.11-rc1`: + +- core/cpu: Prevent clobbering of stack guard for boot-cpu + + Commit 90d53934c2da ("core/cpu: discover stack region size before + initialising memory regions") introduced memzero for struct cpu_thread + in init_cpu_thread(). This has an unintended side effect of clobbering + the stack-guard cannery of the boot_cpu stack. This results in opal + failing to init with this failure message: :: + + CPU: P9 generation processor (max 4 threads/core) + CPU: Boot CPU PIR is 0x0004 PVR is 0x004e1200 + Guard skip = 0 + Stack corruption detected ! + Aborting! + CPU 0004 Backtrace: + S: 0000000031c13ab0 R: 0000000030013b0c .backtrace+0x5c + S: 0000000031c13b50 R: 000000003001bd18 ._abort+0x60 + S: 0000000031c13be0 R: 0000000030013bbc .__stack_chk_fail+0x54 + S: 0000000031c13c60 R: 00000000300c5b70 .memset+0x12c + S: 0000000031c13d00 R: 0000000030019aa8 .init_cpu_thread+0x40 + S: 0000000031c13d90 R: 000000003001b520 .init_boot_cpu+0x188 + S: 0000000031c13e30 R: 0000000030015050 .main_cpu_entry+0xd0 + S: 0000000031c13f00 R: 0000000030002700 boot_entry+0x1c0 + + So the patch provides a fix by tweaking the memset() call in + init_cpu_thread() to skip over the stack-guard cannery. +- core/lock.c: ensure valid start value for lock spin duration warning + + The previous fix in a8e6cc3f4 only addressed half of the problem, as + we could also get an invalid value for start, causing us to fail + in a weird way. + + This was caught by the testcases.OpTestHMIHandling.HMI_TFMR_ERRORS + test in op-test-framework. + + You'd get to this part of the test and get the erroneous lock + spinning warnings: :: + + PATH=/usr/local/sbin:$PATH putscom -c 00000000 0x2b010a84 0003080000000000 + 0000080000000000 + [ 790.140976993,4] WARNING: Lock has been spinning for 790275ms + [ 790.140976993,4] WARNING: Lock has been spinning for 790275ms + [ 790.140976918,4] WARNING: Lock has been spinning for 790275ms + + This patch checks the validity of timebase before setting start, + and only checks the lock timeout if we got a valid start value. + + +Since :ref:`skiboot-5.10`: + +- core/opal: allow some re-entrant calls + + This allows a small number of OPAL calls to succeed despite re-entering + the firmware, and rejects others rather than aborting. + + This allows a system reset interrupt that interrupts OPAL to do something + useful. Sreset other CPUs, use the console, which allows xmon to work or + stack traces to be printed, reboot the system. + + Use OPAL_INTERNAL_ERROR when rejecting, rather than OPAL_BUSY, which is + used for many other things that does not mean a serious permanent error. +- core/opal: abort in case of re-entrant OPAL call + + The stack is already destroyed by the time we get here, so there + is not much point continuing. +- core/lock: Add lock timeout warnings + + There are currently no timeout warnings for locks in skiboot. We assume + that the lock will eventually become free, which may not always be the + case. + + This patch adds timeout warnings for locks. Any lock which spins for more + than 5 seconds will throw a warning and stacktrace for that thread. This is + useful for debugging siturations where a lock which hang, waiting for the + lock to be freed. +- core/lock: Add deadlock detection + + This adds simple deadlock detection. The detection looks for circular + dependencies in the lock requests. It will abort and display a stack trace + when a deadlock occurs. + The detection is enabled by DEBUG_LOCKS (enabled by default). + While the detection may have a slight performance overhead, as there are + not a huge number of locks in skiboot this overhead isn't significant. +- core/hmi: report processor recovery reason from core FIR bits on P9 + + When an error is encountered that causes processor recovery, HMI is + generated if the recovery was successful. The reason is recorded in + the core FIR, which gets copied into the WOF. + + In this case dump the WOF register and an error string into the OPAL + msglog. + + A broken init setting led to HMIs reported in Linux as: :: + + [ 3.591547] Harmless Hypervisor Maintenance interrupt [Recovered] + [ 3.591648] Error detail: Processor Recovery done + [ 3.591714] HMER: 2040000000000000 + + This patch would have been useful because it tells us exactly that + the problem is in the d-side ERAT: :: + + [ 414.489690798,7] HMI: Received HMI interrupt: HMER = 0x2040000000000000 + [ 414.489693339,7] HMI: [Loc: UOPWR.0000000-Node0-Proc0]: P:0 C:1 T:1: Processor recovery occurred. + [ 414.489699837,7] HMI: Core WOF = 0x0000000410000000 recovered error: + [ 414.489701543,7] HMI: LSU - SRAM (DCACHE parity, etc) + [ 414.489702341,7] HMI: LSU - ERAT multi hit + + In future it will be good to unify this reporting, so Linux could + print something more useful. Until then, this gives some good data. + +NPU2/NVLink2 Fixes +------------------ +- npu2: Add performance tuning SCOM inits + + Peer-to-peer GPU bandwidth latency testing has produced some tunable + values that improve performance. Add them to our device initialization. + + File these under things that need to be cleaned up with nice #defines + for the register names and bitfields when we get time. + + A few of the settings are dependent on the system's particular NVLink + topology, so introduce a helper to determine how many links go to a + single GPU. +- hw/npu2: Assign a unique LPARSHORTID per GPU + + This gets used elsewhere to index items in the XTS tables. +- NPU2: dump NPU2 registers on npu2 HMI + + Due to the nature of debugging npu2 issues, folk are wanting the + full list of NPU2 registers dumped when there's a problem. +- npu2: Remove DD1 support + + Major changes in the NPU between DD1 and DD2 necessitated a fair bit of + revision-specific code. + + Now that all our lab machines are DD2, we no longer test anything on DD1 + and it's time to get rid of it. + + Remove DD1-specific code and abort probe if we're running on a DD1 machine. +- npu2: Disable fast reboot + + Fast reboot does not yet work right with the NPU. It's been disabled on + NVLink and OpenCAPI machines. Do the same for NVLink2. + + This amounts to a port of 3e4577939bbf ("npu: Fix broken fast reset") + from the npu code to npu2. +- npu2: Use unfiltered mode in XTS tables + + The XTS_PID context table is limited to 256 possible pids/contexts. To + relieve this limitation, make use of "unfiltered mode" instead. + + If an entry in the XTS_BDF table has the bit for unfiltered mode set, we + can just use one context for that entire bdf/lpar, regardless of pid. + Instead of of searching the XTS_PID table, the NMMU checkout request + will simply use the entry indexed by lparshort id instead. + + Change opal_npu_init_context() to create these lparshort-indexed + wildcard entries (0-15) instead of allocating one for each pid. Check + that multiple calls for the same bdf all specify the same msr value. + + In opal_npu_destroy_context(), continue validating the bdf argument, + ensuring that it actually maps to an lpar, but no longer remove anything + from the XTS_PID table. If/when we start supporting virtualized GPUs, we + might consider actually removing these wildcard entries by keeping a + refcount, but keep things simple for now. + +CAPI/OpenCAPI +------------- + +Since :ref:`skiboot-5.11-rc1`: + +- capi: Poll Err/Status register during CAPP recovery + + This patch updates do_capp_recovery_scoms() to poll the CAPP + Err/Status control register, check for CAPP-Recovery to complete/fail + based on indications of BITS-1,5,9 and then proceed with the + CAPP-Recovery scoms iif recovery completed successfully. This would + prevent cases where we bring-up the PCIe link while recovery sequencer + on CAPP is still busy with casting out cache lines. + + In case CAPP-Recovery didn't complete successfully an error is returned + from do_capp_recovery_scoms() asking phb4_creset() to keep the phb4 + fenced and mark it as broken. + + The loop that implements polling of Err/Status register will also log + an error on the PHB when it continues for more than 168ms which is the + max time to failure for CAPP-Recovery. + +Since :ref:`skiboot-5.10`: + +- npu2-opencapi: Add OpenCAPI OPAL API calls + + Add three OPAL API calls that are required by the ocxl driver. + + - OPAL_NPU_SPA_SETUP + + The Shared Process Area (SPA) is a table containing one entry (a + "Process Element") per memory context which can be accessed by the + OpenCAPI device. + + - OPAL_NPU_SPA_CLEAR_CACHE + + The NPU keeps a cache of recently accessed memory contexts. When a + Process Element is removed from the SPA, the cache for the link must be + cleared. + + - OPAL_NPU_TL_SET + + The Transaction Layer specification defines several templates for + messages to be exchanged on the link. During link setup, the host and + device must negotiate what templates are supported on both sides and at + what rates those messages can be sent. +- npu2-opencapi: Train OpenCAPI links and setup devices + + Scan the OpenCAPI links under the NPU, and for each link, reset the card, + set up a device, train the link and register a PHB. + + Implement the necessary operations for the OpenCAPI PHB type. + + For bringup, test and debug purposes, we allow an NVRAM setting, + "opencapi-link-training" that can be set to either disable link training + completely or to use the prbs31 test pattern. + + To disable link training: :: + + nvram -p ibm,skiboot --update-config opencapi-link-training=none + + To use prbs31: :: + + nvram -p ibm,skiboot --update-config opencapi-link-training=prbs31 +- npu2-hw-procedures: Add support for OpenCAPI PHY link training + + Unlike NVLink, which uses the pci-virt framework to fake a PCI + configuration space for NVLink devices, the OpenCAPI device model presents + us with a real configuration space handled by the device over the OpenCAPI + link. + + As a result, we have to train the OpenCAPI link in skiboot before we do PCI + probing, so that config space can be accessed, rather than having link + training being triggered by the Linux driver. +- npu2-opencapi: Configure NPU for OpenCAPI + + Scan the device tree for NPUs with OpenCAPI links and configure the NPU per + the initialisation sequence in the NPU OpenCAPI workbook. +- capp: Make error in capp timebase sync a non-fatal error + + Presently when we encounter an error while synchronizing capp timebase + with chip-tod at the end of enable_capi_mode() we return an + error. This has an to unintended consequences. First this will prevent + disabling of fast-reboot even though CAPP is already enabled by this + point. Secondly, failure during timebase sync is a non fatal error or + capp initialization as CAPP/PSL can continue working after this and an + AFU will only see an error when it tries to read the timebase value + from PSL. + + So this patch updates enable_capi_mode() to not return an error in + case call to chiptod_capp_timebase_sync() fails. The function will now + just log an error and continue further with capp init sequence. This + make the current implementation align with the one in kernel 'cxl' + driver which also assumes the PSL timebase sync errors as non-fatal + init error. +- npu2-opencapi: Fix assert on link reset during init + + We don't support resetting an opencapi link yet. + + Commit fe6d86b9 ("pci: Make fast reboot creset PHBs in parallel") + tries resetting any PHB whose slot defines a 'run_sm' callback. It + raises an assert when applied to an opencapi PHB, as 'run_sm' calls + the 'freset' callback, which is not yet defined for opencapi. + + Fix it for now by removing the currently useless definition of + 'run_sm' on the opencapi slot. It will print a message in the skiboot + log because the PHB cannot be reset, which is correct. It will all go + away when we add support for resetting an opencapi link. +- capp: Add lid definition for P9 DD-2.2 + + Update fsp_lid_map to include CAPP ucode lid for phb4-chipid == + 0x202d1 that corresponds to P9 DD-2.2 chip. +- capp: Disable fast-reboot when capp is enabled + + +PCI +--- + +Since :ref:`skiboot-5.11-rc1`: + +- phb4: Reset FIR/NFIR registers before PHB4 probe + + The function phb4_probe_stack() resets "ETU Reset Register" to + unfreeze the PHB before it performs mmio access on the PHB. However in + case the FIR/NFIR registers are set while entering this function, + the reset of "ETU Reset Register" wont unfreeze the PHB and it will + remain fenced. This leads to failure during initial CRESET of the PHB + as mmio access is still not enabled and an error message of the form + below is logged: :: + + PHB#0000[0:0]: Initializing PHB4... + PHB#0000[0:0]: Default system config: 0xffffffffffffffff + PHB#0000[0:0]: New system config : 0xffffffffffffffff + PHB#0000[0:0]: Initial PHB CRESET is 0xffffffffffffffff + PHB#0000[0:0]: Waiting for DLP PG reset to complete... + <snip> + PHB#0000[0:0]: Timeout waiting for DLP PG reset ! + PHB#0000[0:0]: Initialization failed + + This is especially seen happening during the MPIPL flow where SBE + would quiesces and fence the PHB so that it doesn't stomp on the main + memory. However when skiboot enters phb4_probe_stack() after MPIPL, + the FIR/NFIR registers are set forcing PHB to re-enter fence after ETU + reset is done. + + So to fix this issue the patch introduces new xscom writes to + phb4_probe_stack() to reset the FIR/NFIR registers before performing + ETU reset to enable mmio access to the PHB. + +Since :ref:`skiboot-5.10`: + +- pci: Reduce log level of error message + + If a link doesn't train, we can end up with error messages like this: :: + + [ 63.027261959,3] PHB#0032[8:2]: LINK: Timeout waiting for electrical link + [ 63.027265573,3] PHB#0032:00:00.0 Error -6 resetting + + The first message is useful but the second message is just debug from + the core PCI code and is confusing to print to the console. + + This reduces the second print to debug level so it's not seen by the + console by default. +- Revert "platforms/astbmc/slots.c: Allow comparison of bus numbers when matching slots" + + This reverts commit bda7cc4d0354eb3f66629d410b2afc08c79f795f. + + Ben says: + It's on purpose that we do NOT compare the bus numbers, + they are always 0 in the slot table + we do a hierarchical walk of the tree, matching only the + devfn's along the way bcs the bus numbering isn't fixed + this breaks all slot naming etc... stuff on anything using + the "skiboot" slot tables (P8 opp typically) +- core/pci-dt-slot: Fix booting with no slot map + + Currently if you don't have a slot map in the device tree in + /ibm,pcie-slots, you can crash with a back trace like this: :: + + CPU 0034 Backtrace: + S: 0000000031cd3370 R: 000000003001362c .backtrace+0x48 + S: 0000000031cd3410 R: 0000000030019e38 ._abort+0x4c + S: 0000000031cd3490 R: 000000003002760c .exception_entry+0x180 + S: 0000000031cd3670 R: 0000000000001f10 * + S: 0000000031cd3850 R: 00000000300b4f3e * cpu_features_table+0x1d9e + S: 0000000031cd38e0 R: 000000003002682c .dt_node_is_compatible+0x20 + S: 0000000031cd3960 R: 0000000030030e08 .map_pci_dev_to_slot+0x16c + S: 0000000031cd3a30 R: 0000000030091054 .dt_slot_get_slot_info+0x28 + S: 0000000031cd3ac0 R: 000000003001e27c .pci_scan_one+0x2ac + S: 0000000031cd3ba0 R: 000000003001e588 .pci_scan_bus+0x70 + S: 0000000031cd3cb0 R: 000000003001ee74 .pci_scan_phb+0x100 + S: 0000000031cd3d40 R: 0000000030017ff0 .cpu_process_jobs+0xdc + S: 0000000031cd3e00 R: 0000000030014cb0 .__secondary_cpu_entry+0x44 + S: 0000000031cd3e80 R: 0000000030014d04 .secondary_cpu_entry+0x34 + S: 0000000031cd3f00 R: 0000000030002770 secondary_wait+0x8c + [ 73.016947149,3] Fatal MCE at 0000000030026054 .dt_find_property+0x30 + [ 73.017073254,3] CFAR : 0000000030026040 + [ 73.017138048,3] SRR0 : 0000000030026054 SRR1 : 9000000000201000 + [ 73.017198375,3] HSRR0: 0000000000000000 HSRR1: 0000000000000000 + [ 73.017263210,3] DSISR: 00000008 DAR : 7c7b1b7848002524 + [ 73.017352517,3] LR : 000000003002602c CTR : 000000003009102c + [ 73.017419778,3] CR : 20004204 XER : 20040000 + [ 73.017502425,3] GPR00: 000000003002682c GPR16: 0000000000000000 + [ 73.017586924,3] GPR01: 0000000031c23670 GPR17: 0000000000000000 + [ 73.017643873,3] GPR02: 00000000300fd500 GPR18: 0000000000000000 + [ 73.017767091,3] GPR03: fffffffffffffff8 GPR19: 0000000000000000 + [ 73.017855707,3] GPR04: 00000000300b3dc6 GPR20: 0000000000000000 + [ 73.017943944,3] GPR05: 0000000000000000 GPR21: 00000000300bb6d2 + [ 73.018024709,3] GPR06: 0000000031c23910 GPR22: 0000000000000000 + [ 73.018117716,3] GPR07: 0000000031c23930 GPR23: 0000000000000000 + [ 73.018195974,3] GPR08: 0000000000000000 GPR24: 0000000000000000 + [ 73.018278350,3] GPR09: 0000000000000000 GPR25: 0000000000000000 + [ 73.018353795,3] GPR10: 0000000000000028 GPR26: 00000000300be6fb + [ 73.018424362,3] GPR11: 0000000000000000 GPR27: 0000000000000000 + [ 73.018533159,3] GPR12: 0000000020004208 GPR28: 0000000030767d38 + [ 73.018642725,3] GPR13: 0000000031c20000 GPR29: 00000000300b3dc6 + [ 73.018737925,3] GPR14: 0000000000000000 GPR30: 0000000000000010 + [ 73.018794428,3] GPR15: 0000000000000000 GPR31: 7c7b1b7848002514 + + This has been seen in the lab on a witherspoon using the device tree + entry point (ie. no HDAT). + + This fixes the null pointer deref. + +Bugs Fixed +---------- +Since :ref:`skiboot-5.11-rc1`: + +- cpufeatures: Fix setting DARN and SCV HWCAP feature bits + + DARN and SCV has been assigned AT_HWCAP2 (32-63) bits: :: + + #define PPC_FEATURE2_DARN 0x00200000 /* darn random number insn */ + #define PPC_FEATURE2_SCV 0x00100000 /* scv syscall */ + + A cpufeatures-aware OS will not advertise these to userspace without + this patch. +- xive: disable store EOI support + + Hardware has limitations which would require to put a sync after each + store EOI to make sure the MMIO operations that change the ESB state + are ordered. This is a killer for performance and the PHBs do not + support the sync. So remove the store EOI for the moment, until + hardware is improved. + + Also, while we are at changing the XIVE source flags, let's fix the + settings for the PHB4s which should follow these rules : + + - SHIFT_BUG for DD10 + - STORE_EOI for DD20 and if enabled + - TRIGGER_PAGE for DDx0 and if not STORE_EOI + +Since :ref:`skiboot-5.10`: + +- xive: fix opal_xive_set_vp_info() error path + + In case of error, opal_xive_set_vp_info() will return without + unlocking the xive object. This is most certainly a typo. +- hw/imc: don't access homer memory if it was not initialised + + This can happen under mambo, at least. +- nvram: run nvram_validate() after nvram_reformat() + + nvram_reformat() sets nvram_valid = true, but it does not set + skiboot_part_hdr. Call nvram_validate() instead, which sets + everything up properly. +- dts: Zero struct to avoid using uninitialised value +- hw/imc: Don't dereference possible NULL +- libstb/create-container: munmap() signature file address +- npu2-opencapi: Fix memory leak +- npu2: Fix possible NULL dereference +- occ-sensors: Remove NULL checks after dereference +- core/ipmi-opal: Add interrupt-parent property for ipmi node on P9 and above. + + dtc complains below warning with newer 4.2+ kernels. :: + + dts: Warning (interrupts_property): Missing interrupt-parent for /ibm,opal/ipmi + + This fix adds interrupt-parent property under /ibm,opal/ipmi DT node on P9 + and above, which allows ipmi-opal to properly use the OPAL irqchip. + +Other fixes and improvements +---------------------------- + +- core/cpu: discover stack region size before initialising memory regions + + Stack allocation first allocates a memory region sized to hold stacks + for all possible CPUs up to the maximum PIR of the architecture, zeros + the region, then initialises all stacks. Max PIR is 32768 on POWER9, + which is 512MB for stacks. + + The stack region is then shrunk after CPUs are discovered, but this is + a bit of a hack, and it leaves a hole in the memory allocation regions + as it's done after mem regions are initialised. :: + + 0x000000000000..00002fffffff : ibm,os-reserve - OS + 0x000030000000..0000303fffff : ibm,firmware-code - OPAL + 0x000030400000..000030ffffff : ibm,firmware-heap - OPAL + 0x000031000000..000031bfffff : ibm,firmware-data - OPAL + 0x000031c00000..000031c0ffff : ibm,firmware-stacks - OPAL + *** gap *** + 0x000051c00000..000051d01fff : ibm,firmware-allocs-memory@0 - OPAL + 0x000051d02000..00007fffffff : ibm,firmware-allocs-memory@0 - OS + 0x000080000000..000080b3cdff : initramfs - OPAL + 0x000080b3ce00..000080b7cdff : ibm,fake-nvram - OPAL + 0x000080b7ce00..0000ffffffff : ibm,firmware-allocs-memory@0 - OS + + This change moves zeroing into the per-cpu stack setup. The boot CPU + stack is set up based on the current PIR. Then the size of the stack + region is set, by discovering the maximum PIR of the system from the + device tree, before mem regions are intialised. + + This results in all memory being accounted within memory regions, + and less memory fragmentation of OPAL allocations. +- Make gard display show that a record is cleared + + When clearing gard records, Hostboot only modifies the record_id + portion to be 0xFFFFFFFF. The remainder of the entry remains. + Without this change it can be confusing to users to know that + the record they are looking at is no longer valid. +- Reserve OPAL API number for opal_handle_hmi2 function. +- dts: spl_wakeup: Remove all workarounds in the spl wakeup logic + + We coded few workarounds in special wakeup logic to handle the + buggy firmware. Now that is fixed remove them as they break the + special wakeup protocol. As per the spec we should not de-assert + beofre assert is complete. So follow this protocol. +- build: use thin archives rather than incremental linking + + This changes to build system to use thin archives rather than + incremental linking for built-in.o, similar to recent change to Linux. + built-in.o is renamed to built-in.a, and is created as a thin archive + with no index, for speed and size. All built-in.a are aggregated into + a skiboot.tmp.a which is a thin archive built with an index, making it + suitable or linking. This is input into the final link. + + The advantags of build size and linker code placement flexibility are + not as great with skiboot as a bigger project like Linux, but it's a + conceptually better way to build, and is more compatible with link + time optimisation in toolchains which might be interesting for skiboot + particularly for size reductions. + + Size of build tree before this patch is 34.4MB, afterwards 23.1MB. +- core/init: Assert when kernel not found + + If the kernel doesn't load out of flash or there is nothing at + KERNEL_LOAD_BASE, we end up with an esoteric message as we try to + branch to out of skiboot into nothing :: + + [ 0.007197688,3] INIT: ELF header not found. Assuming raw binary. + [ 0.014035267,5] INIT: Starting kernel at 0x0, fdt at 0x3044ad90 13029 + [ 0.014042254,3] *********************************************** + [ 0.014069947,3] Fatal Exception 0xe40 at 0000000000000000 + [ 0.014085574,3] CFAR : 00000000300051c4 + [ 0.014090118,3] SRR0 : 0000000000000000 SRR1 : 0000000000000000 + [ 0.014096243,3] HSRR0: 0000000000000000 HSRR1: 9000000000001000 + [ 0.014102546,3] DSISR: 00000000 DAR : 0000000000000000 + [ 0.014108538,3] LR : 00000000300144c8 CTR : 0000000000000000 + [ 0.014114756,3] CR : 40002202 XER : 00000000 + [ 0.014120301,3] GPR00: 000000003001447c GPR16: 0000000000000000 + + This improves the message and asserts in this case: :: + + [ 0.014042685,5] INIT: Starting kernel at 0x0, fdt at 0x3044ad90 13049 bytes) + [ 0.014049556,0] FATAL: Kernel is zeros, can't execute! + [ 0.014054237,0] Assert fail: core/init.c:566:0 + [ 0.014060472,0] Aborting! +- core: Fix 'opal-runtime-size' property + + We are populating 'opal-runtime-size' before calculating actual stack size. + Hence we endup having wrong runtime size (ex: on P9 it shows ~540MB while + actual size is around ~40MB). Note that only device tree property is shows + wrong value, but reserved-memory reflects correct size. + + init_all_cpus() calculates and updates actual stack size. Hence move this + function call before add_opal_node(). + +- mambo: Add fw-feature flags for security related settings + + Newer firmwares report some feature flags related to security + settings via HDAT. On real hardware skiboot translates these into + device tree properties. For testing purposes just create the + properties manually in the tcl. + + These values don't exactly match any actual chip revision, but the + code should not rely on any exact set of values anyway. We just define + the most interesting flags, that if toggled to "disable" will change + Linux behaviour. You can see the actual values in the hostboot source + in src/usr/hdat/hdatiplparms.H. + + Also add an environment variable for easily toggling the top-level + "security on" setting. +- direct-controls: mambo fix for multiple chips +- libflash/blocklevel: Correct miscalculation in blocklevel_smart_erase() + + If blocklevel_smart_erase() detects that the smart erase fits entire in + one erase block, it has an early bail path. In this path it miscaculates + where in the buffer the backend needs to read from to perform the final + write. +- libstb/secureboot: Fix logging of secure verify messages. + + Currently we are logging secure verify/enforce messages in PR_EMERG + level even when there is no secureboot mode enabled. So reduce the + log level to PR_ERR when secureboot mode is OFF. + +Testing / Code coverage improvements +------------------------------------ + +Improvements in gcov support include support for newer GCCs as well +as easily exporting the area of memory you need to dump to feed to +`extract-gcov`. + +- cpu_idle_job: relax a bit + + This *dramatically* improves kernel boot time with GCOV builds + + from ~3minutes between loading kernel and switching the HILE + bit down to around 10 seconds. +- gcov: Another GCC, another gcov tweak +- Keep constructors with priorities + + Fixes GCOV builds with gcc7, which uses this. +- gcov: Add gcov data struct to sysfs + + Extracting the skiboot gcov data is currently a tedious process which + involves taking a mem dump of skiboot and searching for the gcov_info + struct. + This patch adds the gcov struct to sysfs under /opal/exports. Allowing the + data to be copied directly into userspace and processed. + diff --git a/roms/skiboot/doc/release-notes/skiboot-5.2.0-rc1.rst b/roms/skiboot/doc/release-notes/skiboot-5.2.0-rc1.rst new file mode 100644 index 000000000..eb50891bf --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.2.0-rc1.rst @@ -0,0 +1,265 @@ +skiboot-5.2.0-rc1 +================= + +skiboot-5.2.0-rc1 was released on Friday Feb 26th, 2016. + +skiboot-5.2.0-rc1 is the first release candidate of skiboot 5.2, which will +become the new stable release of skiboot following the 5.1 release, first +released August 17th, 2015. + +skiboot-5.2.0-rc1 contains all bug fixes as of skiboot-5.1.13. + +This is the second release that will follow the (now documented) Skiboot +stable rules - see :ref:`stable-rules`. + +The current plan is to release skiboot-5.2.0 mid-March 2016, with a focus on +bug fixing for future 5.2.0-rc releases. + +New Features +^^^^^^^^^^^^ + +Over skiboot-5.1, the following features have been added: + +- Naples (P8', i.e. P8 with NVLINK) processor support, including NVLINK. +- Improvements in gard, libflash/pflash and opal-prd utilities + + - increased testing + - increased usability + - systemd scripts for opal-prd + - pflash can now use the /dev/mtd device to access BMC flash rather than + accessing it directly. It is *important* that you use --mtd if your + BMC may otherwise know how to interact with its own flash. +- support for Micron N25Q256Ax and N25Qx256Ax NOR flash. +- support for Winbond W25Q256BV NOR flash +- support for an emulated ("fake") RTC clock, useful in simulators + and during bringup +- Explicit 1:1 mapping in ranges properties have been added to PCI + bridges. This allows a neat trick with offb and VGA ports that should + probably not be told to young children. +- Added support to read the V2 format of the OCC-OPAL memory region, + which supports Workload Optimized Frequency (WOF) + +Changes in behavior +^^^^^^^^^^^^^^^^^^^ + +- Assigning OPAL IDs to PHBs is now fixed and based on the chip id and PHB + index on that chip. On POWER7, we continue to use allocated numbers. +- We now query the BMC for BT capabilities rather than making assumptions + +Removed support +^^^^^^^^^^^^^^^ + +- p5ioc2 is no longer supported. + This affects a grand total of two POWER7 systems in the world. + +**NOTE**: It is planned that skiboot-5.2 will be the last release supporting +POWER7 machines. + +Bugs fixed +^^^^^^^^^^ + +- PHB3: Fix unexpected ER (all) on errinjct by PCI config +- hw/bt: timeout messages when BT interface isn't functional +- On Habanero, Slot3 should have been "Slot 3". +- We now completely flush the console buffer before power down and reboot +- For chips with ibm,occ-functional-state set to false, we don't wait + for the OCC to start. This caused needless delay in booting on simulators + which did not simulate OCCs. +- Change OCC reset order to always reset slave OCCs first. +- slw: Remove overwrites for EX_PM_CORE_ECO_VRET and EX_PM_CORE_PFET_VRET + (these were already initialized in hostboot) +- p8-i2c: send stop bit on timeouts. + Some devices can otherwise leave the bus in a held state. + +Other improvements +^^^^^^^^^^^^^^^^^^ + +- many fixes of compiler and static analysis warnings +- increased unit test coverage +- Unit test of "boot debian jessie installer" +- ability to plug in other simulators to run existing tests (e.g. simulator for + non pegasus p8) +- Support using (patched) Qemu with PowerNV platform support for running + unit tests. +- increased support for running with sparse +- We now build with -fstack-protector-strong if supported by the compiler +- We now build with -Werror for -Wformat +- pflash is now built as part of travis-ci and for Coverity Scan. +- There is now a RPM SPEC file that can be used as the basis for packaging + skiboot and associated utilities. + +Contributors +------------ + +We have had a number of improvements in workflow over skiboot-5.1.0. Looking +back, we have roughly the same number of changesets (372 for 5.1.0, 334 for +5.2.0-rc1 - even closer for 5.1.0-beta1) which indicates a relatively stable +rate of development. + +Complete statistics are included below (generated by gitdm), but I'd like to +draw attention to a couple of stats: + +======== ====== ======= ======= ====== ======== +Release csets Ack Reviews Tested Reported +======== ====== ======= ======= ====== ======== +5.0 329 15 20 1 0 +5.1 372 13 38 1 4 +5.2-rc1 334 20 34 6 11 +======== ====== ======= ======= ====== ======== + +Overall, it looks like we're on the right trajectory for increasing the number +of eyeballs looking at code before it heads in tree, especially around testing. +Largely, this increase in Tested-by can be attributed to encouraging the +existing test teams to start commenting on the patches themselves. + +Anyway, here's the full stats from skiboot 5.1.0 to 5.2.0-rc1: + +Processed 334 csets from 27 developers +2 employers found +A total of 46172 lines added, 23274 removed (delta 22898) + +Developers with the most changesets + +========================== =========== +========================== =========== +Stewart Smith 146 (43.7%) +Cyril Bur 52 (15.6%) +Benjamin Herrenschmidt 15 (4.5%) +Joel Stanley 12 (3.6%) +Gavin Shan 12 (3.6%) +Alistair Popple 10 (3.0%) +Vasant Hegde 10 (3.0%) +Michael Neuling 10 (3.0%) +Russell Currey 9 (2.7%) +Cédric Le Goater 8 (2.4%) +Jeremy Kerr 8 (2.4%) +Samuel Mendoza-Jonas 6 (1.8%) +Neelesh Gupta 6 (1.8%) +Shilpasri G Bhat 4 (1.2%) +Oliver O'Halloran 4 (1.2%) +Mahesh Salgaonkar 4 (1.2%) +Vipin K Parashar 3 (0.9%) +Daniel Axtens 3 (0.9%) +Andrew Donnellan 2 (0.6%) +Philippe Bergheaud 2 (0.6%) +Ananth N Mavinakayanahalli 2 (0.6%) +Vaibhav Jain 1 (0.3%) +Sam Mendoza-Jonas 1 (0.3%) +Adriana Kobylak 1 (0.3%) +Shreyas B. Prabhu 1 (0.3%) +Vaidyanathan Srinivasan 1 (0.3%) +Ian Munsie 1 (0.3%) +========================== =========== + +Developers with the most changed lines + + +========================== ============= +========================== ============= +Stewart Smith 19533 (39.4%) +Oliver O'Halloran 17920 (36.1%) +Alistair Popple 3285 (6.6%) +Daniel Axtens 2154 (4.3%) +Cyril Bur 2028 (4.1%) +Benjamin Herrenschmidt 941 (1.9%) +Neelesh Gupta 434 (0.9%) +Gavin Shan 294 (0.6%) +Russell Currey 261 (0.5%) +Vasant Hegde 245 (0.5%) +Cédric Le Goater 209 (0.4%) +Vipin K Parashar 155 (0.3%) +Shilpasri G Bhat 153 (0.3%) +Joel Stanley 140 (0.3%) +Vaidyanathan Srinivasan 135 (0.3%) +Michael Neuling 111 (0.2%) +Samuel Mendoza-Jonas 81 (0.2%) +Jeremy Kerr 60 (0.1%) +Mahesh Salgaonkar 58 (0.1%) +Vaibhav Jain 50 (0.1%) +Ananth N Mavinakayanahalli 43 (0.1%) +Shreyas B. Prabhu 17 (0.0%) +Sam Mendoza-Jonas 12 (0.0%) +Andrew Donnellan 10 (0.0%) +Ian Munsie 8 (0.0%) +Philippe Bergheaud 6 (0.0%) +Adriana Kobylak 6 (0.0%) +========================== ============= + +Developers with the most lines removed + +========================= ============= +========================= ============= +Daniel Axtens 2149 (9.2%) +Shreyas B. Prabhu 17 (0.1%) +Andrew Donnellan 9 (0.0%) +Vipin K Parashar 2 (0.0%) +========================= ============= + +Developers with the most signoffs (total 190) + +========================= ============= +========================= ============= +Stewart Smith 188 (98.9%) +Gavin Shan 1 (0.5%) +Neelesh Gupta 1 (0.5%) +========================= ============= + +Developers with the most reviews (total 34) + +========================= ============= +========================= ============= +Patrick Williams 5 (14.7%) +Joel Stanley 5 (14.7%) +Cédric Le Goater 5 (14.7%) +Vasant Hegde 4 (11.8%) +Alistair Popple 4 (11.8%) +Sam Mendoza-Jonas 3 (8.8%) +Samuel Mendoza-Jonas 3 (8.8%) +Andrew Donnellan 2 (5.9%) +Cyril Bur 2 (5.9%) +Vaibhav Jain 1 (2.9%) +========================= ============= + +Developers with the most test credits (total 6) + +========================= ============= +========================= ============= +Vipin K Parashar 3 (50.0%) +Vaibhav Jain 2 (33.3%) +Gajendra B Bandhu1 1 (16.7%) +========================= ============= + +Developers who gave the most tested-by credits (total 6) + +=========================== ============= +=========================== ============= +Gavin Shan 2 (33.3%) +Ananth N Mavinakayanahalli 2 (33.3%) +Alistair Popple 1 (16.7%) +Stewart Smith 1 (16.7%) +=========================== ============= + +Developers with the most report credits (total 11) + +========================= ============= +========================= ============= +Vaibhav Jain 2 (18.2%) +Paul Nguyen 2 (18.2%) +Alistair Popple 1 (9.1%) +Cédric Le Goater 1 (9.1%) +Aneesh Kumar K.V 1 (9.1%) +Dionysius d. Bell 1 (9.1%) +Pradeep Ramanna 1 (9.1%) +John Walthour 1 (9.1%) +Benjamin Herrenschmidt 1 (9.1%) +========================= ============= + +Developers who gave the most report credits (total 11) + +========================= ============= +========================= ============= +Gavin Shan 6 (54.5%) +Stewart Smith 3 (27.3%) +Samuel Mendoza-Jonas 1 (9.1%) +Shilpasri G Bhat 1 (9.1%) +========================= ============= diff --git a/roms/skiboot/doc/release-notes/skiboot-5.2.0-rc2.rst b/roms/skiboot/doc/release-notes/skiboot-5.2.0-rc2.rst new file mode 100644 index 000000000..5207522ee --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.2.0-rc2.rst @@ -0,0 +1,72 @@ +skiboot-5.2.0-rc2 +================= + +skiboot-5.2.0-rc2 was released on Wednesday March 9th, 2016. + +skiboot-5.2.0-rc2 is the second release candidate of skiboot 5.2, which will +become the new stable release of skiboot following the 5.1 release, first +released August 17th, 2015. + +skiboot-5.2.0-rc2 contains all bug fixes as of skiboot-5.1.14. + +This is the second release that will follow the (now documented) Skiboot +stable rules - see :ref:`stable-rules`. + +The current plan is to release skiboot-5.2.0 mid-March 2016, with a focus on +bug fixing for future 5.2.0-rc releases (if any - I hope this will be the last) + +Over skiboot-5.2.0-rc1, we have the following changes: + +New platform! +^^^^^^^^^^^^^ + +- Add Barreleye platform + +Generic +^^^^^^^ + +- hw/p8-i2c: Speed up SMBUS_WRITE +- Fix early backtraces + +FSP Platforms +^^^^^^^^^^^^^ + +- fsp-sensor: rework device tree for sensors +- platforms/firenze: Fix I2C clock source frequency + +Simics simulator +^^^^^^^^^^^^^^^^ + +- Enable Simics UART console + +Mambo simulator +^^^^^^^^^^^^^^^ + +- platforms/mambo: Add terminate callback + + - fix hang in multi-threaded mambo + - add multithreaded mambo tests + +IPMI +^^^^ + +- hw/ipmi: fix event data 1 for System Firmware Progress sensor +- ipmi: Log exact NetFn value in OPAL logs + +AST BMC based platforms +^^^^^^^^^^^^^^^^^^^^^^^ + +- hw/bt: allow BT driver to use different buffer size + +opal-prd utility +^^^^^^^^^^^^^^^^ + +- opal-prd: Add debug output for firmware-driven OCC events + We indicate when we have a user-driven event, so add corresponding + outputs for firmware-driven ones too. + +getscom utility +^^^^^^^^^^^^^^^ + +- Add Naples chip support + diff --git a/roms/skiboot/doc/release-notes/skiboot-5.2.0.rst b/roms/skiboot/doc/release-notes/skiboot-5.2.0.rst new file mode 100644 index 000000000..9ff6bcd94 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.2.0.rst @@ -0,0 +1,327 @@ +.. _skiboot-5.2.0: + +skiboot-5.2.0 +============= + +skiboot-5.2.0 was released on Wednesday March 16th, 2016. + +skiboot-5.2.0 is the first stable release of skiboot 5.2, the new stable +release of skiboot, which will take over from the 5.1.x series which was +first released August 17th, 2015. + +skiboot-5.2.0 contains all bug fixes as of skiboot-5.1.15. + +This is the second release that will follow the (now documented) Skiboot +stable rules - see :ref:`stable-rules`. + +Changes since rc2 +----------------- +Over skiboot-5.2.0-rc2, the following fixes are included: + +- Include 'extract-gcov' in make clean. +- ipmi-sel: Fix esel event logger to handle early boot PANIC events +- IPMI: Enable synchronous eSEL logging option (for PANIC events) +- libflash/libffs: Reporting seeing all 0xFF bytes during init. +- ipmi-sel: Fix memory leak in error path + +Changes since rc1 +----------------- +Over skiboot-5.2.0-rc1, we have the following changes: + +- Add Barreleye platform + +Generic +^^^^^^^ + +- hw/p8-i2c: Speed up SMBUS_WRITE +- Fix early backtraces + +FSP Platforms +^^^^^^^^^^^^^ + +- fsp-sensor: rework device tree for sensors +- platforms/firenze: Fix I2C clock source frequency + +Simics simulator +^^^^^^^^^^^^^^^^ + +- Enable Simics UART console + +Mambo simulator +^^^^^^^^^^^^^^^ + +- platforms/mambo: Add terminate callback + + - fix hang in multi-threaded mambo + - add multithreaded mambo tests + +IPMI +^^^^ + +- hw/ipmi: fix event data 1 for System Firmware Progress sensor +- ipmi: Log exact NetFn value in OPAL logs + +AST BMC based platforms +^^^^^^^^^^^^^^^^^^^^^^^ + +- hw/bt: allow BT driver to use different buffer size + +opal-prd utility +^^^^^^^^^^^^^^^^ + +- opal-prd: Add debug output for firmware-driven OCC events + We indicate when we have a user-driven event, so add corresponding + outputs for firmware-driven ones too. + +getscom utility +^^^^^^^^^^^^^^^ + +- Add Naples chip support + +New Features +^^^^^^^^^^^^ +Over skiboot-5.1, the following features have been added: + +- Naples (P8', i.e. P8 with NVLINK) processor support, including NVLINK. +- Improvements in gard, libflash/pflash and opal-prd utilities + + - increased testing + - increased usability + - systemd scripts for opal-prd + - pflash can now use the /dev/mtd device to access BMC flash rather than + accessing it directly. It is *important* that you use --mtd if your + BMC may otherwise know how to interact with its own flash. +- support for Micron N25Q256Ax and N25Qx256Ax NOR flash. +- support for Winbond W25Q256BV NOR flash +- support for an emulated ("fake") RTC clock, useful in simulators + and during bringup +- Explicit 1:1 mapping in ranges properties have been added to PCI + bridges. This allows a neat trick with offb and VGA ports that should + probably not be told to young children. +- Added support to read the V2 format of the OCC-OPAL memory region, + which supports Workload Optimized Frequency (WOF) + +Changes in behavior +^^^^^^^^^^^^^^^^^^^ + +- Assigning OPAL IDs to PHBs is now fixed and based on the chip id and PHB + index on that chip. On POWER7, we continue to use allocated numbers. +- We now query the BMC for BT capabilities rather than making assumptions + +Removed support +^^^^^^^^^^^^^^^ + +- p5ioc2 is no longer supported. + This affects a grand total of two POWER7 systems in the world. + +**NOTE**: It is planned that skiboot-5.2 will be the last release supporting +POWER7 machines. + +Bugs fixed +^^^^^^^^^^ + +- PHB3: Fix unexpected ER (all) on errinjct by PCI config +- hw/bt: timeout messages when BT interface isn't functional +- On Habanero, Slot3 should have been "Slot 3". +- We now completely flush the console buffer before power down and reboot +- For chips with ibm,occ-functional-state set to false, we don't wait + for the OCC to start. This caused needless delay in booting on simulators + which did not simulate OCCs. +- Change OCC reset order to always reset slave OCCs first. +- slw: Remove overwrites for EX_PM_CORE_ECO_VRET and EX_PM_CORE_PFET_VRET + (these were already initialized in hostboot) +- p8-i2c: send stop bit on timeouts. + Some devices can otherwise leave the bus in a held state. + +Other improvements +^^^^^^^^^^^^^^^^^^ + +- many fixes of compiler and static analysis warnings +- increased unit test coverage +- Unit test of "boot debian jessie installer" +- ability to plug in other simulators to run existing tests (e.g. simulator for + non pegasus p8) +- Support using (patched) Qemu with PowerNV platform support for running + unit tests. +- increased support for running with sparse +- We now build with -fstack-protector-strong if supported by the compiler +- We now build with -Werror for -Wformat +- pflash is now built as part of travis-ci and for Coverity Scan. +- There is now a RPM SPEC file that can be used as the basis for packaging + skiboot and associated utilities. + +Contributors +------------ + +We have had a number of improvements in workflow over skiboot-5.1.0. Looking +back, we have roughly the same number of changesets (372 for 5.1.0, 334 for +5.2.0-rc1 - even closer for 5.1.0-beta1) which indicates a relatively stable +rate of development. + +Complete statistics are included below (generated by gitdm), but I'd like to +draw attention to a couple of stats: + +======== ====== ======= ======= ====== ======== +Release csets Ack Reviews Tested Reported +======== ====== ======= ======= ====== ======== +5.0 329 15 20 1 0 +5.1 372 13 38 1 4 +5.2-rc1 334 20 34 6 11 +======== ====== ======= ======= ====== ======== + +Overall, it looks like we're on the right trajectory for increasing the number +of eyeballs looking at code before it heads in tree, especially around testing. +Largely, this increase in Tested-by can be attributed to encouraging the +existing test teams to start commenting on the patches themselves. + +Anyway, here's the full stats from skiboot 5.1.0 to 5.2.0-rc1: + +Processed 334 csets from 27 developers +2 employers found +A total of 46172 lines added, 23274 removed (delta 22898) + +Developers with the most changesets + +========================== =========== +========================== =========== +Stewart Smith 146 (43.7%) +Cyril Bur 52 (15.6%) +Benjamin Herrenschmidt 15 (4.5%) +Joel Stanley 12 (3.6%) +Gavin Shan 12 (3.6%) +Alistair Popple 10 (3.0%) +Vasant Hegde 10 (3.0%) +Michael Neuling 10 (3.0%) +Russell Currey 9 (2.7%) +Cédric Le Goater 8 (2.4%) +Jeremy Kerr 8 (2.4%) +Samuel Mendoza-Jonas 6 (1.8%) +Neelesh Gupta 6 (1.8%) +Shilpasri G Bhat 4 (1.2%) +Oliver O'Halloran 4 (1.2%) +Mahesh Salgaonkar 4 (1.2%) +Vipin K Parashar 3 (0.9%) +Daniel Axtens 3 (0.9%) +Andrew Donnellan 2 (0.6%) +Philippe Bergheaud 2 (0.6%) +Ananth N Mavinakayanahalli 2 (0.6%) +Vaibhav Jain 1 (0.3%) +Sam Mendoza-Jonas 1 (0.3%) +Adriana Kobylak 1 (0.3%) +Shreyas B. Prabhu 1 (0.3%) +Vaidyanathan Srinivasan 1 (0.3%) +Ian Munsie 1 (0.3%) +========================== =========== + +Developers with the most changed lines + + +========================== ============= +========================== ============= +Stewart Smith 19533 (39.4%) +Oliver O'Halloran 17920 (36.1%) +Alistair Popple 3285 (6.6%) +Daniel Axtens 2154 (4.3%) +Cyril Bur 2028 (4.1%) +Benjamin Herrenschmidt 941 (1.9%) +Neelesh Gupta 434 (0.9%) +Gavin Shan 294 (0.6%) +Russell Currey 261 (0.5%) +Vasant Hegde 245 (0.5%) +Cédric Le Goater 209 (0.4%) +Vipin K Parashar 155 (0.3%) +Shilpasri G Bhat 153 (0.3%) +Joel Stanley 140 (0.3%) +Vaidyanathan Srinivasan 135 (0.3%) +Michael Neuling 111 (0.2%) +Samuel Mendoza-Jonas 81 (0.2%) +Jeremy Kerr 60 (0.1%) +Mahesh Salgaonkar 58 (0.1%) +Vaibhav Jain 50 (0.1%) +Ananth N Mavinakayanahalli 43 (0.1%) +Shreyas B. Prabhu 17 (0.0%) +Sam Mendoza-Jonas 12 (0.0%) +Andrew Donnellan 10 (0.0%) +Ian Munsie 8 (0.0%) +Philippe Bergheaud 6 (0.0%) +Adriana Kobylak 6 (0.0%) +========================== ============= + +Developers with the most lines removed + +========================= ============= +========================= ============= +Daniel Axtens 2149 (9.2%) +Shreyas B. Prabhu 17 (0.1%) +Andrew Donnellan 9 (0.0%) +Vipin K Parashar 2 (0.0%) +========================= ============= + +Developers with the most signoffs (total 190) + +========================= ============= +========================= ============= +Stewart Smith 188 (98.9%) +Gavin Shan 1 (0.5%) +Neelesh Gupta 1 (0.5%) +========================= ============= + +Developers with the most reviews (total 34) + +========================= ============= +========================= ============= +Patrick Williams 5 (14.7%) +Joel Stanley 5 (14.7%) +Cédric Le Goater 5 (14.7%) +Vasant Hegde 4 (11.8%) +Alistair Popple 4 (11.8%) +Sam Mendoza-Jonas 3 (8.8%) +Samuel Mendoza-Jonas 3 (8.8%) +Andrew Donnellan 2 (5.9%) +Cyril Bur 2 (5.9%) +Vaibhav Jain 1 (2.9%) +========================= ============= + +Developers with the most test credits (total 6) + +========================= ============= +========================= ============= +Vipin K Parashar 3 (50.0%) +Vaibhav Jain 2 (33.3%) +Gajendra B Bandhu1 1 (16.7%) +========================= ============= + +Developers who gave the most tested-by credits (total 6) + +=========================== ============= +=========================== ============= +Gavin Shan 2 (33.3%) +Ananth N Mavinakayanahalli 2 (33.3%) +Alistair Popple 1 (16.7%) +Stewart Smith 1 (16.7%) +=========================== ============= + +Developers with the most report credits (total 11) + +========================= ============= +========================= ============= +Vaibhav Jain 2 (18.2%) +Paul Nguyen 2 (18.2%) +Alistair Popple 1 (9.1%) +Cédric Le Goater 1 (9.1%) +Aneesh Kumar K.V 1 (9.1%) +Dionysius d. Bell 1 (9.1%) +Pradeep Ramanna 1 (9.1%) +John Walthour 1 (9.1%) +Benjamin Herrenschmidt 1 (9.1%) +========================= ============= + +Developers who gave the most report credits (total 11) + +========================= ============= +========================= ============= +Gavin Shan 6 (54.5%) +Stewart Smith 3 (27.3%) +Samuel Mendoza-Jonas 1 (9.1%) +Shilpasri G Bhat 1 (9.1%) +========================= ============= diff --git a/roms/skiboot/doc/release-notes/skiboot-5.2.1.rst b/roms/skiboot/doc/release-notes/skiboot-5.2.1.rst new file mode 100644 index 000000000..877a2fd32 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.2.1.rst @@ -0,0 +1,165 @@ +skiboot-5.2.1 +============= + +skiboot-5.2.1 was released on Wednesday April 27th, 2016. + +skiboot-5.2.1 is the second stable release of skiboot 5.2, the new stable +release of skiboot, which will take over from the 5.1.x series which was +first released August 17th, 2015. + +skiboot-5.2.1 contains all bug fixes as of skiboot-5.1.15. + +This is the second release that will follow the (now documented) Skiboot +stable rules - see :ref:`stable-rules`. + +Changes +------- +Over skiboot-5.2.0, the following fixes are included: + +pflash +^^^^^^ + +- Allow building under yocto. + Makefile fixes to enable building as part of an OpenBMC build. + +Garrison platform +^^^^^^^^^^^^^^^^^ + +- Add PCIe and NPU slot location names +- hw/npu.c: Add ibm, npu-index property to npu device tree +- hmi: Add handling for NPU checkstops + +PHB3 (all POWER8 platforms) +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- hw/phb3: Ensure PQ bits are cleared in the IVC when masking IRQ + When we mask an interrupt, we may race with another interrupt coming + in from the hardware. If this occurs, the P and/or Q bit may end up + being set but we never EOI/clear them. This could result in a lost + interrupt or the next interrupt that comes in after re-enabling never + being presented. + + This fixes a bug seen with some CAPI workloads which have lots of + interrupt masking at the same time as high interrupt load. The fix is + not specific to CAPI though. +- hw/phb3: Fix potential race in EOI + When we EOI we need to clear the present (P) bit in the Interrupt + Vector Cache (IVC). We must clear P ensuring that any additional + interrupts that come in aren't lost while also maintaining coherency + with the Interrupt Vector Table (IVT). + + To do this, the hardware provides a conditional update bit in the + IVC. This bit ensures that generation counts between the IVT and the + IVC updates are synchronised. + + Unfortunately we never set this the bit to conditionally update the P + bit in the IVC based on the generation count. Also, we didn't set + what we wanted the new generation count to be if the update was + successful. + +FSP platforms +^^^^^^^^^^^^^ + +- OPAL:Handle mbox response with bad status:0x24 during FSP termination + OPAL committed a predictive log with SRC BB822411 in some situations. + +Generic +^^^^^^^ + +- hmi: Fix a bug where partial hmi event was reported to host. + This bug fix ensures the CPU PIR is reported correctly: :: + + [ 305.628283] Fatal Hypervisor Maintenance interrupt [Not recovered] + [ 305.628341] Error detail: Malfunction Alert + [ 305.628388] HMER: 8040000000000000 + - [ 305.628423] CPU PIR: 00000000 + + [ 200.123021] CPU PIR: 000008e8 + [ 305.628458] [Unit: VSU] Logic core check stop + +- xscom: Return OPAL_WRONG_STATE on XSCOM ops if CPU is asleep + + +Contributors +------------ + +Processed 15 csets from 7 developers +A total of 436 lines added, 59 removed (delta 377) + +Developers with the most changesets + +============================ ========== +============================ ========== +Russell Currey 7 (46.7%) +Alistair Popple 2 (13.3%) +Michael Neuling 2 (13.3%) +Patrick Williams 1 (6.7%) +Stewart Smith 1 (6.7%) +Mamatha 1 (6.7%) +Mahesh Salgaonkar 1 (6.7%) +============================ ========== + +Developers with the most changed lines + +========================== ============ +========================== ============ +Alistair Popple 215 (48.3%) +Russell Currey 140 (31.5%) +Michael Neuling 55 (12.4%) +Mamatha 15 (3.4%) +Patrick Williams 9 (2.0%) +Mahesh Salgaonkar 8 (1.8%) +Stewart Smith 3 (0.7%) +========================== ============ + +Developers with the most lines removed + +========================== ============ +========================== ============ +Patrick Williams 5 (8.5%) +========================== ============ + +Developers with the most signoffs (total 30) + +========================== ============ +========================== ============ +Stewart Smith 15 (50.0%) +Russell Currey 7 (23.3%) +Michael Neuling 2 (6.7%) +Alistair Popple 2 (6.7%) +Patrick Williams 1 (3.3%) +Oliver O'Halloran 1 (3.3%) +Mahesh Salgaonkar 1 (3.3%) +Mamatha 1 (3.3%) +========================== ============ + +Developers with the most reviews (total 11) + +========================== ============ +========================== ============ +Alistair Popple 5 (45.5%) +Andrew Donnellan 3 (27.3%) +Mahesh Salgaonkar 2 (18.2%) +Joel Stanley 1 (9.1%) +========================== ============ + +Developers with the most Acked-by (total 1) + +========================== ============ +========================== ============ +Alistair Popple 1 (100.0%) +========================== ============ + +Developers with the most test credits (total 3) + +========================== ============ +========================== ============ +Andrew Donnellan 2 (66.7%) +Vaibhav Jain 1 (33.3%) +========================== ============ + +Developers who received the most tested-by credits (total 3) + +========================== ============ +========================== ============ +Michael Neuling 3 (100.0%) +========================== ============ diff --git a/roms/skiboot/doc/release-notes/skiboot-5.2.2.rst b/roms/skiboot/doc/release-notes/skiboot-5.2.2.rst new file mode 100644 index 000000000..0de037269 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.2.2.rst @@ -0,0 +1,34 @@ +skiboot-5.2.2 +============= + +skiboot-5.2.2 was released on Thursday May 5th, 2016. + +skiboot-5.2.2 is the third stable release of skiboot 5.2, the new stable +release of skiboot, which will take over from the 5.1.x series which was +first released August 17th, 2015. + +Skiboot 5.2.2 replaces skiboot-5.2.1 as the current stable version, which was +released on April 27th, 2016. Over skiboot-5.2.1, skiboot 5.2.2 contains +one bug fix targeted at P8NVL systems, notably the Garrison platform. + +skiboot-5.2.2 contains all bug fixes as of skiboot-5.1.16. + +This is the second release that will follow the (now documented) Skiboot +stable rules - see :ref:`stable-rules`. + +Over skiboot-5.2.1, the following fixes are included: + +P8NVL/Garrison +^^^^^^^^^^^^^^ + +- PHB3: Fix corruption of pref window register + On P8+ Garrison platform, the root port's pref window register might + be not writable and we have to emulate the window because of hardware + defect. In order to detect that, we read the register content, write + inversed value and read the register content again. The register is + regarded as read-only if the values from the two continuous read are + same. However, the original register content isn't written back and + it causes corruption on pref window register if it's writable. + + This fixes the above issue by writing the original content back to + the register at the end. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.2.3.rst b/roms/skiboot/doc/release-notes/skiboot-5.2.3.rst new file mode 100644 index 000000000..6ed30acd1 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.2.3.rst @@ -0,0 +1,63 @@ +skiboot-5.2.3 +============= + +skiboot-5.2.3 was released on Thursday June 30th, 2016. + +skiboot-5.2.3 is the 4th stable release of skiboot 5.2, the new stable +release of skiboot, which takes over from the 5.1.x series which was +first released August 17th, 2015. + +Skiboot 5.2.3 replaces skiboot-5.2.2 as the current stable version, which was +released on May 5th, 2016. Over skiboot-5.2.2, skiboot 5.2.3 contains +one important bug fix regarding parsing data from the OCC regarding CPU +frequency tables, which could lead to no CPU frequency scaling. + +skiboot-5.2.3 contains all bug fixes as of skiboot-5.1.16. + +This is the second release that will follow the (now documented) Skiboot +stable rules - see :ref:`stable-rules`. + +Over skiboot-5.2.2, the following fixes are included: + +OpenPOWER platforms +^^^^^^^^^^^^^^^^^^^ + +- occ: Filter out entries from Pmin to Pmax in pstate table + (cherry picked from commit eca02ee2e62cee115d921a01cea061782ce47cc7) + Without this fix, with newer OCC firmware on some OpenPOWER machines, + we would fail to parse the table from the OCC, which meant the host OS + would not get a table of supported CPU frequencies. + +General +^^^^^^^ + +- pci: Do a dummy config write to devices to establish bus number + (cherry picked from commit f46c1e506d199332b0f9741278c8ec35b3e39135) + + On PCI Express, devices need to know their own bus number in order + to provide the correct source identification (aka RID) in upstream + packets they might send, such as error messages or DMAs. + + However while devices know (and hard wire) their own device and + function number, they know nothing about bus numbers by default, those + are decoded by bridges for routing. All they know is that if their + parent bridge sends a "type 0" configuration access, they should decode + it provided the device and function numbers match. + + The PCIe spec thus defines that when a device receive such a configuration + access and it's a write, it should "capture" the bus number in the source + field of the packet, and re-use as the originator bus number of all + subsequent outgoing requests. + + In order to ensure that a device has this bus number firmly established + before it's likely to send error packets upstream, we should thus do a + dummy configuration write to it as soon as possible after probing. +- Fix GCC 6 warning in backtrace code + (cherry picked from commit 793f6f5b32c96f2774bd955b6062c74a672317ca) +- Backport of user visible typo fixes + partial cherry picked from 4c95b5e04e3c4f72e4005574f67cd6e365d3276f + +Utilities +^^^^^^^^^ + +- Fix ARM build failure with parallel make diff --git a/roms/skiboot/doc/release-notes/skiboot-5.2.4.rst b/roms/skiboot/doc/release-notes/skiboot-5.2.4.rst new file mode 100644 index 000000000..43c470236 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.2.4.rst @@ -0,0 +1,32 @@ +skiboot-5.2.4 +============= + +skiboot-5.2.4 was released on Tuesday July 12th, 2016. + +This is the 5th stable release of skiboot 5.2, the new stable release of +skiboot (first release with 5.2.0 on March 16th 2016). + +Skiboot 5.2.4 replaces skiboot-5.2.3 as the current stable version, which was +released on June 30th 2016. Over skiboot-5.2.3, skiboot 5.2.4 contains bug +fixes to make skiboot more resilient to errors in the XSCOM engine and some +build improvements for the pflash utility. + +skiboot-5.2.4 contains all bug fixes as of skiboot-5.1.16. + +This is the second release that will follow the (now documented) Skiboot +stable rules - see :ref:`stable-rules`. + +Over skiboot-5.2.3, the following fixes are included: + +All platforms +^^^^^^^^^^^^^ + +- Make the XSCOM engine code more resilient to errors: + + - hw/xscom: Reset XSCOM engine after querying sleeping core FIR + - hw/xscom: Reset XSCOM engine after finite number of retries when busy + +Userspace utilities +^^^^^^^^^^^^^^^^^^^ + +- pflash build improvements diff --git a/roms/skiboot/doc/release-notes/skiboot-5.2.5.rst b/roms/skiboot/doc/release-notes/skiboot-5.2.5.rst new file mode 100644 index 000000000..3c1e8c8e4 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.2.5.rst @@ -0,0 +1,37 @@ +skiboot-5.2.5 +------------- + +skiboot-5.2.5 was released on Thursday July 28th, 2016. + +skiboot-5.2.5 contains all bug fixes as of skiboot-5.1.17. + +This is the second release that will follow the (now documented) Skiboot +stable rules - see :ref:`stable-rules`. + +Over skiboot-5.2.4, the following fixes are included: + +- pflash: Fix the makefile + (cherry picked from commit fd599965f723330da5ec55519c20cdb6aa2b3a2d) +- pflash: Clean up makefiles and resolve build race + (cherry picked from commit c327eddd9b291a0e6e54001fa3b1e547bad3fca2) +- FSP/ELOG: Fix OPAL generated elog resend logic + (cherry picked from commit a6d4a7884e95cb9c918b8a217c11e46b01218358) +- FSP/ELOG: Fix possible event notifier hangs + (cherry picked from commit e7c8cba4ad773055f390632c2996d3242b633bf4) +- FSP/ELOG: Disable event notification if list is not consistent + (cherry picked from commit 1fb10de164d3ca034193df81c1f5d007aec37781) +- FSP/ELOG: Improve elog event states + (cherry picked from commit cec5750a4a86ff3f69e1d8817eda023f4d40c492) +- FSP/ELOG: Fix OPAL generated elog event notification + (cherry picked from commit ec366ad4e2e871096fa4c614ad7e89f5bb6f884f) +- FSP/ELOG: Disable event notification during kexec + (cherry picked from commit d2ae07fd97bb9408456279cec799f72cb78680a6) +- hw/xscom: Reset XSCOM engine after querying sleeping core FIR + (cherry picked from commit 15cec493804ff14e6246eb1b65e9d0c7cb469a81) +- hw/xscom: Reset XSCOM engine after finite number of retries when busy + (cherry picked from commit e761222593a1ae932cddbc81239b6a7cd98ddb70) +- xscom: Return OPAL_WRONG_STATE on XSCOM ops if CPU is asleep + (cherry picked from commit 9c2d82394fd2303847cac4a665dee62556ca528a) +- fsp/console: Ignore data on unresponsive consoles + (cherry picked from commit fd6b71fcc6912611ce81f455b4805f0531699d5e) +- SEL: Fix eSEL ID while logging eSEL event diff --git a/roms/skiboot/doc/release-notes/skiboot-5.3.0-rc1.rst b/roms/skiboot/doc/release-notes/skiboot-5.3.0-rc1.rst new file mode 100644 index 000000000..ca622b0bb --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.3.0-rc1.rst @@ -0,0 +1,332 @@ +skiboot-5.3.0-rc1 +================= + +skiboot-5.3.0-rc1 was released on Monday July 25th, 2016 + +skiboot-5.3.0-rc1 is the first release candidate of skiboot 5.3, which will +become the new stable release of skiboot following the 5.2 release, first +released March 16th 2016. + +skiboot-5.3.0-rc1 contains all bug fixes as of skiboot-5.1.16 +and skiboot-5.2.4 (the existing stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules`. + +The current plan is to release skiboot-5.3.0 August 1st 2016. + +Over skiboot-5.2, we have the following changes: + +OPAL API/Device Tree +-------------------- + +- Reserve OPAL API numbers for XICS emulation for XIVE + Additionally, we put in some skeleton docs for what's coming, + key points being that this is for P9 and above, relies on a device + being present in the device tree and is modelled on the PAPR calls. +- interrupts: Remove #interrupt-cells from ICP nodes +- Stop adding legacy linux, phandle to device tree, just add phandle + No Linux kernel has ever existed for powernv that only knows linux,phandle. + +POWER9 +------ + +- Add base POWER9 support + In *NO WAY* is this geared towards real POWER9 hardware. + Suitable for use in simulators *only*, and even then, only if you + intensely know what you're doing. +- Document changes in OPAL API for POWER9 + Some things are going to change, we start documenting them. +- cpu: supply ibm,dec-bits via devicetree +- power9: Add example device tree for phb4 +- device-tree: Only advertise ibm, opal-v3 (not v2) on POWER9 and above + +CAPI +---- + +- phb3: Test CAPI mode on both CAPP units on Naples +- hmi: Recover both CAPP units on Naples after malfunction alert +- chiptod: Sync timebase in both CAPP units on Naples +- phb3: Set CAPI mode for both CAPP units on Naples +- phb3: Load CAPP ucode to both CAPP units on Naples +- phb3: Add support for CAPP DMA mode + The XSL used in the Mellanox CX4 card uses a DMA mode of CAPI, which + requires a few registers configured specially. This adds a new mode to + the OPAL_PCI_SET_PHB_CAPI_MODE API to enable CAPI in DMA mode. + +PCI +--- + +- pci: Do a dummy config write to devices to establish bus number +- phb: Work around XSL bug sending PTE updates with wrong scope +- Support for PCI hotplug (if a platform supports it) + +Garrison +-------- + +- NVLink/NPU support +- Full garrison platform support. + +BMC based platforms +------------------- + +- bt: use the maximum retry count returned by the BMC +- SEL: Fix eSEL ID while logging eSEL event + Commit 127a7dac added eSEL ID to SEL event in reverse order (0700 instead + of 0007). This code fixes this issue by adding ID in proper order. + +Tests/Simulation +---------------- + +- test/hello_world: always use shutdown type zero +- make check: make test runs less noisy +- boot-tests: force booting from primary (non-golden) side +- mambo: Enable multicore configurations +- mambo: Flatten device tree at the end +- mambo: Increase memory to 4GB and change memory map +- Timebase quirk for slow simulators like AWAN and SIMICS +- chip: Add simics specific quirks +- mambo: Flash driver using bogus disk +- platform/mambo: Add a heartbeat time, making console more responsive +- mambo: Fix bt command and add little endian support + +FSP platforms +------------- + +- beginnings of support for SPIRA-S structure +- Handle mbox response with bad status:0x24 during FSP termination +- FSP: Validate fsp_msg response memory allocation +- FSP/ELOG: Fix OPAL generated elog event notification +- FSP/ELOG: Disable event notification during kexec + Possible crash if error log timing around kexec is unfortunate +- fsp/console: Ignore data on unresponsive consoles + + Linux kernels from v4.1 onwards will try to request an irq for each hvc + console using OPAL_EVENT_CONSOLE_INPUT, however because the IRQF_SHARED + flag is not set any console after the first will fail. If there is data + on one of these failed consoles OPAL will set OPAL_EVENT_CONSOLE_INPUT + every time fsp_console_read is called, leading to RCU stalls in the + kernel. + + As a workaround for unpatched kernels, cease setting + OPAL_EVENT_CONSOLE_INPUT for consoles that we have noticed are not being + read. + +HMI +--- + +- hmi: Fix a bug where partial hmi event was reported to host. +- hmi: Add handling for NPU checkstops +- hmi: Only raise a catchall HMI if no other components have +- hmi: Rework HMI event handling of FIR read failure + +Tools +----- + +- external: Add a getsram command + The getsram command reads the OCC SRAM. This is useful for debug. +- bug fixes in flash utilities (pflash/gard) +- pflash: Allow building under yocto. +- external/opal-prd: Ensure that struct host_interfaces matches the thunk +- external/pflash: Handle incorrect cmd-line options better +- libflash: fix bug on reading truncated flash file +- pflash: add support for manipulating file rather than flash +- gard: fix compile error on ARM +- libflash: Add sanity checks to ffs init code. +- external: Add dynamically linked pflash + +Mambo +----- + +- Test device tree for kernel location + This can reduce the boot time since the kernel no longer needs to + relocate itself when loaded directly at 0. + +Generic +------- + +- hw/lpc: Log LPC SYNC errors as OPAL_PLATFORM_ERR_EVT errors +- Explicitly disable the attn instruction on all CPUs on boot. +- hw/xscom: Reset XSCOM engine after finite number of retries when busy +- hw/xscom: Reset XSCOM engine after querying sleeping core FIR +- core/timer: Add support for platform specific heartbeat +- Fix GCOV_COUNTERS ifdef logic for GCC 6.0 +- core: Fix backtrace for gcc 6 + fixes a compiler warning on GCC 6 and above +- cpu: Don't call time_wait with lock held + Also make the locking around re-init safer, properly block the + OS from restarting a thread that was caught for re-init. +- flash: Increase the maximum number of flash devices + +Contributors +------------ + +Extending the analysis done for the last few releases, we can see our trends +in code review across versions: + +======== ====== ======= ======= ====== ======== +Release csets Ack Reviews Tested Reported +======== ====== ======= ======= ====== ======== +5.0 329 15 20 1 0 +5.1 372 13 38 1 4 +5.2-rc1 334 20 34 6 11 +5.3-rc1 302 36 53 4 5 +======== ====== ======= ======= ====== ======== + +An increase in reviews this cycle is great! + +Detailed statistics for 5.3.0-rc1 are below: + +Processed 302 csets from 31 developers +A total of 20887 lines added, 4540 removed (delta 16347) + +Developers with the most changesets + +=========================== ============ +=========================== ============ +Stewart Smith 82 (27.2%) +Gavin Shan 36 (11.9%) +Benjamin Herrenschmidt 28 (9.3%) +Michael Neuling 25 (8.3%) +Vasant Hegde 24 (7.9%) +Russell Currey 14 (4.6%) +Brad Bishop 12 (4.0%) +Vipin K Parashar 10 (3.3%) +Cédric Le Goater 9 (3.0%) +Shreyas B. Prabhu 8 (2.6%) +Jeremy Kerr 7 (2.3%) +Philippe Bergheaud 6 (2.0%) +Cyril Bur 5 (1.7%) +Mukesh Ojha 4 (1.3%) +Alistair Popple 4 (1.3%) +Ian Munsie 4 (1.3%) +Oliver O'Halloran 3 (1.0%) +Chris Smart 3 (1.0%) +Sam Mendoza-Jonas 2 (0.7%) +Joel Stanley 2 (0.7%) +Dinar Valeev 2 (0.7%) +Shilpasri G Bhat 2 (0.7%) +Patrick Williams 2 (0.7%) +Deb McLemore 1 (0.3%) +Balbir Singh 1 (0.3%) +Andrew Donnellan 1 (0.3%) +Suraj Jitindar Singh 1 (0.3%) +Frederic Bonnard 1 (0.3%) +Kamalesh Babulal 1 (0.3%) +Mamatha 1 (0.3%) +Mahesh Salgaonkar 1 (0.3%) +=========================== ============ + +Developers with the most changed lines + +========================= ============ +========================= ============ +Benjamin Herrenschmidt 7491 (34.4%) +Gavin Shan 4821 (22.1%) +Vasant Hegde 4740 (21.7%) +Stewart Smith 1294 (5.9%) +Michael Neuling 620 (2.8%) +Cédric Le Goater 470 (2.2%) +Jeremy Kerr 338 (1.6%) +Shreyas B. Prabhu 330 (1.5%) +Vipin K Parashar 305 (1.4%) +Russell Currey 295 (1.4%) +Alistair Popple 229 (1.1%) +Philippe Bergheaud 170 (0.8%) +Ian Munsie 133 (0.6%) +Dinar Valeev 126 (0.6%) +Brad Bishop 80 (0.4%) +Oliver O'Halloran 80 (0.4%) +Cyril Bur 62 (0.3%) +Frederic Bonnard 61 (0.3%) +Sam Mendoza-Jonas 32 (0.1%) +Chris Smart 27 (0.1%) +Shilpasri G Bhat 20 (0.1%) +Patrick Williams 18 (0.1%) +Suraj Jitindar Singh 17 (0.1%) +Mamatha 15 (0.1%) +Mukesh Ojha 8 (0.0%) +Mahesh Salgaonkar 8 (0.0%) +Joel Stanley 4 (0.0%) +Balbir Singh 4 (0.0%) +Kamalesh Babulal 2 (0.0%) +Deb McLemore 1 (0.0%) +Andrew Donnellan 1 (0.0%) +========================= ============ + +Developers with the most lines removed + +========================= ============ +========================= ============ +Dinar Valeev 68 (1.5%) +Patrick Williams 10 (0.2%) +Mukesh Ojha 4 (0.1%) +Kamalesh Babulal 1 (0.0%) +========================= ============ + +Developers with the most signoffs (total 249) + +========================= ============ +========================= ============ +Stewart Smith 236 (94.8%) +Vaidyanathan Srinivasan 6 (2.4%) +Benjamin Herrenschmidt 3 (1.2%) +Michael Neuling 2 (0.8%) +Oliver O'Halloran 1 (0.4%) +Vipin K Parashar 1 (0.4%) +========================= ============ + +Developers with the most reviews (total 53) + +========================= ============ +========================= ============ +Andrew Donnellan 11 (20.8%) +Russell Currey 9 (17.0%) +Joel Stanley 7 (13.2%) +Alistair Popple 7 (13.2%) +Mukesh Ojha 5 (9.4%) +Cyril Bur 3 (5.7%) +Mahesh Salgaonkar 2 (3.8%) +Gavin Shan 2 (3.8%) +Vasant Hegde 2 (3.8%) +Stewart Smith 1 (1.9%) +Vaidyanathan Srinivasan 1 (1.9%) +Vipin K Parashar 1 (1.9%) +Frederic Barrat 1 (1.9%) +Cédric Le Goater 1 (1.9%) +========================= ============ + +Developers with the most test credits (total 4) + +========================= ============ +========================= ============ +Andrew Donnellan 2 (50.0%) +Russell Currey 1 (25.0%) +Vaibhav Jain 1 (25.0%) +========================= ============ + +Developers who gave the most tested-by credits (total 4) + +========================= ============ +========================= ============ +Michael Neuling 3 (75.0%) +Gavin Shan 1 (25.0%) +========================= ============ + +Developers with the most report credits (total 5) + +========================= ============ +========================= ============ +Mukesh Ojha 2 (40.0%) +Russell Currey 1 (20.0%) +Pridhiviraj Paidipeddi 1 (20.0%) +Balbir Singh 1 (20.0%) +========================= ============ + +Developers who gave the most report credits (total 5) + +========================= ============ +========================= ============ +Gavin Shan 2 (40.0%) +Stewart Smith 2 (40.0%) +Vasant Hegde 1 (20.0%) +========================= ============ diff --git a/roms/skiboot/doc/release-notes/skiboot-5.3.0-rc2.rst b/roms/skiboot/doc/release-notes/skiboot-5.3.0-rc2.rst new file mode 100644 index 000000000..a0643acb8 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.3.0-rc2.rst @@ -0,0 +1,40 @@ +skiboot-5.3.0-rc2 +================= + +skiboot-5.3.0-rc2 was released on Thursday July 28th, 2016. + +The current plan is to release skiboot-5.3.0 August 1st 2016. + +Over skiboot-5.3.0-rc1, we have the following changes: + +pflash +------ + +- pflash: Clean up makefiles and resolve build race +- pflash: use atexit for musl compatibility + +General +------- + +- core/flash: Fix passing pointer instead of value + +POWER9 +------ + +- mambo: Update Radix Tree Size as per ISA 3.0 + In Linux we recently changed to this encoding, so we no longer boot. + The associated Linux commit is b23d9c5b9c83c05e013aa52460f12a8365062cf4 + +FSP Platforms +------------- + +- platforms/ibm-fsp: Fix incorrect struct member access and comparison +- FSP/MDST: Fix TCE alignment issue + In some corner cases (like source memory size = 4097) we may + endup doing wrong mapping and corrupting part of SYSDUMP. +- hdat/vpd: Add chip-id property to processor chip node under vpd + +CAPI +---- + +- hw/phb3: Increase AIB TX command credit for DMA read in CAPP DMA mode diff --git a/roms/skiboot/doc/release-notes/skiboot-5.3.0.rst b/roms/skiboot/doc/release-notes/skiboot-5.3.0.rst new file mode 100644 index 000000000..5e55c59d4 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.3.0.rst @@ -0,0 +1,18 @@ +.. _skiboot-5.3.0: + +skiboot-5.3.0 +------------- + +skiboot-5.3.0 was released on Tuesday August 2nd, 2016. + +skiboot-5.3.0 is the first stable release of skiboot 5.3, the new stable +release of skiboot, which will take over from the 5.2.x series which was +first released Wednesday March 16th, 2016. + +skiboot-5.3.0 contains all bug fixes as of skiboot-5.1.17 and skiboot-5.2.5. + +Changes over skiboot-5.3.0-rc2: +- Adopt libtool rules for soname versioning for libflash + +See skiboot-5.3.0-rc2 and skiboot-5.3.0-rc1 release notes for a complete +list of changes from skiboot-5.3.0. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.3.1.rst b/roms/skiboot/doc/release-notes/skiboot-5.3.1.rst new file mode 100644 index 000000000..7aecb8440 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.3.1.rst @@ -0,0 +1,39 @@ +skiboot-5.3.1 +------------- + +skiboot-5.3.1 was released on Wednesday August 10th, 2016. + +This is the 2nd stable release of skiboot 5.3, the new stable release of +skiboot (first released with 5.3.0 on August 2nd, 2016). + +Skiboot 5.3.1 replaces skiboot-5.3.0 as the current stable version. It contains +a few minor bug fixes. + +This release follows the Skiboot stable rules, see :ref:`stable-rules`. + +Over skiboot-5.3.0, the following fixes are included: + +FSP systems: + +- FSP/ELOG: elog_enable flag should be false by default + This issue is one of the corner case, which is related to recent change + went upstream and only observed in the petitboot prompt, where we see + only one error log instead of getting all error log in + /sys/firmware/opal/elog. + +NVLink systems (i.e. Garrison): + +- npu: reword "error" to indicate it's actually a warning + Without this patch, you get spurious FirmWare Test Suite (FWTS) warnings + about NVLink not working on machines that aren't fully populated with + GPUs. +- hmi: Clean up NPU FIR debug messages + With the skiboot log set to debug, the FIR (and related registers) were + logged all in the same message. It was too much for one line, didn't + clarify if the numbers were in hex, and didn't show leading zeroes. + +General: + +- asm: Fix backtrace for unexpected exception +- correct the log level from PR_ERROR down to PR_INFO for some skiboot + log messages. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.3.2.rst b/roms/skiboot/doc/release-notes/skiboot-5.3.2.rst new file mode 100644 index 000000000..e93c3cdf8 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.3.2.rst @@ -0,0 +1,28 @@ +skiboot-5.3.2 +------------- + +skiboot-5.3.2 was released on Friday August 26th, 2016. + +This is the 3rd stable release of skiboot 5.3, the new stable release of +skiboot (first released with 5.3.0 on August 2nd, 2016). + +Skiboot 5.3.2 replaces skiboot-5.3.1 as the current stable version. It contains +a few minor bug fixes. + +Over skiboot-5.3.1, the following fixes are included: + +- opal/hmi: Fix a TOD HMI failure during a race condition. + Rare race condition which meant we wouldn't recover from TOD error + +- lpc: Log LPC SYNC errors as unrecoverable ones for manufacturing + Only affects systems in manufacturing mode. + No behaviour change when not in manufacturing mode. + +- hw/phb3: Update capi initialization sequence + The capi initialization sequence was revised in a circumvention + document when a 'link down' error was converted from fatal to Endpoint + Recoverable. Other, non-capi, register setup was corrected even before + the initial open-source release of skiboot, but a few capi-related + registers were not updated then, so this patch fixes it. + The point is that a link-down error detected by the UTL logic will + lead to an AIB fence, so that the CAPP unit can detect the error. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.3.3.rst b/roms/skiboot/doc/release-notes/skiboot-5.3.3.rst new file mode 100644 index 000000000..880725807 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.3.3.rst @@ -0,0 +1,19 @@ +skiboot-5.3.3 +------------- + +skiboot-5.3.3 was released on Friday September 2nd, 2016. + +This is the 4th stable release of skiboot 5.3, the new stable release of +skiboot (first released with 5.3.0 on August 2nd, 2016). + +Skiboot 5.3.3 replaces skiboot-5.3.2 as the current stable version. It contains +two bug fixes for machines utilizing the NPU (i.e. Garrison) + +Over skiboot-5.3.2, the following fixes are included: + +- hw/npu: assert the NPU irq min is aligned. +- hw/npu: program NPU BUID reg properly + The NPU BUID register was incorrectly programmed resulting in npu + interrupt level 0 causing a PB_CENT_CRESP_ADDR_ERROR checkstop, + and irqs from npus in odd chips being aliased to and processed + as the interrupts from the corresponding npu on the even chips. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.3.4.rst b/roms/skiboot/doc/release-notes/skiboot-5.3.4.rst new file mode 100644 index 000000000..63fca04f0 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.3.4.rst @@ -0,0 +1,22 @@ +skiboot-5.3.4 +------------- + +skiboot-5.3.4 was released on Tuesday September 13th, 2016. + +This is the 5th stable release of skiboot 5.3, the new stable release of +skiboot (first released with 5.3.0 on August 2nd, 2016). + +Skiboot 5.3.4 replaces skiboot-5.3.3 as the current stable version. It contains +a couple of bug fixes, specifically around failing XSCOMs. + +Over skiboot-5.3.3, the following fixes are included: + +- xscom: Initialize the data to a known value in xscom_read + In case of error, don't leave the data random. It helps debugging when + the user fails to check the error code. This happens due to a bug in the + PRD wrapper app. +- xscom: Map all HMER status codes to OPAL errors +- centaur: Mark centaur offline after 10 consecutive access errors + This avoids spamming the logs when the centaur is dead and PRD + constantly tries to access it +- nvlink: Fix bad PE number check in error inject code path (<= rather than <) diff --git a/roms/skiboot/doc/release-notes/skiboot-5.3.5.rst b/roms/skiboot/doc/release-notes/skiboot-5.3.5.rst new file mode 100644 index 000000000..c8084c1a1 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.3.5.rst @@ -0,0 +1,16 @@ +skiboot-5.3.5 +------------- + +skiboot-5.3.5 was released on Wednesday September 14th, 2016. + +This is the 6th stable release of skiboot 5.3, the new stable release of +skiboot (first released with 5.3.0 on August 2nd, 2016). + +Skiboot 5.3.5 replaces skiboot-5.3.4 as the current stable version. It contains +a couple of minor bug fixes: simply clarifying two error messages. + +Over skiboot-5.3.4, the following fixes are included: + +- centaur: print message on disabling xscoms to centaur due to many errors +- slw: improve error message for SLW timer stuck + We still register dump, but only to in memory console buffer by default. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.3.6.rst b/roms/skiboot/doc/release-notes/skiboot-5.3.6.rst new file mode 100644 index 000000000..8960ebc68 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.3.6.rst @@ -0,0 +1,16 @@ +skiboot-5.3.6 +------------- + +skiboot-5.3.6 was released on Saturday September 17th, 2016. + +This is the 7th stable release of skiboot 5.3, the new stable release of +skiboot (first released with 5.3.0 on August 2nd, 2016). + +Skiboot 5.3.6 replaces skiboot-5.3.5 as the current stable version. It contains +one minor bug fix. + +Over skiboot-5.3.5, the following fixes are included: + +- SLW: Actually print the register dump only to memory + A fix in 5.3.5 was only partially correct, we still had the log priority + incorrect for dumping of the SLW registers. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.3.7.rst b/roms/skiboot/doc/release-notes/skiboot-5.3.7.rst new file mode 100644 index 000000000..5fba96038 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.3.7.rst @@ -0,0 +1,81 @@ +.. _skiboot-5.3.7: + +skiboot-5.3.7 +------------- + +skiboot-5.3.7 was released on Wednesday October 12th, 2016. + +This is the 8th stable release of skiboot 5.3, the new stable release of +skiboot (first released with 5.3.0 on August 2nd, 2016). + +Skiboot 5.3.7 replaces skiboot-5.3.6 as the current stable version. It contains +a few bugfixes, including an important PCI bug fix that could cause some +adapters to not be detected. + +Over skiboot-5.3.6, the following fixes are included: + +PCI: + +- pci: Avoid hot resets at boot time + In the PCI post-fundamental reset code, a hot reset is performed at the + end. This is causing issues at boot time as a reset signal is being sent + downstream before the links are up, which is causing issues on adapters + behind switches. No errors result in skiboot, but the adapters are not + usable in Linux as a result. + + This patch fixes some adapters not being configurable in Linux on some + systems. The issue was not present in skiboot 5.2.x. + +- core/pci: Fix the power-off timeout in pci_slot_power_off() + The timeout should be 1000ms instead of 1000 ticks while powering + off PCI slot in pci_slot_power_off(). Otherwise, it's likely to + hit timeout powering off the PCI slot as below skiboot logs reveal: + + [47912590456,5] SkiBoot skiboot-5.3.6 starting... + (snip) + [5399532365,7] PHB#0005:02:11.0 Bus 0f..ff scanning... + [5399540804,7] PHB#0005:02:11.0 No card in slot + [5399576870,5] PHB#0005:02:11.0 Timeout powering off slot + [5401431782,3] FIRENZE-PCI: Wrong state 00000000 on slot 8000000002880005 + +PRD: + +- occ/prd/opal-prd: Queue OCC_RESET event message to host in OpenPOWER + During an OCC reset cycle the system is forced to Psafe pstate. + When OCC becomes active, the system has to be restored to its + last pstate as requested by host. So host needs to be notified + of OCC_RESET event or else system will continue to remian in + Psafe state until host requests a new pstate after the OCC + reset cycle. +- opal-prd: Fix error code from scom_read & scom_write + Currently, we always return a zero value from scom_read & scom_write, + so the HBRT implementation has no way of detecting errors during scom + operations. + This change uses the actual return value from the scom operation from + the kernel instead. + +- opal-prd: Add get_interface_capabilities to host interfaces + We need a way to indicate behaviour changes & fixes in the prd + interface, without requiring a major version bump. + + This change introduces the get_interface_capabilities callback, + returning a bitmask of capability flags, pertaining to 'sets' of + capabilities. We currently return 0 for all. + +IBM FSP Platforms: + +- platforms/firenze: Fix clock frequency dt property +- platforms/firence: HDAT: Fix typo in nest-frequency property + +NVLink: + +- hw/npu.c: Fix reserved PE# + Currently the reserved PE is set to NPU_NUM_OF_PES, which is one + greater than the maximum PE resulting in the following kernel errors + at boot: + + [ 0.000000] pnv_ioda_reserve_pe: Invalid PE 4 on PHB#4 + [ 0.000000] pnv_ioda_reserve_pe: Invalid PE 4 on PHB#5 + + Due to a HW errata PE#0 is already reserved in the kernel, so update + the opal-reserved-pe device-tree property to match this. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.0-rc1.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.0-rc1.rst new file mode 100644 index 000000000..31ac6ac08 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.0-rc1.rst @@ -0,0 +1,535 @@ +.. _skiboot-5.4.0-rc1: + +skiboot-5.4.0-rc1 +================= + +skiboot-5.4.0-rc1 was released on Monday October 17th 2016. It is the first +release candidate of skiboot 5.4, which will become the new stable release +of skiboot following the 5.3 release, first released August 2nd 2016. + +skiboot-5.4.0-rc1 contains all bug fixes as of :ref:`skiboot-5.3.7` +and :ref:`skiboot-5.1.18` (the currently maintained stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to release a new release candidate every week until we +feel good about it. The aim is for skiboot-5.4.x to be in op-build v1.13, which +is due by November 23rd 2016. + +Over skiboot-5.3, we have the following changes: + +New Features +------------ +- Initial Trusted Boot support (see :ref:`stb-overview`). + There are several limitations with this initial release: + + - CAPP partition is not measured correctly + - Only Nuvoton TPM 2.0 is supported + - Requires hardware rework on late revision Habanero or Firestone boards + in order to install TPM. + + - Add i2c Nuvoton TPM 2.0 Driver + - romcode driver for POWER8 secure ROM + - See Device tree docs for tpm and ibm,secureboot nodes + - See main secure and trusted boot documentation. + + +- Fast reboot for P8 + + This makes reboot take an *awful* lot less time, somewhere between four + and ten times faster than a full IPL. It is currently experimental and not + enabled by default. + You can enable the experimental support via nvram option: :: + + # nvram -p ibm,skiboot --update-config experimental-fast-reset=feeling-lucky + + **WARNING**: This has *known* bugs. For example, if you have used a device + in CAPI mode, we will currently *NOT* reset it back to plain PCI. There + are also some known issues in most simulators. + +- Support ``ibm,skiboot`` NVRAM partition with skiboot configuration options. + + - These should generally only be used if you either completely know what + you are doing or need to work around a skiboot bug. They are **not** + intended for end users. + - Add support for supplying the kernel boot arguments from the ``bootargs`` + configuration string in the ``ibm,skiboot`` NVRAM partition. + - Enabling the experimental fast reset feature is done via this method. + +- Add support for nap mode on P8 while in skiboot + + - While nap has been exposed to the Operating System since day 1, we have + not utilized low power states when in skiboot itself, leading to higher + power consumption during boot. + We only enable the functionality after the 0x100 vector has been + patched, and we disable it before transferring control to Linux. + +- libflash: add 128MB MX66L1G45G part + +- Pointer validation of OPAL API call arguments. + + - If the kernel called an OPAL API with vmalloc'd address + or any other address range in real mode, we would hit + a problem with aliasing. Since the top 4 bits are ignored + in real mode, pointers from 0xc.. and 0xd.. (and other ranges) + could collide and lead to hard to solve bugs. This patch + adds the infrastructure for pointer validation and a simple + test case for testing the API + - The checks validate pointers sent in using ``opal_addr_valid()`` + +Documentation +------------- + +There have been a number of documentation fixes this release. Most prominent +is the switch to Sphinx (from the Python project) and ReStructured Text (RST) +as the documentation format. RST and Sphinx enable both production of pretty +documentation in HTML and PDF formats while remaining readable in their raw +form to those with no knowledge of RST. + +You can build a HTML site by doing the following: :: + + cd doc/ + make html + +As always, documentation patches are very, *very* welcome as we attempt to +document the OPAL API, the device tree bindings and important parts of +OPAL internals. + +We would like the Device Tree documentation to follow the style that can be +included in the Device Tree Specification. + + +General +------- +- Make console-log time more readable: seconds rather than timebase + Log format is now ``[SECONDS.(tb%512000000),LEVEL]`` + +- Flash (PNOR) code improvements + + - flash: Make size 64 bit safe + This makes the size of flash 64 bit safe so that we can have flash + devices greater than 4GB. This is especially useful for mambo disks + passed through to Linux. + - core/flash.c: load actual partition size + We are downloading 0x20000 bytes from PNOR for CAPP, but currently the + CAPP lid is only 40K. + - flash: Rework error paths and messages for multiple flash controllers + Now that we have mambo bogusdisk flash, we can have many flash chips. + This is resulting in some confusing output messages. + +- core/init: Fix "failure of getting node in the free list" warning on boot. +- slw: improve error message for SLW timer stuck + +- Centaur / XSCOM error handling + + - print message on disabling xscoms to centaur due to many errors + - Mark centaur offline after 10 consecutive access errors + +- XSCOM improvements + + - xscom: Map all HMER status codes to OPAL errors + - xscom: Initialize the data to a known value in ``xscom_read`` + In case of error, don't leave the data random. It helps debugging when + the user fails to check the error code. This happens due to a bug in the + PRD wrapper app. + - chip: Add a quirk for when core direct control XSCOMs are missing + +- p8-i2c: Don't crash if a centaur errored out + +- cpu: Make endian switch message more informative +- cpu: Display number of started CPUs during boot +- core/init: ensure that HRMOR is zero at boot +- asm: Fix backtrace for unexpected exception + +- cpu: Remove pollers calling heuristics from ``cpu_wait_job`` + This will be handled by ``time_wait_ms()``. Also remove a useless + ``smt_medium()``. + Note that this introduce a difference in behaviour: time_wait + will only call the pollers on the boot CPU while ``cpu_wait_job()`` + could call them on any. However, I can't think of a case where + this is a problem. + +- cpu: Remove global job queue + Instead, target a specific CPU for a global job at queuing time. + This will allow us to wake up the target using an interrupt when + implementing nap mode. + The algorithm used is to look for idle primary threads first, then + idle secondaries, and finally the less loaded thread. If nothing can + be found, we fallback to a synchronous call. +- lpc: Log LPC SYNC errors as unrecoverable ones for manufacturing +- lpc: Optimize SerIRQ dispatch based on which PSI IRQ fired +- interrupts: Add new source ``->attributes()`` callback + This allows a given source to provide per-interrupt attributes + such as whether it targets OPAL or Linux and it's estimated + frequency. + + The former allows to get rid of the double set of ops used to + decide which interrupts go where on some modules like the PHBs + and the latter will be eventually used to implement smart + caching of the source lookups. +- opal/hmi: Fix a TOD HMI failure during a race condition. +- platform: Add BT to Generic platform + + +NVRAM +----- +- Support ``ibm,skiboot`` partition for skiboot specific configuration options +- flash: Size NVRAM based on ECC for OpenPOWER platforms + If NVRAM has ECC (as per the ffs header) then the actual size of the + partition is less than reported by the ffs header in the PNOR then the + actual size of the partition is less than reported by the ffs header. + +NVLink/NPU +---------- + +- Fix reserved PE# +- NPU bdfn allocation bugfix +- Fix bad PE number check + NPUs have 4 PEs which are zero indexed, so {0, 1, 2, 3}. A bad PE number + check in npu_err_inject checks if the PE number is greater than 4 as a + fail case, so it would wrongly perform operations on a non-existant PE 4. +- Use PCI virtual device +- assert the NPU irq min is aligned. +- program NPU BUID reg properly +- npu: reword "error" to indicate it's actually a warning + Incorrect FWTS annotation. + Without this patch, you get spurious FirmWare Test Suite (FWTS) warnings + about NVLink not working on machines that aren't fully populated with + GPUs. +- external: NPU hardware procedure script + Performing NPU hardware procedures requires some config space magic. + Put all that magic into a script, so you can just specify the target + device and the procedure number. + +PCI +--- + +- Generic fixes + + - Claim surprise hotplug capability + - Reserve PCI buses for RC's slot + - Update PCI topology after power change + - Return slot cached power state + - Cache power state on slot without power control + - Avoid hot resets at boot time + - Fix initial PCIe slot power state + - Print CRS retry times + It's useful to know the CRS retry times before the PCI device is + detected successfully. In PCI hot add case, it usually indicates + time consumed for the adapter's firmware to be partially ready + (responsive PCI config space). + - core/pci: Fix the power-off timeout in ``pci_slot_power_off()`` + The timeout should be 1000ms instead of 1000 ticks while powering + off PCI slot in ``pci_slot_power_off()``. Otherwise, it's likely to + hit timeout powering off the PCI slot as below skiboot logs reveal: :: + + [5399576870,5] PHB#0005:02:11.0 Timeout powering off slot + +- PHB3 + + - Override root slot's ``prepare_link_change()`` with PHB's + - Disable surprise link down event on PCI slots + - Disable ECRC on Broadcom adapter behind PMC switch + +- astbmc platforms + + - Support dynamic PCI slot. We might insert a PCIe switch to PHB direct slot + and the downstream ports of the PCIe switch supports PCI hotplug. + + +CAPI +---- + +- hw/phb3: Update capi initialization sequence + The capi initialization sequence was revised in a circumvention + document when a 'link down' error was converted from fatal to Endpoint + Recoverable. Other, non-capi, register setup was corrected even before + the initial open-source release of skiboot, but a few capi-related + registers were not updated then, so this patch fixes it. + +IPMI +---- + +- core/ipmi: Set interrupt-parent property + This allows ipmi-opal to properly use the OPAL irqchip rather than + falling back to the event interface in Linux. + +Mambo Simulator +--------------- + +- Helpers for POWER9 Mambo. +- mambo: Advertise available RADIX page sizes +- mambo: Add section for kernel command line boot args + Users can set kernel command line boot arguments for Mambo in a tcl + script. +- mambo: add exception and qtrace helpers +- external/mambo: Update skiboot.tcl to add page-sizes nodes to device tree + +Simics Simulator +---------------- + +- chiptod: Enable ChipTOD in SIMICS + +Utilities +--------- + +- pflash + + - fix harmless buffer overflow: ``fl_total_size`` was ``uint32_t`` not ``uint64_t``. + - Don't try to write protect when writing to flash file + - Misc small improvements to code and code style + - makefile bug fixes + + +- external/boot_tests + + - remove lid from the BMC after flashing + - add the nobooting option -N + - add arbitrary lid option -F + +- ``getscom`` / ``getsram`` / ``putscom``: Parse chip-id as hex + We print the chip-id in hex (without a leading 0x), but we fail to + parse that same value correctly in ``getscom`` / ``getsram`` / ``putscom`` :: + + # getscom -l + ... + 80000000 | DD2.0 | Centaur memory buffer + # getscom -c 80000000 201140a + Error -19 reading XSCOM + + Fix this by assuming base 16 when parsing chip-id. + +PRD +--- + +- opal-prd: Fix error code from ``scom_read`` and ``scom_write`` +- opal-prd: Add get_interface_capabilities to host interfaces +- opal-prd: fix for 64-bit pnor sizes +- occ/prd/opal-prd: Queue OCC_RESET event message to host in OpenPOWER + During an OCC reset cycle the system is forced to Psafe pstate. + When OCC becomes active, the system has to be restored to its + last pstate as requested by host. So host needs to be notified + of OCC_RESET event or else system will continue to remian in + Psafe state until host requests a new pstate after the OCC + reset cycle. + +IBM FSP Based Platforms +----------------------- + +- fsp/console: Allocate irq for each hvc console + Allocate an irq number for each hvc console and set its interrupt-parent + property so that Linux can use the opal irqchip instead of the + OPAL_EVENT_CONSOLE_INPUT interface. +- platforms/firenze: Fix clock frequency dt property: :: + + [ 1.212366090,3] DT: Unexpected property length /xscom@3fc0000000000/i2cm@a0020/clock-frequency + +- HDAT: Fix typo in nest-frequency property + nest-frquency -> nest-frequency +- platforms/ibm-fsp: Use power_ctl bit when determining slot reset method + The power_ctl bit is used to represent if power management is available. + If power_ctl is set to true, then the I2C based external power management + functionality will be populated on the PCI slot. Otherwise we will try to + use the inband PERST as the fundamental reset, as before. +- FSP/ELOG: Fix elog timeout issue + Presently we set timeout value as soon as we add elog to queue. If + we have multiple elogs to write, it doesn't consider queue wait time. + Instead set timeout value when we are actually sending elog to FSP. +- FSP/ELOG: elog_enable flag should be false by default + This issue is one of the corner case, which is related to recent change + went upstream and only observed in the petitboot prompt, where we see + only one error log instead of getting all error log in + ``/sys/firmware/opal/elog``. + + + +POWER9 +------ + +- mambo: Make POWER9 look like DD2 +- flash: Move flash node under ``ibm,opal/flash/`` + This changes the boot ABI, so it's only active for P9 and later systems, + even though it's unrelated to hardware changes. There is an associated + Linux change to properly search for this node as well. +- core/cpu.c: Add OPAL call to setup Nest MMU +- psi: On p9, create an interrupt-map for routing PSI interrupts +- lpc: Add P9 LPC interrupts support +- chiptod: Basic P9 support +- psi: Add P9 support + +Testing and Debugging +--------------------- + +- test/qemu: bump qemu version used in CI, adds IPMI support +- platform/qemu: add BT and IPMI support + Enables testing BT and IPMI functionality in the Qemu simulator +- init: In debug builds, enable debug output to console +- mem_region: Be a bit smarter about poisoning + Don't poison chunks that are already free and poison regions on + first allocation. This speeds things up dramatically. +- libc: Use 8-bytes stores for non-0 memset too + Memory poisoning hammers this, so let's be a bit smart about it and + avoid falling back to byte stores when the data is not 0 +- fwts: add annotation for manufacturing mode +- check: Fix bugs in mem region tests +- Don't set -fstack-protector-all unconditionally + We set it already in DEBUG builds and we use -fstack-protector-strong + in release builds which provides most of the benefits and is more + efficient. +- Build host programs (and checks) with debug enabled + This enables memory poisoning in allocations and list checking + among other things. +- Add global DEBUG make flag + + +Contributors +------------ + +Extending the analysis done for the last few releases, we can see our trends +in code review across versions: + +======== ====== ======= ======= ====== ======== +Release csets Ack Reviews Tested Reported +======== ====== ======= ======= ====== ======== +5.0 329 15 20 1 0 +5.1 372 13 38 1 4 +5.2-rc1 334 20 34 6 11 +5.3-rc1 302 36 53 4 5 +5.4-rc1 278 8 19 0 4 +======== ====== ======= ======= ====== ======== + +This release has fewer changesets over previous 5.x first release candidates, +but that is not indicative of the size or complexity of these changes. + + +Processed 278 csets from 31 developers +A total of 17052 lines added, 4745 removed (delta 12307) + +Developers with the most changesets + +=========================== == ======= +=========================== == ======= +Stewart Smith 71 (25.5%) +Benjamin Herrenschmidt 50 (18.0%) +Claudio Carvalho 38 (13.7%) +Gavin Shan 20 (7.2%) +Oliver O'Halloran 18 (6.5%) +Mukesh Ojha 9 (3.2%) +Cyril Bur 7 (2.5%) +Russell Currey 7 (2.5%) +Vasant Hegde 7 (2.5%) +Pridhiviraj Paidipeddi 6 (2.2%) +Michael Neuling 6 (2.2%) +Alistair Popple 4 (1.4%) +Sam Mendoza-Jonas 3 (1.1%) +Vipin K Parashar 3 (1.1%) +Balbir Singh 3 (1.1%) +Mahesh Salgaonkar 3 (1.1%) +Frederic Barrat 3 (1.1%) +Chris Smart 2 (0.7%) +Jack Miller 2 (0.7%) +Patrick Williams 2 (0.7%) +Jeremy Kerr 2 (0.7%) +Suraj Jitindar Singh 2 (0.7%) +Milton Miller 2 (0.7%) +Shilpasri G Bhat 1 (0.4%) +Frederic Bonnard 1 (0.4%) +Joel Stanley 1 (0.4%) +Breno Leitao 1 (0.4%) +Anton Blanchard 1 (0.4%) +Nicholas Piggin 1 (0.4%) +Nageswara R Sastry 1 (0.4%) +Cédric Le Goater 1 (0.4%) +=========================== == ======= + +Developers with the most changed lines + +========================= ==== ======= +========================= ==== ======= +Claudio Carvalho 6817 (38.2%) +Stewart Smith 4677 (26.2%) +Benjamin Herrenschmidt 2586 (14.5%) +Gavin Shan 1005 (5.6%) +Cyril Bur 509 (2.9%) +Mukesh Ojha 361 (2.0%) +Oliver O'Halloran 343 (1.9%) +Russell Currey 343 (1.9%) +Balbir Singh 227 (1.3%) +Pridhiviraj Paidipeddi 194 (1.1%) +Michael Neuling 121 (0.7%) +Cédric Le Goater 115 (0.6%) +Vipin K Parashar 68 (0.4%) +Alistair Popple 66 (0.4%) +Vasant Hegde 65 (0.4%) +Shilpasri G Bhat 45 (0.3%) +Suraj Jitindar Singh 41 (0.2%) +Nicholas Piggin 34 (0.2%) +Sam Mendoza-Jonas 33 (0.2%) +Jack Miller 32 (0.2%) +Nageswara R Sastry 32 (0.2%) +Jeremy Kerr 23 (0.1%) +Mahesh Salgaonkar 21 (0.1%) +Chris Smart 20 (0.1%) +Milton Miller 19 (0.1%) +Patrick Williams 11 (0.1%) +Frederic Barrat 6 (0.0%) +Anton Blanchard 3 (0.0%) +Frederic Bonnard 2 (0.0%) +Joel Stanley 2 (0.0%) +Breno Leitao 2 (0.0%) +========================= ==== ======= + +Developers with the most lines removed + +========================= ==== ======= +========================= ==== ======= +Cyril Bur 299 (6.3%) +========================= ==== ======= + +Developers with the most signoffs (total 226) + +========================= ==== ======= +========================= ==== ======= +Stewart Smith 219 (96.9%) +Alistair Popple 4 (1.8%) +Cyril Bur 1 (0.4%) +Jeremy Kerr 1 (0.4%) +Benjamin Herrenschmidt 1 (0.4%) +========================= ==== ======= + +Developers with the most reviews (total 19) + +========================= ==== ======= +========================= ==== ======= +Mukesh Ojha 5 (26.3%) +Andrew Donnellan 4 (21.1%) +Vasant Hegde 3 (15.8%) +Russell Currey 3 (15.8%) +Balbir Singh 2 (10.5%) +Cyril Bur 1 (5.3%) +Vaidyanathan Srinivasan 1 (5.3%) +========================= ==== ======= + +Developers with the most test credits (total 0) + +Developers who gave the most tested-by credits (total 0) + +Developers with the most report credits (total 4) + +========================= ==== ======= +========================= ==== ======= +Benjamin Herrenschmidt 1 (25.0%) +Li Meng 1 (25.0%) +Pridhiviraj Paidipeddi 1 (25.0%) +Gavin Shan 1 (25.0%) +========================= ==== ======= + +Developers who gave the most report credits (total 4) + +========================= ==== ======= +========================= ==== ======= +Gavin Shan 1 (25.0%) +Vasant Hegde 1 (25.0%) +Russell Currey 1 (25.0%) +Stewart Smith 1 (25.0%) +========================= ==== ======= diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.0-rc2.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.0-rc2.rst new file mode 100644 index 000000000..7d8e61d47 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.0-rc2.rst @@ -0,0 +1,272 @@ +.. _skiboot-5.4.0-rc2: + +================= +skiboot-5.4.0-rc2 +================= + +skiboot-5.4.0-rc2 was released on Wednesday October 26th 2016. It is the +second release candidate of skiboot 5.4, which will become the new stable +release of skiboot following the 5.3 release, first released August 2nd 2016. + +skiboot-5.4.0-rc2 contains all bug fixes as of :ref:`skiboot-5.3.7` +and :ref:`skiboot-5.1.18` (the currently maintained stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +Since this is a release candidate, it should *NOT* be put into production. + +The current plan is to release a new release candidate every week until we +feel good about it. The aim is for skiboot-5.4.x to be in op-build v1.13, which +is due by November 23rd 2016. + +Over :ref:`skiboot-5.4.0-rc1`, we have a few changes: + +Secure and Trusted Boot +======================= + +skiboot 5.4.0-rc2 improves upon the progress towards Secure and Trusted Boot +in rc1. It is important to note that this is *not* a complete, end-to-end +secure/trusted boot implementation. + +With the current code, it is now possible to verify and measure resources +loaded from PNOR by skiboot (namely the CAPP and BOOTKERNEL partitions). + +Note that this functionality is currently *only* available on systems that +use the libflash backend. It is *NOT* enabled on IBM FSP based systems. +There is some support for some simulators though. + +- libstb/stb.c: ignore the secure mode flag unless forced in NVRAM + + For this stage in Trusted Boot development, we are wishing to not + force Secure Mode through the whole firmware boot process, but we + are wanting to be able to test it (classic chicken and egg problem with + build infrastructure). + + We disabled secure mode if the secure-enabled devtree property is + read from the device tree *IF* we aren't overriding it through NVRAM. + Seeing as we can only increase (not decrease) what we're checking through + the NVRAM variable, it is safe. + + The NVRAM setting is force-secure-mode=true in the ibm,skiboot partition. + + However, if you want to force secure mode even if Hostboot has *not* set + the secure-enabled proprety in the device tree, set force-secure-mode + to "always". + + There is also a force-trusted-mode NVRAM setting to force trusted mode + even if Hostboot has not enabled it int the device tree. + + To indicate to Linux that we haven't gone through the whole firmware + process in secure mode, we replace the 'secure-enabled' property with + 'partial-secure-enabled', to indicate that only part of the firmware + boot process has gone through secure mode. + + +Command line arguments to BOOTKERNEL +==================================== + +- core/init.c: Fix bootargs parsing + + Currently the bootargs are unconditionally deleted, which causes + a bug where the bootargs passed in by the device tree are lost. + + This patch deletes bootargs only if it needs to be replaced by the NVRAM + entry. + + This patch also removes KERNEL_COMMAND_LINE config option in favour of + using the NVRAM or a device tree. + +pflash utility +============== + +- external/pflash: Make MTD accesses the default + + Now that BMC and host kernel mtd drivers exist and have matured we + should use them by default. + + This is especially important since we seem to be telling everyone to use + pflash (pflash world domination plans are continuing on schedule). +- external/pflash: Catch incompatible combination of flags +- external/common: arm: Don't error trying to wrprotect with MTD access +- libflash/libffs: Use blocklevel_smart_write() when updating partitions + +Other changes +============= +- extract-gcov: build with -m64 if compiler supports it. + + Fixes build break on 32bit ppc64 (e.g. PowerMac G5, where user space + is mostly 32bit). + +Fast Reset +========== + +- fast-reset: disable fast reboot in event of platform error + + Most of the time, if we're rebooting due to a platform error, we should + trigger a checkstop. However, if we haven't been told what we should do + to trigger a checkstop (e.g. on an FSP machine), then we should still + fail to fast-reboot. + + So, disable fast-reboot in the OPAL_CEC_REBOOT2 code path + for OPAL_REBOOT_PLATFORM_ERROR reboot type. +- fast-reboot: disable on FSP code update or unrecoverable HMI +- fast-reboot: abort fast reboot if CAPP attached + + If a PHB is in CAPI mode, we cannot safely fast reboot - the PHB will be + fenced during the reboot resulting in major problems when we load the new + kernel. + + In order to handle this safely, we need to disable CAPI mode before + resetting PHBs during the fast reboot. However, we don't currently support + this. + + In the meantime, when fast rebooting, check if there are any PHBs with a + CAPP attached, and if so, abort the fast reboot and revert to a normal + reboot instead. + +OpenPOWER Platforms +=================== + +For all hardware platforms that aren't IBM FSP machines: + +- Revert "flash: Move flash node under ibm,opal/flash/" + + This reverts commit e1e6d009860d0ef60f9daf7a0fbe15f869516bd0. + + Breaks DT enough that it makes people cranky, reverting for now. + This could break access to flash with existing kernels in POWER9 simulators + +- flash: rework flash_load_resource to correctly read FFS/STB + + This fixes the previous reverts of loading the CAPP partition with + STB headers (which broke CAPP partitions without STB headers). + + The new logic fixes both CAPP partition loading with STB headers *and* + addresses a long standing bug due to differing interpretations of FFS. + + The f_part utility that *constructs* PNOR files just sets actualSize=totalSize + no matter on what the size of the partition is. Prior to this patch, + skiboot would always load actualSize, leading to longer than needed IPL. + + The pflash utility updates actualSize, so no developer has really ever + noticed this, apart from maybe an inkling that it's odd that a freshly + baked PNOR from op-build takes ever so slightly longer to boot than one + that has had individual partitions pflashed in. + + With this patch, we now compute actualSize. For partitions with a STB + header, we take the payload size from the STB header. For partitions + that don't have a STB header, we compute the size either by parsing + the ELF header or by looking at the subpartition header and computing it. + + We now need to read the entire partition for partitions with subpartitions + so that we pass consistent values to be measured as part of Trusted Boot. + + As of this patch, the actualSize field in FFS is *not* relied on for + partition size, we determine it from the content of the partition. + + However, this patch *will* break loading of partitions that are not ELF + and do not contain subpartitions. Luckily, nothing in-tree makes use of + that. + +PCI +=== +- pci: Check power state before powering off slot + + Prevents the erroneous "Error -1 powering off slot" error message. + +Contributors +============ +Since :ref:`skiboot-5.4.0-rc1`, we have 23 csets from 8 developers. + +A total of 876 lines added, 621 removed (delta 255) + +Developers with the most changesets + +============================ = ======= +Developer # % +============================ = ======= +Stewart Smith 7 (30.4%) +Cyril Bur 5 (21.7%) +Mukesh Ojha 3 (13.0%) +Gavin Shan 3 (13.0%) +Claudio Carvalho 2 (8.7%) +Chris Smart 1 (4.3%) +Andrew Donnellan 1 (4.3%) +Nageswara R Sastry 1 (4.3%) +============================ = ======= + +Developers with the most changed lines + +========================== === ======= +Developer # % +========================== === ======= +Stewart Smith 424 (45.7%) +Mukesh Ojha 204 (22.0%) +Gavin Shan 173 (18.6%) +Cyril Bur 69 (7.4%) +Claudio Carvalho 35 (3.8%) +Andrew Donnellan 13 (1.4%) +Chris Smart 8 (0.9%) +Nageswara R Sastry 2 (0.2%) +========================== === ======= + +Developers with the most lines removed + +============================ = ======= +Developer # % +============================ = ======= +Gavin Shan 9 (1.4%) +Chris Smart 4 (0.6%) +============================ = ======= + +Developers with the most signoffs (total 16) + +=========================== == ======== +Developer # % +=========================== == ======== +Stewart Smith 16 (100.0%) +=========================== == ======== + +Developers with the most reviews (total 4) + +============================ = ======= +Developer # % +============================ = ======= +Vasant Hegde 2 (50.0%) +Andrew Donnellan 2 (50.0%) +============================ = ======= + +Developers with the most test credits (total 1) + +============================ = ======= +Developer # % +============================ = ======= +Pridhiviraj Paidipeddi 1 (100.0%) +============================ = ======= + +Developers who gave the most tested-by credits (total 1) + +============================ = ======= +Developer # % +============================ = ======= +Gavin Shan 1 (100.0%) +============================ = ======= + +Developers with the most report credits (total 3) + +============================ = ======= +Developer # % +============================ = ======= +Pridhiviraj Paidipeddi 1 (33.3%) +Andrei Warkenti 1 (33.3%) +Michael Neuling 1 (33.3%) +============================ = ======= + +Developers who gave the most report credits (total 3) + +============================ = ======= +Developer # % +============================ = ======= +Stewart Smith 2 (66.7%) +Gavin Shan 1 (33.3%) +============================ = ======= diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.0-rc3.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.0-rc3.rst new file mode 100644 index 000000000..59a1fedd3 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.0-rc3.rst @@ -0,0 +1,41 @@ +.. _skiboot-5.4.0-rc3: + +================= +skiboot-5.4.0-rc3 +================= + +skiboot-5.4.0-rc3 was released on Wednesday November 2nd 2016. It is the +third release candidate of skiboot 5.4, which will become the new stable +release of skiboot following the 5.3 release, first released August 2nd 2016. + +skiboot-5.4.0-rc3 contains all bug fixes as of :ref:`skiboot-5.3.7` +and :ref:`skiboot-5.1.18` (the currently maintained stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +Since this is a release candidate, it should *NOT* be put into production. + +The current plan is to release a new release candidate every week until we +feel good about it. The aim is for skiboot-5.4.x to be in op-build v1.13, which +is due by November 23rd 2016. + +Over :ref:`skiboot-5.4.0-rc2`, we have a few changes: + +- pflash: Fail when file is larger than partition + You can still shoot yourself in the foot by passing --force. +- core/flash: Don't do anything clever for OPAL_FLASH_{READ, WRITE, ERASE} + This fixes a bug where opal-prd and opal-gard could fail. + Fixes: `<https://github.com/open-power/skiboot/issues/44>`_ +- boot-tests: force BMC to boot from non-golden side +- fast-reset: Send special reset sequence to operational CPUs only. + Fixes fast-reset for cases where there are garded CPUs +- Secure/Trusted boot: be much clearer about what is being measured where. +- Secure/Trusted boot: be more resilient to disabled TPM(s). +- Secure/Trusted boot: The ``force-secure-mode`` NVRAM setting introduced + temporarily in :ref:`skiboot-5.4.0-rc2` has changed behaviour. Now, by + default, the ``secure-mode`` flag in the device tree is obeyed. As always, + any skiboot NVRAM options are in no way ABI, API or supported and may cause + unfinished verbose analogies to appear in release notes relating to the + dangers of using developer only options. +- gard: Fix compiler warning on modern GCC targetting ARM 32-bit +- opal-prd: systemd scripts improvements, only run on supported systems diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.0-rc4.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.0-rc4.rst new file mode 100644 index 000000000..3c79bcec4 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.0-rc4.rst @@ -0,0 +1,50 @@ +.. _skiboot-5.4.0-rc4: + +================= +skiboot-5.4.0-rc4 +================= + +skiboot-5.4.0-rc4 was released on Tuesday November 8th 2016. It is the +fourth (and hopefully final) release candidate of skiboot 5.4, which will +become the new stable release of skiboot following the 5.3 release, first +released August 2nd 2016. + +skiboot-5.4.0-rc4 contains all bug fixes as of :ref:`skiboot-5.3.7` +and :ref:`skiboot-5.1.18` (the currently maintained stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +Since this is a release candidate, it should *NOT* be put into production. + +With this release candidate, I'm hoping that it's the last one, and that within +the week we're able to tag a final 5.4.0 release. There is one bit of code I'm +hoping to merge in before the final 5.4.0, and that's the p8dtu platform +definition. The aim is for skiboot-5.4.x to be in op-build v1.13, which is due +by November 23rd 2016. + +Over :ref:`skiboot-5.4.0-rc3`, we have a few changes: + +- Add BMC platform to enable correct OEM IPMI commands + + An out of tree platform (p8dtu) uses a different IPMI OEM command + for IPMI_PARTIAL_ADD_ESEL. This exposed some assumptions about the BMC + implementation in our core code. + + Now, with platform.bmc, each platform can dictate (or detect) the BMC + that is present. We allow it to be set at runtime rather than purely + statically in struct platform as it's possible to have differing BMC + implementations on the one machine (e.g. AMI BMC or OpenBMC). + +- hw/ipmi-sensor: Fix setting of firmware progress sensor properly. + + On FSP systems, OPAL was incorrectly setting firmware status + on a sensor id "00" which doesn't exist. + +- pflash: remove stray d in from info message +- libflash/pflash: support whole chip erase on mtd access +- boot_test: fix typo in console message +- core/pci: Fix criteria in pci_cfg_reg_filter(), i.e. NVLink didn't work. + +- Remove KERNEL_COMMAND_LINE mention from config.h + + We removed the functionality but not the define. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.0.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.0.rst new file mode 100644 index 000000000..eb33de55e --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.0.rst @@ -0,0 +1,690 @@ +.. _skiboot-5.4.0: + +============= +skiboot-5.4.0 +============= + +skiboot-5.4.0 was released on Friday November 11th 2016. It is the new stable +skiboot release, taking over from the 5.3.x series (first released August 2nd, +2016). It comes after four release candidates, which have helped to shake out +a few issues. + +skiboot-5.4.0 contains all bug fixes as of :ref:`skiboot-5.3.7` +and :ref:`skiboot-5.1.18` (the currently maintained stable releases). + +Skiboot 5.4.x becomes the new stable release. For how the skiboot stable +releases work, see :ref:`stable-rules` for details. + +Over :ref:`skiboot-5.4.0-rc4`, we have a few changes: + +- libstb: bump up the byte timeout for tpm i2c requests + + This bumps up the byte timeout for tpm i2c requests from 10ms to 30ms. + Some p8dtu systems are getting i2c request timeout. + +- external/pflash: Perform the correct cleanup when -F is used to operate on + a file. + +- Add SuperMicro p8dtu1u and p8dtu2u platforms + +- Revert "core/ipmi: Set interrupt-parent property". + This reverts commit d997e482705d9fdff8e25fcbe07fb56008f96ae1 (introduced + in 5.4.0-rc1) + + A problem was found with pre 4.2 linux kernels where a spurious WARNING + would be emitted. This change doesn't matter enough to scare users + so we can just revert it. :: + + Warning was: + [ 0.947741] irq: irq-62==>hwirq-0x3e mapping failed: -22 + [ 0.947793] ------------[ cut here ]------------ + [ 0.947838] WARNING: at kernel/irq/irqdomain.c:485 + +- libflash/libffs: Fix possible NULL dereference + +Previous Release Candidates +--------------------------- + +There were four release candidates for skiboot 5.4.0: + +- :ref:`skiboot-5.4.0-rc4` +- :ref:`skiboot-5.4.0-rc3` +- :ref:`skiboot-5.4.0-rc2` +- :ref:`skiboot-5.4.0-rc1` + +Changes since skiboot 5.3 +========================= + +Over skiboot-5.3, we have the following changes: + +New Features +------------ + +- Add SuperMicro p8dtu1u and p8dtu2u platforms +- Initial Trusted Boot support (see :ref:`stb-overview`). + There are several limitations with this initial release: + + - Only Nuvoton TPM 2.0 is supported + - Requires hardware rework on late revision Habanero or Firestone boards + in order to install TPM. + + - Add i2c Nuvoton TPM 2.0 Driver + - romcode driver for POWER8 secure ROM + - See Device tree docs: :ref:`device-tree/tpm` and :ref:`device-tree/ibm,secureboot` + - See :ref:`stb-overview` + +- Support ``ibm,skiboot`` NVRAM partition with skiboot configuration options. + + - These should generally only be used if you either completely know what + you are doing or need to work around a skiboot bug. They are **not** + intended for end users and are *explicitly* **NOT ABI**. + - Add support for supplying the kernel boot arguments from the ``bootargs`` + configuration string in the ``ibm,skiboot`` NVRAM partition. + - Enabling the experimental fast reset feature is done via this method. + +- Add support for nap mode on P8 while in skiboot + + - While nap has been exposed to the Operating System since day 1, we have + not utilized low power states when in skiboot itself, leading to higher + power consumption during boot. + We only enable the functionality after the 0x100 vector has been + patched, and we disable it before transferring control to Linux. + +- libflash: add 128MB MX66L1G45G part + +- Pointer validation of OPAL API call arguments. + + - If the kernel called an OPAL API with vmalloc'd address + or any other address range in real mode, we would hit + a problem with aliasing. Since the top 4 bits are ignored + in real mode, pointers from 0xc.. and 0xd.. (and other ranges) + could collide and lead to hard to solve bugs. This patch + adds the infrastructure for pointer validation and a simple + test case for testing the API + - The checks validate pointers sent in using ``opal_addr_valid()`` + +- Fast reboot for P8 + + This makes reboot take an *awful* lot less time, somewhere between four + and ten times faster than a full IPL. It is currently experimental and not + enabled by default. + You can enable the experimental support via nvram option: :: + + # nvram -p ibm,skiboot --update-config experimental-fast-reset=feeling-lucky + + **WARNING**: While we *think* we've managed to work out or around most of + the kinks with fast-reset, we are *not* enabling it by default in 5.4. + + Notably, fast reset will *not* happen in the following scenarios: + + - platform error + + Most of the time, if we're rebooting due to a platform error, we should + trigger a checkstop. However, if we haven't been told what we should do + to trigger a checkstop (e.g. on an FSP machine), then we should still + fail to fast-reboot. + + So, fast-reboot is disabled in the OPAL_CEC_REBOOT2 code path + for the OPAL_REBOOT_PLATFORM_ERROR reboot type. + - FSP code update + - Unrecoverable HMI + - A PHB is in CAPI mode + + If a PHB is in CAPI mode, we cannot safely fast reboot - the PHB will be + fenced during the reboot resulting in major problems when we load the new + kernel. + + In order to handle this safely, we need to disable CAPI mode before + resetting PHBs during the fast reboot. However, we don't currently support + this. + + In the meantime, when fast rebooting, check if there are any PHBs with a + CAPP attached, and if so, abort the fast reboot and revert to a normal + reboot instead. + + +Documentation +------------- + +There have been a number of documentation fixes this release. Most prominent +is the switch to Sphinx (from the Python project) and ReStructured Text (RST) +as the documentation format. RST and Sphinx enable both production of pretty +documentation in HTML and PDF formats while remaining readable in their raw +form to those with no knowledge of RST. + +You can build a HTML site by doing the following: :: + + cd doc/ + make html + +As always, documentation patches are very, *very* welcome as we attempt to +document the OPAL API, the device tree bindings and important parts of +OPAL internals. + +We would like the Device Tree documentation to follow the style that can be +included in the Device Tree Specification. + + +General +------- +- Make console-log time more readable: seconds rather than timebase + Log format is now ``[SECONDS.(tb%512000000),LEVEL]`` + +- Flash (PNOR) code improvements + + - flash: Make size 64 bit safe + This makes the size of flash 64 bit safe so that we can have flash + devices greater than 4GB. This is especially useful for mambo disks + passed through to Linux. + - core/flash.c: load actual partition size + We are downloading 0x20000 bytes from PNOR for CAPP, but currently the + CAPP lid is only 40K. + - flash: Rework error paths and messages for multiple flash controllers + Now that we have mambo bogusdisk flash, we can have many flash chips. + This is resulting in some confusing output messages. + +- core/init: Fix "failure of getting node in the free list" warning on boot. +- slw: improve error message for SLW timer stuck + +- Centaur / XSCOM error handling + + - print message on disabling xscoms to centaur due to many errors + - Mark centaur offline after 10 consecutive access errors + +- XSCOM improvements + + - xscom: Map all HMER status codes to OPAL errors + - xscom: Initialize the data to a known value in ``xscom_read`` + In case of error, don't leave the data random. It helps debugging when + the user fails to check the error code. This happens due to a bug in the + PRD wrapper app. + - chip: Add a quirk for when core direct control XSCOMs are missing + +- p8-i2c: Don't crash if a centaur errored out + +- cpu: Make endian switch message more informative +- cpu: Display number of started CPUs during boot +- core/init: ensure that HRMOR is zero at boot +- asm: Fix backtrace for unexpected exception + +- cpu: Remove pollers calling heuristics from ``cpu_wait_job`` + This will be handled by ``time_wait_ms()``. Also remove a useless + ``smt_medium()``. + Note that this introduce a difference in behaviour: time_wait + will only call the pollers on the boot CPU while ``cpu_wait_job()`` + could call them on any. However, I can't think of a case where + this is a problem. + +- cpu: Remove global job queue + Instead, target a specific CPU for a global job at queuing time. + This will allow us to wake up the target using an interrupt when + implementing nap mode. + The algorithm used is to look for idle primary threads first, then + idle secondaries, and finally the less loaded thread. If nothing can + be found, we fallback to a synchronous call. +- lpc: Log LPC SYNC errors as unrecoverable ones for manufacturing +- lpc: Optimize SerIRQ dispatch based on which PSI IRQ fired +- interrupts: Add new source ``->attributes()`` callback + This allows a given source to provide per-interrupt attributes + such as whether it targets OPAL or Linux and it's estimated + frequency. + + The former allows to get rid of the double set of ops used to + decide which interrupts go where on some modules like the PHBs + and the latter will be eventually used to implement smart + caching of the source lookups. +- opal/hmi: Fix a TOD HMI failure during a race condition. +- platform: Add BT to Generic platform + + +NVRAM +----- +- Support ``ibm,skiboot`` partition for skiboot specific configuration options +- flash: Size NVRAM based on ECC for OpenPOWER platforms + If NVRAM has ECC (as per the ffs header) then the actual size of the + partition is less than reported by the ffs header in the PNOR then the + actual size of the partition is less than reported by the ffs header. + +NVLink/NPU +---------- + +- Fix reserved PE# +- NPU bdfn allocation bugfix +- Fix bad PE number check + NPUs have 4 PEs which are zero indexed, so {0, 1, 2, 3}. A bad PE number + check in npu_err_inject checks if the PE number is greater than 4 as a + fail case, so it would wrongly perform operations on a non-existant PE 4. +- Use PCI virtual device +- assert the NPU irq min is aligned. +- program NPU BUID reg properly +- npu: reword "error" to indicate it's actually a warning + Incorrect FWTS annotation. + Without this patch, you get spurious FirmWare Test Suite (FWTS) warnings + about NVLink not working on machines that aren't fully populated with + GPUs. +- external: NPU hardware procedure script + Performing NPU hardware procedures requires some config space magic. + Put all that magic into a script, so you can just specify the target + device and the procedure number. + +PCI +--- + +- Generic fixes + + - Claim surprise hotplug capability + - Reserve PCI buses for RC's slot + - Update PCI topology after power change + - Return slot cached power state + - Cache power state on slot without power control + - Avoid hot resets at boot time + - Fix initial PCIe slot power state + - Print CRS retry times + It's useful to know the CRS retry times before the PCI device is + detected successfully. In PCI hot add case, it usually indicates + time consumed for the adapter's firmware to be partially ready + (responsive PCI config space). + - core/pci: Fix the power-off timeout in ``pci_slot_power_off()`` + The timeout should be 1000ms instead of 1000 ticks while powering + off PCI slot in ``pci_slot_power_off()``. Otherwise, it's likely to + hit timeout powering off the PCI slot as below skiboot logs reveal: :: + + [5399576870,5] PHB#0005:02:11.0 Timeout powering off slot + + - pci: Check power state before powering off slot. + Prevents the erroneous "Error -1 powering off slot" error message. + +- PHB3 + + - Override root slot's ``prepare_link_change()`` with PHB's + - Disable surprise link down event on PCI slots + - Disable ECRC on Broadcom adapter behind PMC switch + +- astbmc platforms + + - Support dynamic PCI slot. We might insert a PCIe switch to PHB direct slot + and the downstream ports of the PCIe switch supports PCI hotplug. + + +CAPI +---- + +- hw/phb3: Update capi initialization sequence + The capi initialization sequence was revised in a circumvention + document when a 'link down' error was converted from fatal to Endpoint + Recoverable. Other, non-capi, register setup was corrected even before + the initial open-source release of skiboot, but a few capi-related + registers were not updated then, so this patch fixes it. + + +Mambo Simulator +--------------- + +- Helpers for POWER9 Mambo. +- mambo: Advertise available RADIX page sizes +- mambo: Add section for kernel command line boot args + Users can set kernel command line boot arguments for Mambo in a tcl + script. +- mambo: add exception and qtrace helpers +- external/mambo: Update skiboot.tcl to add page-sizes nodes to device tree + +Simics Simulator +---------------- + +- chiptod: Enable ChipTOD in SIMICS + +Utilities +--------- + +- pflash + + - fix harmless buffer overflow: ``fl_total_size`` was ``uint32_t`` not ``uint64_t``. + - Don't try to write protect when writing to flash file + - Misc small improvements to code and code style + - makefile bug fixes + - external/pflash: Make MTD accesses the default + + Now that BMC and host kernel mtd drivers exist and have matured we + should use them by default. + + This is especially important since we seem to be telling everyone to use + pflash (pflash world domination plans are continuing on schedule). + - external/pflash: Catch incompatible combination of flags + - external/common: arm: Don't error trying to wrprotect with MTD access + - libflash/libffs: Use blocklevel_smart_write() when updating partitions + +- external/boot_tests + + - remove lid from the BMC after flashing + - add the nobooting option -N + - add arbitrary lid option -F + +- ``getscom`` / ``getsram`` / ``putscom``: Parse chip-id as hex + We print the chip-id in hex (without a leading 0x), but we fail to + parse that same value correctly in ``getscom`` / ``getsram`` / ``putscom`` :: + + # getscom -l + ... + 80000000 | DD2.0 | Centaur memory buffer + # getscom -c 80000000 201140a + Error -19 reading XSCOM + + Fix this by assuming base 16 when parsing chip-id. + +PRD +--- + +- opal-prd: Fix error code from ``scom_read`` and ``scom_write`` +- opal-prd: Add get_interface_capabilities to host interfaces +- opal-prd: fix for 64-bit pnor sizes +- occ/prd/opal-prd: Queue OCC_RESET event message to host in OpenPOWER + During an OCC reset cycle the system is forced to Psafe pstate. + When OCC becomes active, the system has to be restored to its + last pstate as requested by host. So host needs to be notified + of OCC_RESET event or else system will continue to remian in + Psafe state until host requests a new pstate after the OCC + reset cycle. + +IBM FSP Based Platforms +----------------------- + +- fsp/console: Allocate irq for each hvc console + Allocate an irq number for each hvc console and set its interrupt-parent + property so that Linux can use the opal irqchip instead of the + OPAL_EVENT_CONSOLE_INPUT interface. +- platforms/firenze: Fix clock frequency dt property: :: + + [ 1.212366090,3] DT: Unexpected property length /xscom@3fc0000000000/i2cm@a0020/clock-frequency + +- HDAT: Fix typo in nest-frequency property + nest-frquency -> nest-frequency +- platforms/ibm-fsp: Use power_ctl bit when determining slot reset method + The power_ctl bit is used to represent if power management is available. + If power_ctl is set to true, then the I2C based external power management + functionality will be populated on the PCI slot. Otherwise we will try to + use the inband PERST as the fundamental reset, as before. +- FSP/ELOG: Fix elog timeout issue + Presently we set timeout value as soon as we add elog to queue. If + we have multiple elogs to write, it doesn't consider queue wait time. + Instead set timeout value when we are actually sending elog to FSP. +- FSP/ELOG: elog_enable flag should be false by default + This issue is one of the corner case, which is related to recent change + went upstream and only observed in the petitboot prompt, where we see + only one error log instead of getting all error log in + ``/sys/firmware/opal/elog``. + + + +POWER9 +------ + +Skiboot 5.4 contains only *preliminary* support for POWER9. It's suitable +only for use in simulators. If working on hardware, use more recent skiboot +or development branches. We will not be backporting POWER9 fixes to 5.4.x. + +- mambo: Make POWER9 look like DD2 +- core/cpu.c: Add OPAL call to setup Nest MMU +- psi: On p9, create an interrupt-map for routing PSI interrupts +- lpc: Add P9 LPC interrupts support +- chiptod: Basic P9 support +- psi: Add P9 support + +Testing and Debugging +--------------------- + +- test/qemu: bump qemu version used in CI, adds IPMI support +- platform/qemu: add BT and IPMI support + Enables testing BT and IPMI functionality in the Qemu simulator +- init: In debug builds, enable debug output to console +- mem_region: Be a bit smarter about poisoning + Don't poison chunks that are already free and poison regions on + first allocation. This speeds things up dramatically. +- libc: Use 8-bytes stores for non-0 memset too + Memory poisoning hammers this, so let's be a bit smart about it and + avoid falling back to byte stores when the data is not 0 +- fwts: add annotation for manufacturing mode +- check: Fix bugs in mem region tests +- Don't set -fstack-protector-all unconditionally + We set it already in DEBUG builds and we use -fstack-protector-strong + in release builds which provides most of the benefits and is more + efficient. +- Build host programs (and checks) with debug enabled + This enables memory poisoning in allocations and list checking + among other things. +- Add global DEBUG make flag + + + +Command line arguments to BOOTKERNEL +==================================== + +- core/init.c: Fix bootargs parsing + + Currently the bootargs are unconditionally deleted, which causes + a bug where the bootargs passed in by the device tree are lost. + + This patch deletes bootargs only if it needs to be replaced by the NVRAM + entry. + + This patch also removes KERNEL_COMMAND_LINE config option in favour of + using the NVRAM or a device tree. + + +Other changes +============= +- extract-gcov: build with -m64 if compiler supports it. + + Fixes build break on 32bit ppc64 (e.g. PowerMac G5, where user space + is mostly 32bit). + + +Flash on OpenPOWER platforms +============================ + +- flash: rework flash_load_resource to correctly read FFS/STB + + This fixes the previous reverts of loading the CAPP partition with + STB headers (which broke CAPP partitions without STB headers). + + The new logic fixes both CAPP partition loading with STB headers *and* + addresses a long standing bug due to differing interpretations of FFS. + + The f_part utility that *constructs* PNOR files just sets actualSize=totalSize + no matter on what the size of the partition is. Prior to this patch, + skiboot would always load actualSize, leading to longer than needed IPL. + + The pflash utility updates actualSize, so no developer has really ever + noticed this, apart from maybe an inkling that it's odd that a freshly + baked PNOR from op-build takes ever so slightly longer to boot than one + that has had individual partitions pflashed in. + + With this patch, we now compute actualSize. For partitions with a STB + header, we take the payload size from the STB header. For partitions + that don't have a STB header, we compute the size either by parsing + the ELF header or by looking at the subpartition header and computing it. + + We now need to read the entire partition for partitions with subpartitions + so that we pass consistent values to be measured as part of Trusted Boot. + + As of this patch, the actualSize field in FFS is *not* relied on for + partition size, we determine it from the content of the partition. + + However, this patch *will* break loading of partitions that are not ELF + and do not contain subpartitions. Luckily, nothing in-tree makes use of + that. + +Contributors +============ + +Extending the analysis done for the last few releases, we can see our trends +in code review across versions: + +======== ====== ======= ======= ====== ======== +Release csets Ack Reviews Tested Reported +======== ====== ======= ======= ====== ======== +5.0 329 15 20 1 0 +5.1 372 13 38 1 4 +5.2-rc1 334 20 34 6 11 +5.3-rc1 302 36 53 4 5 +5.4-rc1 278 8 19 0 4 +5.4.0 361 16 28 1 9 +======== ====== ======= ======= ====== ======== + +Interesting is the stats of 5.4.0-rc1 versus the final 5.4.0, there's been +a doubling of Acks, an increase in reviewed-by and reported-by. There's +nothing like an impending release to get people to look closer. + +Processed 361 csets from 34 developers +A total of 20206 lines added, 5843 removed (delta 14363) + +Developers with the most changesets: + +========================== === ======= +Developer # % +========================== === ======= +Stewart Smith 105 (29.1%) +Benjamin Herrenschmidt 50 (13.9%) +Claudio Carvalho 47 (13.0%) +Gavin Shan 24 (6.6%) +Cyril Bur 20 (5.5%) +Oliver O'Halloran 18 (5.0%) +Michael Neuling 12 (3.3%) +Mukesh Ojha 12 (3.3%) +Pridhiviraj Paidipeddi 7 (1.9%) +Vasant Hegde 7 (1.9%) +Russell Currey 7 (1.9%) +Joel Stanley 4 (1.1%) +Alistair Popple 4 (1.1%) +Mahesh Salgaonkar 4 (1.1%) +Nageswara R Sastry 4 (1.1%) +Chris Smart 3 (0.8%) +Sam Mendoza-Jonas 3 (0.8%) +Vipin K Parashar 3 (0.8%) +Balbir Singh 3 (0.8%) +Frederic Barrat 3 (0.8%) +leoluo 2 (0.6%) +Rafael Fonseca 2 (0.6%) +Jack Miller 2 (0.6%) +Patrick Williams 2 (0.6%) +Jeremy Kerr 2 (0.6%) +Suraj Jitindar Singh 2 (0.6%) +Milton Miller 2 (0.6%) +Andrew Donnellan 1 (0.3%) +Shilpasri G Bhat 1 (0.3%) +Frederic Bonnard 1 (0.3%) +Breno Leitao 1 (0.3%) +Anton Blanchard 1 (0.3%) +Nicholas Piggin 1 (0.3%) +Cédric Le Goater 1 (0.3%) +========================== === ======= + +Developers with the most changed lines: + +========================= ==== ======= +Developer # % +========================= ==== ======= +Claudio Carvalho 6947 (32.9%) +Stewart Smith 6667 (31.6%) +Benjamin Herrenschmidt 2586 (12.3%) +Gavin Shan 1185 (5.6%) +Cyril Bur 692 (3.3%) +Mukesh Ojha 565 (2.7%) +Oliver O'Halloran 343 (1.6%) +Russell Currey 343 (1.6%) +leoluo 269 (1.3%) +Pridhiviraj Paidipeddi 236 (1.1%) +Balbir Singh 227 (1.1%) +Michael Neuling 211 (1.0%) +Nageswara R Sastry 132 (0.6%) +Cédric Le Goater 115 (0.5%) +Vipin K Parashar 68 (0.3%) +Alistair Popple 66 (0.3%) +Vasant Hegde 65 (0.3%) +Mahesh Salgaonkar 50 (0.2%) +Shilpasri G Bhat 45 (0.2%) +Suraj Jitindar Singh 41 (0.2%) +Nicholas Piggin 34 (0.2%) +Sam Mendoza-Jonas 33 (0.2%) +Jack Miller 32 (0.2%) +Chris Smart 28 (0.1%) +Jeremy Kerr 23 (0.1%) +Milton Miller 19 (0.1%) +Joel Stanley 13 (0.1%) +Andrew Donnellan 13 (0.1%) +Rafael Fonseca 12 (0.1%) +Patrick Williams 11 (0.1%) +Frederic Barrat 6 (0.0%) +Anton Blanchard 3 (0.0%) +Frederic Bonnard 2 (0.0%) +Breno Leitao 2 (0.0%) +========================= ==== ======= + +Developers with the most lines removed: + +========================== === ====== +Developer # % +========================== === ====== +Cyril Bur 206 (3.5%) +Rafael Fonseca 8 (0.1%) +========================== === ====== + +Developers with the most signoffs (total 278): + +========================== === ======= +Developer # % +========================== === ======= +Stewart Smith 268 (96.4%) +Alistair Popple 4 (1.4%) +Jim Yuan 2 (0.7%) +Cyril Bur 1 (0.4%) +Michael Neuling 1 (0.4%) +Jeremy Kerr 1 (0.4%) +Benjamin Herrenschmidt 1 (0.4%) +========================== === ======= + +Developers with the most reviews (total 28): + +========================== === ======= +Developer # % +========================== === ======= +Andrew Donnellan 6 (21.4%) +Vasant Hegde 5 (17.9%) +Mukesh Ojha 5 (17.9%) +Joel Stanley 3 (10.7%) +Russell Currey 3 (10.7%) +Cyril Bur 2 (7.1%) +Balbir Singh 2 (7.1%) +Alistair Popple 1 (3.6%) +Vaidyanathan Srinivasan 1 (3.6%) +========================== === ======= + +Developers with the most test credits (total 1): + +========================== === ======== +Developer # % +========================== === ======== +Pridhiviraj Paidipeddi 1 (100.0%) +========================== === ======== + +Developers who gave the most tested-by credits (total 1): + +========================== === ======== +Developer # % +========================== === ======== +Gavin Shan 1 (100.0%) +========================== === ======== + + +Developers with the most report credits (total 9): + +========================== === ======== +Developer # % +========================== === ======== +Pridhiviraj Paidipeddi 3 (33.3%) +Gavin Shan 1 (11.1%) +Vasant Hegde 1 (11.1%) +Michael Neuling 1 (11.1%) +Benjamin Herrenschmidt 1 (11.1%) +Andrei Warkenti 1 (11.1%) +Li Meng 1 (11.1%) +========================== === ======== diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.1.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.1.rst new file mode 100644 index 000000000..ffa164385 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.1.rst @@ -0,0 +1,27 @@ +.. _skiboot-5.4.1: + +============= +skiboot-5.4.1 +============= + +skiboot-5.4.1 was released on Tuesday November 29th 2016. It replaces +:ref:`skiboot-5.4.0` as the current stable release. + +Over :ref:`skiboot-5.4.0`, we have a few changes: + +- Nuvoton i2c TPM driver: bug fixes and improvements, especially around + timeouts and error handling. +- Limit number of "Poller recursion detected" errors to display. + In some error conditions, we could spiral out of control on this + and spend all of our time printing the exact same backtrace. +- slw: do SLW timer testing while holding xscom lock. + In some situations without this, it could take long enough to get + the xscom lock that the 1ms timeout would expire and we'd falsely + think the SLW timer didn't work when in fact it did. +- p8i2c: Use calculated poll_interval when booting OPAL. + Otherwise we'd default to 2seconds (TIMER_POLL) during boot on + chips with a functional i2c interrupt, leading to slow i2c + during boot (or hitting timeouts instead). +- i2c: More efficiently run TPM I2C operations during boot, avoiding hitting + timeouts +- fsp: Don't recurse pollers in ibm_fsp_terminate diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.10.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.10.rst new file mode 100644 index 000000000..97e0f5c1f --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.10.rst @@ -0,0 +1,58 @@ +.. _skiboot-5.4.10: + +============== +skiboot-5.4.10 +============== + +skiboot-5.4.10 was released on Monday May 28th, 2018. It replaces +:ref:`skiboot-5.4.9` as the current stable release in the 5.4.x series. + +Over :ref:`skiboot-5.4.9`, we have a few bug fixes: + +- opal-prd: Do not error out on first failure for soft/hard offline. + + The memory errors (CEs and UEs) that are detected as part of background + memory scrubbing are reported by PRD asynchronously to opal-prd along with + affected memory ranges. hservice_memory_error() converts these ranges into + page granularity before hooking up them to soft/hard offline-ing + infrastructure. + + But the current implementation of hservice_memory_error() does not hookup + all the pages to soft/hard offline-ing if any of the page offline action + fails. e.g hard offline can fail for: + + - Pages that are not part of buddy managed pool. + - Pages that are reserved by kernel using memblock_reserved() + - Pages that are in use by kernel. + + But for the pages that are in use by user space application, the hard + offline marks the page as hwpoison, sends SIGBUS signal to kill the + affected application as recovery action and returns success. + + Hence, It is possible that some of the pages in that memory range are in + use by application or free. By stopping on first error we loose the + opportunity to hwpoison the subsequent pages which may be free or in use by + application. This patch fixes this issue. +- OPAL_PCI_SET_POWER_STATE: fix locking in error paths + + Otherwise we could exit OPAL holding locks, potentially leading + to all sorts of problems later on. +- p8-i2c: Limit number of retry attempts + + Current we will attempt to start an I2C transaction until it succeeds. + In the event that the OCC does not release the lock on an I2C bus this + results in an async token being held forever and the kernel thread that + started the transaction will block forever while waiting for an async + completion message. Fix this by limiting the number of attempts to + start the transaction. +- FSP/CONSOLE: Disable notification on unresponsive consoles + + Commit fd6b71fc fixed the situation where ipmi console was open (hvc0) but got + data on different console (hvc1). + + During FSP R/R OPAL closes all consoles. After R/R complete FSP requests to + open hvc1 and sends data on this. If hvc1 registration failed or not opened in + host kernel then it will not read data and results in RCU stalls. + + Note that this is workaround for older kernel where we don't have separate irq + for each console. Latest kernel works fine without this patch. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.11.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.11.rst new file mode 100644 index 000000000..31b3f9c62 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.11.rst @@ -0,0 +1,27 @@ +.. _skiboot-5.4.11: + +============== +skiboot-5.4.11 +============== + +skiboot-5.4.11 was released on Wednesday Dec 4th, 2019. It replaces +:ref:`skiboot-5.4.10` as the current stable release in the 5.4.x series. + +Over :ref:`skiboot-5.4.10`, we have below bug fix to support inband ipmi +interface: + +- FSP/IPMI: Handle FSP reset reload + FSP IPMI driver serializes ipmi messages. It sends message to FSP and waits + for response before sending new message. It works fine as long as we get + response from FSP on time. + + If we have inflight ipmi message during FSP R/R, we will not get resonse + from FSP. So if we initiate inband FSP R/R then all subsequent inband ipmi + message gets blocked. + + Sequence: + - ipmitool mc reset cold + - <FSP R/R complete> + - ipmitool <any command> <-- gets blocked + + This patch clears inflight ipmi messages after FSP R/R complete. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.12.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.12.rst new file mode 100644 index 000000000..935e7281c --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.12.rst @@ -0,0 +1,14 @@ +.. _skiboot-5.4.12: + +============== +skiboot-5.4.12 +============== + +skiboot-5.4.12 was released on Thursday Oct 22nd, 2020. It replaces +:ref:`skiboot-5.4.11` as the current stable release in the 5.4.x series. + +Over :ref:`skiboot-5.4.11`, we have below bug fix to support FSP based +system : + + +- FSP/NVRAM: Do not assert in vNVRAM statistics call diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.2.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.2.rst new file mode 100644 index 000000000..d8e353dd4 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.2.rst @@ -0,0 +1,15 @@ +.. _skiboot-5.4.2: + +============= +skiboot-5.4.2 +============= + +skiboot-5.4.2 was released on Friday December 2nd 2016. It replaces +:ref:`skiboot-5.4.1` as the current stable release. + +Over :ref:`skiboot-5.4.1`, we have two bug fixes exclusively aimed at machines +with TPMs: + +- i2c: Add nuvoton TPM quirk, disallowing i2cdetect as it can hard lock the TPM +- p8-i2c improve I2C reset code path, solves getting stuck resetting i2c engine + diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.3.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.3.rst new file mode 100644 index 000000000..0e9a116c9 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.3.rst @@ -0,0 +1,18 @@ +.. _skiboot-5.4.3: + +============= +skiboot-5.4.3 +============= + +skiboot-5.4.3 was released on Monday January 16th, 2017. It replaces +:ref:`skiboot-5.4.2` as the current stable release. + +Over :ref:`skiboot-5.4.2`, we have a small number of bug fixes: + +- Makefile: Disable stack protector due to gcc problems +- Makefile: Use -ffixed-r13. + We use r13 for our own stuff, make sure it's properly fixed +- phb3: Lock the PHB on set_xive callbacks +- arch_flash_arm: Don't assume mtd labels are short +- Stop using 3-operand cmp[l][i] for latest binutils +- hw/phb3: fix error handling in complete reset diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.4.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.4.rst new file mode 100644 index 000000000..acd374705 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.4.rst @@ -0,0 +1,76 @@ +.. _skiboot-5.4.4: + +============= +skiboot-5.4.4 +============= + +skiboot-5.4.4 was released on Wednesday May 3rd, 2017. It replaces +:ref:`skiboot-5.4.3` as the current stable release in the 5.4.x series. + +Over :ref:`skiboot-5.4.3`, we have a small number of bug fixes: + +- hw/fsp: Do not queue SP and SPCN class messages during reset/reload + In certain cases of communicating with the FSP (e.g. sensors), the OPAL FSP + driver returns a default code (async + completion) even though there is no known bound from the time of this error + return to the actual data being available. The kernel driver keeps waiting + leading to soft-lockup on the host side. + + Mitigate both these (known) cases by returning OPAL_BUSY so the host driver + knows to retry later. +- core/pci: Fix PCIe slot's presence + According to PCIe spec, the presence bit is hardcoded to 1 if PCIe + switch downstream port doesn't support slot capability. The register + used for the check in pcie_slot_get_presence_state() is wrong. It + should be PCIe capability register instead of PCIe slot capability + register. Otherwise, we always have present bit on the PCI topology. + + The issue is found on Supermicro's p8dtu2u machine: :: + + # lspci -t + -+-[0022:00]---00.0-[01-08]----00.0-[02-08]--+-01.0-[03]----00.0 + | \-02.0-[04-08]-- + # cat /sys/bus/pci/slots/S002204/adapter + 1 + # lspci -vvs 0022:02:02.0 + # lspci -vvs 0022:02:02.0 + 0022:02:02.0 PCI bridge: PLX Technology, Inc. PEX 8718 16-Lane, \ + 5-Port PCI Express Gen 3 (8.0 GT/s) Switch (rev ab) (prog-if 00 [Normal decode]) + : + Capabilities: [68] Express (v2) Downstream Port (Slot+), MSI 00 + : + SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock- + Changed: MRL- PresDet- LinkState- + + This fixes the issue by checking the correct register (PCIe capability). + Also, the register's value is cached in advance as we did for slot and + link capability. +- core/pci: More reliable way to update PCI slot power state + + The power control bit (SLOT_CTL, offset: PCIe cap + 0x18) isn't + reliable enough to reflect the PCI slot's power state. Instead, + the power indication bits are more reliable comparatively. This + leads to mismatch between the cached power state and PCI slot's + presence state, resulting in the hotplug driver in kernel refuses + to unplug the devices properly on the request. The issue was + found on below NVMe card on "supermicro,p8dtu2u" machine. We don't + have this issue on the integrated PLX 8718 switch. :: + + # lspci + 0022:01:00.0 PCI bridge: PLX Technology, Inc. PEX 9733 33-lane, \ + 9-port PCI Express Gen 3 (8.0 GT/s) Switch (rev aa) + 0022:02:01.0 PCI bridge: PLX Technology, Inc. PEX 9733 33-lane, \ + 9-port PCI Express Gen 3 (8.0 GT/s) Switch (rev aa) + 0022:02:04.0 PCI bridge: PLX Technology, Inc. PEX 9733 33-lane, \ + 9-port PCI Express Gen 3 (8.0 GT/s) Switch (rev aa) + 0022:02:05.0 PCI bridge: PLX Technology, Inc. PEX 9733 33-lane, \ + 9-port PCI Express Gen 3 (8.0 GT/s) Switch (rev aa) + 0022:02:06.0 PCI bridge: PLX Technology, Inc. PEX 9733 33-lane, \ + 9-port PCI Express Gen 3 (8.0 GT/s) Switch (rev aa) + 0022:02:07.0 PCI bridge: PLX Technology, Inc. PEX 9733 33-lane, \ + 9-port PCI Express Gen 3 (8.0 GT/s) Switch (rev aa) + 0022:17:00.0 Non-Volatile memory controller: Device 19e5:0123 (rev 45) + + This updates the cached PCI slot's power state using the power + indication bits instead of power control bit, to fix above issue. +- core/pci: Avoid hreset after freset diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.5.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.5.rst new file mode 100644 index 000000000..cbb5e7049 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.5.rst @@ -0,0 +1,56 @@ +.. _skiboot-5.4.5: + +============= +skiboot-5.4.5 +============= + +skiboot-5.4.5 was released on Friday June 9th, 2017. It replaces +:ref:`skiboot-5.4.4` as the current stable release in the 5.4.x series. + +Over :ref:`skiboot-5.4.4`, we have a small number of bug fixes: + + +- On FSP platforms: notify FSP of Platform Log ID after Host Initiated Reset Reload + Trigging a Host Initiated Reset (when the host detects the FSP has gone + out to lunch and should be rebooted), would cause "Unknown Command" messages + to appear in the OPAL log. + + This patch implements those messages. + + Log showing unknown command: :: + + / # cat /sys/firmware/opal/msglog | grep -i ,3 + [ 110.232114723,3] FSP: fsp_trigger_reset() entry + [ 188.431793837,3] FSP #0: Link down, starting R&R + [ 464.109239162,3] FSP #0: Got XUP with no pending message ! + [ 466.340598554,3] FSP-DPO: Unknown command 0xce0900 + [ 466.340600126,3] FSP: Unhandled message ce0900 + +- hw/i2c: Fix early lock drop + + When interacting with an I2C master the p8-i2c driver (common to p9) + aquires a per-master lock which it holds for the duration of it's + interaction with the master. Unfortunately, when + p8_i2c_check_initial_status() detects that the master is busy with + another transaction it drops the lock and returns OPAL_BUSY. This is + contrary to the driver's locking strategy which requires that the + caller aquire and drop the lock. This leads to a crash due to the + double unlock(), which skiboot treats as fatal. + +- head.S: store all of LR and CTR + + When saving the CTR and LR registers the skiboot exception handlers use the + 'stw' instruction which only saves the lower 32 bits of the register. Given + these are both 64 bit registers this leads to some strange register dumps, + for example: :: + + *********************************************** + Unexpected exception 200 ! + SRR0 : 0000000030016968 SRR1 : 9000000000201000 + HSRR0: 0000000000000180 HSRR1: 9000000000001000 + LR : 3003438830823f50 CTR : 3003438800000018 + CFAR : 00000000300168fc + CR : 40004208 XER: 00000000 + + In this dump the upper 32 bits of LR and CTR are actually stack gunk + which obscures the underlying issue. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.6.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.6.rst new file mode 100644 index 000000000..20173a01f --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.6.rst @@ -0,0 +1,117 @@ +.. _skiboot-5.4.6: + +============= +skiboot-5.4.6 +============= + +skiboot-5.4.6 was released on Wednesday June 14th, 2017. It replaces +:ref:`skiboot-5.4.5` as the current stable release in the 5.4.x series. + +Over :ref:`skiboot-5.4.5`, we have a small number of bug fixes for +FSP based platforms: + +- FSP/CONSOLE: Workaround for unresponsive ipmi daemon + + In some corner cases, where FSP is active but not responding to + console MBOX message (due to buggy IPMI) and we have heavy console + write happening from kernel, then eventually our console buffer + becomes full. At this point OPAL starts sending OPAL_BUSY_EVENT to + kernel. Kernel will keep on retrying. This is creating kernel soft + lockups. In some extreme case when every CPU is trying to write to + console, user will not be able to ssh and thinks system is hang. + + If we reset FSP or restart IPMI daemon on FSP, system recovers and + everything becomes normal. + + This patch adds workaround to above issue by returning OPAL_HARDWARE + when cosole is full. Side effect of this patch is, we may endup dropping + latest console data. But better to drop console data than system hang. + + Alternative approach is to drop old data from console buffer, make space + for new data. But in normal condition only FSP can update 'next_out' + pointer and if we touch that pointer, it may introduce some other + race conditions. Hence we decided to just new console write request. + +- FSP: Set status field in response message for timed out message + + For timed out FSP messages, we set message status as "fsp_msg_timeout". + But most FSP driver users (like surviellance) are ignoring this field. + They always look for FSP returned status value in callback function + (second byte in word1). So we endup treating timed out message as success + response from FSP. + + Sample output: :: + + [69902.432509048,7] SURV: Sending the heartbeat command to FSP + [70023.226860117,4] FSP: Response from FSP timed out, word0 = d66a00d7, word1 = 0 state: 3 + .... + [70023.226901445,7] SURV: Received heartbeat acknowledge from FSP + [70023.226903251,3] FSP: fsp_trigger_reset() entry + + Here SURV code thought it got valid response from FSP. But actually we didn't + receive response from FSP. + +- FSP: Improve timeout message + + Presently we print word0 and word1 in error log. word0 contains + sequence number and command class. One has to understand word0 + format to identify command class. + + Lets explicitly print command class, sub command etc. + +- FSP/RTC: Remove local fsp_in_reset variable + + Now that we are using fsp_in_rr() to detect FSP reset/reload, fsp_in_reset + become redundant. Lets remove this local variable. + +- FSP/RTC: Fix possible FSP R/R issue in rtc write path + + fsp_opal_rtc_write() checks FSP status before queueing message to FSP. But if + FSP R/R starts before getting response to queued message then we will continue + to return OPAL_BUSY_EVENT to host. In some extreme condition host may + experience hang. Once FSP is back we will repost message, get response from FSP + and return OPAL_SUCCESS to host. + + This patch caches new values and returns OPAL_SUCCESS if FSP R/R is happening. + And once FSP is back we will send cached value to FSP. + +- hw/fsp/rtc: read/write cached rtc tod on fsp hir. + + Currently fsp-rtc reads/writes the cached RTC TOD on an fsp + reset. Use latest fsp_in_rr() function to properly read the cached rtc + value when fsp reset initiated by the hir. + + Below is the kernel trace when we set hw clock, when hir process starts. :: + + [ 1727.775824] NMI watchdog: BUG: soft lockup - CPU#57 stuck for 23s! [hwclock:7688] + [ 1727.775856] Modules linked in: vmx_crypto ibmpowernv ipmi_powernv uio_pdrv_genirq ipmi_devintf powernv_op_panel uio ipmi_msghandler powernv_rng leds_powernv ip_tables x_tables autofs4 ses enclosure scsi_transport_sas crc32c_vpmsum lpfc ipr tg3 scsi_transport_fc + [ 1727.775883] CPU: 57 PID: 7688 Comm: hwclock Not tainted 4.10.0-14-generic #16-Ubuntu + [ 1727.775883] task: c000000fdfdc8400 task.stack: c000000fdfef4000 + [ 1727.775884] NIP: c00000000090540c LR: c0000000000846f4 CTR: 000000003006dd70 + [ 1727.775885] REGS: c000000fdfef79a0 TRAP: 0901 Not tainted (4.10.0-14-generic) + [ 1727.775886] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> + [ 1727.775889] CR: 28024442 XER: 20000000 + [ 1727.775890] CFAR: c00000000008472c SOFTE: 1 + GPR00: 0000000030005128 c000000fdfef7c20 c00000000144c900 fffffffffffffff4 + GPR04: 0000000028024442 c00000000090540c 9000000000009033 0000000000000000 + GPR08: 0000000000000000 0000000031fc4000 c000000000084710 9000000000001003 + GPR12: c0000000000846e8 c00000000fba0100 + [ 1727.775897] NIP [c00000000090540c] opal_set_rtc_time+0x4c/0xb0 + [ 1727.775899] LR [c0000000000846f4] opal_return+0xc/0x48 + [ 1727.775899] Call Trace: + [ 1727.775900] [c000000fdfef7c20] [c00000000090540c] opal_set_rtc_time+0x4c/0xb0 (unreliable) + [ 1727.775901] [c000000fdfef7c60] [c000000000900828] rtc_set_time+0xb8/0x1b0 + [ 1727.775903] [c000000fdfef7ca0] [c000000000902364] rtc_dev_ioctl+0x454/0x630 + [ 1727.775904] [c000000fdfef7d40] [c00000000035b1f4] do_vfs_ioctl+0xd4/0x8c0 + [ 1727.775906] [c000000fdfef7de0] [c00000000035bab4] SyS_ioctl+0xd4/0xf0 + [ 1727.775907] [c000000fdfef7e30] [c00000000000b184] system_call+0x38/0xe0 + [ 1727.775908] Instruction dump: + [ 1727.775909] f821ffc1 39200000 7c832378 91210028 38a10020 39200000 38810028 f9210020 + [ 1727.775911] 4bfffe6d e8810020 80610028 4b77f61d <60000000> 7c7f1b78 3860000a 2fbffff4 + + This is found when executing the `op-test-framework fspresetReload testcase <https://github.com/open-power/op-test-framework/blob/master/testcases/fspresetReload.py>`_ + + With this fix ran fsp hir torture testcase in the above test + which is working fine. + +- FSP/CHIPTOD: Return false in error path diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.7.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.7.rst new file mode 100644 index 000000000..b4a013e70 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.7.rst @@ -0,0 +1,30 @@ +.. _skiboot-5.4.7: + +============= +skiboot-5.4.7 +============= + +skiboot-5.4.7 was released on Tuesday September 19th, 2017. It replaces +:ref:`skiboot-5.4.6` as the current stable release in the 5.4.x series. + +Over :ref:`skiboot-5.4.6`, we have two backported bug fixes for FSP platforms: + +- FSP: Add check to detect FSP Reset/Reload inside fsp_sync_msg() + + During FSP Reset/Reload we move outstanding MBOX messages from msgq to + rr_queue including inflight message (fsp_reset_cmdclass()). But we are not + resetting inflight message state. + + In extreme corner case where we sent message to FSP via fsp_sync_msg() path + and FSP Reset/Reload happens before getting respose from FSP, then we will + endup waiting in fsp_sync_msg() until everything becomes normal. + + This patch adds fsp_in_rr() check to fsp_sync_msg() and return error to + caller if FSP is in R/R. + +- platforms/ibm-fsp/firenze: Fix PCI slot power-off pattern + + When powering off the PCI slot, the corresponding bits should + be set to 0bxx00xx00 instead of 0bxx11xx11. Otherwise, the + specified PCI slot can't be put into power-off state. Fortunately, + it didn't introduce any side-effects so far. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.8.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.8.rst new file mode 100644 index 000000000..fa66085ca --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.8.rst @@ -0,0 +1,158 @@ +.. _skiboot-5.4.8: + +============= +skiboot-5.4.8 +============= + +skiboot-5.4.8 was released on Wednesday October 11th, 2017. It replaces +:ref:`skiboot-5.4.7` as the current stable release in the 5.4.x series. + +Over :ref:`skiboot-5.4.7`, we have a few bug fixes for FSP platforms: + +- libflash/file: Handle short read()s and write()s correctly + + Currently we don't move the buffer along for a short read() or write() + and nor do we request only the remaining amount. +- FSP/NVRAM: Handle "get vNVRAM statistics" command + + FSP sends MBOX command (cmd : 0xEB, subcmd : 0x05, mod : 0x00) to get vNVRAM + statistics. OPAL doesn't maintain any such statistics. Hence return + FSP_STATUS_INVALID_SUBCMD. + + Sample OPAL log: :: + + [16944.384670488,3] FSP: Unhandled message eb0500 + [16944.474110465,3] FSP: Unhandled message eb0500 + [16945.111280784,3] FSP: Unhandled message eb0500 + [16945.293393485,3] FSP: Unhandled message eb0500 +- FSP/CONSOLE: Limit number of error logging + + Commit c8a7535f (FSP/CONSOLE: Workaround for unresponsive ipmi daemon, added + in skiboot 5.4.6 and 5.7-rc1) added error logging when buffer is full. In some + corner cases kernel may call this function multiple time and we may endup logging + error again and again. + + This patch fixes it by generating error log only once. + +- FSP/CONSOLE: Fix fsp_console_write_buffer_space() call + + Kernel calls fsp_console_write_buffer_space() to check console buffer space + availability. If there is enough buffer space to write data, then kernel will + call fsp_console_write() to write actual data. + + In some extreme corner cases (like one explained in commit c8a7535f) + console becomes full and this function returns 0 to kernel (or space available + in console buffer < next incoming data size). Kernel will continue retrying + until it gets enough space. So we will start seeing RCU stalls. + + This patch keeps track of previous available space. If previous space is same + as current means not enough space in console buffer to write incoming data. + It may be due to very high console write operation and slow response from FSP + -OR- FSP has stopped processing data (ex: because of ipmi daemon died). At this + point we will start timer with timeout of SER_BUFFER_OUT_TIMEOUT (10 secs). + If situation is not improved within 10 seconds means something went bad. Lets + return OPAL_RESOURCE so that kernel can drop console write and continue. +- FSP/CONSOLE: Close SOL session during R/R + + Presently we are not closing SOL and FW console sessions during R/R. Host will + continue to write to SOL buffer during FSP R/R. If there is heavy console write + operation happening during FSP R/R (like running `top` command inside console), + then at some point console buffer becomes full. fsp_console_write_buffer_space() + returns 0 (or less than required space to write data) to host. While one thread + is busy writing to console, if some other threads tries to write data to console + we may see RCU stalls (like below) in kernel. + + kernel call trace: :: + + [ 2082.828363] INFO: rcu_sched detected stalls on CPUs/tasks: { 32} (detected by 16, t=6002 jiffies, g=23154, c=23153, q=254769) + [ 2082.828365] Task dump for CPU 32: + [ 2082.828368] kworker/32:3 R running task 0 4637 2 0x00000884 + [ 2082.828375] Workqueue: events dump_work_fn + [ 2082.828376] Call Trace: + [ 2082.828382] [c000000f1633fa00] [c00000000013b6b0] console_unlock+0x570/0x600 (unreliable) + [ 2082.828384] [c000000f1633fae0] [c00000000013ba34] vprintk_emit+0x2f4/0x5c0 + [ 2082.828389] [c000000f1633fb60] [c00000000099e644] printk+0x84/0x98 + [ 2082.828391] [c000000f1633fb90] [c0000000000851a8] dump_work_fn+0x238/0x250 + [ 2082.828394] [c000000f1633fc60] [c0000000000ecb98] process_one_work+0x198/0x4b0 + [ 2082.828396] [c000000f1633fcf0] [c0000000000ed3dc] worker_thread+0x18c/0x5a0 + [ 2082.828399] [c000000f1633fd80] [c0000000000f4650] kthread+0x110/0x130 + [ 2082.828403] [c000000f1633fe30] [c000000000009674] ret_from_kernel_thread+0x5c/0x68 + + Hence lets close SOL (and FW console) during FSP R/R. + +- FSP/CONSOLE: Do not associate unavailable console + + Presently OPAL sends associate/unassociate MBOX command for all + FSP serial console (like below OPAL message). We have to check + console is available or not before sending this message. + + OPAL log: :: + + [ 5013.227994012,7] FSP: Reassociating HVSI console 1 + [ 5013.227997540,7] FSP: Reassociating HVSI console 2 +- FSP: Disable PSI link whenever FSP tells OPAL about impending Reset/Reload + + Commit 42d5d047 fixed scenario where DPO has been initiated, but FSP went + into reset before the CEC power down came in. But this is generic issue + that can happen in normal shutdown path as well. + + Hence disable PSI link as soon as we detect FSP impending R/R. + + +- fsp: return OPAL_BUSY_EVENT on failure sending FSP_CMD_POWERDOWN_NORM + Also, return OPAL_BUSY_EVENT on failure sending FSP_CMD_REBOOT / DEEP_REBOOT. + + We had a race condition between FSP Reset/Reload and powering down + the system from the host: + + Roughly: + + == ======================== ========================================================== + # FSP Host + == ======================== ========================================================== + 1 Power on + 2 Power on + 3 (inject EPOW) + 4 (trigger FSP R/R) + 5 Processes EPOW event, starts shutting down + 6 calls OPAL_CEC_POWER_DOWN + 7 (is still in R/R) + 8 gets OPAL_INTERNAL_ERROR, spins in opal_poll_events + 9 (FSP comes back) + 10 spinning in opal_poll_events + 11 (thinks host is running) + == ======================== ========================================================== + + The call to OPAL_CEC_POWER_DOWN is only made once as the reset/reload + error path for fsp_sync_msg() is to return -1, which means we give + the OS OPAL_INTERNAL_ERROR, which is fine, except that our own API + docs give us the opportunity to return OPAL_BUSY when trying again + later may be successful, and we're ambiguous as to if you should retry + on OPAL_INTERNAL_ERROR. + + For reference, the linux code looks like this: :: + + static void __noreturn pnv_power_off(void) + { + long rc = OPAL_BUSY; + + pnv_prepare_going_down(); + + while (rc == OPAL_BUSY || rc == OPAL_BUSY_EVENT) { + rc = opal_cec_power_down(0); + if (rc == OPAL_BUSY_EVENT) + opal_poll_events(NULL); + else + mdelay(10); + } + for (;;) + opal_poll_events(NULL); + } + + Which means that *practically* our only option is to return OPAL_BUSY + or OPAL_BUSY_EVENT. + + We choose OPAL_BUSY_EVENT for FSP systems as we do want to ensure we're + running pollers to communicate with the FSP and do the final bits of + Reset/Reload handling before we power off the system. + diff --git a/roms/skiboot/doc/release-notes/skiboot-5.4.9.rst b/roms/skiboot/doc/release-notes/skiboot-5.4.9.rst new file mode 100644 index 000000000..63a202e90 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.4.9.rst @@ -0,0 +1,16 @@ +.. _skiboot-5.4.9: + +============= +skiboot-5.4.9 +============= + +skiboot-5.4.9 was released on Friday January 5th, 2018. It replaces +:ref:`skiboot-5.4.8` as the current stable release in the 5.4.x series. + +Over :ref:`skiboot-5.4.8`, we have one new feature: + +- Parse IPL FW feature settings + + Add parsing for the firmware feature flags in the HDAT. This + indicates the settings of various parameters which are set at IPL time + by firmware. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.5.0-rc1.rst b/roms/skiboot/doc/release-notes/skiboot-5.5.0-rc1.rst new file mode 100644 index 000000000..fd6fd71bb --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.5.0-rc1.rst @@ -0,0 +1,861 @@ +.. _skiboot-5.5.0-rc1: + +skiboot-5.5.0-rc1 +================= + +skiboot-5.5.0-rc1 was released on Tuesday March 28th 2017. It is the first +release candidate of skiboot 5.5, which will become the new stable release +of skiboot following the 5.4 release, first released November 11th 2016. + +skiboot-5.5.0-rc1 contains all bug fixes as of :ref:`skiboot-5.4.3` +and :ref:`skiboot-5.1.19` (the currently maintained stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.5.0 by April 8th, with skiboot 5.5.0 +being for all POWER8 and POWER9 platforms in op-build v1.16 (Due April 12th). +This is a short cycle as this release is mainly targetted towards POWER9 +bringup efforts. + +Following skiboot-5.5.0, we will move to a regular six week release cycle, +similar to op-build, but slightly offset to allow for a short stabilisation +period. Expected release dates and contents are tracked using GitHub milestone +and issues: https://github.com/open-power/skiboot/milestones + +Over skiboot-5.4, we have the following changes: + +New Platforms +------------- +- SuperMicro's (SMC) P8DNU: An astbmc based POWER8 platform +- Add a generic platform to help with bringup of new systems. +- Four POWER9 based systems (NOTE: All POWER9 systems should be considered + for bringup use only at this point): + + - Romulus + - Witherspoon (a POWER9 system with NVLink2 attached GPUs) + - Zaius (OpenCompute platform, also known as "Barreleye 2") + - ZZ (FSP based system) + +New features +------------ + +- System reset IPI facility and Mambo implementation + Add an opal call :ref:`OPAL_SIGNAL_SYSTEM_RESET` which allows system reset + exceptions to be raised on other CPUs and act as an NMI IPI. There + is an initial simple Mambo implementation, but allowances are made + for a more complex hardware implementation. + + The Mambo implementation is based on the RFC implementation for POWER8 + hardware (see https://patchwork.ozlabs.org/patch/694794/) which we hope + makes it into a future release. + + This implements an in-band NMI equivalent. +- add CONTRIBUTING.md, ensuring that people new to the project have a one-stop + place to find out how to get started. +- interrupts: Add optional name for OPAL interrupts + + This adds the infrastructure for an interrupt source to provide + a name for an interrupt directed toward OPAL. Those names will + be put into an "opal-interrupts-names" property which is a + standard DT string list corresponding 1:1 with the "opal-interrupts" + property. PSI interrupts get names, and this is visible in Linux + through /proc/interrupts +- platform: add OPAL_REBOOT_FULL_IPL reboot type + + There may be circumstances in which a user wants to force a full IPL reboot + rather than using fast reboot. Add a new reboot type, OPAL_REBOOT_FULL_IPL, + that disables fast reboot. On platforms which don't support fast reboot, + this will be equivalent to a normal reboot. +- phb3: Trick to allow control of the PCIe link width and speed + + This implements a hook inside OPAL that catches 16 and 32 bit writes + to the link status register of the PHB. + + It allows you to write a new speed or a new width, and OPAL will then + cause the PHB to renegociate. + + Example: + + First read the link status on PHB4: :: + + setpci -s 0004:00:00.0 0x5a.w + a103 + + It's at x16 Gen3 speed (8GT/s) + + bits 0x0ff0 are the width and 0x000f the speed. The width can be + 1 to 16 and the speed 1 to 3 (2.5, 5 and 8GT/s) + + Then try to bring it down to 1x Gen1 : :: + + setpci -s 0004:00:00.0 0x5a.w=0xa011 + + Observe the result in the PHB: :: + + / # lspci -s 0004:00:00.0 -vv + 0004:00:00.0 PCI bridge: IBM Device 03dc (prog-if 00 [Normal decode]) + .../... + LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt+ + + And in the device: :: + + / # lspci -s 0004:01:00.0 -vv + .../... + LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- + +- core/init: Add hdat-map property to OPAL node. + + Exports the HDAT heap to the OS. This allows the OS to view the HDAT heap + directly. This allows us to view the HDAT area without having to use + getmemproc. + +- Add a generic platform: If /bmc in device tree, attempt to init one + For the most part, this gets us somewhere on some OpenPOWER systems + before there's a platform file for that machine. + + Useful in bringup only, and marked as such with scary looking log + messages. + + +Core +---- + +- asm: Don't try to set LPCR:LPES1 on P8 and P9, the bit doesn't exist. + +- pci: Add a framework for quirks + + In future we may want to be able to do fixups for specific PCI devices in + skiboot, so add a small framework for doing this. + + This is not intended for the same purposes as quirks in the Linux kernel, + as the PCI devices that quirks can match for in skiboot are not properly + configured. This is intended to enable having a custom path to make + changes that don't directly interact with the PCI device, for example + adding device tree entries. + +- hw/slw: fix possible NULL dereference +- slw: Print enabled stop states on boot +- uart: Fix Linux pass-through policy, provide NVRAM override option +- libc/stdio/vsnprintf.c: add explicit fallthrough, this silences a recent + (GCC 7.x) warning +- init: print the FDT blob size in decimal +- init: Print some more info before booting linux + + The kernel command line from nvram and the stdout-path are + useful to know when debugging console related problems. + +- Makefile: Disable stack protector due to gcc problems + + Depending on how it was built, gcc will use the canary from a global + (works for us) or from the TLS (doesn't work for us and accesses + random stuff instead). + + Fixing that would be tricky. There are talks of adding a gcc option + to force use of globals, but in the meantime, disable the stack + protector. +- Stop using 3-operand cmp[l][i] for latest binutils + Since a5721ba270, binutils does not support 3-operand cmp[l][i]. + This adds (previously optional) parameter L. +- buddy: Add a simple generic buddy allocator +- stack: Don't recurse into __stack_chk_fail +- Makefile: Use -ffixed-r13 + We use r13 for our own stuff, make sure it's properly fixed +- Always set ibm,occ-functional-state correctly +- psi: fix the xive registers initialization on P8, which seems to be fine + for real HW but causes a lof of pain under qemu +- slw: Set PSSCR value for idle states +- Limit number of "Poller recursion detected" errors to display + + In some error conditions, we could spiral out of control on this + and spend all of our time printing the exact same backtrace. + + Limit it to 16 times, because 16 is a nice number. +- slw: do SLW timer testing while holding xscom lock + + We add some routines that let a caller get the xscom lock once and + then do a bunch of xscoms while holding it. + In some situations without this, it could take long enough to get + the xscom lock that the 1ms timeout would expire and we'd falsely + think the SLW timer didn't work when in fact it did. +- wait_for_resource_loaded: don't needlessly sleep for 5ms +- run pollers in cpu_process_local_jobs() if running job synchonously +- fsp: Don't recurse pollers in ibm_fsp_terminate +- chiptod: More hardening against -1 chip ID +- interrupts: Rewrite/correct doc for opal_set/get_xive +- cpu: Don't enable nap mode/PM mode on non-P8 +- platform: Call generic platform probe and init UART there +- psi: Don't register more interrupts than the HW supports +- psi: Add DT option to disable LPC interrupts + +I2C and TPM +----------- +- p8i2c: Use calculated poll_interval when booting OPAL + Otherwise we'd default to 2seconds (TIMER_POLL) during boot on + chips with a functional i2c interrupt, leading to slow i2c + during boot (or hitting timeouts instead). +- i2c: Add i2c_run_req() to crank the state machine for a request +- tpm_i2c_nuvoton: work out the polling time using mftb() +- tpm_i2c_nuvoton: handle errors after reading the tpm fifo +- tpm_i2c_nuvoton: cleanup variables in tpm_read_fifo() +- tpm_i2c_nuvoton: handle errors after writting the tpm fifo +- tpm_i2c_nuvoton: cleanup variables in tpm_write_fifo() +- tpm_i2c_nuvoton: handle errors after writing sts.commandReady in step 5 +- tpm_i2c_nuvoton: handle errors after writing sts.go +- tpm_i2c_nuvoton: handle errors after checking the tpm fifo status +- tpm_i2c_nuvoton: return burst_count in tpm_read_burst_count() +- tpm_i2c_nuvoton: isolate the code that handles the TPM_TIMEOUT_D timeout +- tpm_i2c_nuvoton: handle errors after reading sts.commandReady +- tpm_i2c_nuvoton: add tpm_status_read_byte() +- tpm_i2c_nuvoton: add tpm_check_status() +- tpm_i2c_nuvoton: rename defines to shorter names +- tpm_i2c_interface: decouple rc from being done with i2c request +- tpm_i2c_interface: set timeout before each request +- i2c: Add nuvoton quirk, disallowing i2cdetect as it locks TPM + + p8-i2c reset things manually in some error conditions +- stb: create-container and wrap skiboot in Secure/Trusted Boot container + + We produce **UNSIGNED** skiboot.lid.stb and skiboot.lid.xz.stb as build + artifacts. + + These are suitable blobs for flashing onto Trusted Boot enabled op-build + builds *WITH* the secure boot jumpers *ON* (i.e. *NOT* in secure mode). + It's just enough of the Secure and Trusted Boot container format to + make Hostboot behave. + + +PCI +--- +- core/pci: Support SRIOV VFs + + Currently, skiboot can't see SRIOV VFs. It introduces some troubles + as I can see: The device initialization logic (phb->ops->device_init()) + isn't applied to VFs, meaning we have to maintain same and duplicated + mechanism in kernel for VFs only. It introduces difficulty to code + maintaining and prone to lose sychronization. + + This was motivated by bug reported by Carol: The VF's Max Payload + Size (MPS) isn't matched with PF's on Mellanox's adapter even kernel + tried to make them same. It's caused by readonly PCIECAP_EXP_DEVCTL + register on VFs. The skiboot would be best place to emulate this bits + to eliminate the gap as I can see. + + This supports SRIOV VFs. When the PF's SRIOV capability is populated, + the number of maximal VFs (struct pci_device) are instanciated, but + but not usable yet. In the mean while, PCI config register filter is + registered against PCIECAP_SRIOV_CTRL_VFE to capture the event of + enabling or disabling VFs. The VFs are initialized, put into the PF's + children list (pd->children), populate its PCI capabilities, and + register PCI config register filter against PCICAP_EXP_DEVCTL. The + filter's handler caches what is written to MPS field and returns + the cached value on read, to eliminate the gap mentioned as above. + +- core/pci: Avoid hreset after freset + + Commit 5ac71c9 ("pci: Avoid hot resets at boot time") missed to + avoid hot reset after fundamental reset for PCIe common slots. + + This fixes it. +- core/pci: Enforce polling PCIe link in hot-add path + + In surprise hot-add path, the power state isn't changed on hardware. + Instead, we set the cached power state (@slot->power_state) and + return OPAL_SUCCESS. The upper layer starts the PCI probing immediately + when receiving OPAL_SUCCESS. However, the PCIe link behind the PCI + slot is likely down. Nothing will be probed from the PCI slot even + we do have PCI adpater connected to the slot. + + This fixes the issue by returning OPAL_ASYNC_COMPLETION to force + upper layer to poll the PCIe link before probing the PCI devices + behind the slot in surprise and managed hot-add paths. +- hw/phb3: fix error handling in complete reset + During a complete reset, when we get a timeout waiting for pending + transaction in state PHB3_STATE_CRESET_WAIT_CQ, we mark the PHB as + permanently broken. + + Set the state to PHB3_STATE_FENCED so that the kernel can retry the + complete reset. +- phb3: Lock the PHB on set_xive callbacks + +p8dnu platform +-------------- +- astbmc/p8dnu: Enable PCI slot's power supply on PEX9733 in hot-add path +- astbmc/p8dnu: Enable PCI slot's power supply on PEX8718 in hot-add path +- core/pci: Mark broken PDC on slots without surprise hotplug capability + + We has to support surprise hotplug on PCI slots that don't support + it on hardware. So we're fully utilizing the PCIe link state change + event to detect the events (hot-remove and hot-add). The PDC (Presence + Detection Change) event isn't reliable for the purpose. For example, + PEX8718 on superMicro's machines. + + This adds another PCI slot property "ibm,slot-broken-pdc" in the + device-tree, to indicate the PDC isn't reliable on those (software + claimed) surprise pluggable slots. +- core/pci: Fix PCIe slot's presence + + According to PCIe spec, the presence bit is hardcoded to 1 if PCIe + switch downstream port doesn't support slot capability. The register + used for the check in pcie_slot_get_presence_state() is wrong. It + should be PCIe capability register instead of PCIe slot capability + register. Otherwise, we always have present bit on the PCI topology. + The issue is found on Supermicro's p8dtu2u machine: :: + + # lspci -t + -+-[0022:00]---00.0-[01-08]----00.0-[02-08]--+-01.0-[03]----00.0 + | \-02.0-[04-08]-- + # cat /sys/bus/pci/slots/S002204/adapter + 1 + # lspci -vvs 0022:02:02.0 + # lspci -vvs 0022:02:02.0 + 0022:02:02.0 PCI bridge: PLX Technology, Inc. PEX 8718 16-Lane, \ + 5-Port PCI Express Gen 3 (8.0 GT/s) Switch (rev ab) (prog-if 00 [Normal decode]) + : + Capabilities: [68] Express (v2) Downstream Port (Slot+), MSI 00 + : + SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock- + Changed: MRL- PresDet- LinkState- + + This fixes the issue by checking the correct register (PCIe capability). + Also, the register's value is cached in advance as we did for slot and + link capability. +- core/pci: More reliable way to update PCI slot power state + + The power control bit (SLOT_CTL, offset: PCIe cap + 0x18) isn't + reliable enough to reflect the PCI slot's power state. Instead, + the power indication bits are more reliable comparatively. This + leads to mismatch between the cached power state and PCI slot's + presence state, resulting in the hotplug driver in kernel refuses + to unplug the devices properly on the request. The issue was + found on below NVMe card on "supermicro,p8dtu2u" machine. We don't + have this issue on the integrated PLX 8718 switch. :: + + # lspci + 0022:01:00.0 PCI bridge: PLX Technology, Inc. PEX 9733 33-lane, \ + 9-port PCI Express Gen 3 (8.0 GT/s) Switch (rev aa) + 0022:02:01.0 PCI bridge: PLX Technology, Inc. PEX 9733 33-lane, \ + 9-port PCI Express Gen 3 (8.0 GT/s) Switch (rev aa) + 0022:02:04.0 PCI bridge: PLX Technology, Inc. PEX 9733 33-lane, \ + 9-port PCI Express Gen 3 (8.0 GT/s) Switch (rev aa) + 0022:02:05.0 PCI bridge: PLX Technology, Inc. PEX 9733 33-lane, \ + 9-port PCI Express Gen 3 (8.0 GT/s) Switch (rev aa) + 0022:02:06.0 PCI bridge: PLX Technology, Inc. PEX 9733 33-lane, \ + 9-port PCI Express Gen 3 (8.0 GT/s) Switch (rev aa) + 0022:02:07.0 PCI bridge: PLX Technology, Inc. PEX 9733 33-lane, \ + 9-port PCI Express Gen 3 (8.0 GT/s) Switch (rev aa) + 0022:17:00.0 Non-Volatile memory controller: Device 19e5:0123 (rev 45) + + This updates the cached PCI slot's power state using the power + indication bits instead of power control bit, to fix above issue. + +Utilities +--------- + +- opal-prd: Direct systemd to always restart opal-prd + Always restart the opal-prd daemon, irrespective of why it stopped. +- external/ffspart: Simple C program to be able to make an FFS partition +- getscom: Add chip info for P9. +- gard: Fix make dist target +- pflash/libflash: arch_flash_arm: Don't assume mtd labels are short + +libffs +------ +- libffs: Understand how to create FFS partition TOCs and entries. + +BMC Based systems +----------------- +- platforms/astbmc: Support PCI slots for palmetto +- habanero/slottable: Remove Network Mezz(2, 0) from PHB1. +- BMC/PCI: Check slot tables against detected devices + On BMC machines, we have slot tables of built in PHBs, slots and devices + that are physically present in the system (such as the BMC itself). We + can use these tables to check what we *detected* against what *should* + be in the system and throw an error if they differ. + + We have seen this occur a couple of times while still booting, giving the + user just an empty petitboot screen and not much else to go on. This + patch helps in that we get a skiboot error message, and at some point + in the future when we pump them up to the OS we could get a big friendly + error message telling you you're having a bad day. +- pci/quirk: Populate device tree for AST2400 VGA + + Adding these properties enables the kernel to function in the same way + that it would if it could no longer access BMC configuration registers + through a backdoor, which may become the default in future. + + The comments describe how isolating the host from the BMC could be + achieved in skiboot, assuming all kernels that the system boots + support this. Isolating the BMC and the host from each other is + important if they are owned by different parties; for example, a cloud + provider renting machines "bare metal". + +- astbmc/pnor: Use mbox-flash for flash accesses + + If the BMC is MBOX protocol aware, request flash reads/writes over the + MBOX regs. This inits the blocklevel for pnor access with mbox-flash. +- ast: Account for differences between 2400 vs 2500 +- platform: set default bmc_platform + The bmc_platform pointer is set to NULL by default and on non-AMI BMC + platforms. As a result a few places in hw/ipmi/ipmi-sel.c will blindly + dereference a NULL pointer. + +POWER9 +------ + +- external: Update xscom utils for type 1 indirect accesses +- xscom: Harden indirect writes +- xscom: Add POWER9 scom reset +- homer : Enable HOMER region reservation for POWER9 +- slw: Define stop idle states for P9 DD1 +- slw: Fix parsing of supported STOP states +- slw: only enable supported STOP states +- dts: add support for p9 cores + +- asm: Add POWER9 case to init_shared_sprs + + For now, setup the HID and HMEER. We'll add more as we get + good default values from HW. +- xive/psi/lpc: Handle proper clearing of LPC SerIRQ latch on POWER9 DD1 +- lpc: Mark the power9 LPC bus as compatible with power8 +- Fix typo in PIR mask for POWER9. Fixes booting multi-chip. +- vpd: add vpd_valid() to check keyword VPD blobs + + Adds a function to check whether a blob is a valid IBM ASCII keyword + VPD blob. This allows us to recognise when we do and do not have a VPD + blob and act accordingly. +- core/cpu.c: Use a device-tree node to detect nest mmu presence + The nest mmu address scom was hardcoded which could lead to boot + failure on POWER9 systems without a nest mmu. For example Mambo + doesn't model the nest mmu which results in failure when + calling opal_nmmu_set_ptcr() during kernel load. +- psi: Fix P9 BAR setup on multi-chips + +PHB4: + + - phb4: Fix TVE encoding for start address + - phb4: Always assign powerbus BARs + + HostBoot configure them with weird values that confuse us, instead + let's just own the assignment. This is temporary, I will centralize + memory map management next but this gets us going. + - phb4: Fix endian issue with link control2/status2 registers + Fixes training at larger than PCIe Gen1 speeds. + - phb4: Add ability to log config space access + Useful for debugging + - phb4: Change debug prints + Currently we print "PHB4" and mean either "PHB version 4" or "PHB + number 4" which can be quite confusing. + - phb4: Fix config space enable bits on DD1 + - phb4: Fix location of EEH enable bits + - phb4: Fix setting of max link speed + - phb4: Updated inits as of PHB4 spec 0.52 + +HDAT fixes: + + - hdat: Parse BMC nodes much earlier + + This moves the parsing of the BMC and LPC details to the start of the + HDAT parsing. This allows us to enable the Skiboot log console earlier + so we can get debug output while parsing the rest of the HDAT. + - astbmc: Don't do P8 PSI or DT fixups on P9 + + Previously the HDAT format was only ever used with IBM hardware so it + would store vital product data (VPD) blobs in the IBM ASCII Keyword VPD + format. With P9 HDAT is used on OpenPower machines which use Industry + Standard DIMMs that provide their product data through a "Serial Present + Detect" EEPROM mounted on the DIMM. + + The SPD blob has a different format and is exported in the device-tree + under the "spd" property rather than the "ibm,vpd" property. This patch + adds support for recognising these blobs and placing them in the + appropriate DT property. + - hdat: Add __packed to all HDAT structures and workaround HB reserve + + Some HDAT structures aren't properly aligned. We were using __packed + on some but not others and got at least one wrong (HB reserve). This + adds it everywhere to avoid such problems. + + However this then triggers another problem where HB gives us a + crazy range (0.256M) to reserve with no label, which triggers an + assertion failure later on in mem_regions.c. + + So also add a test to skip any region starting at 0 until we can + undertand that better and have it fixed one way or another. + - hdat: Ignore broken memory reserves + + Ignore HDAT memory reserves > 512MB. These are considered bogus and + workaround known HDAT bugs. + - hdat: Add BMC device-tree node for P9 OpenPOWER systems + - hdat: Fix interrupt & device_type of UART node + + The interrupt should use a standard "interrupts" property. The UART + node also need a device_type="serial" property for historical reasons + otherwise Linux won't pick it up. + - parse and export STOP levels + - add new sppcrd_chip_info fields + - add radix-AP-encodings + - stop using proc_int_line in favor of pir + - rename add_icp() to add_xics_icp() + - Add support for PHB4 + - create XIVE nodes under each xscom node + - Add P9 compatible property + - Parse hostboot memory reservations from HDAT + - Add new fields to IPL params structure and update sys family for p9. + - Fix ibm,pa-features for all CPU types + - Fix XSCOM nodes for P9 + - Remove deprecated 'ibm, mem-interleave-scope' from DT on POWER9 + - Grab system model name from HDAT when available + - Grab vendor information from HDAT when available + - SPIRA-H/S changes for P9 + - Add BMC and LPC IOPATH support + - handle ISDIMM SPD blobs + - make HDIF_child() print more useful errors + - Add PSI HB xscom details + - Add new fields to proc_init_data structure + - Add processor version check for hs service ntuple + - add_iplparams_serial - Validate HDIF_get_iarray_size() return value + + +XIVE: + +The list of XIVE fixes and updates is extensive. Below is only a portion of +the changes that have gone into skiboot 5.5.0-rc1 for the new XIVE hardware +that is present in POWER9: + + - xive: Enable backlog on queues + - xive: Use for_each_present_cpu() for setting up XIVE + - xive: Fix logic in opal_xive_get_xirr() + - xive: Properly initialize new VP and EQ structures + - xive: Improve/fix EOI of LSIs + - xive: Add FIXME comments about mask/umask races + - xive: Fix memory barrier in opal_xive_get_xirr() + - xive: Don't try to find a target EQ for prio 0xff + - xive: Bump table sizes in direct mode + - xive: Properly register escalation interrupts + - xive: Split the OPAL irq flags from the internal ones + - xive: Don't touch ESB masks unless masking/unmasking + - xive: Fix xive_get_ir_targetting() + - xive: Cleanup escalation PQ on queue change + - xive: Add *any chip* for allocating interrupts + - xive: Add chip_id to get_vp_info + - xive: Add opal_xive_get/set_vp_info + - xive: Add VP alloc/free OPAL functions + - xive: Workaround for bad DD1 checker + - xive: Add more checks for exploitation mode + - xive: Add support for EOIs via OPAL + - xive/phb4: Work around broken LSI control on P9 DD1 + - xive: Forward interrupt names callback + - xive: Export opal_xive_reset() arguments in OPAL API + - xive: Add interrupt allocator + - xive: Implement xive_reset + - xive: Don't assert if xive_get_vp() fails + - xive: Expose exploitation mode DT properties + - xive: Use a constant for max# of chips + - xive: Keep track of which interrupts were ever enabled + In order to speed up xive reset + - xive: Implement internal VP allocator + - xive: Add xive_get/set_queue_info + - xive: Add helpers to encode and decode VP numbers + - xive: Add API to donate pages in indirect mode + - xive: Add asynchronous cache updates and update irq targetting + - xive: Split xive_provision_cpu() and use cache watch for VP + - xive: Add cache scrub to push watch updates to memory + - xive: Mark XIVE owned EQs with a specific flag + - xive: Use an allocator for EQDs + - xive: Break assumption that block ID == chip ID + - xive/phb4: Handle bad ESB offsets in PHB4 DD1 + - xive: Implement get/set_irq_config APIs + - xive: Rework xive_set_eq_info() to store all info even when masking + - xive: Implement cache watch and use it for EQs + - xive: Add locking to some API calls + - xive: Add opal_xive_get_irq_info() + - xive: Add CPU node "interrupts" properties representing the IPIs + - xive: Add basic opal_xive_reset() call and exploitation mode + - xive: Add support for escalation interrupts + - xive: OPAL API update + - xive: Add some dump facility for debugging + - xive: Document exploitation mode + (Pretty much work in progress) + - xive: Indirect table entries must have top bits "type" set + - xive: Remove unused field and clarify comment + - xive: Provide a way to override some IPI sources + - xive: Add helper to retrieve an IPI trigger port + - xive: Fix IPI EOI logic in opal_xive_eoi() + - xive: Don't try to EOI a masked source + - xive: Fix comments in xive_source_set_xive() + - xive: Fix comments in xive_get_ive() + - xive: Configure forwarding ports + - xive: Fix mangling of interrupt server# in opal_get/set_xive() + - xive: Fix interrupt number mangling + + +Fast-reboot +----------- +- fast-reboot: creset PHBs on fast reboot + On fast reboot, perform a creset of all PHBs. This ensures that any PHBs + that are fenced will be working after the reboot. +- fast-reboot: Enable fast reboot with CAPI adapters in CAPI mode + CAPI mode is disabled as part of OPAL_SYNC_HOST_REBOOT. +- opal/fast-reboot: set fw_progress sensor status with IPMI_FW_PCI_INIT. + +CAPI +---- + +- hmi: Print CAPP FIR information when handling CAPP malfunction alerts + +FSP based systems +----------------- + +- hw/fsp: Do not queue SP and SPCN class messages during reset/reload + This could cause soft lockups if FSP reset reload was done while in OPAL + During FSP R/R, the FSP is inaccessible and will lose state. Messages to the + FSP are generally queued for sending later. + +Tests +----- +- core/test/run-trace: Reduce number of samples when running under valgrind + This reduces 'make check' run time by ~10 seconds on my laptop, + and just the run-trace test itself takes 15 seconds less (under valgrind). +- test/sreset_world: Kind of like Hello World, but from the SRESET vector. + A regression test for the mambo implementation of OPAL_SIGNAL_SYSTEM_RESET. +- nvram-format: Fix endian issues + NVRAM formats are always BE, so let's use the sparse annotation to catch + any issues (and correct said issues). + + On LE platforms, the test was erroneously passing as with building the + nvram-format code on LE we were produces an incorrect NVRAM image. + +- test/hello_world: use P9MAMBO to differentiate from P8 +- hdata_to_dt: Specify PVR on command line +- hdata/test: Add DTS output for the test cases +- hdata/test: strip blobs from the DT output +- mambo: add mprintf() + + mprintf() is printf(), but it goes straight to the mambo console. This + allows it to be independent of Skiboot's actual console infrastructure + so it can be used for debugging the console drivers and for debugging + code that runs before the console is setup. +- generate-fwts-olog: add support for parsing prerror() +- Add bitmap test + The worst test suite ever +- mambo_utils: add ascii output to hexdump +- mambo_utils: add p_str <addr> [limit] +- mambo_utils: make p return a value +- hello_world: print out full path of missing MAMBO_BINARY +- print-stb-container: Fix build on centos7 + +- Travis-ci improvements: + - install expect on ubuntu 12.04, disable qemu on 16.04/latest + - build and test more on centos7 + - hello_world: run p9 mambo tests + - install systemsim-p8 on centos7 + - install systemsim-p8 on centos6 + - install systemsim-p9 + - enable fedora25 + - always pull new docker image + - add fedora rawhide + +- Add fwts annotation for duplicate DT node entries. + + Reference bug: https://github.com/open-power/op-build/issues/751 +- external/fwts: Add 'last-tag' to FWTS olog output + This isn't so useful at the moment, but this will make cleaning out + crufty old error definitions much easier. +- external/fwts: Add FWTS olog merge script + A script to merge olog error definitions from multiple skiboot versions + into a single olog JSON file. Will prompt when conflicting patterns are + found to update the pattern, or add both. +- mambo: fake NVRAM support +- mambo: Add Fake NVRAM driver +- external/mambo: add shortcut to print all GPRs + + + +Contributors +------------ + +Processed 363 csets from 28 developers. +A total of 18105 lines added, 16499 removed (delta 1606) + +Developers with the most changesets +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================== === ======= +Developer # % +========================== === ======= +Benjamin Herrenschmidt 138 (38.0%) +Stewart Smith 56 (15.4%) +Oliver O'Halloran 47 (12.9%) +Michael Neuling 18 (5.0%) +Gavin Shan 15 (4.1%) +Claudio Carvalho 14 (3.9%) +Vasant Hegde 11 (3.0%) +Cyril Bur 11 (3.0%) +Andrew Donnellan 11 (3.0%) +Ananth N Mavinakayanahalli 5 (1.4%) +Cédric Le Goater 5 (1.4%) +Pridhiviraj Paidipeddi 5 (1.4%) +Shilpasri G Bhat 4 (1.1%) +Nicholas Piggin 4 (1.1%) +Russell Currey 3 (0.8%) +Alistair Popple 2 (0.6%) +Jack Miller 2 (0.6%) +Chris Smart 2 (0.6%) +Matt Brown 1 (0.3%) +Michael Ellerman 1 (0.3%) +Frederic Barrat 1 (0.3%) +Hank Chang 1 (0.3%) +Willie Liauw 1 (0.3%) +Werner Fischer 1 (0.3%) +Jeremy Kerr 1 (0.3%) +Patrick Williams 1 (0.3%) +Joel Stanley 1 (0.3%) +Alexey Kardashevskiy 1 (0.3%) +========================== === ======= + +Developers with the most changed lines +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== ===== ======= +Developer # % +=========================== ===== ======= +Oliver O'Halloran 17961 (56.7%) +Benjamin Herrenschmidt 5509 (17.4%) +Cyril Bur 2801 (8.8%) +Stewart Smith 1649 (5.2%) +Gavin Shan 653 (2.1%) +Claudio Carvalho 489 (1.5%) +Willie Liauw 361 (1.1%) +Ananth N Mavinakayanahalli 340 (1.1%) +Andrew Donnellan 315 (1.0%) +Michael Neuling 240 (0.8%) +Shilpasri G Bhat 228 (0.7%) +Nicholas Piggin 219 (0.7%) +Vasant Hegde 207 (0.7%) +Russell Currey 158 (0.5%) +Jack Miller 127 (0.4%) +Cédric Le Goater 126 (0.4%) +Chris Smart 95 (0.3%) +Hank Chang 56 (0.2%) +Pridhiviraj Paidipeddi 47 (0.1%) +Alistair Popple 39 (0.1%) +Matt Brown 29 (0.1%) +Michael Ellerman 3 (0.0%) +Alexey Kardashevskiy 2 (0.0%) +Frederic Barrat 1 (0.0%) +Werner Fischer 1 (0.0%) +Jeremy Kerr 1 (0.0%) +Patrick Williams 1 (0.0%) +Joel Stanley 1 (0.0%) +=========================== ===== ======= + +Developers with the most lines removed +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== ===== ======= +Developer # % +=========================== ===== ======= +Oliver O'Halloran 8810 (53.4%) +Ananth N Mavinakayanahalli 98 (0.6%) +Alistair Popple 9 (0.1%) +Michael Ellerman 3 (0.0%) +Werner Fischer 1 (0.0%) +=========================== ===== ======= + +Developers with the most signoffs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Total 322 + +======================== ===== ======= +Developer # % +======================== ===== ======= +Stewart Smith 307 (95.3%) +Michael Neuling 6 (1.9%) +Oliver O'Halloran 3 (0.9%) +Benjamin Herrenschmidt 2 (0.6%) +Vaidyanathan Srinivasan 1 (0.3%) +Hank Chang 1 (0.3%) +Jack Miller 1 (0.3%) +Gavin Shan 1 (0.3%) +======================== ===== ======= + +Developers with the most reviews +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Total: 45 + +======================== ===== ======= +Developer # % +======================== ===== ======= +Vasant Hegde 10 (22.2%) +Andrew Donnellan 9 (20.0%) +Russell Currey 6 (13.3%) +Cédric Le Goater 5 (11.1%) +Oliver O'Halloran 4 (8.9%) +Gavin Shan 3 (6.7%) +Vaidyanathan Srinivasan 2 (4.4%) +Alistair Popple 2 (4.4%) +Frederic Barrat 2 (4.4%) +Mahesh Salgaonkar 1 (2.2%) +Cyril Bur 1 (2.2%) +======================== ===== ======= + +Developers with the most test credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Total 11 + +======================== ===== ======= +Developer # % +======================== ===== ======= +Willie Liauw 4 (36.4%) +Claudio Carvalho 3 (27.3%) +Gavin Shan 1 (9.1%) +Michael Neuling 1 (9.1%) +Pridhiviraj Paidipeddi 1 (9.1%) +Chris Smart 1 (9.1%) +======================== ===== ======= + +Developers who gave the most tested-by credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Total 11 + +========================== ===== ======= +Developer # % +========================== ===== ======= +Gavin Shan 4 (36.4%) +Stewart Smith 4 (36.4%) +Chris Smart 1 (9.1%) +Oliver O'Halloran 1 (9.1%) +Ananth N Mavinakayanahalli 1 (9.1%) +========================== ===== ======= + +Developers with the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Total 7 + +========================== === ======= +Developer # % +========================== === ======= +Hank Chang 4 (57.1%) +Guilherme G. Piccoli 1 (14.3%) +Colin Ian King 1 (14.3%) +Pradipta Ghosh 1 (14.3%) +========================== === ======= + + +Developers who gave the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Total 7 + +========================== === ======= +Developer # % +========================== === ======= +Gavin Shan 5 (71.4%) +Andrew Donnellan 1 (14.3%) +Jeremy Kerr 1 (14.3%) +========================== === ======= diff --git a/roms/skiboot/doc/release-notes/skiboot-5.5.0-rc2.rst b/roms/skiboot/doc/release-notes/skiboot-5.5.0-rc2.rst new file mode 100644 index 000000000..30cb79b2b --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.5.0-rc2.rst @@ -0,0 +1,172 @@ +.. _skiboot-5.5.0-rc2: + +skiboot-5.5.0-rc2 +================= + +skiboot-5.5.0-rc2 was released on Monday April 3rd 2017. It is the second +release candidate of skiboot 5.5, which will become the new stable release +of skiboot following the 5.4 release, first released November 11th 2016. + +skiboot-5.5.0-rc2 contains all bug fixes as of :ref:`skiboot-5.4.3` +and :ref:`skiboot-5.1.19` (the currently maintained stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.5.0 by April 8th, with skiboot 5.5.0 +being for all POWER8 and POWER9 platforms in op-build v1.16 (Due April 12th). +This is a short cycle as this release is mainly targetted towards POWER9 +bringup efforts. + +Following skiboot-5.5.0, we will move to a regular six week release cycle, +similar to op-build, but slightly offset to allow for a short stabilisation +period. Expected release dates and contents are tracked using GitHub milestone +and issues: https://github.com/open-power/skiboot/milestones + +Over :ref:`skiboot-5.5.0-rc1`, we have the following changes: + +NVLINK2 +------- + +- Introduce NPU2 support + + NVLink2 is a new feature introduced on POWER9 systems. It is an + evolution of of the NVLink1 feature included in POWER8+ systems but + adds several new features including support for GPU address + translation using the Nest MMU and cache coherence. + + Similar to NVLink1 the functionality is exposed to the OS as a series + of virtual PCIe devices. However the actual hardware interfaces are + significantly different which limits the amount of common code that + can be shared between implementations in the firmware. + + This patch adds basic hardware initialisation and exposure of the + virtual NVLink2 PCIe devices to the running OS. + +- npu2: Add OPAL calls for nvlink2 address translation services (see :ref:`OPAL_NPU2`) + + Adds three OPAL calls for interacting with NPU2 devices: + :ref:`OPAL_NPU_INIT_CONTEXT`, :ref:`OPAL_NPU_DESTROY_CONTEXT` and + :ref:`OPAL_NPU_MAP_LPAR`. + + These are used to setup and configure address translation services + (ATS) for a process/partition on a given NVLink2 device. + + +POWER9 +------ +- hdata/memory: ignore homer and occ reserved ranges + + We populate these from the HOMER BARs in the PBA directly. There's no + need to take the hostboot supplied values so just ignore the + corresponding reserved ranges. + +- hdata/vpd: Parse the OpenPOWER OPFR record + + Parse the OpenPOWER FRU VPD (OPFR) record on OpenPOWER instead + of the VINI records. + +- hdata/vpd: Parse additional VINI records + + These records provide hardware version details, CCIN extension information, + card type details and hardware characteristics of the FRU + +- hdata/cpu: account for p9 shared caches + + On P9 the L2 and L3 caches are shared between pairs of SMT=4 cores. + Currently this is not accounted for when creating caches nodes in + the device tree. This patch adds additional checking so that a + cache node is only created for the first core in the pair and + the second core will reference the cache correctly. + +- hdata: print backtraces on HDAT errors +- hdat: ignore zero length reserves + + Hostboot can export reserved regions with a length of zero and these + should be ignored rather than being turned into reserved range. While + we're here fix a memory leak by moving the "too large" region check + to before we allocate space for the label. + +- SLW: Add init for power9 power management + + This patch adds new function to init core for power9 power management. + SPECIAL_WKUP_* SCOM registers, if set, can hold the cores from going into + idle states. Hence, clear PPM_SPECIAL_WKUP_HYP_REG scom register for each + core during init. (This init are not required for MAMBO) + + +PCI +--- + +- hw/phb3: Adjust ECRC on root port dynamically + + The Samsung NVMe adapter is lost when it's connected to PMC 8546 PCIe + switch, until ECRC is disabled on the root port. We found similar issue + prevously when Broadcom adapter is connected to same part of PCIe switch + and it was fixed by commit 60ce59ccd0e9 ("hw/phb3: Disable ECRC on Broadcom + adapter behind PMC switch"). Unfortunately, the commit doesn't fix + the Samsung NVMe adapter lost issue. + + This fixes the issues by disable ECRC generation/check on root port + when PMC 8546 PCIe switch ports are found. This can be extended for + other PCIe switches or endpoints in future: Each PHB maintains the + count of PCI devices (PMC 8546 PCIe switch ports currently) which + require to disable ECRC on root port. The ECRC functionality is + enabled when first PMC 8546 switch port is probed and disabled when + last PMC 8546 switch port is destroyed (in PCI hot remove scenario). + Except PHB's reinitialization after complete reset, the ECRC on + root port is untouched. + +- core/pci: Fix lost NVMe adapter behind PMC 8546 switch + + The NVMe adapter in below PCI topology is lost. The root cause is + the presence bit on its PCI slot is missed, but the PCIe link has + been up. The PCI core doesn't probe the adapter behind the slot, + leading to lost NVMe adapter in the particular case. + + - PHB3 root port + - PLX switch 8748 (10b5:8748) + - PLX swich 9733 (10b5:9733) + - PMC 8546 swtich (11f8:8546) + - NVMe adapter (1c58:0023) + + This fixes the issue by overriding the PCI slot presence bit with + PCIe link state bit. +- hw/phb4: Locate AER capability position if necessary +- core/pci: Disable surprise hotplug on root port +- core/pci: Ignore PCI slot capability on root port + + We are creating PCI slot on root port, where the PCI slot isn't + supported from hardware. For this case, we shouldn't read the PCI + slot capability from hardware. When bogus data returned from the + hardware, we will attempt to the PCI slot's power state or enable + surprise hotplug functionality. All of them can't be accomplished + without hardware support. + + This leaves the PCI slot's capability list 0 if PCICAP_EXP_CAP_SLOT + isn't set in hardware (pcie_cap + 0x2). Otherwise, the PCI slot's + capability list is retrieved from hardware (pcie_cap + 0x14). + + +- phb4: Default to PCIe GEN2 on DD1 + + Default to PCIe GEN2 link speeds on DD1 for stability. + + Can be overridden using nvram pcie-max-link-speed=4 parameter. + +- phb3/4: Set max link speed via nvram + + This adds an nvram parameter pcie-max-link-speed to configure the max + speed of the pcie link. This can be set from the petitboot prompt + using: :: + + nvram -p ibm,skiboot --update-config pcie-max-link-speed=4 + + This takes preference over anything set in the device tree and is + global to all PHBs. + +Tests +----- + +- Mambo/Qemu boot tests: expect (and fail) on checkstop + + This allows us to fail a lot faster if we checkstop diff --git a/roms/skiboot/doc/release-notes/skiboot-5.5.0-rc3.rst b/roms/skiboot/doc/release-notes/skiboot-5.5.0-rc3.rst new file mode 100644 index 000000000..dbb588ceb --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.5.0-rc3.rst @@ -0,0 +1,51 @@ +.. _skiboot-5.5.0-rc3: + +skiboot-5.5.0-rc3 +================= + +skiboot-5.5.0-rc3 was released on Wednesday April 5th 2017. It is the third +release candidate of skiboot 5.5, which will become the new stable release +of skiboot following the 5.4 release, first released November 11th 2016. + +skiboot-5.5.0-rc3 contains all bug fixes as of :ref:`skiboot-5.4.3` +and :ref:`skiboot-5.1.19` (the currently maintained stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.5.0 by April 8th, with skiboot 5.5.0 +being for all POWER8 and POWER9 platforms in op-build v1.16 (Due April 12th). +This is a short cycle as this release is mainly targetted towards POWER9 +bringup efforts. + +Following skiboot-5.5.0, we will move to a regular six week release cycle, +similar to op-build, but slightly offset to allow for a short stabilisation +period. Expected release dates and contents are tracked using GitHub milestone +and issues: https://github.com/open-power/skiboot/milestones + +Over :ref:`skiboot-5.5.0-rc2`, we have the following changes: + +- xive: Fix setting of remote NVT VSD + + This fixes a checkstop when using my XIVE exploitation mode on some multi-chip machines. + +- core/init: Use '_' as separator in names of "exports" properties + + The names of the properties under /ibm,opal/firmware/exports are used + directly by Linux to create files in sysfs. To remain consistent with + the existing naming of OPAL sysfs files, use '_' as the separator. + + In particular for the symbol map which is already exported separately, + it's cleaner for the two files to have the same name, eg: :: + + /sys/firmware/opal/exports/symbol_map + /sys/firmware/opal/symbol_map + +- hdata: fix reservation size + + The hostboot reserved ranges are [start, end] pairs rather than + [start, end) so we need to stick a +1 in there to calculate the + size properly. + +- hdat: Add model-name property for OpenPower system +- hdat: Read description from ibm, vpd binary blob +- hdat: Populate model property with 'Unknown' in error path diff --git a/roms/skiboot/doc/release-notes/skiboot-5.5.0.rst b/roms/skiboot/doc/release-notes/skiboot-5.5.0.rst new file mode 100644 index 000000000..396ef9da8 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.5.0.rst @@ -0,0 +1,326 @@ +.. _skiboot-5.5.0: + +skiboot-5.5.0 +============= + +skiboot-5.5.0 was released on Friday April 7th 2017. It is the new stable +release of skiboot, taking over from the 5.4 release, first released on +November 11th 2016. + +skiboot-5.5.0 contains all bug fixes as of :ref:`skiboot-5.4.3` +and :ref:`skiboot-5.1.19` (the currently maintained stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +This release is a good level set of POWER9 support for bringup activities. +If you are doing bringup, it is strongly suggested you continue to follow +skiboot master. + +After skiboot 5.5.0, we move to a regular six week release cycle, +similar to op-build, but slightly offset to allow for a short stabilisation +period. Expected release dates and contents are tracked using GitHub milestone +and issues: https://github.com/open-power/skiboot/milestones + +Changes in skiboot-5.5.0 +------------------------ + +See changes in the release candidates: + +- :ref:`skiboot-5.5.0-rc1` +- :ref:`skiboot-5.5.0-rc2` +- :ref:`skiboot-5.5.0-rc3` + +Changes since skiboot-5.5.0-rc3 +------------------------------- + +- hdat: parse processor attached i2c devices + + Adds basic parsing for i2c devices that are attached to the processor + I2C interfaces. This is mainly VPD SEEPROMs. +- libflash/blocklevel: Add blocklevel_smart_erase() + + With recent changes to flash drivers in linux not all erase blocks are + 4K anymore. While most level of the pflash/gard tool stacks were written + to not mind, it turns out there are bugs which means not 4K erase block + backing stores aren't handled all that well. Part of the problem is the + FFS layout that is 4K aligned and with larger block sizes pflash and the + gard tool don't check if their erase commands are erase block aligned - + which they are usually not with 64K erase blocks. + + This patch aims to add common functionality to blocklevel so that (at + least) pflash and the gard tool don't need to worry about the problem + anymore. +- external/pflash: Use blocklevel_smart_erase() +- external/gard: Use blocklevel_smart_erase() +- libstb/create-container: Add full container build and sign with imprint keys + + This adds support for writing all the public key and signature fields to the + container header, and for dumping the prefix and software headers so they may + may be signed, and for signing those headers with the imprint keys. +- asm: do not set SDR1 on POWER9. This register does not exist in ISAv3. + +Testing: + +- mambo: Allow setting the Linux command line from the environment + + For automated testing it's helpful to be able to set the Linux command + line via an environment variable. +- mambo: Add util function for breaking on console output + + +Contributors +------------ + +Processed 408 csets from 31 developers + +3 employers found + +A total of 24073 lines added, 16759 removed (delta 7314) + +Extending the analysis done for the last few releases, we can see our trends +in code review across versions: + +======== ====== ======= ======= ====== ======== +Release csets Ack Reviews Tested Reported +======== ====== ======= ======= ====== ======== +5.0 329 15 20 1 0 +5.1 372 13 38 1 4 +5.2-rc1 334 20 34 6 11 +5.3-rc1 302 36 53 4 5 +5.4.0 361 16 28 1 9 +5.5.0 408 11 48 14 10 +======== ====== ======= ======= ====== ======== + +I am absolutely *thrilled* as to the uptick of reviews and tested-by occuring +over our 5.4.0 release. Although we are not yet back up to 5.3 era levels for +review, we're much closer. For tested-by, we've set a new record, which is +excellent! + + +Developers with the most changesets +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +========================== === ======= +Developer # % +========================== === ======= +Benjamin Herrenschmidt 139 (34.1%) +Stewart Smith 60 (14.7%) +Oliver O'Halloran 54 (13.2%) +Gavin Shan 23 (5.6%) +Michael Neuling 20 (4.9%) +Vasant Hegde 15 (3.7%) +Cyril Bur 15 (3.7%) +Claudio Carvalho 14 (3.4%) +Andrew Donnellan 11 (2.7%) +Ananth N Mavinakayanahalli 9 (2.2%) +Alistair Popple 6 (1.5%) +Nicholas Piggin 5 (1.2%) +Cédric Le Goater 5 (1.2%) +Pridhiviraj Paidipeddi 5 (1.2%) +Michael Ellerman 4 (1.0%) +Shilpasri G Bhat 4 (1.0%) +Russell Currey 3 (0.7%) +Jack Miller 2 (0.5%) +Chris Smart 2 (0.5%) +Dave Heller 1 (0.2%) +Akshay Adiga 1 (0.2%) +Reza Arbab 1 (0.2%) +Matt Brown 1 (0.2%) +Frederic Barrat 1 (0.2%) +Hank Chang 1 (0.2%) +Willie Liauw 1 (0.2%) +Werner Fischer 1 (0.2%) +Jeremy Kerr 1 (0.2%) +Patrick Williams 1 (0.2%) +Joel Stanley 1 (0.2%) +Alexey Kardashevskiy 1 (0.2%) +========================== === ======= + +Developers with the most changed lines +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================== ===== ======= +Developer # % +========================== ===== ======= +Oliver O'Halloran 18278 (48.5%) +Benjamin Herrenschmidt 5512 (14.6%) +Cyril Bur 3184 (8.4%) +Alistair Popple 3102 (8.2%) +Stewart Smith 2757 (7.3%) +Gavin Shan 802 (2.1%) +Ananth N Mavinakayanahalli 544 (1.4%) +Claudio Carvalho 489 (1.3%) +Dave Heller 425 (1.1%) +Willie Liauw 361 (1.0%) +Andrew Donnellan 315 (0.8%) +Michael Neuling 290 (0.8%) +Vasant Hegde 253 (0.7%) +Shilpasri G Bhat 228 (0.6%) +Nicholas Piggin 222 (0.6%) +Reza Arbab 198 (0.5%) +Russell Currey 158 (0.4%) +Jack Miller 127 (0.3%) +Cédric Le Goater 126 (0.3%) +Chris Smart 95 (0.3%) +Akshay Adiga 57 (0.2%) +Hank Chang 56 (0.1%) +Pridhiviraj Paidipeddi 47 (0.1%) +Michael Ellerman 29 (0.1%) +Matt Brown 29 (0.1%) +Alexey Kardashevskiy 2 (0.0%) +Frederic Barrat 1 (0.0%) +Werner Fischer 1 (0.0%) +Jeremy Kerr 1 (0.0%) +Patrick Williams 1 (0.0%) +Joel Stanley 1 (0.0%) +========================== ===== ======= + +Developers with the most lines removed +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +========================== ===== ======= +Developer # % +========================== ===== ======= +Oliver O'Halloran 8516 (50.8%) +Werner Fischer 1 (0.0%) +========================== ===== ======= + +Developers with the most signoffs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Total: 364 + +======================== ===== ======= +Developer # % +======================== ===== ======= +Stewart Smith 348 (95.6%) +Michael Neuling 6 (1.6%) +Oliver O'Halloran 3 (0.8%) +Benjamin Herrenschmidt 2 (0.5%) +Vaidyanathan Srinivasan 1 (0.3%) +Hank Chang 1 (0.3%) +Jack Miller 1 (0.3%) +Gavin Shan 1 (0.3%) +Alistair Popple 1 (0.3%) +======================== ===== ======= + + +Developers with the most reviews +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Total 50 + +======================== ===== ======= +Developer # % +======================== ===== ======= +Vasant Hegde 14 (28.0%) +Andrew Donnellan 9 (18.0%) +Russell Currey 6 (12.0%) +Cédric Le Goater 5 (10.0%) +Oliver O'Halloran 4 (8.0%) +Vaidyanathan Srinivasan 3 (6.0%) +Gavin Shan 3 (6.0%) +Alistair Popple 2 (4.0%) +Frederic Barrat 2 (4.0%) +Mahesh Salgaonkar 1 (2.0%) +Cyril Bur 1 (2.0%) +======================== ===== ======= + +Developers with the most test credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Total 14 + +======================== ===== ======= +Developer # % +======================== ===== ======= +Willie Liauw 4 (28.6%) +Mark E Schreiter 3 (21.4%) +Claudio Carvalho 3 (21.4%) +Gavin Shan 1 (7.1%) +Michael Neuling 1 (7.1%) +Pridhiviraj Paidipeddi 1 (7.1%) +Chris Smart 1 (7.1%) +======================== ===== ======= + +Developers who gave the most tested-by credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Total 14 + +========================== === ======= +Developer # % +========================== === ======= +Gavin Shan 7 (50.0%) +Stewart Smith 4 (28.6%) +Chris Smart 1 (7.1%) +Oliver O'Halloran 1 (7.1%) +Ananth N Mavinakayanahalli 1 (7.1%) +========================== === ======= + + +Developers with the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Total 10 + +============================ = ======= +Developer # % +============================ = ======= +Hank Chang 4 (40.0%) +Mark E Schreiter 3 (30.0%) +Guilherme G. Piccoli 1 (10.0%) +Colin Ian King 1 (10.0%) +Pradipta Ghosh 1 (10.0%) +============================ = ======= + + +Developers who gave the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Total 10 + +============================ = ======= +Developer # % +============================ = ======= +Gavin Shan 8 (80.0%) +Andrew Donnellan 1 (10.0%) +Jeremy Kerr 1 (10.0%) +============================ = ======= + +Top changeset contributors by employer +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================== === ======= +Employer # % +========================== === ======= +IBM 406 (99.5%) +SuperMicro 1 (0.2%) +Thomas-Krenn AG 1 (0.2%) +========================== === ======= + +Top lines changed by employer +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================= ===== ======= +Employer # % +========================= ===== ======= +IBM 37329 (99.0%) +SuperMicro 361 (1.0%) +Thomas-Krenn AG 1 (0.0%) +========================= ===== ======= + +Employers with the most signoffs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Total 364 + +========================= ==== ======= +Employer # % +========================= ==== ======= +IBM 363 (99.7%) +(Unknown) 1 (0.3%) +========================= ==== ======= + +Employers with the most hackers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Total 31 + +========================= ==== ======= +Employer # % +========================= ==== ======= +IBM 29 (93.5%) +Thomas-Krenn AG 1 (3.2%) +SuperMicro 1 (3.2%) +========================= ==== ======= diff --git a/roms/skiboot/doc/release-notes/skiboot-5.6.0-rc1.rst b/roms/skiboot/doc/release-notes/skiboot-5.6.0-rc1.rst new file mode 100644 index 000000000..5f9bf30d2 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.6.0-rc1.rst @@ -0,0 +1,502 @@ +.. _skiboot-5.6.0-rc1: + +skiboot-5.6.0-rc1 +================= + +skiboot-5.6.0-rc1 was released on Tuesday May 16th 2017. It is the first +release candidate of skiboot 5.6, which will become the new stable release +of skiboot following the 5.5 release, first released April 7th 2017. + +skiboot-5.6.0-rc1 contains all bug fixes as of :ref:`skiboot-5.4.4` +and :ref:`skiboot-5.1.19` (the currently maintained stable releases). We +do not currently expect to do any 5.5.x stable releases. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.6.0 by May 22nd, with skiboot 5.6.0 +being for all POWER8 and POWER9 platforms in op-build v1.17 (Due May 24th). +This is a short cycle as this release is mainly targetted towards POWER9 +bringup efforts. + +This is the first release using the new regular six week release cycle, +similar to op-build, but slightly offset to allow for a short stabilisation +period. Expected release dates and contents are tracked using GitHub milestone +and issues: https://github.com/open-power/skiboot/milestones + +Over skiboot-5.5, we have the following changes: + +New Platforms +------------- + +Thanks to SuperMicro for submitting support for the p9dsu platform, AKA Boston. + +POWER9 +------ + +XIVE: + + - xive: Clear emulation mode queue on reset + - xive: Fixes/improvements to xive reset for multi-chip systems + - xive: Synchronize after disable IRQs in opal_xive_reset() + - xive: Workaround a problem with indirect TM access + - hdata: Make FSPv1 work again + One less thing to work around for those crazy enough to try. + - xive: Log more information in opal_xive_dump() for emulation state + + Add a counter of total interrupts taken by a CPU, dump the + queue buffer both before and after the current pointer, + and also display the HW state of the queue descriptor and + the PQ state of the IPI. + - xive: Add a per-cpu logging mechanism to XICS emulation + + This is a small 32-entries rolling buffer that logs a few + operations. It's useful to debug odd problems. The output + is printed when opal_xive_dump() is called. + - xive: Check queues for duplicates in DEBUG builds. + + There should never be duplicate interrupts in a queue. + This adds code to check that when looking at the queue + content. Since it can be a performance loss, this is only + done for debug builds. + - xive+phb4: Fix exposing trigger page to Linux + +HDAT Parsing: + + - hdata/spira.c: Add device-tree bindings for nest mmu + - hdata/i2c: Workaround broken i2c devices + - hdata: indicate when booted with elevated risk level + + When the system is IPLed with an elevated risk level Hostboot will + set a flag in the IPL parameters structure. Parse and export this + in the device tree at: /ipl-params/sys-params/elevated-risk-level + - hdata: Respect OCC and HOMER resevations + + In the past we've ignored these since Hostboot insisted in exporting + broken reservations and the OCC was not being used yet. This situation + seems to have resolved itself so we should respect the reservations that + hostboot provides. + +I2C: + +- i2c: Add interrupts support on P9 + + Some older revisions of hostboot populate the host i2c device fields + with all zero entires. Detect and ignore these so we don't crash on + boot. + + Without this we get: :: + + [ 151.251240444,3] DT: dt_attach_root failed, duplicate unknown@0 + [ 151.251300274,3] *********************************************** + [ 151.251339330,3] Unexpected exception 200 ! + [ 151.251363654,3] SRR0 : 0000000030090c28 SRR1 : 9000000000201000 + [ 151.251409207,3] HSRR0: 0000000000000010 HSRR1: 9000000000001000 + [ 151.251444114,3] LR : 30034018300c5ab0 CTR : 30034018300a343c + [ 151.251478314,3] CFAR : 0000000030024804 + [ 151.251500346,3] CR : 40004208 XER: 00000000 + <snip GPRS> + [ 151.252083372,0] Aborting! + CPU 0034 Backtrace: + S: 0000000031cd36a0 R: 000000003001364c .backtrace+0x2c + S: 0000000031cd3730 R: 0000000030018db8 ._abort+0x4c + S: 0000000031cd37b0 R: 0000000030025c6c .exception_entry+0x114 + S: 0000000031cd3840 R: 0000000000001f00 * +0x1f00 + S: 0000000031cd3a10 R: 0000000031cd3ab0 * + S: 0000000031cd3aa0 R: 00000000300248b8 .new_property+0x90 + S: 0000000031cd3b30 R: 0000000030024b50 .__dt_add_property_cells+0x30 + S: 0000000031cd3bd0 R: 000000003009abec .parse_i2c_devs+0x350 + S: 0000000031cd3cf0 R: 0000000030093ffc .parse_hdat+0x11e4 + S: 0000000031cd3e30 R: 00000000300144c8 .main_cpu_entry+0x138 + S: 0000000031cd3f00 R: 0000000030002648 boot_entry+0x198 + +PHB4: + + - phb4: Enforce root complex config space size of 2048 + + The root complex config space size on PHB4 is 2048. This patch sets + that size and enforces it when trying to read/write the config space + in the root complex. + + Without this someone reading the config space via /sysfs in linux will + cause an EEH on the PHB. + + If too high, reads returns 1s and writes are silently dropped. + - phb4: Add an option for disabling EEH MMIO in nvram + + Having the option to disable EEH for MMIO without rebuilding skiboot + could be useful for testing, so check for pci-eeh-mmio=disabled in nvram. + + This is not designed to be a supported option or configuration, just + an option that's useful in bringup and development of POWER9 systems. + - phb4: Fix slot presence detect + + This has the nice side effect of improving boot times since we no + longer waste time tring to train links that don't have anything + present. + - phb4: Enable EEH for MMIO + - phb4: Implement fence check + - phb4: Implement diag data + +OCC: + + - occ/irq: Fix SCOM address and irq reasons for P9 OCC + + This patch fixes the SCOM address for OCC_MISC register which is used + for OCC interupts. In P9, OCC sends an interrupt to notify change in + the shared memory like throttle status. This patch handles this + interrupt reason. + +PRD: + + - prd: Fix PRD scoms for P9 + +NX/DARN: + + - nx: Add POWER9 DARN support + +NPU2: + + - npu2: Do not attempt to initialise non DD1 hardware + + There are significant changes to hardware register addresses and + meanings on newer chip revisions making them unlikely to work + correctly with the existing code. Better to fail clearly and early. + + - npu, npu2: Describe diag data size in device tree + +Memory Reservation: + + - mem_region: Add reserved regions after memory init + + When a new memory region is added (e.g for memory reserved by firmware) + the list of existing memory regions is iterated through and a cut-out is + made in any existing region that overlaps with the new one. Prior to the + HDAT reservations being made the region init process was always: + + 1) Create regions from the memory@<addr> DT nodes. (mostly large) + 2) Create reserved regions from the device-tree. (mostly small) + + When adding new regions we have assumed that the new region will only + every intersect with at most one existing region, which it will split. + Adding reservations inside the HDAT parser breaks this because when + adding the memory@<addr> node regions we can potentially overlap with + multiple reserved regions. This patch fixes this by maintaining a + seperate list of memory reservations and delaying merging them until + after the normal memory init has finished, similar to how DT + reservations are handled. + +PCI +--- + +- pci: Describe PHB diag data size in device tree + + Linux hardcodes the PHB diag data buffer at (as of this commit) 8192 bytes. + This has been enough for P7IOC and PHB3, but the 512 PEs of PHB4 pushes + the diag data blob over this size. Rather than just increasing the + hardcoded size in Linux, provide the size of the diag data blob in the + device tree so that the OS can dynamically allocate as much as it needs. + This both enables more space for PHB4 and less wasted memory for P7IOC + and PHB3. + + P7IOC communicates both hub and PHB data using this buffer, so when + setting the size, use whichever struct is largest. +- hdata/i2c: Fix bus and clock frequencies +- ibm-fsp: use opal-prd on p9 and above + + Previously the PRD tooling ran on the FSP, but it was moved into + userspace on the host for OpenPower systems. For P9 this system + was adopted for FSP systems too. + + +I2C +--- +- i2c: Remove old hack for bad clock frequency + + This hack dates back to ancient P8 hostboots. The value + it would use if it detected the "bad" value was incorrect + anyway. + +- i2c: Log the engine clock frequency at boot + +FSP Systems +----------- + +These include the Apollo, Firenze and ZZ platforms. + +- Remove multiple logging for un-handled fsp sub commands. + + If any new or unknown command need to be handled, just log + un-hnadled message from only fsp, not required from fsp-dpo. :: + + cat /sys/firmware/opal/msglog | grep -i ,3 + [ 110.232114723,3] FSP: fsp_trigger_reset() entry + [ 188.431793837,3] FSP #0: Link down, starting R&R + [ 464.109239162,3] FSP #0: Got XUP with no pending message ! + [ 466.340598554,3] FSP-DPO: Unknown command 0xce0900 + [ 466.340600126,3] FSP: Unhandled message ce0900 + +- FSP: Notify FSP of Platform Log ID after Host Initiated Reset Reload + + Trigging a Host Initiated Reset (when the host detects the FSP has gone + out to lunch and should be rebooted), would cause "Unknown Command" messages + to appear in the OPAL log. + + This patch implements those messages + + How to trigger FSP RR(HIR): :: + + $ putmemproc 300000f8 0x00000000deadbeef + s1 k0:n0:s0:p00 + ecmd_ppc putmemproc 300000f8 0x00000000deadbeef + + Log showing unknown command: + / # cat /sys/firmware/opal/msglog | grep -i ,3 + [ 110.232114723,3] FSP: fsp_trigger_reset() entry + [ 188.431793837,3] FSP #0: Link down, starting R&R + [ 464.109239162,3] FSP #0: Got XUP with no pending message ! + [ 466.340598554,3] FSP-DPO: Unknown command 0xce0900 + [ 466.340600126,3] FSP: Unhandled message ce0900 + + The message we need to handle is "Get PLID after host initiated FipS + reset/reload". When the FSP comes back from HIR, it asks "hey, so, which + error log explains why you rebooted me?". So, we tell it. + +Misc +---- + +- hdata_to_dt: Misc improvements in the utility and unit test +- GCC7: fixes for -Wimplicit-fallthrough expected regexes + + It turns out GCC7 adds a useful warning and does fancy things like + parsing your comments to work out that you intended to do the fallthrough. + There's a few places where we don't match the regex. Fix them, as it's + harmless to do so. + + Found by building on Fedora Rawhide in Travis. + + While we do not have everything needed to start building successfully + with GCC7 (well, at least doing so warning clean), it's a start. +- hdata/i2c: avoid possible int32_t overflow + + We're safe up until engine number 524288. Found by static analysis (of course) +- tpm_i2c_nuvoton: fix use-after-free in tpm_register_chip failure path +- mambo: Fix reserved-ranges node +- external/mambo: add helper for machine checks +- console: Set log level from nvram + + This adds two new nvram options to set the console log level for the + driver/uart and in memory. These are called log-level-memory and + log-level-driver. + + These are only set once we have nvram inited. + + To set them you do: :: + + nvram -p ibm,skiboot --update-config log-level-memory=9 + nvram -p ibm,skiboot --update-config log-level-driver=9 + + You can also use the named versions of emerg, alert, crit, err, + warning, notice, printf, info, debug, trace or insane. ie. :: + + nvram -p ibm,skiboot --update-config log-level-driver=insane + +- npu: Implement Function Level Reset (FLR) +- mbox: Sanitize interrupts registers +- xive: Fix potential for lost IPIs when manipulating CPPR +- xive: Don't double EOI interrupts that have an EOI override +- libflash/file: Only use 64bit MTD erase ioctl() when needed + + We recently made MTD 64 bit safe in e5720d3fe94 which now requires the + 64 bit MTD erase ioctl. Unfortunately this ioctl is not present in + older kernels used by some BMC vendors that use pflash. + + This patch addresses this by only using the 64bit version of the erase + ioctl() if the parameters exceed 32bit in size. + + If an erase requires the 64bit ioctl() on a kernel which does not + support it, the code will still attempt it. There is no way of knowing + beforehand if the kernel supports it. The ioctl() will fail and an error + will be returned from from the function. + +Contributors +------------ + +This release contains 81 csets from 15 developers, working at 2 employers. +A total of 2496 lines added, 641 removed (delta 1855) + +Developers with the most changesets +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== == ======= +Developer # % +=========================== == ======= +Oliver O'Halloran 17 (21.0%) +Benjamin Herrenschmidt 17 (21.0%) +Michael Neuling 16 (19.8%) +Stewart Smith 9 (11.1%) +Russell Currey 8 (9.9%) +Alistair Popple 5 (6.2%) +ppaidipe@linux.vnet.ibm.com 1 (1.2%) +Dave Heller 1 (1.2%) +Jeff Scheel 1 (1.2%) +Nicholas Piggin 1 (1.2%) +Ananth N Mavinakayanahalli 1 (1.2%) +Cyril Bur 1 (1.2%) +Alexey Kardashevskiy 1 (1.2%) +Jim Yuan 1 (1.2%) +Shilpasri G Bhat 1 (1.2%) +=========================== == ======= + +Developers with the most changed lines +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== === ======= +Developer # % +=========================== === ======= +Michael Neuling 748 (28.4%) +Benjamin Herrenschmidt 405 (15.4%) +Russell Currey 360 (13.7%) +Oliver O'Halloran 297 (11.3%) +Nicholas Piggin 187 (7.1%) +Alistair Popple 183 (7.0%) +Stewart Smith 175 (6.6%) +Shilpasri G Bhat 79 (3.0%) +Jim Yuan 56 (2.1%) +Ananth N Mavinakayanahalli 45 (1.7%) +Cyril Bur 38 (1.4%) +Alexey Kardashevskiy 37 (1.4%) +Jeff Scheel 19 (0.7%) +Dave Heller 2 (0.1%) +Pridhiviraj Paidipeddi 1 (0.0%) +=========================== === ======= + +Developers with the most lines removed +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== === ======= +Developer # % +=========================== === ======= +Pridhiviraj Paidipeddi 1 (0.2%) +=========================== === ======= + +Developers with the most signoffs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Total of 73. + +========================= === ======= +Developer # % +========================= === ======= +Stewart Smith 56 (76.7%) +Michael Neuling 16 (21.9%) +Oliver O'Halloran 1 (1.4%) +========================= === ======= + +Developers with the most reviews +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Total of 6. + +========================= === ======= +Developer # % +========================= === ======= +Oliver O'Halloran 3 (50.0%) +Andrew Donnellan 1 (16.7%) +Gavin Shan 1 (16.7%) +Cyril Bur 1 (16.7%) +========================= === ======= + +Developers with the most test credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Total of 5. + +========================= === ======= +Developer # % +========================= === ======= +Oliver O'Halloran 2 (40.0%) +Vaidyanathan Srinivasan 1 (20.0%) +Vasant Hegde 1 (20.0%) +Michael Ellerman 1 (20.0%) +========================= === ======= + +Developers who gave the most tested-by credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Total of 5. + +========================= === ======= +Developer # % +========================= === ======= +Oliver O'Halloran 2 (40.0%) +Benjamin Herrenschmidt 2 (40.0%) +Nicholas Piggin 1 (20.0%) +========================= === ======= + +Developers with the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Total of 2. + +=========================== === ======= +Developer # % +=========================== === ======= +Benjamin Herrenschmidt 1 (50.0%) +Pridhiviraj Paidipeddi 1 (50.0%) +=========================== === ======= + +Developers who gave the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Total of 2. + +========================= === ======= +Developer # % +========================= === ======= +Stewart Smith 2 (100.0%) +========================= === ======= + +Top changeset contributors by employer +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Total of 2. + +========================= === ======= +Employer # % +========================= === ======= +IBM 80 (98.8%) +SuperMicro 1 (1.2%) +========================= === ======= + +Top lines changed by employer +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================= ==== ======= +Employer # % +========================= ==== ======= +IBM 2576 (97.9%) +SuperMicro 56 (2.1%) +========================= ==== ======= + +Employers with the most signoffs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Total 73. + +========================= === ======= +Employer # % +========================= === ======= +IBM 73 (100.0%) +========================= === ======= + +Employers with the most hackers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Total 15. + +========================= === ======= +Employer # % +========================= === ======= +IBM 14 (93.3%) +SuperMicro 1 (6.7%) +========================= === ======= diff --git a/roms/skiboot/doc/release-notes/skiboot-5.6.0-rc2.rst b/roms/skiboot/doc/release-notes/skiboot-5.6.0-rc2.rst new file mode 100644 index 000000000..fd1a098e2 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.6.0-rc2.rst @@ -0,0 +1,72 @@ +.. _skiboot-5.6.0-rc2: + +skiboot-5.6.0-rc2 +================= + +skiboot-5.6.0-rc2 was released on Friday May 19th 2017. It is the second +release candidate of skiboot 5.6, which will become the new stable release +of skiboot following the 5.5 release, first released April 7th 2017. + +skiboot-5.6.0-rc2 contains all bug fixes as of :ref:`skiboot-5.4.4` +and :ref:`skiboot-5.1.19` (the currently maintained stable releases). We +do not currently expect to do any 5.5.x stable releases. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.6.0 by May 22nd, with skiboot 5.6.0 +being for all POWER8 and POWER9 platforms in op-build v1.17 (Due May 24th). +This is a short cycle as this release is mainly targetted towards POWER9 +bringup efforts. + +With skiboot 5.6.0, we are moving to a regular six week release cycle, +similar to op-build, but slightly offset to allow for a short stabilisation +period. Expected release dates and contents are tracked using GitHub milestone +and issues: https://github.com/open-power/skiboot/milestones + +Over :ref:`skiboot-5.6.0-rc1`, we have the following changes: + +- hw/i2c: Fix early lock drop + + When interacting with an I2C master the p8-i2c driver (common to p9) + aquires a per-master lock which it holds for the duration of it's + interaction with the master. Unfortunately, when + p8_i2c_check_initial_status() detects that the master is busy with + another transaction it drops the lock and returns OPAL_BUSY. This is + contrary to the driver's locking strategy which requires that the + caller aquire and drop the lock. This leads to a crash due to the + double unlock(), which skiboot treats as fatal. + +- mambo: Add skiboot/linux symbol lookup + + Adds the skisym and linsym commands which can be used to find the + address of a Linux or Skiboot symbol. To function this requires + the user to provide the SKIBOOT_MAP and VMLINUX_MAP environmental + variables which indicate which skiboot.map and System.map files + should be used. + + Examples: + + - Look up a symbol address: :: + + systemsim % skisym .load_and_boot_kernel + 0x0000000030013a08 + + - Set a breakpoint there: :: + + systemsim % b [skisym .load_and_boot_kernel] + breakpoint set at [0:0]: 0x0000000030013a08 (0x0000000030013A08) Enc:0x7D800026 : mfcr r12 + + +- libstb: Fix build in OpenSSL 1.1 + + The build failure was as follows: :: + + [ HOSTCC ] libstb/create-container.c + In file included from /usr/include/openssl/asn1.h:24:0, + from /usr/include/openssl/ec.h:30, + from libstb/create-container.c:36: + libstb/create-container.c: In function ‘getSigRaw’: + libstb/create-container.c:104:31: error: dereferencing pointer to incomplete + type ‘ECDSA_SIG {aka struct ECDSA_SIG_st}’ + rlen = BN_num_bytes(signature->r); + ^ diff --git a/roms/skiboot/doc/release-notes/skiboot-5.6.0.rst b/roms/skiboot/doc/release-notes/skiboot-5.6.0.rst new file mode 100644 index 000000000..c51b238ee --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.6.0.rst @@ -0,0 +1,30 @@ +.. _skiboot-5.6.0: + +skiboot-5.6.0 +============= + +skiboot-5.6.0 was released on Wednesday 24th May 2017. It is the new stable +release of skiboot, taking over from the 5.5 release, first released on +April 7th 2017. It is the first release done in a regular six week release +cycle, mirroring that of op-build. + +skiboot-5.6.0 contains all bug fixes as of :ref:`skiboot-5.4.4` +and :ref:`skiboot-5.1.19` (the currently maintained stable releases). We +do not currently expect to do any 5.5.x stable releases. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +This release is a good level set of POWER9 support for bringup activities. +If you are doing bringup, it is strongly suggested you continue to follow +skiboot master. + +Changes in skiboot-5.6.0 +------------------------ + +See changes in the release candidates: + +- :ref:`skiboot-5.6.0-rc1` +- :ref:`skiboot-5.6.0-rc2` + +The final 5.6.0 release has no functional changes over the 5.6.0-rc2. + diff --git a/roms/skiboot/doc/release-notes/skiboot-5.7-rc1.rst b/roms/skiboot/doc/release-notes/skiboot-5.7-rc1.rst new file mode 100644 index 000000000..575f64f0b --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.7-rc1.rst @@ -0,0 +1,979 @@ +.. _skiboot-5.7-rc1: + +skiboot-5.7-rc1 +=============== + +skiboot v5.7-rc1 was released on Monday July 3rd 2017. It is the first +release candidate of skiboot 5.7, which will become the new stable release +of skiboot following the 5.6 release, first released 24th May 2017. + +skiboot v5.7-rc1 contains all bug fixes as of :ref:`skiboot-5.4.6` +and :ref:`skiboot-5.1.19` (the currently maintained stable releases). We +do not currently expect to do any 5.6.x stable releases. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.7 by July 12th, with skiboot 5.7 +being for all POWER8 and POWER9 platforms in op-build v1.18 (Due July 12th). +This is a short cycle as this release is mainly targetted towards POWER9 +bringup efforts. + +This is the second release using the new regular six week release cycle, +similar to op-build, but slightly offset to allow for a short stabilisation +period. Expected release dates and contents are tracked using GitHub milestone +and issues: https://github.com/open-power/skiboot/milestones + +Over skiboot-5.6, we have the following changes: + +New Features +------------ + +New features in this release for POWER9 systems: + +- In Memory Counters (IMC) (See :ref:`imc` for details) +- phb4: Activate shared PCI slot on witherspoon (see :ref:`Shared Slot <shared-slot-5.7-rc1-rn>`) +- phb4 capi (i.e. CAPI2): Enable capi mode for PHB4 (see :ref:`CAPI on PHB4 <capi2-5.7-rc1-rn>`) + +New feature for IBM FSP based systems: + +- fsp/tpo: Provide support for disabling TPO alarm + + This patch adds support for disabling a preconfigured + Timed-Power-On(TPO) alarm on FSP based systems. Presently once a TPO alarm + is configured from the kernel it will be triggered even if its + subsequently disabled. + + With this patch a TPO alarm can be disabled by passing + y_m_d==hr_min==0 to fsp_opal_tpo_write(). A branch is added to the + function to handle this case by sending FSP_CMD_TPO_DISABLE message to + the FSP instead of usual FSP_CMD_TPO_WRITE message. The kernel is + expected to call opal_tpo_write() with y_m_d==hr_min==0 to request + opal to disable TPO alarm. + +POWER9 +------ + +Development on POWER9 systems continues in earnest. + +This release includes the first support for POWER9 DD2 chips. Future releases +will likely contain more bug fixes, this release has booted on real hardware. + +- hdata: Reserve Trace Areas + + When hostboot is configured to setup in memory tracing it will reserve + some memory for use by the hardware tracing facility. We need to mark + these areas as off limits to the operating system and firmware. +- hdata: Make out-of-range idata print at PR_DEBUG + + Some fields just aren't populated on some systems. + +- hdata: Ignore unnamed memory reservations. + + Hostboot should name any and all memory reservations that it provides. + Currently some hostboots export a broken reservation covering the first + 256MB of memory and this causes the system to crash at boot due to an + invalid free because this overlaps with the static "ibm,os-reserve" + region (which covers the first 768MB of memory). + + According to the hostboot team unnamed reservations are invalid and can + be ignored. + +- hdata: Check the Host I2C devices array version + + Currently this is not populated on FSP machines which causes some + obnoxious errors to appear in the boot log. We also only want to + parse version 1 of this structure since future versions will completely + change the array item format. + +- Ensure P9 DD1 workarounds apply only to Nimbus + + The workarounds for P9 DD1 are only needed for Nimbus. P9 Cumulus will + be DD1 but don't need these same workarounds. + + This patch ensures the P9 DD1 workarounds only apply to Nimbus. It + also renames some things to make clear what's what. + +- cpu: Cleanup AMR and IAMR when re-initializing CPUs + + There's a bug in current Linux kernels leaving crap in those registers + accross kexec and not sanitizing them on boot. This breaks kexec under + some circumstances (such as booting a hash kernel from a radix one + on P9 DD2.0). + + The long term fix is in Linux, but this workaround is a reasonable + way of "sanitizing" those SPRs when Linux calls opal_reinit_cpus() + and shouldn't have adverse effects. + + We could also use that same mechanism to cleanup other things as + well such as restoring some other SPRs to their default value in + the future. + +- Set POWER9 RPR SPR to 0x00000103070F1F3F. Same value as P8. + + Without this, thread priorities inside a core don't work. + +- cpu: Support setting HID[RADIX] and set it by default on P9 + + This adds new opal_reinit_cpus() flags to setup radix or hash + mode in HID[8] on POWER9. + + By default HID[8] will be set. On P9 DD1.0, Linux will change + it as needed. On P9 DD2.0 hash works in radix mode (radix is + really "dual" mode) so KVM won't break and existing kernels + will work. + + Newer kernels built for hash will call this to clear the HID bit + and thus get the full size of the TLB as an optimization. + +- Add "cleanup_global_tlb" for P9 and later + + Uses broadcast TLBIE's to cleanup the TLB on all cores and on + the nest MMU + +- xive: DD2.0 updates + + Add support for StoreEOI, fix StoreEOI MMIO offset in ESB page, + and other cleanups + +- Update default TSCR value for P9 as recommended by HW folk. + +- xive: Fix initialisation of xive_cpu_state struct + + When using XIVE emulation with DEBUG=1, we run into crashes in log_add() + due to the xive_cpu_state->log_pos being uninitialised (and thus, with + DEBUG enabled, initialised to the poison value of 0x99999999). + +OCC/Power Management +^^^^^^^^^^^^^^^^^^^^ + +With this release, it's possible to boot POWER9 systems with the OCC +enabled and change CPU frequencies. Doing so does require other firmware +components to also support this (otherwise the frequency will not be set). + +- occ: Skip setting cores to nominal frequency in P9 + + In P9, once OCC is up, it is supposed to setup the cores to nominal + frequency. So skip this step in OPAL. +- occ: Fix Pstate ordering for P9 + + In P9 the pstate values are positive. They are continuous set of + unsigned integers [0 to +N] where Pmax is 0 and Pmin is N. The + linear ordering of pstates for P9 has changed compared to P8. + P8 has neagtive pstate values advertised as [0 to -N] where Pmax + is 0 and Pmin is -N. This patch adds helper routines to abstract + pstate comparison with pmax and adds sanity pstate limit checks. + This patch also fixes pstate arithmetic by using labs(). +- p8-i2c: occ: Add support for OCC to use I2C engines + + This patch adds support to share the I2C engines with host and OCC. + OCC uses I2C engines to read DIMM temperatures and to communicate with + GPU. OCC Flag register is used for locking between host and OCC. Host + requests for the bus by setting a bit in OCC Flag register. OCC sends + an interrupt to indicate the change in ownership. + +opal-prd/PRD +^^^^^^^^^^^^ + +- opal-prd: Handle SBE passthrough message passing + + This patch adds support to send SBE pass through command to HBRT. +- SBE: Add passthrough command support + + SBE sends passthrough command. We have to capture this interrupt and + send event to HBRT via opal-prd (user space daemon). +- opal-prd: hook up reset_pm_complex + + This change provides the facility to invoke HBRT's reset_pm_complex, in + the same manner is done with process_occ_reset previously. + + We add a control command for `opal-prd pm-complex reset`, which is just + an alias for occ_reset at this stage. + +- prd: Implement firmware side of opaque PRD channel + + This change introduces the firmware side of the opaque HBRT <--> OPAL + message channel. We define a base message format to be shared with HBRT + (in include/prd-fw-msg.h), and allow firmware requests and responses to + be sent over this channel. + + We don't currently have any notifications defined, so have nothing to do + for firmware_notify() at this stage. + +- opal-prd: Add firmware_request & firmware_notify implementations + + This change adds the implementation of firmware_request() and + firmware_notify(). To do this, we need to add a message queue, so that + we can properly handle out-of-order messages coming from firmware. + +- opal-prd: Add support for variable-sized messages + + With the introductuion of the opaque firmware channel, we want to + support variable-sized messages. Rather than expecting to read an + entire 'struct opal_prd_msg' in one read() call, we can split this + over mutiple reads, potentially expanding our message buffer. + +- opal-prd: Sync hostboot interfaces with HBRT + + This change adds new callbacks defined for p9, and the base thunks for + the added calls. + +- opal-prd: interpret log level prefixes from HBRT + + Interpret the (optional) \*_MRK log prefixes on HBRT messages, and set + the syslog log priority to suit. + +- opal-prd: Add occ reset to usage text +- opal-prd: allow different chips for occ control actions + + The `occ reset` and `occ error` actions can both take a chip id + argument, but we're currently just using zero. This change changes the + control message format to pass the chip ID from the control process to + the opal-prd daemon. + + +PCI/PHB4 +^^^^^^^^ + +- phb4: Fix number of index bits in IODA tables + + On PHB4 the number of index bits in the IODA table address register + was bumped to 10 bits to accomodate for 1024 MSIs and 1024 TVEs (DD2). + + However our macro only defined the field to be 9 bits, thus causing + "interesting" behaviours on some systems. + +- phb4: Harden init with bad PHBs + + Currently if we read all 1's from the EEH or IRQ capabilities, we end + up train wrecking on some other random code (eg. an assert() in xive). + + This hardens the PHB4 code to look for these bad reads and more + gracefully fails the init for that PHB alone. This allows the rest of + the system to boot and ignore those bad PHBs. + +- phb4 capi (i.e. CAPI2): Handle HMI events + + Find the CAPP on the chip associated with the HMI event for PHB4. + The recovery mode (re-initialization of the capp, resume of functional + operations) is only available with P9 DD2. A new patch will be provided + to support this feature. + +.. _capi2-5.7-rc1-rn: + +- phb4 capi (i.e. CAPI2): Enable capi mode for PHB4 + + Enable the Coherently attached processor interface. The PHB is used as + a CAPI interface. + CAPI Adapters can be connected to either PEC0 or PEC2. Single port + CAPI adapter can be connected to either PEC0 or PEC2, but Dual-Port + Adapter can be only connected to PEC2 + * CAPP0 attached to PHB0(PEC0 - single port) + * CAPP1 attached to PHB3(PEC2 - single or dual port) + +- hw/phb4: Rework phb4_get_presence_state() + + There are two issues in current implementation: It should return errcode + visibile to Linux, which has prefix OPAL_*. The code isn't very obvious. + + This returns OPAL_HARDWARE when the PHB is broken. Otherwise, OPAL_SUCCESS + is always returned. In the mean while, It refactors the code to make it + obvious: OPAL_PCI_SLOT_PRESENT is returned when the presence signal (low active) + or PCIe link is active. Otherwise, OPAL_PCI_SLOT_EMPTY is returned. + +- phb4: Error injection for config space + + Implement CFG (config space) error injection. + + This works the same as PHB3. MMIO and DMA error injection require a + rewrite, so they're unsupported for now. + + While it's not feature complete, this at least provides an easy way to + inject an error that will trigger EEH. + +- phb4: Error clear implementation +- phb4: Mask link down errors during reset + + During a hot reset the PCI link will drop, so we need to mask link down + events to prevent unnecessary errors. +- phb4: Implement root port initialization + + phb4_root_port_init() was a NOP before, so fix that. +- phb4: Complete reset implementation + + This implements complete reset (creset) functionality for POWER9 DD1. + + Only partially tested and contends with some DD1 errata, but it's a start. + +.. _shared-slot-5.7-rc1-rn: + +- phb4: Activate shared PCI slot on witherspoon + + Witherspoon systems come with a 'shared' PCI slot: physically, it + looks like a x16 slot, but it's actually two x8 slots connected to two + PHBs of two different chips. Taking advantage of it requires some + logic on the PCI adapter. Only the Mellanox CX5 adapter is known to + support it at the time of this writing. + + This patch enables support for the shared slot on witherspoon if a x16 + adapter is detected. Each x8 slot has a presence bit, so both bits + need to be set for the activation to take place. Slot sharing is + activated through a gpio. + + Note that there's no easy way to be sure that the card is indeed a + shared-slot compatible PCI adapter and not a normal x16 card. Plugging + a normal x16 adapter on the shared slot should be avoided on + witherspoon, as the link won't train on the second slot, resulting in + a timeout and a longer boot time. Only the first slot is usable and + the x16 adapter will end up using only half the lines. + + If the PCI card plugged on the physical slot is only x8 (or less), + then the presence bit of the second slot is not set, so this patch + does nothing. The x8 (or less) adapter should work like on any other + physical slot. + +- phb4: Block D-state power management on direct slots + + As current revisions of PHB4 don't properly handle the resulting + L1 link transition. + +- phb4: Call pci config filters + +- phb4: Mask out write-1-to-clear registers in RC cfg + + The root complex config space only supports 4-byte accesses. Thus, when + the client requests a smaller size write, we do a read-modify-write to + the register. + + However, some register have bits defined as "write 1 to clear". + + If we do a RMW cycles on such a register and such bits are 1 in the + part that the client doesn't intend to modify, we will accidentally + write back those 1's and clear the corresponding bit. + + This avoids it by masking out those magic bits from the "old" value + read from the register. + +- phb4: Properly mask out link down errors during reset +- phb3/4: Silence a useless warning + + PHB's don't have base location codes on non-FSP systems and it's + normal. + +- phb4: Workaround bug in spec 053 + + Wait for DLP PGRESET to clear *after* lifting the PCIe core reset + +- phb4: DD2.0 updates + + Support StoreEOI, full complements of PEs (twice as big TVT) + and other updates. + + Also renumber init steps to match spec 063 + + +NPU2 +^^^^ + +Note that currently NPU2 support is limited to POWER9 DD1 hardware. + +- platforms/astbmc/witherspoon.c: Add NPU2 slot mappings + + For NVLink2 to function PCIe devices need to be associated with the right + NVLinks. This association is supposed to be passed down to Skiboot via HDAT but + those fields are still not correctly filled out. To work around this we add slot + tables for the NVLinks similar to what we have for P8+. + +- hw/npu2.c: Fix device aperture calculation + + The POWER9 NPU2 implements an address compression scheme to compress 56-bit P9 + physical addresses to 47-bit GPU addresses. System software needs to know both + addresses, unfortunately the calculation of the compressed address was + incorrect. Fix it here. + +- hw/npu2.c: Change MCD BAR allocation order + + MCD BARs need to be correctly aligned to the size of the region. As GPU + memory is allocated from the top of memory down we should start allocating + from the highest GPU memory address to the lowest to ensure correct + alignment. + +- NPU2: Add flag to nvlink config space indicating DL reset state + + Device drivers need to be able to determine if the DL is out of reset or + not so they can safely probe to see if links have already been trained. + This patch adds a flag to the vendor specific config space indicating if + the DL is out of reset. + +- hw/npu2.c: Hardcode MSR_SF when setting up npu XTS contexts + + We don't support anything other than 64-bit mode for address translations so we + can safely hardcode it. + +- hw/npu2-hw-procedures.c: Add nvram option to override zcal calculations + + In some rare cases the zcal state machine may fail and flag an error. According + to hardware designers it is sometimes ok to ignore this failure and use nominal + values for the calculations. In this case we add a nvram variable + (nv_zcal_override) which will cause skiboot to ignore the failure and use the + nominal value specified in nvram. +- npu2: Fix npu2_{read,write}_4b() + + When writing or reading 4-byte values, we need to use the upper half of + the 64-bit SCOM register. + + Fix npu2_{read,write}_4b() and their callers to use uint32_t, and + appropriately shift the value being written or returned. + + +- hw/npu2.c: Fix opal_npu_map_lpar to search for existing BDF +- hw/npu2-hw-procedures.c: Fix running of zcal procedure + + The zcal procedure should only be run once per obus (ie. once per group of 3 + links). Clean up the code and fix the potential buffer overflow due to a typo. + Also updates the zcal settings to their proper values. +- hw/npu2.c: Add memory coherence directory programming + + The memory coherence directory (MCD) needs to know which system memory addresses + belong to the GPU. This amounts to setting a BAR and a size in the MCD to cover + the addresses assigned to each of the GPUs. To ease assignment we assume GPUs + are assigned memory in a contiguous block per chip. + + +pflash/libflash +--------------- + +- libflash/libffs: Zero checksum words + + On writing ffs entries to flash libffs doesn't zero checksum words + before calculating the checksum across the entire structure. This causes + an inaccurate calculation of the checksum as it may calculate a checksum + on non-zero checksum bytes. + +- libffs: Fix ffs_lookup_part() return value + + It would return success when the part wasn't found +- libflash/libffs: Correctly update the actual size of the partition + + libffs has been updating FFS partition information in the wrong place + which leads to incomplete erases and corruption. +- libflash: Initialise entries list earlier + + In the bail-out path we call ffs_close() to tear down the partially + initialised ffs_handle. ffs_close() expects the entries list to be + initialised so we need to do that earlier to prevent a null pointer + dereference. + +mbox-flash +---------- + +mbox-flash is the emerging standard way of talking to host PNOR flash +on POWER9 systems. + +- libflash/mbox-flash: Implement MARK_WRITE_ERASED mbox call + + Version two of the mbox-flash protocol defines a new command: + MARK_WRITE_ERASED. + + This command provides a simple way to mark a region of flash as all 0xff + without the need to go and write all 0xff. This is an optimisation as + there is no need for an erase before a write, it is the responsibility of + the BMC to deal with the flash correctly, however in v1 it was ambiguous + what a client should do if the flash should be erased but not actually + written to. This allows of a optimal path to resolve this problem. + +- libflash/mbox-flash: Update to V2 of the protocol + + Updated version 2 of the protocol can be found at: + https://github.com/openbmc/mboxbridge/blob/master/Documentation/mbox_protocol.md + + This commit changes mbox-flash such that it will preferentially talk + version 2 to any capable daemon but still remain capable of talking to + v1 daemons. + + Version two changes some of the command definitions for increased + consistency and usability. + Version two includes more attention bits - these are now dealt with at a + simple level. +- libflash/mbox-flash: Implement MARK_WRITE_ERASED mbox call + + Version two of the mbox-flash protocol defines a new command: + MARK_WRITE_ERASED. + + This command provides a simple way to mark a region of flash as all 0xff + without the need to go and write all 0xff. This is an optimisation as + there is no need for an erase before a write, it is the responsibility of + the BMC to deal with the flash correctly, however in v1 it was ambiguous + what a client should do if the flash should be erased but not actually + written to. This allows of a optimal path to resolve this problem. + +- libflash/mbox-flash: Update to V2 of the protocol + + Updated version 2 of the protocol can be found at: + https://github.com/openbmc/mboxbridge/blob/master/Documentation/mbox_protocol.md + + This commit changes mbox-flash such that it will preferentially talk + version 2 to any capable daemon but still remain capable of talking to + v1 daemons. + + Version two changes some of the command definitions for increased + consistency and usability. + Version two includes more attention bits - these are now dealt with at a + simple level. + +- hw/lpc-mbox: Use message registers for interrupts + + Currently the BMC raises the interrupt using the BMC control register. + It does so on all accesses to the 16 'data' registers meaning that when + the BMC only wants to set the ATTN (on which we have interrupts enabled) + bit we will also get a control register based interrupt. + + The solution here is to mask that interrupt permanantly and enable + interrupts on the protocol defined 'response' data byte. + +General fixes +------------- + +- Reduce log level on non-error log messages + + 90% of what we print isn't useful to a normal user. This + dramatically reduces the amount of messages printed by + OPAL in normal circumstances. + +- init: Silence messages and call ourselves "OPAL" +- psi: Switch to ESB mode later + + There's an errata, if we switch to ESB mode before setting up + the various ESB mode related registers, a pending interrupts + can go wrong. + +- lpc: Enable "new" SerIRQ mode +- hw/ipmi/ipmi-sel: missing newline in prlog warning + +- p8-i2c OCC lock: fix locking in p9_i2c_bus_owner_change +- Convert important polling loops to spin at lowest SMT priority + + The pattern of calling cpu_relax() inside a polling loop does + not suit the powerpc SMT priority instructions. Prefrred is to + set a low priority then spin until break condition is reached, + then restore priority. + +- Improve cpu_idle when PM is disabled + + Split cpu_idle() into cpu_idle_delay() and cpu_idle_job() rather than + requesting the idle type as a function argument. Have those functions + provide a default polling (non-PM) implentation which spin at the + lowest SMT priority. + +- core/fdt: Always add a reserve map + + Currently we skip adding the reserved ranges block to the generated + FDT blob if we are excluding the root node. This can result in a DTB + that dtc will barf on because the reserved memory ranges overlap with + the start of the dt_struct block. As an example: :: + + $ fdtdump broken.dtb -d + /dts-v1/; + // magic: 0xd00dfeed + // totalsize: 0x7f3 (2035) + // off_dt_struct: 0x30 <----\ + // off_dt_strings: 0x7b8 | this is bad! + // off_mem_rsvmap: 0x30 <----/ + // version: 17 + // last_comp_version: 16 + // boot_cpuid_phys: 0x0 + // size_dt_strings: 0x3b + // size_dt_struct: 0x788 + + /memreserve/ 0x100000000 0x300000004; + /memreserve/ 0x3300000001 0x169626d2c; + /memreserve/ 0x706369652d736c6f 0x7473000000000003; + *continues* + + With this patch: :: + + $ fdtdump working.dtb -d + /dts-v1/; + // magic: 0xd00dfeed + // totalsize: 0x803 (2051) + // off_dt_struct: 0x40 + // off_dt_strings: 0x7c8 + // off_mem_rsvmap: 0x30 + // version: 17 + // last_comp_version: 16 + // boot_cpuid_phys: 0x0 + // size_dt_strings: 0x3b + // size_dt_struct: 0x788 + + // 0040: tag: 0x00000001 (FDT_BEGIN_NODE) + / { + // 0048: tag: 0x00000003 (FDT_PROP) + // 07fb: string: phandle + // 0054: value + phandle = <0x00000001>; + *continues* + +- hw/lpc-mbox: Use message registers for interrupts + + Currently the BMC raises the interrupt using the BMC control register. + It does so on all accesses to the 16 'data' registers meaning that when + the BMC only wants to set the ATTN (on which we have interrupts enabled) + bit we will also get a control register based interrupt. + + The solution here is to mask that interrupt permanantly and enable + interrupts on the protocol defined 'response' data byte. + + +PCI +--- +- pci: Wait 20ms before checking presence detect on PCIe + + As the PHB presence logic has a debounce timer that can take + a while to settle. + +- phb3+iov: Fixup support for config space filters + + The filter should be called before the HW access and its + return value control whether to perform the access or not +- core/pci: Use PCI slot's power facality in pci_enable_bridge() + + The current implmentation has incorrect assumptions: there is + always a PCI slot associated with root port and PCIe switch + downstream port and all of them are capable to change its + power state by register PCICAP_EXP_SLOTCTL. Firstly, there + might not a PCI slot associated with the root port or PCIe + switch downstream port. Secondly, the power isn't controlled + by standard config register (PCICAP_EXP_SLOTCTL). There are + I2C slave devices used to control the power states on Tuleta. + + In order to use the PCI slot's methods to manage the power + states, this does: + + * Introduce PCI_SLOT_FLAG_ENFORCE, indicates the request operation + is enforced to be applied. + * pci_enable_bridge() is split into 3 functions: pci_bridge_power_on() + to power it on; pci_enable_bridge() as a place holder and + pci_bridge_wait_link() to wait the downstream link to come up. + * In pci_bridge_power_on(), the PCI slot's specific power management + methods are used if there is a PCI slot associated with the PCIe + switch downstream port or root port. +- platforms/astbmc/slots.c: Allow comparison of bus numbers when matching slots + + When matching devices on multiple down stream PLX busses we need to compare more + than just the device-id of the PCIe BDFN, so increase the mask to do so. + +Tests and simulators +-------------------- + +- boot-tests: add OpenBMC support +- boot_test.sh: Add SMC BMC support + + Your BMC needs a special debug image flashed to use this, the exact + image and methods aren't something I can publish here, but if you work + for IBM or SMC you can find out from the right sources. + + A few things are needed to move around to be able to flash to a SMC BMC. + + For a start, the SSH daemon will only accept connections after a special + incantation (which I also can't share), but you should put that in the + ~/.skiboot_boot_tests file along with some other default login information + we don't publicise too broadly (because Security Through Obscurity is + *obviously* a good idea....) + + We also can't just directly "ssh /bin/true", we need an expect script, + and we can't scp, but we can anonymous rsync! + + You also need a pflash binary to copy over. +- hdata_to_dt: Add PVR overrides to the usage text +- mambo: Add a reservation for the initramfs + + On most systems the initramfs is loaded inside the part of memory + reserved for the OS [0x0-0x30000000] and skiboot will never touch it. + On mambo it's loaded at 0x80000000 and if you're unlucky skiboot can + allocate over the top of it and corrupt the initramfs blob. + + There might be the downside that the kernel cannot re-use the initramfs + memory since it's marked as reserved, but the kernel might also free it + anyway. +- mambo: Update P9 PVR to reflect Scale out 24 core chips + + The P9 PVR bits 48:51 don't indicate a revision but instead different + configurations. From BookIV we have: + + ==== =================== + Bits Configuration + ==== =================== + 0 Scale out 12 cores + 1 Scale out 24 cores + 2 Scale up 12 cores + 3 Scale up 24 cores + ==== =================== + + Skiboot will mostly the use "Scale out 24 core" configuration + (ie. SMT4 not SMT8) so reflect this in mambo. +- core: Move enable_mambo_console() into chip initialisation + + Rather than having a wart in main_cpu_entry() that initialises the mambo + console, we can move it into init_chips() which is where we discover that we're + on mambo. + +- mambo: Create multiple chips when we have multiple CPUs + + Currently when we boot mambo with multiple CPUs, we create multiple CPU nodes in + the device tree, and each claims to be on a separate chip. + + However we don't create multiple xscom nodes, which means skiboot only knows + about a single chip, and all CPUs end up on it. At the moment mambo is not able + to create multiple xscom controllers. We can create fake ones, just by faking + the device tree up, but that seems uglier than this solution. + + So create a mambo-chip for each CPU other than 0, to tell skiboot we want a + separate chip created. This then enables Linux to see multiple chips: :: + + smp: Brought up 2 nodes, 2 CPUs + numa: Node 0 CPUs: 0 + numa: Node 1 CPUs: 1 + +- chip: Add support for discovering chips on mambo + + Currently the only way for skiboot to discover chips is by looking for xscom + nodes. But on mambo it's currently not possible to create multiple xscom nodes, + which means we can only simulate a single chip system. + + However it seems we can fairly cleanly add support for a special mambo chip + node, and use that to instantiate multiple chips. + + Add a check in init_chip() that we're not clobbering an already initialised + chip, now that we have two places that initialise chips. +- mambo: Make xscom claim to be DD 2.0 + + In the mambo tcl we set the CPU version to DD 2.0, because mambo is not + bug compatible with DD 1. + + But in xscom_read_cfam_chipid() we have a hard coded value, to work + around the lack of the f000f register, which claims to be P9 DD 1.0. + + This doesn't seem to cause crashes or anything, but at boot we do see: :: + + [ 0.003893084,5] XSCOM: chip 0x0 at 0x1a0000000000 [P9N DD1.0] + + So fix it to claim that the xscom is also DD 2.0 to match the CPU. + +- mambo: Match whole string when looking up symbols with linsym/skisym + + linsym/skisym use a regex to match the symbol name, and accepts a + partial match against the entry in the symbol map, which can lead to + somewhat confusing results, eg: :: + + systemsim % linsym early_setup + 0xc000000000027890 + systemsim % linsym early_setup$ + 0xc000000000aa8054 + systemsim % linsym early_setup_secondary + 0xc000000000027890 + + I don't think that's the behaviour we want, so append a $ to the name so + that the symbol has to match against the whole entry, eg: :: + + systemsim % linsym early_setup + 0xc000000000aa8054 + +- Disable nap on P8 Mambo, public release has bugs +- mambo: Allow loading multiple CPIOs + + Currently we have support for loading a single CPIO and telling Linux to + use it as the initrd. But the Linux code actually supports having + multiple CPIOs contiguously in memory, between initrd-start and end, and + will unpack them all in order. That is a really nice feature as it means + you can have a base CPIO with your root filesystem, and then tack on + others as you need for various tests etc. + + So expand the logic to handle SKIBOOT_INITRD, and treat it as a comma + separated list of CPIOs to load. I chose comma as it's fairly rare in + filenames, but we could make it space, colon, whatever. Or we could add + a new environment variable entirely. The code also supports trimming + whitespace from the values, so you can have "cpio1, cpio2". +- hdata/test: Add memory reservations to hdata_to_dt + + Currently memory reservations are parsed, but since they are not + processed until mem_region_init() they don't appear in the output + device tree blob. Several bugs have been found with memory reservations + so we want them to be part of the test output. + + Add them and clean up several usages of printf() since we want only the + dtb to appear in standard out. + +IBM FSP systems +--------------- + +- FSP/CONSOLE: Fix possible NULL dereference +- platforms/ibm-fsp/firenze: Fix PCI slot power-off pattern + + When powering off the PCI slot, the corresponding bits should + be set to 0bxx00xx00 instead of 0bxx11xx11. Otherwise, the + specified PCI slot can't be put into power-off state. Fortunately, + it didn't introduce any side-effects so far. +- FSP/CONSOLE: Workaround for unresponsive ipmi daemon + + We use TCE mapped area to write data to console. Console header + (fsp_serbuf_hdr) is modified by both FSP and OPAL (OPAL updates + next_in pointer in fsp_serbuf_hdr and FSP updates next_out pointer). + + Kernel makes opal_console_write() OPAL call to write data to console. + OPAL write data to TCE mapped area and sends MBOX command to FSP. + If our console becomes full and we have data to write to console, + we keep on waiting until FSP reads data. + + In some corner cases, where FSP is active but not responding to + console MBOX message (due to buggy IPMI) and we have heavy console + write happening from kernel, then eventually our console buffer + becomes full. At this point OPAL starts sending OPAL_BUSY_EVENT to + kernel. Kernel will keep on retrying. This is creating kernel soft + lockups. In some extreme case when every CPU is trying to write to + console, user will not be able to ssh and thinks system is hang. + + If we reset FSP or restart IPMI daemon on FSP, system recovers and + everything becomes normal. + + This patch adds workaround to above issue by returning OPAL_HARDWARE + when cosole is full. Side effect of this patch is, we may endup dropping + latest console data. But better to drop console data than system hang. + +- FSP: Set status field in response message for timed out message + + For timed out FSP messages, we set message status as "fsp_msg_timeout". + But most FSP driver users (like surviellance) are ignoring this field. + They always look for FSP returned status value in callback function + (second byte in word1). So we endup treating timed out message as success + response from FSP. + + Sample output: :: + + [69902.432509048,7] SURV: Sending the heartbeat command to FSP + [70023.226860117,4] FSP: Response from FSP timed out, word0 = d66a00d7, word1 = 0 state: 3 + .... + [70023.226901445,7] SURV: Received heartbeat acknowledge from FSP + [70023.226903251,3] FSP: fsp_trigger_reset() entry + + Here SURV code thought it got valid response from FSP. But actually we didn't + receive response from FSP. + + This patch fixes above issue by updating status field in response structure. + +- FSP: Improve timeout message + +- FSP/RTC: Fix possible FSP R/R issue in rtc write path +- hw/fsp/rtc: read/write cached rtc tod on fsp hir. + + Currently fsp-rtc reads/writes the cached RTC TOD on an fsp + reset. Use latest fsp_in_rr() function to properly read the cached rtc + value when fsp reset initiated by the hir. + + Below is the kernel trace when we set hw clock, when hir process starts. :: + + [ 1727.775824] NMI watchdog: BUG: soft lockup - CPU#57 stuck for 23s! [hwclock:7688] + [ 1727.775856] Modules linked in: vmx_crypto ibmpowernv ipmi_powernv uio_pdrv_genirq ipmi_devintf powernv_op_panel uio ipmi_msghandler powernv_rng leds_powernv ip_tables x_tables autofs4 ses enclosure scsi_transport_sas crc32c_vpmsum lpfc ipr tg3 scsi_transport_fc + [ 1727.775883] CPU: 57 PID: 7688 Comm: hwclock Not tainted 4.10.0-14-generic #16-Ubuntu + [ 1727.775883] task: c000000fdfdc8400 task.stack: c000000fdfef4000 + [ 1727.775884] NIP: c00000000090540c LR: c0000000000846f4 CTR: 000000003006dd70 + [ 1727.775885] REGS: c000000fdfef79a0 TRAP: 0901 Not tainted (4.10.0-14-generic) + [ 1727.775886] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> + [ 1727.775889] CR: 28024442 XER: 20000000 + [ 1727.775890] CFAR: c00000000008472c SOFTE: 1 + GPR00: 0000000030005128 c000000fdfef7c20 c00000000144c900 fffffffffffffff4 + GPR04: 0000000028024442 c00000000090540c 9000000000009033 0000000000000000 + GPR08: 0000000000000000 0000000031fc4000 c000000000084710 9000000000001003 + GPR12: c0000000000846e8 c00000000fba0100 + [ 1727.775897] NIP [c00000000090540c] opal_set_rtc_time+0x4c/0xb0 + [ 1727.775899] LR [c0000000000846f4] opal_return+0xc/0x48 + [ 1727.775899] Call Trace: + [ 1727.775900] [c000000fdfef7c20] [c00000000090540c] opal_set_rtc_time+0x4c/0xb0 (unreliable) + [ 1727.775901] [c000000fdfef7c60] [c000000000900828] rtc_set_time+0xb8/0x1b0 + [ 1727.775903] [c000000fdfef7ca0] [c000000000902364] rtc_dev_ioctl+0x454/0x630 + [ 1727.775904] [c000000fdfef7d40] [c00000000035b1f4] do_vfs_ioctl+0xd4/0x8c0 + [ 1727.775906] [c000000fdfef7de0] [c00000000035bab4] SyS_ioctl+0xd4/0xf0 + [ 1727.775907] [c000000fdfef7e30] [c00000000000b184] system_call+0x38/0xe0 + [ 1727.775908] Instruction dump: + [ 1727.775909] f821ffc1 39200000 7c832378 91210028 38a10020 39200000 38810028 f9210020 + [ 1727.775911] 4bfffe6d e8810020 80610028 4b77f61d <60000000> 7c7f1b78 3860000a 2fbffff4 + + This is found when executing the testcase + https://github.com/open-power/op-test-framework/blob/master/testcases/fspresetReload.py + + With this fix ran fsp hir torture testcase in the above test + which is working fine. +- occ: Set return variable to correct value + + When entering this section of code rc will be zero. If fsp_mkmsg() fails + the code responsible for printing an error message won't be set. + Resetting rc should allow for the error case to trigger if fsp_mkmsg + fails. +- capp: Fix hang when CAPP microcode LID is missing on FSP machine + + When the LID is absent, we fail early with an error from + start_preload_resource. In that case, capp_ucode_info.load_result + isn't set properly causing a subsequent capp_lid_download() to + call wait_for_resource_loaded() on something that isn't being + loaded, thus hanging. + +- FSP: Add check to detect FSP R/R inside fsp_sync_msg() + + OPAL sends MBOX message to FSP and updates message state from fsp_msg_queued + -> fsp_msg_sent. fsp_sync_msg() queues message and waits until we get response + from FSP. During FSP R/R we move outstanding MBOX messages from msgq to rr_queue + including inflight message (fsp_reset_cmdclass()). But we are not resetting + inflight message state. + + In extreme croner case where we sent message to FSP via fsp_sync_msg() path + and FSP R/R happens before getting respose from FSP, then we will endup waiting + in fsp_sync_msg() until everything becomes normal. + + This patch adds fsp_in_rr() check to fsp_sync_msg() and return error to caller + if FSP is in R/R. +- FSP: Add check to detect FSP R/R inside fsp_sync_msg() + + OPAL sends MBOX message to FSP and updates message state from fsp_msg_queued + -> fsp_msg_sent. fsp_sync_msg() queues message and waits until we get response + from FSP. During FSP R/R we move outstanding MBOX messages from msgq to rr_queue + including inflight message (fsp_reset_cmdclass()). But we are not resetting + inflight message state. + + In extreme croner case where we sent message to FSP via fsp_sync_msg() path + and FSP R/R happens before getting respose from FSP, then we will endup waiting + in fsp_sync_msg() until everything becomes normal. + + This patch adds fsp_in_rr() check to fsp_sync_msg() and return error to caller + if FSP is in R/R. +- capp: Fix hang when CAPP microcode LID is missing on FSP machine + + When the LID is absent, we fail early with an error from + start_preload_resource. In that case, capp_ucode_info.load_result + isn't set properly causing a subsequent capp_lid_download() to + call wait_for_resource_loaded() on something that isn't being + loaded, thus hanging. +- FSP/CONSOLE: Do not free fsp_msg in error path + + as we reuse same msg to send next output message. + +- platform/zz: Acknowledge OCC_LOAD mbox message in ZZ + + In P9 FSP box, OCC image is pre-loaded. So do not handle the load + command and send SUCCESS to FSP on recieving OCC_LOAD mbox message. + +- FSP/RTC: Improve error log + +astbmc systems +-------------- + +- platforms/astbmc: Don't validate model on palmetto + + The platform isn't compatible with palmetto until the root device-tree + node's "model" property is NULL or "palmetto". However, we could have + "TN71-BP012" for the property on palmetto. :: + + linux# cat /proc/device-tree/model + TN71-BP012 + + This skips the validation on root device-tree node's "model" property + on palmetto, meaning we check the "compatible" property only. + + diff --git a/roms/skiboot/doc/release-notes/skiboot-5.7-rc2.rst b/roms/skiboot/doc/release-notes/skiboot-5.7-rc2.rst new file mode 100644 index 000000000..210c8ffae --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.7-rc2.rst @@ -0,0 +1,197 @@ +.. _skiboot-5.7-rc2: + +skiboot-5.7-rc2 +=============== + +skiboot v5.7-rc2 was released on Thursday July 13th 2017. It is the second +release candidate of skiboot 5.7, which will become the new stable release +of skiboot following the 5.6 release, first released 24th May 2017. + +skiboot v5.7-rc2 contains all bug fixes as of :ref:`skiboot-5.4.6` +and :ref:`skiboot-5.1.19` (the currently maintained stable releases). We +do not currently expect to do any 5.6.x stable releases. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.7 in the next week or so, with skiboot +5.7 being for all POWER8 and POWER9 platforms in op-build v1.18 +(due July 12th, but will come *after* skiboot 5.7). + +This is the second release using the new regular six week release cycle, +similar to op-build, but slightly offset to allow for a short stabilisation +period. Expected release dates and contents are tracked using GitHub milestone +and issues: https://github.com/open-power/skiboot/milestones + +Over :ref:`skiboot-5.7-rc1`, we have the following changes: + +POWER9 +------ + +There are many important changes for POWER9 DD1 and DD2 systems. POWER9 support +should be considered in development and skiboot 5.7 is certainly **NOT** +suitable for POWER9 production environments. + +- HDAT: Add IPMI sensor data under /bmc node +- numa/associativity: Add a new level of NUMA for GPU's + + Today we have an issue where the NUMA nodes corresponding + to GPU's have the same affinity/distance as normal memory + nodes. Our reference-points today supports two levels + [0x4, 0x4] for normal systems and [0x4, 0x3] for Power8E + systems. This patch adds a new level [0x4, X, 0x2] and + uses node-id as at all levels for the GPU. +- xive: Enable memory backing of queues + + This dedicates 6x64k pages of memory permanently for the XIVE to + use for internal queue overflow. This allows the XIVE to deal with + some corner cases where the internal queues might prove insufficient. + +- xive: Properly get rid of donated indirect pages during reset + + Otherwise they keep being used accross kexec causing memory + corruption in subsequent kernels once KVM has been used. + +- cpu: Better handle unknown flags in opal_reinit_cpus() + + At the moment, if we get passed flags we don't know about, we + return OPAL_UNSUPPORTED but we still perform whatever actions + was requied by the flags we do support. Additionally, on P8, + we attempt a SLW re-init which hasn't been supported since + Murano DD2.0 and will crash your system. + + It's too late to fix on existing systems so Linux will have to + be careful at least on P8, but to avoid future issues let's clean + that up, make sure we only use slw_reinit() when HILE isn't + supported. +- cpu: Unconditionally cleanup TLBs on P9 in opal_reinit_cpus() + + This can work around problems where Linux fails to properly + cleanup part or all of the TLB on kexec. + +- Fix scom addresses for power9 nx checkstop hmi handling. + + Scom addresses for NX status, DMA & ENGINE FIR and PBI FIR has changed + for Power9. Fixup thoes while handling nx checkstop for Power9. +- Fix scom addresses for power9 core checkstop hmi handling. + + Scom addresses for CORE FIR (Fault Isolation Register) and Malfunction + Alert Register has changed for Power9. Fixup those while handling core + checkstop for Power9. + + Without this change HMI handler fails to check for correct reason for + core checkstop on Power9. + +- core/mem_region: check return value of add_region + + The only sensible thing to do if this fails is to abort() as we've + likely just failed reserving reserved memory regions, and nothing + good comes from that. + +PHB4 +^^^^ +- phb4: Do more retries on link training failures + Currently we only retry once when we have a link training failure. + This changes this to be 3 retries as 1 retry is not giving us enough + reliablity. + + This will increase the boot time, especially on systems where we + incorrectly detect a link presence when there really is nothing + present. I'll post a followup patch to optimise our timings to help + mitigate this later. + +- phb4: Workaround phy lockup by doing full PHB reset on retry + + For PHB4 it's possible that the phy may end up in a bad state where it + can no longer recieve data. This can manifest as the link not + retraining. A simple PERST will not clear this. The PHB must be + completely reset. + + This changes the retry state to CRESET to do this. + + This issue may also manifest itself as the link training in a degraded + state (lower speed or narrower width). This patch doesn't attempt to + fix that (will come later). +- pci: Add ability to trace timing + + PCI link training is responsible for a huge chunk of the skiboot boot + time, so add the ability to trace it waiting in the main state + machine. +- pci: Print resetting PHB notice at higher log level + + Currently during boot there a long delay while we wait for the PHBs to + be reset and train. During this time, there is no output from skiboot + and the last message doesn't give an indication of what's happening. + + This boosts the PHB reset message from info to notice so users can see + what's happening during this long period of waiting. +- phb4: Only set one bit in nfir + + The MPIPL procedure says to only set bit 26 when forcing the PEC into + freeze mode. Currently we set bits 24-27. + + This changes the code to follow spec and only set bit 26. +- phb4: Fix order of pfir/nfir clearing in CRESET + + According to the workbook, pfir must be cleared before the nfir. + The way we have it now causes the nfir to not clear properly in some + error circumstances. + + This swaps the order to match the workbook. +- phb4: Remove incorrect state transition + + When waiting in PHB4_SLOT_CRESET_WAIT_CQ for transations to end, we + incorrectly move onto the next state. Generally we don't hit this as + the transactions have ended already anyway. + + This removes the incorrect state transition. +- phb4: Set default lane equalisation + + Set default lane equalisation if there is nothing in the device-tree. + + Default value taken from hdat and confirmed by hardware team. Neatens + the code up a bit too. +- hdata: Fix phb4 lane-eq property generation + + The lane-eq data we get from hdat is all 7s but what we end up in the + device tree is: :: + + xscom@603fc00000000/pbcq@4010c00/stack@0/ibm,lane-eq + 00000000 31c339e0 00000000 0000000c + 00000000 00000000 00000000 00000000 + 00000000 31c30000 77777777 77777777 + 77777777 77777777 77777777 77777777 + + This fixes grabbing the properties from hdat and fixes the call to put + them in the device tree. +- phb4: Fix PHB4 fence recovery. + + We had a few problems: + + - We used the wrong register to trigger the reset (spec bug) + - We should clear the PFIR and NFIR while the reset is asserted + - ... and in the right order ! + - We should only apply the DD1 workaround after the reset has + been lifted. + - We should ensure we use ASB whenever we are fenced or doing a + CRESET + - Make config ops write with ASB +- phb4: Verbose EEH options + + Enabled via nvram pci-eeh-verbose=true. ie. :: + + nvram -p ibm,skiboot --update-config pci-eeh-verbose=true +- phb4: Print more info when PHB fences + + For now at PHBERR level. We don't have room in the diags data + passed to Linux for these unfortunately. + + +Testing/development +------------------- +- lpc: remove double LPC prefix from messages +- opal-ci/fetch-debian-jessie-installer: follow redirects + Fixes some CI failures +- test/qemu-jessie: bail out fast on kernel panic +- test/qemu-jessie: dump boot log on failure +- travis: add fedora26 +- xz: add fallthrough annotations to silence GCC7 warning diff --git a/roms/skiboot/doc/release-notes/skiboot-5.7.rst b/roms/skiboot/doc/release-notes/skiboot-5.7.rst new file mode 100644 index 000000000..9a967c841 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.7.rst @@ -0,0 +1,1508 @@ +.. _skiboot-5.7: + +skiboot-5.7 +=========== + +skiboot v5.7 was released on Tuesday July 25th 2017. It follows two +release candidates of skiboot 5.7, and is now the new stable release +of skiboot following the 5.6 release, first released 24th May 2017. + +skiboot v5.7 contains all bug fixes as of :ref:`skiboot-5.4.6` +and :ref:`skiboot-5.1.19` (the currently maintained stable releases). We +do not currently expect to do any 5.6.x stable releases. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +POWER9 is still in development, and thus all POWER9 users must upgrade +to skiboot v5.7. + +This is the second release using the new regular six week release cycle, +similar to op-build, but slightly offset to allow for a short stabilisation +period. Expected release dates and contents are tracked using GitHub milestone +and issues: https://github.com/open-power/skiboot/milestones + +New Features +------------ + +Since :ref:`skiboot-5.6.0`, we have a few new features: + +New features in this release for POWER9 systems: + +- In Memory Counters (IMC) (See :ref:`imc` for details) +- phb4: Activate shared PCI slot on witherspoon (see :ref:`Shared Slot <shared-slot-rn>`) +- phb4 capi (i.e. CAPI2): Enable capi mode for PHB4 (see :ref:`CAPI on PHB4 <capi2-rn>`) + +New feature for IBM FSP based systems: + +- fsp/tpo: Provide support for disabling TPO alarm + + This patch adds support for disabling a preconfigured + Timed-Power-On(TPO) alarm on FSP based systems. Presently once a TPO alarm + is configured from the kernel it will be triggered even if its + subsequently disabled. + + With this patch a TPO alarm can be disabled by passing + y_m_d==hr_min==0 to fsp_opal_tpo_write(). A branch is added to the + function to handle this case by sending FSP_CMD_TPO_DISABLE message to + the FSP instead of usual FSP_CMD_TPO_WRITE message. The kernel is + expected to call opal_tpo_write() with y_m_d==hr_min==0 to request + opal to disable TPO alarm. + + +POWER9 +------ +There are many important changes for POWER9 DD1 and DD2 systems. POWER9 support +should be considered in development and skiboot 5.7 is certainly **NOT** +suitable for POWER9 production environments. + +Since :ref:`skiboot-5.7-rc2`: + +- platform/witherspoon: Enable eSEL logging + + OpenBMC stack added IPMI OEM extension to log eSEL events. + Lets enable eSEL logging from OPAL side. + + See: https://github.com/openbmc/openpower-host-ipmi-oem/blob/d9296050bcece5c2eca5ede0932d944b0ced66c9/oemhandler.cpp#L142 + (yes, that is the documentation) +- hdat/i2c: Fix array version check +- mem_region: Check for no-map in reserved nodes + + Regions with the no-map property should be handled seperately to + "normal" firmware reservations. When creating mem_region regions + from a reserved-memory DT node use the no-map property to select + the right reservation type. + +- hdata/memory: Add memory reservations to the DT + + Currently we just add these to a list of pre-boot reserved regions + which is then converted into a the contents of the /reserved-memory/ + node just before Skiboot jumps into the firmware kernel. + + This approach is insufficent because we need to add the ibm,prd-instance + labels to the various hostboot reserved regions. To do this we want to + create these resevation nodes inside the HDAT parser rather than having + the mem_region flattening code handle it. On P8 systems Hostboot placed + its memory reservations under the /ibm,hostboot/ node and this patch + makes the HDAT parser do the same. + +Since Since :ref:`skiboot-5.7-rc1`: + +- HDAT: Add IPMI sensor data under /bmc node +- numa/associativity: Add a new level of NUMA for GPU's + + Today we have an issue where the NUMA nodes corresponding + to GPU's have the same affinity/distance as normal memory + nodes. Our reference-points today supports two levels + [0x4, 0x4] for normal systems and [0x4, 0x3] for Power8E + systems. This patch adds a new level [0x4, X, 0x2] and + uses node-id as at all levels for the GPU. +- xive: Enable memory backing of queues + + This dedicates 6x64k pages of memory permanently for the XIVE to + use for internal queue overflow. This allows the XIVE to deal with + some corner cases where the internal queues might prove insufficient. + +- xive: Properly get rid of donated indirect pages during reset + + Otherwise they keep being used accross kexec causing memory + corruption in subsequent kernels once KVM has been used. + +- cpu: Better handle unknown flags in opal_reinit_cpus() + + At the moment, if we get passed flags we don't know about, we + return OPAL_UNSUPPORTED but we still perform whatever actions + was requied by the flags we do support. Additionally, on P8, + we attempt a SLW re-init which hasn't been supported since + Murano DD2.0 and will crash your system. + + It's too late to fix on existing systems so Linux will have to + be careful at least on P8, but to avoid future issues let's clean + that up, make sure we only use slw_reinit() when HILE isn't + supported. +- cpu: Unconditionally cleanup TLBs on P9 in opal_reinit_cpus() + + This can work around problems where Linux fails to properly + cleanup part or all of the TLB on kexec. + +- Fix scom addresses for power9 nx checkstop hmi handling. + + Scom addresses for NX status, DMA & ENGINE FIR and PBI FIR has changed + for Power9. Fixup thoes while handling nx checkstop for Power9. +- Fix scom addresses for power9 core checkstop hmi handling. + + Scom addresses for CORE FIR (Fault Isolation Register) and Malfunction + Alert Register has changed for Power9. Fixup those while handling core + checkstop for Power9. + + Without this change HMI handler fails to check for correct reason for + core checkstop on Power9. + +- core/mem_region: check return value of add_region + + The only sensible thing to do if this fails is to abort() as we've + likely just failed reserving reserved memory regions, and nothing + good comes from that. + +Since Since :ref:`skiboot-5.6.0`: + +- hdata: Reserve Trace Areas + + When hostboot is configured to setup in memory tracing it will reserve + some memory for use by the hardware tracing facility. We need to mark + these areas as off limits to the operating system and firmware. +- hdata: Make out-of-range idata print at PR_DEBUG + + Some fields just aren't populated on some systems. + +- hdata: Ignore unnamed memory reservations. + + Hostboot should name any and all memory reservations that it provides. + Currently some hostboots export a broken reservation covering the first + 256MB of memory and this causes the system to crash at boot due to an + invalid free because this overlaps with the static "ibm,os-reserve" + region (which covers the first 768MB of memory). + + According to the hostboot team unnamed reservations are invalid and can + be ignored. + +- hdata: Check the Host I2C devices array version + + Currently this is not populated on FSP machines which causes some + obnoxious errors to appear in the boot log. We also only want to + parse version 1 of this structure since future versions will completely + change the array item format. + +- Ensure P9 DD1 workarounds apply only to Nimbus + + The workarounds for P9 DD1 are only needed for Nimbus. P9 Cumulus will + be DD1 but don't need these same workarounds. + + This patch ensures the P9 DD1 workarounds only apply to Nimbus. It + also renames some things to make clear what's what. + +- cpu: Cleanup AMR and IAMR when re-initializing CPUs + + There's a bug in current Linux kernels leaving crap in those registers + accross kexec and not sanitizing them on boot. This breaks kexec under + some circumstances (such as booting a hash kernel from a radix one + on P9 DD2.0). + + The long term fix is in Linux, but this workaround is a reasonable + way of "sanitizing" those SPRs when Linux calls opal_reinit_cpus() + and shouldn't have adverse effects. + + We could also use that same mechanism to cleanup other things as + well such as restoring some other SPRs to their default value in + the future. + +- Set POWER9 RPR SPR to 0x00000103070F1F3F. Same value as P8. + + Without this, thread priorities inside a core don't work. + +- cpu: Support setting HID[RADIX] and set it by default on P9 + + This adds new opal_reinit_cpus() flags to setup radix or hash + mode in HID[8] on POWER9. + + By default HID[8] will be set. On P9 DD1.0, Linux will change + it as needed. On P9 DD2.0 hash works in radix mode (radix is + really "dual" mode) so KVM won't break and existing kernels + will work. + + Newer kernels built for hash will call this to clear the HID bit + and thus get the full size of the TLB as an optimization. + +- Add "cleanup_global_tlb" for P9 and later + + Uses broadcast TLBIE's to cleanup the TLB on all cores and on + the nest MMU + +- xive: DD2.0 updates + + Add support for StoreEOI, fix StoreEOI MMIO offset in ESB page, + and other cleanups + +- Update default TSCR value for P9 as recommended by HW folk. + +- xive: Fix initialisation of xive_cpu_state struct + + When using XIVE emulation with DEBUG=1, we run into crashes in log_add() + due to the xive_cpu_state->log_pos being uninitialised (and thus, with + DEBUG enabled, initialised to the poison value of 0x99999999). + + +PHB4 +^^^^ + +Since :ref:`skiboot-5.7-rc2`: + +- phb4: Add link training trace mode + + Add a mode to PHB4 to trace training process closely. This activates + as soon as PERST is deasserted and produces human readable output of + the process. + + This may increase training times since it duplicates some of the + training code. This code has it's own simple checks for fence and + timeout but will fall through to the default training code once done. + + Output produced, looks like the "TRACE:" lines below: :: + + [ 3.410799664,7] PHB#0001[0:1]: FRESET: Starts + [ 3.410802000,7] PHB#0001[0:1]: FRESET: Prepare for link down + [ 3.410806624,7] PHB#0001[0:1]: FRESET: Assert skipped + [ 3.410808848,7] PHB#0001[0:1]: FRESET: Deassert + [ 3.410812176,3] PHB#0001[0:1]: TRACE: 0x0000000101000000 0ms + [ 3.417170176,3] PHB#0001[0:1]: TRACE: 0x0000100101000000 12ms presence + [ 3.436289104,3] PHB#0001[0:1]: TRACE: 0x0000180101000000 49ms training + [ 3.436373312,3] PHB#0001[0:1]: TRACE: 0x00001d0811000000 49ms trained + [ 3.436420752,3] PHB#0001[0:1]: TRACE: Link trained. + [ 3.436967856,7] PHB#0001[0:1]: LINK: Start polling + [ 3.437482240,7] PHB#0001[0:1]: LINK: Electrical link detected + [ 3.437996864,7] PHB#0001[0:1]: LINK: Link is up + [ 4.438000048,7] PHB#0001[0:1]: LINK: Link is stable + + Enabled via nvram using: :: + + nvram -p ibm,skiboot --update-config pci-tracing=true + +- phb4: Improve reset and link training timing + + This improves PHB reset and link training timing. + +- phb4: Add phb4_check_reg() to sanity check failures + + This adds a function phb4_check_reg() to sanity check when we do MMIO + reads from the PHB to make sure it's not fenced. + +- phb4: Remove retry on electrical link timeout + + Currently we retry if we don't detect an electrical link. This is + pointless as all devices should respond in the given time. + + This patches removes this retry and just returns OPAL_HARDWARE if we + don't detect an electrical link. + + This has the additional benefit of improving boot times on machines + that have badly wired presence detect (ie. says a device is present + when there isn't). + +- phb4: Read PERST signal rather than assuming it's asserted + + Currently we assume on boot that PERST is asserted so that we can skip + having to assert it ourselves. + + This instead reads the PERST status and determines if we need to + assert it based on that. + +- phb4: Fix endian of TLP headers print + + Byte swap TLP headers so they are the same as the PCIe spec. +- phb4: Change timeouts prints to error level + + If the link doesn't have a electrical link or the link doesn't train + we should make that more obvious to the user. +- phb4: Better logs why the slot didn't work + + Better logs why the slot didn't work and make it a PR_ERR so users + see it by default. + +- phb4: Force verbose EEH logging + + Force verbose EEH. This is a heavy handed and we should turn if off + later as things stabilise, but is useful for now. +- phb4: Initialization sequence updates + + Mostly errata workarounds, some DD1 specific. + + The step Init_5 was moved to Init_16, so the numbering was updated to + reflect this. + +Since :ref:`skiboot-5.7-rc1`: + +- phb4: Do more retries on link training failures + Currently we only retry once when we have a link training failure. + This changes this to be 3 retries as 1 retry is not giving us enough + reliablity. + + This will increase the boot time, especially on systems where we + incorrectly detect a link presence when there really is nothing + present. I'll post a followup patch to optimise our timings to help + mitigate this later. + +- phb4: Workaround phy lockup by doing full PHB reset on retry + + For PHB4 it's possible that the phy may end up in a bad state where it + can no longer recieve data. This can manifest as the link not + retraining. A simple PERST will not clear this. The PHB must be + completely reset. + + This changes the retry state to CRESET to do this. + + This issue may also manifest itself as the link training in a degraded + state (lower speed or narrower width). This patch doesn't attempt to + fix that (will come later). +- pci: Add ability to trace timing + + PCI link training is responsible for a huge chunk of the skiboot boot + time, so add the ability to trace it waiting in the main state + machine. +- pci: Print resetting PHB notice at higher log level + + Currently during boot there a long delay while we wait for the PHBs to + be reset and train. During this time, there is no output from skiboot + and the last message doesn't give an indication of what's happening. + + This boosts the PHB reset message from info to notice so users can see + what's happening during this long period of waiting. +- phb4: Only set one bit in nfir + + The MPIPL procedure says to only set bit 26 when forcing the PEC into + freeze mode. Currently we set bits 24-27. + + This changes the code to follow spec and only set bit 26. +- phb4: Fix order of pfir/nfir clearing in CRESET + + According to the workbook, pfir must be cleared before the nfir. + The way we have it now causes the nfir to not clear properly in some + error circumstances. + + This swaps the order to match the workbook. +- phb4: Remove incorrect state transition + + When waiting in PHB4_SLOT_CRESET_WAIT_CQ for transations to end, we + incorrectly move onto the next state. Generally we don't hit this as + the transactions have ended already anyway. + + This removes the incorrect state transition. +- phb4: Set default lane equalisation + + Set default lane equalisation if there is nothing in the device-tree. + + Default value taken from hdat and confirmed by hardware team. Neatens + the code up a bit too. +- hdata: Fix phb4 lane-eq property generation + + The lane-eq data we get from hdat is all 7s but what we end up in the + device tree is: :: + + xscom@603fc00000000/pbcq@4010c00/stack@0/ibm,lane-eq + 00000000 31c339e0 00000000 0000000c + 00000000 00000000 00000000 00000000 + 00000000 31c30000 77777777 77777777 + 77777777 77777777 77777777 77777777 + + This fixes grabbing the properties from hdat and fixes the call to put + them in the device tree. +- phb4: Fix PHB4 fence recovery. + + We had a few problems: + + - We used the wrong register to trigger the reset (spec bug) + - We should clear the PFIR and NFIR while the reset is asserted + - ... and in the right order ! + - We should only apply the DD1 workaround after the reset has + been lifted. + - We should ensure we use ASB whenever we are fenced or doing a + CRESET + - Make config ops write with ASB +- phb4: Verbose EEH options + + Enabled via nvram pci-eeh-verbose=true. ie. :: + + nvram -p ibm,skiboot --update-config pci-eeh-verbose=true +- phb4: Print more info when PHB fences + + For now at PHBERR level. We don't have room in the diags data + passed to Linux for these unfortunately. + +Since :ref:`skiboot-5.6.0`: + +- phb4: Fix number of index bits in IODA tables + + On PHB4 the number of index bits in the IODA table address register + was bumped to 10 bits to accomodate for 1024 MSIs and 1024 TVEs (DD2). + + However our macro only defined the field to be 9 bits, thus causing + "interesting" behaviours on some systems. + +- phb4: Harden init with bad PHBs + + Currently if we read all 1's from the EEH or IRQ capabilities, we end + up train wrecking on some other random code (eg. an assert() in xive). + + This hardens the PHB4 code to look for these bad reads and more + gracefully fails the init for that PHB alone. This allows the rest of + the system to boot and ignore those bad PHBs. + +- phb4 capi (i.e. CAPI2): Handle HMI events + + Find the CAPP on the chip associated with the HMI event for PHB4. + The recovery mode (re-initialization of the capp, resume of functional + operations) is only available with P9 DD2. A new patch will be provided + to support this feature. + +.. _capi2-rn: + +- phb4 capi (i.e. CAPI2): Enable capi mode for PHB4 + + Enable the Coherently attached processor interface. The PHB is used as + a CAPI interface. + CAPI Adapters can be connected to either PEC0 or PEC2. Single port + CAPI adapter can be connected to either PEC0 or PEC2, but Dual-Port + Adapter can be only connected to PEC2 + * CAPP0 attached to PHB0(PEC0 - single port) + * CAPP1 attached to PHB3(PEC2 - single or dual port) + +- hw/phb4: Rework phb4_get_presence_state() + + There are two issues in current implementation: It should return errcode + visibile to Linux, which has prefix OPAL_*. The code isn't very obvious. + + This returns OPAL_HARDWARE when the PHB is broken. Otherwise, OPAL_SUCCESS + is always returned. In the mean while, It refactors the code to make it + obvious: OPAL_PCI_SLOT_PRESENT is returned when the presence signal (low active) + or PCIe link is active. Otherwise, OPAL_PCI_SLOT_EMPTY is returned. + +- phb4: Error injection for config space + + Implement CFG (config space) error injection. + + This works the same as PHB3. MMIO and DMA error injection require a + rewrite, so they're unsupported for now. + + While it's not feature complete, this at least provides an easy way to + inject an error that will trigger EEH. + +- phb4: Error clear implementation +- phb4: Mask link down errors during reset + + During a hot reset the PCI link will drop, so we need to mask link down + events to prevent unnecessary errors. +- phb4: Implement root port initialization + + phb4_root_port_init() was a NOP before, so fix that. +- phb4: Complete reset implementation + + This implements complete reset (creset) functionality for POWER9 DD1. + + Only partially tested and contends with some DD1 errata, but it's a start. + +.. _shared-slot-rn: + +- phb4: Activate shared PCI slot on witherspoon + + Witherspoon systems come with a 'shared' PCI slot: physically, it + looks like a x16 slot, but it's actually two x8 slots connected to two + PHBs of two different chips. Taking advantage of it requires some + logic on the PCI adapter. Only the Mellanox CX5 adapter is known to + support it at the time of this writing. + + This patch enables support for the shared slot on witherspoon if a x16 + adapter is detected. Each x8 slot has a presence bit, so both bits + need to be set for the activation to take place. Slot sharing is + activated through a gpio. + + Note that there's no easy way to be sure that the card is indeed a + shared-slot compatible PCI adapter and not a normal x16 card. Plugging + a normal x16 adapter on the shared slot should be avoided on + witherspoon, as the link won't train on the second slot, resulting in + a timeout and a longer boot time. Only the first slot is usable and + the x16 adapter will end up using only half the lines. + + If the PCI card plugged on the physical slot is only x8 (or less), + then the presence bit of the second slot is not set, so this patch + does nothing. The x8 (or less) adapter should work like on any other + physical slot. + +- phb4: Block D-state power management on direct slots + + As current revisions of PHB4 don't properly handle the resulting + L1 link transition. + +- phb4: Call pci config filters + +- phb4: Mask out write-1-to-clear registers in RC cfg + + The root complex config space only supports 4-byte accesses. Thus, when + the client requests a smaller size write, we do a read-modify-write to + the register. + + However, some register have bits defined as "write 1 to clear". + + If we do a RMW cycles on such a register and such bits are 1 in the + part that the client doesn't intend to modify, we will accidentally + write back those 1's and clear the corresponding bit. + + This avoids it by masking out those magic bits from the "old" value + read from the register. + +- phb4: Properly mask out link down errors during reset +- phb3/4: Silence a useless warning + + PHB's don't have base location codes on non-FSP systems and it's + normal. + +- phb4: Workaround bug in spec 053 + + Wait for DLP PGRESET to clear *after* lifting the PCIe core reset + +- phb4: DD2.0 updates + + Support StoreEOI, full complements of PEs (twice as big TVT) + and other updates. + + Also renumber init steps to match spec 063 + +NPU2 +^^^^ + +Note that currently NPU2 support is limited to POWER9 DD1 hardware. + +Since :ref:`skiboot-5.6.0`: + +- platforms/astbmc/witherspoon.c: Add NPU2 slot mappings + + For NVLink2 to function PCIe devices need to be associated with the right + NVLinks. This association is supposed to be passed down to Skiboot via HDAT but + those fields are still not correctly filled out. To work around this we add slot + tables for the NVLinks similar to what we have for P8+. + +- hw/npu2.c: Fix device aperture calculation + + The POWER9 NPU2 implements an address compression scheme to compress 56-bit P9 + physical addresses to 47-bit GPU addresses. System software needs to know both + addresses, unfortunately the calculation of the compressed address was + incorrect. Fix it here. + +- hw/npu2.c: Change MCD BAR allocation order + + MCD BARs need to be correctly aligned to the size of the region. As GPU + memory is allocated from the top of memory down we should start allocating + from the highest GPU memory address to the lowest to ensure correct + alignment. + +- NPU2: Add flag to nvlink config space indicating DL reset state + + Device drivers need to be able to determine if the DL is out of reset or + not so they can safely probe to see if links have already been trained. + This patch adds a flag to the vendor specific config space indicating if + the DL is out of reset. + +- hw/npu2.c: Hardcode MSR_SF when setting up npu XTS contexts + + We don't support anything other than 64-bit mode for address translations so we + can safely hardcode it. + +- hw/npu2-hw-procedures.c: Add nvram option to override zcal calculations + + In some rare cases the zcal state machine may fail and flag an error. According + to hardware designers it is sometimes ok to ignore this failure and use nominal + values for the calculations. In this case we add a nvram variable + (nv_zcal_override) which will cause skiboot to ignore the failure and use the + nominal value specified in nvram. +- npu2: Fix npu2_{read,write}_4b() + + When writing or reading 4-byte values, we need to use the upper half of + the 64-bit SCOM register. + + Fix npu2_{read,write}_4b() and their callers to use uint32_t, and + appropriately shift the value being written or returned. + + +- hw/npu2.c: Fix opal_npu_map_lpar to search for existing BDF +- hw/npu2-hw-procedures.c: Fix running of zcal procedure + + The zcal procedure should only be run once per obus (ie. once per group of 3 + links). Clean up the code and fix the potential buffer overflow due to a typo. + Also updates the zcal settings to their proper values. +- hw/npu2.c: Add memory coherence directory programming + + The memory coherence directory (MCD) needs to know which system memory addresses + belong to the GPU. This amounts to setting a BAR and a size in the MCD to cover + the addresses assigned to each of the GPUs. To ease assignment we assume GPUs + are assigned memory in a contiguous block per chip. + +OCC/Power Management +^^^^^^^^^^^^^^^^^^^^ + +With this release, it's possible to boot POWER9 systems with the OCC +enabled and change CPU frequencies. Doing so does require other firmware +components to also support this (otherwise the frequency will not be set). + +Since :ref:`skiboot-5.6.0`: + +- occ: Skip setting cores to nominal frequency in P9 + + In P9, once OCC is up, it is supposed to setup the cores to nominal + frequency. So skip this step in OPAL. +- occ: Fix Pstate ordering for P9 + + In P9 the pstate values are positive. They are continuous set of + unsigned integers [0 to +N] where Pmax is 0 and Pmin is N. The + linear ordering of pstates for P9 has changed compared to P8. + P8 has neagtive pstate values advertised as [0 to -N] where Pmax + is 0 and Pmin is -N. This patch adds helper routines to abstract + pstate comparison with pmax and adds sanity pstate limit checks. + This patch also fixes pstate arithmetic by using labs(). +- p8-i2c: occ: Add support for OCC to use I2C engines + + This patch adds support to share the I2C engines with host and OCC. + OCC uses I2C engines to read DIMM temperatures and to communicate with + GPU. OCC Flag register is used for locking between host and OCC. Host + requests for the bus by setting a bit in OCC Flag register. OCC sends + an interrupt to indicate the change in ownership. + +opal-prd/PRD +^^^^^^^^^^^^ + +Since :ref:`skiboot-5.6.0`: + +- opal-prd: Handle SBE passthrough message passing + + This patch adds support to send SBE pass through command to HBRT. +- SBE: Add passthrough command support + + SBE sends passthrough command. We have to capture this interrupt and + send event to HBRT via opal-prd (user space daemon). +- opal-prd: hook up reset_pm_complex + + This change provides the facility to invoke HBRT's reset_pm_complex, in + the same manner is done with process_occ_reset previously. + + We add a control command for `opal-prd pm-complex reset`, which is just + an alias for occ_reset at this stage. + +- prd: Implement firmware side of opaque PRD channel + + This change introduces the firmware side of the opaque HBRT <--> OPAL + message channel. We define a base message format to be shared with HBRT + (in include/prd-fw-msg.h), and allow firmware requests and responses to + be sent over this channel. + + We don't currently have any notifications defined, so have nothing to do + for firmware_notify() at this stage. + +- opal-prd: Add firmware_request & firmware_notify implementations + + This change adds the implementation of firmware_request() and + firmware_notify(). To do this, we need to add a message queue, so that + we can properly handle out-of-order messages coming from firmware. + +- opal-prd: Add support for variable-sized messages + + With the introductuion of the opaque firmware channel, we want to + support variable-sized messages. Rather than expecting to read an + entire 'struct opal_prd_msg' in one read() call, we can split this + over mutiple reads, potentially expanding our message buffer. + +- opal-prd: Sync hostboot interfaces with HBRT + + This change adds new callbacks defined for p9, and the base thunks for + the added calls. + +- opal-prd: interpret log level prefixes from HBRT + + Interpret the (optional) \*_MRK log prefixes on HBRT messages, and set + the syslog log priority to suit. + +- opal-prd: Add occ reset to usage text +- opal-prd: allow different chips for occ control actions + + The `occ reset` and `occ error` actions can both take a chip id + argument, but we're currently just using zero. This change changes the + control message format to pass the chip ID from the control process to + the opal-prd daemon. + + +IBM FSP based platforms +----------------------- + +Since :ref:`skiboot-5.7-rc2`: + +- FSP/CONSOLE: Do not enable input irq in write path + + We use irq for reading input from console, but not in output path. + Hence do not enable input irq in write path. + + Fixes : 583c8203 (fsp/console: Allocate irq for each hvc console) + +Since :ref:`skiboot-5.6.0`: + +- FSP/CONSOLE: Fix possible NULL dereference +- platforms/ibm-fsp/firenze: Fix PCI slot power-off pattern + + When powering off the PCI slot, the corresponding bits should + be set to 0bxx00xx00 instead of 0bxx11xx11. Otherwise, the + specified PCI slot can't be put into power-off state. Fortunately, + it didn't introduce any side-effects so far. +- FSP/CONSOLE: Workaround for unresponsive ipmi daemon + + We use TCE mapped area to write data to console. Console header + (fsp_serbuf_hdr) is modified by both FSP and OPAL (OPAL updates + next_in pointer in fsp_serbuf_hdr and FSP updates next_out pointer). + + Kernel makes opal_console_write() OPAL call to write data to console. + OPAL write data to TCE mapped area and sends MBOX command to FSP. + If our console becomes full and we have data to write to console, + we keep on waiting until FSP reads data. + + In some corner cases, where FSP is active but not responding to + console MBOX message (due to buggy IPMI) and we have heavy console + write happening from kernel, then eventually our console buffer + becomes full. At this point OPAL starts sending OPAL_BUSY_EVENT to + kernel. Kernel will keep on retrying. This is creating kernel soft + lockups. In some extreme case when every CPU is trying to write to + console, user will not be able to ssh and thinks system is hang. + + If we reset FSP or restart IPMI daemon on FSP, system recovers and + everything becomes normal. + + This patch adds workaround to above issue by returning OPAL_HARDWARE + when cosole is full. Side effect of this patch is, we may endup dropping + latest console data. But better to drop console data than system hang. + +- FSP: Set status field in response message for timed out message + + For timed out FSP messages, we set message status as "fsp_msg_timeout". + But most FSP driver users (like surviellance) are ignoring this field. + They always look for FSP returned status value in callback function + (second byte in word1). So we endup treating timed out message as success + response from FSP. + + Sample output: :: + + [69902.432509048,7] SURV: Sending the heartbeat command to FSP + [70023.226860117,4] FSP: Response from FSP timed out, word0 = d66a00d7, word1 = 0 state: 3 + .... + [70023.226901445,7] SURV: Received heartbeat acknowledge from FSP + [70023.226903251,3] FSP: fsp_trigger_reset() entry + + Here SURV code thought it got valid response from FSP. But actually we didn't + receive response from FSP. + + This patch fixes above issue by updating status field in response structure. + +- FSP: Improve timeout message + +- FSP/RTC: Fix possible FSP R/R issue in rtc write path +- hw/fsp/rtc: read/write cached rtc tod on fsp hir. + + Currently fsp-rtc reads/writes the cached RTC TOD on an fsp + reset. Use latest fsp_in_rr() function to properly read the cached rtc + value when fsp reset initiated by the hir. + + Below is the kernel trace when we set hw clock, when hir process starts. :: + + [ 1727.775824] NMI watchdog: BUG: soft lockup - CPU#57 stuck for 23s! [hwclock:7688] + [ 1727.775856] Modules linked in: vmx_crypto ibmpowernv ipmi_powernv uio_pdrv_genirq ipmi_devintf powernv_op_panel uio ipmi_msghandler powernv_rng leds_powernv ip_tables x_tables autofs4 ses enclosure scsi_transport_sas crc32c_vpmsum lpfc ipr tg3 scsi_transport_fc + [ 1727.775883] CPU: 57 PID: 7688 Comm: hwclock Not tainted 4.10.0-14-generic #16-Ubuntu + [ 1727.775883] task: c000000fdfdc8400 task.stack: c000000fdfef4000 + [ 1727.775884] NIP: c00000000090540c LR: c0000000000846f4 CTR: 000000003006dd70 + [ 1727.775885] REGS: c000000fdfef79a0 TRAP: 0901 Not tainted (4.10.0-14-generic) + [ 1727.775886] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> + [ 1727.775889] CR: 28024442 XER: 20000000 + [ 1727.775890] CFAR: c00000000008472c SOFTE: 1 + GPR00: 0000000030005128 c000000fdfef7c20 c00000000144c900 fffffffffffffff4 + GPR04: 0000000028024442 c00000000090540c 9000000000009033 0000000000000000 + GPR08: 0000000000000000 0000000031fc4000 c000000000084710 9000000000001003 + GPR12: c0000000000846e8 c00000000fba0100 + [ 1727.775897] NIP [c00000000090540c] opal_set_rtc_time+0x4c/0xb0 + [ 1727.775899] LR [c0000000000846f4] opal_return+0xc/0x48 + [ 1727.775899] Call Trace: + [ 1727.775900] [c000000fdfef7c20] [c00000000090540c] opal_set_rtc_time+0x4c/0xb0 (unreliable) + [ 1727.775901] [c000000fdfef7c60] [c000000000900828] rtc_set_time+0xb8/0x1b0 + [ 1727.775903] [c000000fdfef7ca0] [c000000000902364] rtc_dev_ioctl+0x454/0x630 + [ 1727.775904] [c000000fdfef7d40] [c00000000035b1f4] do_vfs_ioctl+0xd4/0x8c0 + [ 1727.775906] [c000000fdfef7de0] [c00000000035bab4] SyS_ioctl+0xd4/0xf0 + [ 1727.775907] [c000000fdfef7e30] [c00000000000b184] system_call+0x38/0xe0 + [ 1727.775908] Instruction dump: + [ 1727.775909] f821ffc1 39200000 7c832378 91210028 38a10020 39200000 38810028 f9210020 + [ 1727.775911] 4bfffe6d e8810020 80610028 4b77f61d <60000000> 7c7f1b78 3860000a 2fbffff4 + + This is found when executing the testcase + https://github.com/open-power/op-test-framework/blob/master/testcases/fspresetReload.py + + With this fix ran fsp hir torture testcase in the above test + which is working fine. +- occ: Set return variable to correct value + + When entering this section of code rc will be zero. If fsp_mkmsg() fails + the code responsible for printing an error message won't be set. + Resetting rc should allow for the error case to trigger if fsp_mkmsg + fails. +- capp: Fix hang when CAPP microcode LID is missing on FSP machine + + When the LID is absent, we fail early with an error from + start_preload_resource. In that case, capp_ucode_info.load_result + isn't set properly causing a subsequent capp_lid_download() to + call wait_for_resource_loaded() on something that isn't being + loaded, thus hanging. + +- FSP: Add check to detect FSP R/R inside fsp_sync_msg() + + OPAL sends MBOX message to FSP and updates message state from fsp_msg_queued + -> fsp_msg_sent. fsp_sync_msg() queues message and waits until we get response + from FSP. During FSP R/R we move outstanding MBOX messages from msgq to rr_queue + including inflight message (fsp_reset_cmdclass()). But we are not resetting + inflight message state. + + In extreme croner case where we sent message to FSP via fsp_sync_msg() path + and FSP R/R happens before getting respose from FSP, then we will endup waiting + in fsp_sync_msg() until everything becomes normal. + + This patch adds fsp_in_rr() check to fsp_sync_msg() and return error to caller + if FSP is in R/R. +- FSP: Add check to detect FSP R/R inside fsp_sync_msg() + + OPAL sends MBOX message to FSP and updates message state from fsp_msg_queued + -> fsp_msg_sent. fsp_sync_msg() queues message and waits until we get response + from FSP. During FSP R/R we move outstanding MBOX messages from msgq to rr_queue + including inflight message (fsp_reset_cmdclass()). But we are not resetting + inflight message state. + + In extreme croner case where we sent message to FSP via fsp_sync_msg() path + and FSP R/R happens before getting respose from FSP, then we will endup waiting + in fsp_sync_msg() until everything becomes normal. + + This patch adds fsp_in_rr() check to fsp_sync_msg() and return error to caller + if FSP is in R/R. +- capp: Fix hang when CAPP microcode LID is missing on FSP machine + + When the LID is absent, we fail early with an error from + start_preload_resource. In that case, capp_ucode_info.load_result + isn't set properly causing a subsequent capp_lid_download() to + call wait_for_resource_loaded() on something that isn't being + loaded, thus hanging. +- FSP/CONSOLE: Do not free fsp_msg in error path + + as we reuse same msg to send next output message. + +- platform/zz: Acknowledge OCC_LOAD mbox message in ZZ + + In P9 FSP box, OCC image is pre-loaded. So do not handle the load + command and send SUCCESS to FSP on recieving OCC_LOAD mbox message. + +- FSP/RTC: Improve error log + +astbmc systems +-------------- + +Since :ref:`skiboot-5.6.0`: + +- platforms/astbmc: Don't validate model on palmetto + + The platform isn't compatible with palmetto until the root device-tree + node's "model" property is NULL or "palmetto". However, we could have + "TN71-BP012" for the property on palmetto. :: + + linux# cat /proc/device-tree/model + TN71-BP012 + + This skips the validation on root device-tree node's "model" property + on palmetto, meaning we check the "compatible" property only. + + +General +------- + +Since :ref:`skiboot-5.7-rc2`: + +- core/pci: Fix mem-leak on fast-reboot + + Fast-reboot has a memory leak which causes the system to crash after about + 250 fast-reboots. The patch fixes the memory leak. + The cause of the leak was the pci_device's being freed, without freeing + the pci_slot within it. + +- gcov: properly handle gard and pflash code coverage + +Since :ref:`skiboot-5.6.0`: + +- Reduce log level on non-error log messages + + 90% of what we print isn't useful to a normal user. This + dramatically reduces the amount of messages printed by + OPAL in normal circumstances. + +- init: Silence messages and call ourselves "OPAL" +- psi: Switch to ESB mode later + + There's an errata, if we switch to ESB mode before setting up + the various ESB mode related registers, a pending interrupts + can go wrong. + +- lpc: Enable "new" SerIRQ mode +- hw/ipmi/ipmi-sel: missing newline in prlog warning + +- p8-i2c OCC lock: fix locking in p9_i2c_bus_owner_change +- Convert important polling loops to spin at lowest SMT priority + + The pattern of calling cpu_relax() inside a polling loop does + not suit the powerpc SMT priority instructions. Prefrred is to + set a low priority then spin until break condition is reached, + then restore priority. + +- Improve cpu_idle when PM is disabled + + Split cpu_idle() into cpu_idle_delay() and cpu_idle_job() rather than + requesting the idle type as a function argument. Have those functions + provide a default polling (non-PM) implentation which spin at the + lowest SMT priority. + +- core/fdt: Always add a reserve map + + Currently we skip adding the reserved ranges block to the generated + FDT blob if we are excluding the root node. This can result in a DTB + that dtc will barf on because the reserved memory ranges overlap with + the start of the dt_struct block. As an example: :: + + $ fdtdump broken.dtb -d + /dts-v1/; + // magic: 0xd00dfeed + // totalsize: 0x7f3 (2035) + // off_dt_struct: 0x30 <----\ + // off_dt_strings: 0x7b8 | this is bad! + // off_mem_rsvmap: 0x30 <----/ + // version: 17 + // last_comp_version: 16 + // boot_cpuid_phys: 0x0 + // size_dt_strings: 0x3b + // size_dt_struct: 0x788 + + /memreserve/ 0x100000000 0x300000004; + /memreserve/ 0x3300000001 0x169626d2c; + /memreserve/ 0x706369652d736c6f 0x7473000000000003; + *continues* + + With this patch: :: + + $ fdtdump working.dtb -d + /dts-v1/; + // magic: 0xd00dfeed + // totalsize: 0x803 (2051) + // off_dt_struct: 0x40 + // off_dt_strings: 0x7c8 + // off_mem_rsvmap: 0x30 + // version: 17 + // last_comp_version: 16 + // boot_cpuid_phys: 0x0 + // size_dt_strings: 0x3b + // size_dt_struct: 0x788 + + // 0040: tag: 0x00000001 (FDT_BEGIN_NODE) + / { + // 0048: tag: 0x00000003 (FDT_PROP) + // 07fb: string: phandle + // 0054: value + phandle = <0x00000001>; + *continues* + +- hw/lpc-mbox: Use message registers for interrupts + + Currently the BMC raises the interrupt using the BMC control register. + It does so on all accesses to the 16 'data' registers meaning that when + the BMC only wants to set the ATTN (on which we have interrupts enabled) + bit we will also get a control register based interrupt. + + The solution here is to mask that interrupt permanantly and enable + interrupts on the protocol defined 'response' data byte. + +PCI +--- + +Since :ref:`skiboot-5.6.0`: + +- pci: Wait 20ms before checking presence detect on PCIe + + As the PHB presence logic has a debounce timer that can take + a while to settle. + +- phb3+iov: Fixup support for config space filters + + The filter should be called before the HW access and its + return value control whether to perform the access or not +- core/pci: Use PCI slot's power facality in pci_enable_bridge() + + The current implmentation has incorrect assumptions: there is + always a PCI slot associated with root port and PCIe switch + downstream port and all of them are capable to change its + power state by register PCICAP_EXP_SLOTCTL. Firstly, there + might not a PCI slot associated with the root port or PCIe + switch downstream port. Secondly, the power isn't controlled + by standard config register (PCICAP_EXP_SLOTCTL). There are + I2C slave devices used to control the power states on Tuleta. + + In order to use the PCI slot's methods to manage the power + states, this does: + + * Introduce PCI_SLOT_FLAG_ENFORCE, indicates the request operation + is enforced to be applied. + * pci_enable_bridge() is split into 3 functions: pci_bridge_power_on() + to power it on; pci_enable_bridge() as a place holder and + pci_bridge_wait_link() to wait the downstream link to come up. + * In pci_bridge_power_on(), the PCI slot's specific power management + methods are used if there is a PCI slot associated with the PCIe + switch downstream port or root port. +- platforms/astbmc/slots.c: Allow comparison of bus numbers when matching slots + + When matching devices on multiple down stream PLX busses we need to compare more + than just the device-id of the PCIe BDFN, so increase the mask to do so. + +Debugging, Tests and simulators +------------------------------- + +Since :ref:`skiboot-5.7-rc2`: + +- boot_tests: add PFLASH_TO_COPY for OpenBMC +- travis: Add debian stretch and unstable + + At the moment, we mark them both as being able to fail, as we're + hitting an assert in one of the unit tests on debian stretch, and + that hasn't yet been chased down. + +- core/backtrace: Serialise printing backtraces + + Add a lock so that only one thread can print a backtrace at a time. + This should prevent multiple threads from garbaling each other's + backtraces. + +Since :ref:`skiboot-5.7-rc1`: + +- lpc: remove double LPC prefix from messages +- opal-ci/fetch-debian-jessie-installer: follow redirects + Fixes some CI failures +- test/qemu-jessie: bail out fast on kernel panic +- test/qemu-jessie: dump boot log on failure +- travis: add fedora26 +- xz: add fallthrough annotations to silence GCC7 warning + +Since :ref:`skiboot-5.6.0`: + +- boot-tests: add OpenBMC support +- boot_test.sh: Add SMC BMC support + + Your BMC needs a special debug image flashed to use this, the exact + image and methods aren't something I can publish here, but if you work + for IBM or SMC you can find out from the right sources. + + A few things are needed to move around to be able to flash to a SMC BMC. + + For a start, the SSH daemon will only accept connections after a special + incantation (which I also can't share), but you should put that in the + ~/.skiboot_boot_tests file along with some other default login information + we don't publicise too broadly (because Security Through Obscurity is + *obviously* a good idea....) + + We also can't just directly "ssh /bin/true", we need an expect script, + and we can't scp, but we can anonymous rsync! + + You also need a pflash binary to copy over. +- hdata_to_dt: Add PVR overrides to the usage text +- mambo: Add a reservation for the initramfs + + On most systems the initramfs is loaded inside the part of memory + reserved for the OS [0x0-0x30000000] and skiboot will never touch it. + On mambo it's loaded at 0x80000000 and if you're unlucky skiboot can + allocate over the top of it and corrupt the initramfs blob. + + There might be the downside that the kernel cannot re-use the initramfs + memory since it's marked as reserved, but the kernel might also free it + anyway. +- mambo: Update P9 PVR to reflect Scale out 24 core chips + + The P9 PVR bits 48:51 don't indicate a revision but instead different + configurations. From BookIV we have: + + ==== =================== + Bits Configuration + ==== =================== + 0 Scale out 12 cores + 1 Scale out 24 cores + 2 Scale up 12 cores + 3 Scale up 24 cores + ==== =================== + + Skiboot will mostly the use "Scale out 24 core" configuration + (ie. SMT4 not SMT8) so reflect this in mambo. +- core: Move enable_mambo_console() into chip initialisation + + Rather than having a wart in main_cpu_entry() that initialises the mambo + console, we can move it into init_chips() which is where we discover that we're + on mambo. + +- mambo: Create multiple chips when we have multiple CPUs + + Currently when we boot mambo with multiple CPUs, we create multiple CPU nodes in + the device tree, and each claims to be on a separate chip. + + However we don't create multiple xscom nodes, which means skiboot only knows + about a single chip, and all CPUs end up on it. At the moment mambo is not able + to create multiple xscom controllers. We can create fake ones, just by faking + the device tree up, but that seems uglier than this solution. + + So create a mambo-chip for each CPU other than 0, to tell skiboot we want a + separate chip created. This then enables Linux to see multiple chips: :: + + smp: Brought up 2 nodes, 2 CPUs + numa: Node 0 CPUs: 0 + numa: Node 1 CPUs: 1 + +- chip: Add support for discovering chips on mambo + + Currently the only way for skiboot to discover chips is by looking for xscom + nodes. But on mambo it's currently not possible to create multiple xscom nodes, + which means we can only simulate a single chip system. + + However it seems we can fairly cleanly add support for a special mambo chip + node, and use that to instantiate multiple chips. + + Add a check in init_chip() that we're not clobbering an already initialised + chip, now that we have two places that initialise chips. +- mambo: Make xscom claim to be DD 2.0 + + In the mambo tcl we set the CPU version to DD 2.0, because mambo is not + bug compatible with DD 1. + + But in xscom_read_cfam_chipid() we have a hard coded value, to work + around the lack of the f000f register, which claims to be P9 DD 1.0. + + This doesn't seem to cause crashes or anything, but at boot we do see: :: + + [ 0.003893084,5] XSCOM: chip 0x0 at 0x1a0000000000 [P9N DD1.0] + + So fix it to claim that the xscom is also DD 2.0 to match the CPU. + +- mambo: Match whole string when looking up symbols with linsym/skisym + + linsym/skisym use a regex to match the symbol name, and accepts a + partial match against the entry in the symbol map, which can lead to + somewhat confusing results, eg: :: + + systemsim % linsym early_setup + 0xc000000000027890 + systemsim % linsym early_setup$ + 0xc000000000aa8054 + systemsim % linsym early_setup_secondary + 0xc000000000027890 + + I don't think that's the behaviour we want, so append a $ to the name so + that the symbol has to match against the whole entry, eg: :: + + systemsim % linsym early_setup + 0xc000000000aa8054 + +- Disable nap on P8 Mambo, public release has bugs +- mambo: Allow loading multiple CPIOs + + Currently we have support for loading a single CPIO and telling Linux to + use it as the initrd. But the Linux code actually supports having + multiple CPIOs contiguously in memory, between initrd-start and end, and + will unpack them all in order. That is a really nice feature as it means + you can have a base CPIO with your root filesystem, and then tack on + others as you need for various tests etc. + + So expand the logic to handle SKIBOOT_INITRD, and treat it as a comma + separated list of CPIOs to load. I chose comma as it's fairly rare in + filenames, but we could make it space, colon, whatever. Or we could add + a new environment variable entirely. The code also supports trimming + whitespace from the values, so you can have "cpio1, cpio2". +- hdata/test: Add memory reservations to hdata_to_dt + + Currently memory reservations are parsed, but since they are not + processed until mem_region_init() they don't appear in the output + device tree blob. Several bugs have been found with memory reservations + so we want them to be part of the test output. + + Add them and clean up several usages of printf() since we want only the + dtb to appear in standard out. + + +pflash/libffs +------------- + +Since :ref:`skiboot-5.7-rc2`: + +- pflash option to retrieve PNOR partition flags + + This commit extends pflash with an option to retrieve and print + information for a particular partition, including the content from + "pflash -i" and a verbose list of set miscellaneous flags. -i option + is also updated to print a short list of flags in addition to the + ECC flag, with one character per flag. A test of the new option is + included in libflash/test. + +Since :ref:`skiboot-5.6.0`: + +- libflash/libffs: Zero checksum words + + On writing ffs entries to flash libffs doesn't zero checksum words + before calculating the checksum across the entire structure. This causes + an inaccurate calculation of the checksum as it may calculate a checksum + on non-zero checksum bytes. + +- libffs: Fix ffs_lookup_part() return value + + It would return success when the part wasn't found +- libflash/libffs: Correctly update the actual size of the partition + + libffs has been updating FFS partition information in the wrong place + which leads to incomplete erases and corruption. +- libflash: Initialise entries list earlier + + In the bail-out path we call ffs_close() to tear down the partially + initialised ffs_handle. ffs_close() expects the entries list to be + initialised so we need to do that earlier to prevent a null pointer + dereference. + +mbox-flash +---------- + +mbox-flash is the emerging standard way of talking to host PNOR flash +on POWER9 systems. + +- libflash/mbox-flash: Implement MARK_WRITE_ERASED mbox call + + Version two of the mbox-flash protocol defines a new command: + MARK_WRITE_ERASED. + + This command provides a simple way to mark a region of flash as all 0xff + without the need to go and write all 0xff. This is an optimisation as + there is no need for an erase before a write, it is the responsibility of + the BMC to deal with the flash correctly, however in v1 it was ambiguous + what a client should do if the flash should be erased but not actually + written to. This allows of a optimal path to resolve this problem. + +- libflash/mbox-flash: Update to V2 of the protocol + + Updated version 2 of the protocol can be found at: + https://github.com/openbmc/mboxbridge/blob/master/Documentation/mbox_protocol.md + + This commit changes mbox-flash such that it will preferentially talk + version 2 to any capable daemon but still remain capable of talking to + v1 daemons. + + Version two changes some of the command definitions for increased + consistency and usability. + Version two includes more attention bits - these are now dealt with at a + simple level. +- libflash/mbox-flash: Implement MARK_WRITE_ERASED mbox call + + Version two of the mbox-flash protocol defines a new command: + MARK_WRITE_ERASED. + + This command provides a simple way to mark a region of flash as all 0xff + without the need to go and write all 0xff. This is an optimisation as + there is no need for an erase before a write, it is the responsibility of + the BMC to deal with the flash correctly, however in v1 it was ambiguous + what a client should do if the flash should be erased but not actually + written to. This allows of a optimal path to resolve this problem. + +- libflash/mbox-flash: Update to V2 of the protocol + + Updated version 2 of the protocol can be found at: + https://github.com/openbmc/mboxbridge/blob/master/Documentation/mbox_protocol.md + + This commit changes mbox-flash such that it will preferentially talk + version 2 to any capable daemon but still remain capable of talking to + v1 daemons. + + Version two changes some of the command definitions for increased + consistency and usability. + Version two includes more attention bits - these are now dealt with at a + simple level. + +- hw/lpc-mbox: Use message registers for interrupts + + Currently the BMC raises the interrupt using the BMC control register. + It does so on all accesses to the 16 'data' registers meaning that when + the BMC only wants to set the ATTN (on which we have interrupts enabled) + bit we will also get a control register based interrupt. + + The solution here is to mask that interrupt permanantly and enable + interrupts on the protocol defined 'response' data byte. + + +Contributors +------------ + +* Processed 232 csets from 29 developers. +* 1 employer found +* A total of 13043 lines added, 2517 removed (delta 10526) + +Extending the analysis done for some previous releases, we can see our trends +in code review across versions: + +======= ====== ======== ========= ========= =========== +Release csets Ack % Reviews % Tested % Reported % +======= ====== ======== ========= ========= =========== +5.0 329 15 (5%) 20 (6%) 1 (0%) 0 (0%) +5.1 372 13 (3%) 38 (10%) 1 (0%) 4 (1%) +5.2-rc1 334 20 (6%) 34 (10%) 6 (2%) 11 (3%) +5.3-rc1 302 36 (12%) 53 (18%) 4 (1%) 5 (2%) +5.4 361 16 (4%) 28 (8%) 1 (0%) 9 (2%) +5.5 408 11 (3%) 48 (12%) 14 (3%) 10 (2%) +5.6 87 12 (14%) 6 (7%) 5 (6%) 2 (2%) +5.7 232 30 (13%) 32 (14%) 5 (2%) 2 (1%) +======= ====== ======== ========= ========= =========== + +This cycle has been good for reviews/acks, scoring second highest percentage +ever on both, as well as being right up there on absolute numbers. + + +Developers with the most changesets +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================= ==== ======= +Developer # % +========================= ==== ======= +Benjamin Herrenschmidt 41 (17.7%) +Stewart Smith 31 (13.4%) +Michael Neuling 28 (12.1%) +Oliver O'Halloran 18 (7.8%) +Vasant Hegde 18 (7.8%) +Jeremy Kerr 12 (5.2%) +Alistair Popple 11 (4.7%) +Gavin Shan 10 (4.3%) +Russell Currey 9 (3.9%) +Michael Ellerman 9 (3.9%) +Madhavan Srinivasan 7 (3.0%) +Cyril Bur 6 (2.6%) +Christophe Lombard 5 (2.2%) +Shilpasri G Bhat 5 (2.2%) +Andrew Donnellan 3 (1.3%) +Nicholas Piggin 3 (1.3%) +Mahesh Salgaonkar 2 (0.9%) +Anju T Sudhakar 2 (0.9%) +Hemant Kumar 2 (0.9%) +Matt Brown 1 (0.4%) +Michael Tritz 1 (0.4%) +Joel Stanley 1 (0.4%) +Balbir Singh 1 (0.4%) +Frederic Barrat 1 (0.4%) +Andrew Jeffery 1 (0.4%) +Pridhiviraj Paidipeddi 1 (0.4%) +Reza Arbab 1 (0.4%) +Suraj Jitindar Singh 1 (0.4%) +Vaibhav Jain 1 (0.4%) +========================= ==== ======= + + +Developers with the most changed lines +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================= ==== ======= +Developer # % +========================= ==== ======= +Hemant Kumar 3056 (23.0%) +Stewart Smith 1826 (13.7%) +Benjamin Herrenschmidt 1348 (10.1%) +Christophe Lombard 937 (7.0%) +Shilpasri G Bhat 770 (5.8%) +Madhavan Srinivasan 755 (5.7%) +Jeremy Kerr 731 (5.5%) +Cyril Bur 674 (5.1%) +Alistair Popple 477 (3.6%) +Gavin Shan 414 (3.1%) +Russell Currey 396 (3.0%) +Michael Neuling 336 (2.5%) +Vasant Hegde 308 (2.3%) +Oliver O'Halloran 300 (2.3%) +Anju T Sudhakar 300 (2.3%) +Michael Tritz 167 (1.3%) +Frederic Barrat 113 (0.8%) +Nicholas Piggin 93 (0.7%) +Mahesh Salgaonkar 76 (0.6%) +Michael Ellerman 66 (0.5%) +Suraj Jitindar Singh 59 (0.4%) +Andrew Donnellan 53 (0.4%) +Joel Stanley 20 (0.2%) +Balbir Singh 12 (0.1%) +Reza Arbab 10 (0.1%) +Vaibhav Jain 9 (0.1%) +Pridhiviraj Paidipeddi 2 (0.0%) +Matt Brown 1 (0.0%) +Andrew Jeffery 1 (0.0%) +========================= ==== ======= + +Developers with the most signoffs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +(total 242) + +========================= ==== ======= +Developer # % +========================= ==== ======= +Stewart Smith 201 (83.1%) +Michael Neuling 29 (12.0%) +Madhavan Srinivasan 4 (1.7%) +Suraj Jitindar Singh 3 (1.2%) +Anju T Sudhakar 2 (0.8%) +Hemant Kumar 2 (0.8%) +Cyril Bur 1 (0.4%) +========================= ==== ======= + +Developers with the most reviews +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +(total 32) + +========================= ==== ======= +Developer # % +========================= ==== ======= +Vasant Hegde 8 (25.0%) +Cyril Bur 7 (21.9%) +Andrew Donnellan 5 (15.6%) +Frederic Barrat 5 (15.6%) +Andrew Jeffery 2 (6.2%) +Gavin Shan 2 (6.2%) +Joel Stanley 1 (3.1%) +Oliver O'Halloran 1 (3.1%) +Alistair Popple 1 (3.1%) +========================= ==== ======= + +Developers with the most test credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +(total 5) + +========================== ==== ======= +Developer # % +========================== ==== ======= +Vasant Hegde 2 (40.0%) +Oliver O'Halloran 1 (20.0%) +Ananth N Mavinakayanahalli 1 (20.0%) +Michael Ellerman 1 (20.0%) +========================== ==== ======= + +Developers who gave the most tested-by credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +(total 5) + +========================= ==== ======= +Developer # % +========================= ==== ======= +Jeremy Kerr 2 (40.0%) +Vasant Hegde 1 (20.0%) +Oliver O'Halloran 1 (20.0%) +Michael Ellerman 1 (20.0%) +========================= ==== ======= + +Developers with the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +(total 2) + +========================= ==== ======= +Developer # % +========================= ==== ======= +Oliver O'Halloran 1 (50.0%) +Alastair D'Silva 1 (50.0%) +========================= ==== ======= + +Developers who gave the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +(total 2) + +========================= ==== ======= +Developer # % +========================= ==== ======= +Andrew Donnellan 1 (50.0%) +Stewart Smith 1 (50.0%) +========================= ==== ======= diff --git a/roms/skiboot/doc/release-notes/skiboot-5.8-rc1.rst b/roms/skiboot/doc/release-notes/skiboot-5.8-rc1.rst new file mode 100644 index 000000000..cd559c8f1 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.8-rc1.rst @@ -0,0 +1,480 @@ +.. _skiboot-5.8-rc1: + +skiboot-5.8-rc1 +=============== + +skiboot v5.8-rc1 was released on Tuesday August 22nd 2017. It is the first +release candidate of skiboot 5.8, which will become the new stable release +of skiboot following the 5.7 release, first released 25th July 2017. + +skiboot v5.8-rc1 contains all bug fixes as of :ref:`skiboot-5.4.6` +and :ref:`skiboot-5.1.20` (the currently maintained stable releases). We +do not currently expect to do any 5.7.x stable releases. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.8 by August 25th, with skiboot 5.8 +being for all POWER8 and POWER9 platforms in op-build v1.19 (Due August 25th). +This is a short cycle as this release is mainly targetted towards POWER9 +bringup efforts. + +Over skiboot-5.7, we have the following changes: + +New Features +------------ +- sensors: occ: Add support to clear sensor groups + + Adds a generic API to clear sensor groups. OCC inband sensor groups + such as CSM, Profiler and Job Scheduler can be cleared using this API. + It will clear the min/max of all sensors belonging to OCC sensor + groups. + +- sensors: occ: Add CSM_{min/max} sensors + + HWMON's lowest/highest attribute is used by CSM agent, so map min/max + device-tree properties "sensor-data-min" and "sensor-data-max" to + the min/max of CSM. + +- sensors: occ: Add support for OCC inband sensors + + Add support to parse and export OCC inband sensors which are copied + by OCC to main memory in P9. Each OCC writes three buffers which + includes one names buffer for sensor meta data and two buffers for + sensor readings. While OCC writes to one buffer the sensor values + can be read from the other buffer. The sensors are updated every + 100ms. + + This patch adds power, temperature, current and voltage sensors to + ``/ibm,opal/sensors`` device-tree node which can be exported by the + ibmpowernv-hwmon driver in Linux. + +- psr: occ: Add support to change power-shifting-ratio + + Add support to set the CPU-GPU power shifting ratio which is used by + the OCC power capping algorithm. PSR value of 100 takes all power away + from CPU first and a PSR value of 0 caps GPU first. + +- powercap: occ: Add a generic powercap framework + + This patch adds a generic powercap framework and exports OCC powercap + sensors using which system powercap can be set inband through OPAL-OCC + command-response interface. +- phb4: Enable PCI peer-to-peer + + P9 supports PCI peer-to-peer: a PCI device can write directly to the + mmio space of another PCI device. It completely by-passes the CPU. + + It requires some configuration on the PHBs involved: + + 1. on the initiating side, the address for the read/write operation is + in the mmio space of the target, i.e. well outside the range normally + allowed. So we disable range-checking on the TVT entry in bypass mode. + + 2. on the target side, we need to explicitly enable p2p by setting a + bit in a configuration register. It has the side-effect of reserving + an outbound (as seen from the CPU) store queue for p2p. Therefore we + only enable p2p on the PHBs using it, as we don't want to waste the + resource if we don't have to. + + P9 supports p2p mmio writes. Reads are currently only supported if the + two devices are under the same PHB but that is expected to change in + the future, and it raises questions about intermediate switches + configuration, so we report an error for the time being. + + The patch adds a new OPAL call to allow the OS to declare a p2p + (initiator, target) pair. + +- NX 842 and GZIP support on POWER9 + + +POWER9 DD2 +---------- + +Further support for POWER9 DD2 revision chips. Notable changes include: + +- xscom: Grab P9 DD2 revision level +- vas: Set mmio enable bits in DD2 + + POWER9 DD2 added some new "enable" bits that must be set for VAS to + work. These bits were unused in DD1. +- hdat: Add POWER9 DD2.0 specific pa_features + + Same as the default but with TM off. + +POWER9 +------ +- Base NPU2 support on POWER9 DD2 +- hdata/i2c: Work around broken I2C array version + + Work around a bug in the I2C devices array that shows the + array version as being v2 when only the v1 data is populated. +- Recognize the 2s2u zz platform + + OPAL currently doesn't know about the 2s2u zz. It recognizes such a + box as a generic BMC machine and fails to boot. Add the 2s2u as a + supported platform. + + There will subsequently be a 2s2u-L system which may have a different + compatible property, which will need to be handled later. +- hdata/spira: POWER9 NX isn't software compatible with P7/P8 NX, don't claim so +- NX: Add P9 NX support for gzip compression engine + + Power 9 introduces NX gzip compression engine. This patch adds gzip + compression support in NX. Virtual Accelerator Switch (VAS) is used to + access NX gzip engine and the channel configuration will be done with + the receive FIFO. So RxFIFO address, logical partition ID (lpid), + process ID (pid) and thread ID (tid) are used to configure RxFIFO. + P9 NX supports high and normal priority FIFOS. Skiboot configures User + Mode Access Control (UMAC) noitify match register with these values and + also enables other registers to enable / disable the engine. + + Creates the following device-tree entries to provide RxFIFO address, + RxFIFO size, Fifo priority, lpid, pid and tid values so that kernel + can drive P9 NX gzip engine. + + The following nodes are located under an xscom node: :: + /xscom@<xscom_addr>/nx@<nx_addr> + + /ibm,gzip-high-fifo : High priority gzip RxFIFO + /ibm,gzip-normal-fifo : Normal priority gzip RxFIFO + + Each RxFIFO node contain:s + + ``compatible`` + ``ibm,p9-nx-gzip`` + ``priority`` + High or Normal + ``rx-fifo-address`` + RxFIFO address + ``rx-fifo-size`` + RxFIFO size + ``lpid`` + 0xfff (1's for 12 bits in UMAC notify match register) + ``pid`` + gzip coprocessor type + ``tid`` + counter for gzip + +- NX: Add P9 NX support for 842 compression engine + + This patch adds changes needed for 842 compression engine on power 9. + Virtual Accelerator Switch (VAS) is used to access NX 842 engine on P9 + and the channel setup will be done with receive FIFO. So RxFIFO + address, logical partition ID (lpid), process ID (pid) and thread ID + (tid) are used for this setup. p9 NX supports high and normal priority + FIFOs. skiboot is not involved to process data with 842 engine, but + configures User Mode Access Control (UMAC) noitify match register with + these values and export them to kernel with device-tree entries. + + Also configure registers to setup and enable / disable the engine with + the appropriate registers. Creates the following device-tree entries to + provide RxFIFO address, RxFIFO size, Fifo priority, lpid, pid and tid + values so that kernel can drive P9 NX 842 engine. + + The following nodes are located under an xscom node: + ``/xscom@<xscom_addr>/nx@<nx_addr>`` + + ``/ibm,842-high-fifo`` + High priority 842 RxFIFO + ``/ibm,842-normal-fifo`` + Normal priority 842 RxFIFO + + Each RxFIFO node contains: + + ``compatible`` + ibm,p9-nx-842 + ``priority`` + High or Normal + ``rx-fifo-address`` + RxFIFO address + ``rx-fifo-size`` + RXFIFO size + ``lpid`` + 0xfff (1's for 12 bits set in UMAC notify match register) + ``pid`` + 842 coprocessor type + ``tid`` + Counter for 842 +- vas: Create MMIO device tree node + + Create a device tree node for VAS and add properties that Linux + will need to configure/use VAS. +- opal: Extract sw checkstop fir address from HDAT. + + Extract sw checkstop fir address info from HDAT and populate device tree + node ibm,sw-checkstop-fir. + + This patch is required for OPAL_CEC_REBOOT2 OPAL call to work as expected + on p9. + + With this patch a device property 'ibm,sw-checkstop-fir' is now properly + populated: :: + + # lsprop ibm,sw-checkstop-fir + ibm,sw-checkstop-fir + 05012000 0000001f + +PHB4 +---- +- hdat: Fix PCIe GEN4 lane-eq setting for DD2 + + For PCIe GEN4, DD2 uses only 1 byte per PCIe lane for the lane-eq + settings (DD1 uses 2 bytes) +- pci: Wait for CRS and switch link when restoring bus numbers + + When a complete reset occurs, after the PHB recovers it propagates a + reset down the wire to every device. At the same time, skiboot talks to + every device in order to restore the state of devices to what they were + before the reset. + + In some situations, such as devices that recovered slowly and/or were + behind a switch, skiboot attempted to access config space of the device + before the link was up and the device could respond. + + Fix this by retrying CRS until the device responds correctly, and for + devices behind a switch, making sure the switch has its link up first. +- pci: Track whether a PCI device is a virtual function + + This can be checked from config space, but we will need to know this when + restoring the PCI topology, and it is not always safe to access config + space during this period. +- phb4: Enhanced PCIe training tracing + + This add more details to the PCI training tracing (aka Rick Mata + mode). It enables the PCIe Link Training and Status State + Machine (LTSSM) tracing and details on speed and link width. + + Output now looks like this when enabled (via nvram): :: + + [ 1.096995141,3] PHB#0000[0:0]: TRACE:0x0000001101000000 0ms GEN1:x16:detect + [ 1.102849137,3] PHB#0000[0:0]: TRACE:0x0000102101000000 11ms presence GEN1:x16:polling + [ 1.104341838,3] PHB#0000[0:0]: TRACE:0x0000182101000000 14ms training GEN1:x16:polling + [ 1.104357444,3] PHB#0000[0:0]: TRACE:0x00001c5101000000 14ms training GEN1:x16:recovery + [ 1.104580394,3] PHB#0000[0:0]: TRACE:0x00001c5103000000 14ms training GEN3:x16:recovery + [ 1.123259359,3] PHB#0000[0:0]: TRACE:0x00001c5104000000 51ms training GEN4:x16:recovery + [ 1.141737656,3] PHB#0000[0:0]: TRACE:0x0000144104000000 87ms presence GEN4:x16:L0 + [ 1.141752318,3] PHB#0000[0:0]: TRACE:0x0000154904000000 87ms trained GEN4:x16:L0 + [ 1.141757964,3] PHB#0000[0:0]: TRACE: Link trained. + [ 1.096834019,3] PHB#0001[0:1]: TRACE:0x0000001101000000 0ms GEN1:x16:detect + [ 1.105578525,3] PHB#0001[0:1]: TRACE:0x0000102101000000 17ms presence GEN1:x16:polling + [ 1.112763075,3] PHB#0001[0:1]: TRACE:0x0000183101000000 31ms training GEN1:x16:config + [ 1.112778956,3] PHB#0001[0:1]: TRACE:0x00001c5081000000 31ms training GEN1:x08:recovery + [ 1.113002083,3] PHB#0001[0:1]: TRACE:0x00001c5083000000 31ms training GEN3:x08:recovery + [ 1.114833873,3] PHB#0001[0:1]: TRACE:0x0000144083000000 35ms presence GEN3:x08:L0 + [ 1.114848832,3] PHB#0001[0:1]: TRACE:0x0000154883000000 35ms trained GEN3:x08:L0 + [ 1.114854650,3] PHB#0001[0:1]: TRACE: Link trained. + +- phb4: Fix reading wrong size registers in EEH dump + + These registers are supposed to be 16bit, and it makes part of the + register dump misleading. +- phb4: Ignore slot state if performing complete reset + + If a PHB is being completely reset, its state is about to be blown away + anyway, so if it's not in an appropriate state, creset it regardless. +- phb4: Prepare for link down when creset called from kernel + + phb4_creset() is typically called by functions that prepare the link + to go down. In cases where creset() is called directly by the kernel, + this isn't the case and it can cause issues. Prepare for link down in + creset, just like we do in freset and hreset. +- phb4: Skip attempting to fix PHBs broken on boot + + If a PHB is marked broken it didn't work on boot, and if it didn't work + on boot then there's no point trying to recover it later +- phb4: Fix duplicate in EEH register dump +- phb4: Be more conservative on link presence timeout + + In this patch we tuned our link timing to be more agressive: + ``cf960e2884 phb4: Improve reset and link training timing`` + + Cards should take only 32ms but unfortunately we've seen some take + up to 440ms. Hence bump our timer up to 1000ms. + + This can hurt boot times on systems where slots indicate a hotplug + status but no electrical link is present (which we've seen). Since we + have to wait 1 second between PERST and touching config space anyway, + it shouldn't hurt too much. +- phb4: Assert PERST before PHB reset + + Currently we don't assert PERST before issuing a PHB reset. This means + any link issues while resetting the PHB will be logged as errors. + + This asserts PERST before we start resetting the PHB to avoid this. +- Revert "phb4: Read PERST signal rather than assuming it's asserted" + + This reverts commit b42ff2b904165addf32e77679cebb94a08086966 + + The original patch assumes that PERST has been asserted well before (> + 250ms) we hit here (ie. during hostboot). + + In a subesquent patch this will no longer be the case as we need to + assert PERST during PHB reset, which may only be a few milliseconds + before we hit this code. + + Hence revert this patch. Go back to the software mechanism using + skip_perst to determine if PERST should be asserted or not. This + allows us to keep the speed optimisation on boot. +- phb4: Set REGB error enables based on link state + + Currently we always set these enables when initing the PHB. If the + link is already down, we shouldn't set them as it may cause spurious + errors. + + This changes the code to only sets them if the link is up. +- phb4: Mark PHB as fenced on creset + + If we have to inject an error to trigger recover, we end up not + marking the PHB as fenced in the PHB struct. This fixes that. +- phb4: Clear errors before deasserting reset + + During reset we may have logged some errors (eg. due to the link going + down). + + Hence before we deassert PERST or Hot Reset, we need to clear these + errors. This ensures that once link training starts, only new errors + are logged. +- phb4: Disable device config space access when fenced + + On DD2 you can't access device config space when fenced, so just + disable access whenever we are fenced. +- phb4: Dump devctl and devstat registers + + Dump devctl and devstat registers. These would have been useful when + debugging the MPS issue. +- phb4: Only clear some PHB config space registers on errors + + Currently on error we clear the entire PHB config space. This is a + problem as the PCIe Maximum Payload Size (MPS) negotiation may have + already occurred. Clearing MPS in the PHB back to a default of 128 + bytes will result an error for a device which already has a larger MPS + configured. + + This will manifest itself as error due to a malformed TLP packet. ie. + ``phbPblErrorStatus bit 41 = "Malformed TLP error"`` + + This has been seen after kexec on with some adapters. + + This fixes the problem by only clearing a subset of registers on a phb + error. + +Utilities +--------- +- external/xscom-utils: Add ``--list-bits`` + + When using getscom/putscom it's helpful to know what bits are set in the + register. This patch adds an option to print out which bits are set + along with the value that was read/written to the register. Note that + this output indicates which bits are set using the IBM bit ordering + since that's what the XSCOM documentation uses. + + +opal-prd +-------- + +- opal-prd: Do not pass pnor file while starting daemon. + + This change to the included systemd init file means opal-prd can + start and run on IBM FSP based systems. + + We do not have pnor support on all the system. Also we have logic to + autodetect PNOR. Hence do not pass ``--pnor`` by default. + +- opal-prd: Disable pnor access interface on FSP system + + On FSP system host does not have access to PNOR. Hence disable PNOR + access interfaces. + +OPAL Sensors +------------ +- sensor-groups : occ: Add 'ops' DT property + + Add new device-tree property 'ops' to define different operations + supported on each sensor-group. + +- OCC: Map OCC sensor to a chip-id + + Parse device tree to get chip-id for OCC sensor. + +- HDAT: Add chip-id property to ipmi sensors + + Presently we do not have a way to map sensor to chip id. Hence we are + always passing chip id 0 for occ_reset request (see occ_sensor_id_to_chip()). + + This patch adds chip-id property to sensors (whenever its available) so that + we can map occ sensor to chip-id and pass valid chip-id to occ_reset request. + +- xive: Check for valid PIR index when decoding + + This fixes an unlikely but possible assert() fail on kdump. + +- sensors: occ: Skip the deconfigured core sensors + + This patch skips the deconfigured cores from the core sensors while + parsing the sensor names in the main memory as these sensor values are + not updated by OCC. + +Tests +----- +- hdata_to_dt: use a realistic PVR and chip revision + +- nx: PR_INFO that NX RNG and Crypto not yet supported on POWER9 + +- external/pflash: Add tests +- external/pflash: Reinstate the progress bars + + Recent work did some optimising which unfortunately removed some of the + progress bars in pflash. + + It turns out that there's only one thing people prefer to correctly + programmed flash chips, it is the ability to watch little equals + characters go across their screens for potentially minutes. +- external/pflash: Correct erase alignment checks + + pflash should check the alignment of addresses and sizes when asked to + erase. There are two possibilities: + + 1. The user has specified sizes manually in which case pflash should + be as flexible as possible, blocklevel_smart_erase() permits this. To + prevent possible mistakes pflash will require --force to perform a + manual erase of unaligned sizes. + 2. The user used -P to specify a partition, partitions aren't + necessarily erase granule aligned anymore, blocklevel_smart_erase() can + handle. In this it doesn't make sense to warn/error about misalignment + since the misalignment is inherent to the FFS partition and not really + user input. + +- external/pflash: Check the result of strtoul + + Also add 0x in front of --info output to avoid a copy and paste mistake. + +- libflash/file: Break up MTD erase ioctl() calls + + Unfortunately not all drivers are created equal and several drivers on + which pflash relies block in the kernel for quite some time and ignore + signals. + + This is really only a problem if pflash is to perform large erases. So + don't, perform these ops in small chunks. + + An in kernel fix is possible in most cases but it takes time and systems + will be running older drivers for quite some time. Since sector erases + aren't significantly slower than whole chip erases there isn't much of a + performance penalty to breaking up the erase ioctl()s. + +General +------- +- opal-msg: Increase the max-async completion count by max chips possible + +- occ: Add support for OPAL-OCC command/response interface + + This patch adds support for a shared memory based command/response + interface between OCC and OPAL. In HOMER, there is an OPAL command + buffer and an OCC response buffer which is used to send inband + commands to OCC. + +- HDAT/device-tree: only add lid-type on pre-POWER9 systems + + Largely a relic of back when we had multiple entry points into OPAL depending + on which mechanism on an FSP we were using to get loaded, this isn't needed + on modern P9 as we only have one entry point (we don't do the PHYP LID hack). diff --git a/roms/skiboot/doc/release-notes/skiboot-5.8.rst b/roms/skiboot/doc/release-notes/skiboot-5.8.rst new file mode 100644 index 000000000..739585de5 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.8.rst @@ -0,0 +1,709 @@ +.. _skiboot-5.8: + +skiboot-5.8 +=========== + +skiboot v5.8 was released on Thursday August 31st 2017. It is the first +release of skiboot 5.8, which becomes the new stable release. +It follows the 5.7 release, first released 25th July 2017. + +skiboot v5.8 contains all bug fixes as of :ref:`skiboot-5.4.6` +and :ref:`skiboot-5.1.20` (the currently maintained stable releases). We +do not currently expect to do any 5.7.x stable releases. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +Over :ref:`skiboot-5.7`, we have the following changes: + +New Features +------------ +- sensors: occ: Add support to clear sensor groups + + Adds a generic API to clear sensor groups. OCC inband sensor groups + such as CSM, Profiler and Job Scheduler can be cleared using this API. + It will clear the min/max of all sensors belonging to OCC sensor + groups. + +- sensors: occ: Add CSM_{min/max} sensors + + HWMON's lowest/highest attribute is used by CSM agent, so map min/max + device-tree properties "sensor-data-min" and "sensor-data-max" to + the min/max of CSM. + +- sensors: occ: Add support for OCC inband sensors + + Add support to parse and export OCC inband sensors which are copied + by OCC to main memory in P9. Each OCC writes three buffers which + includes one names buffer for sensor meta data and two buffers for + sensor readings. While OCC writes to one buffer the sensor values + can be read from the other buffer. The sensors are updated every + 100ms. + + This patch adds power, temperature, current and voltage sensors to + ``/ibm,opal/sensors`` device-tree node which can be exported by the + ibmpowernv-hwmon driver in Linux. + +- psr: occ: Add support to change power-shifting-ratio + + Add support to set the CPU-GPU power shifting ratio which is used by + the OCC power capping algorithm. PSR value of 100 takes all power away + from CPU first and a PSR value of 0 caps GPU first. + +- powercap: occ: Add a generic powercap framework + + This patch adds a generic powercap framework and exports OCC powercap + sensors using which system powercap can be set inband through OPAL-OCC + command-response interface. +- phb4: Enable PCI peer-to-peer + + P9 supports PCI peer-to-peer: a PCI device can write directly to the + mmio space of another PCI device. It completely by-passes the CPU. + + It requires some configuration on the PHBs involved: + + 1. on the initiating side, the address for the read/write operation is + in the mmio space of the target, i.e. well outside the range normally + allowed. So we disable range-checking on the TVT entry in bypass mode. + + 2. on the target side, we need to explicitly enable p2p by setting a + bit in a configuration register. It has the side-effect of reserving + an outbound (as seen from the CPU) store queue for p2p. Therefore we + only enable p2p on the PHBs using it, as we don't want to waste the + resource if we don't have to. + + P9 supports p2p mmio writes. Reads are currently only supported if the + two devices are under the same PHB but that is expected to change in + the future, and it raises questions about intermediate switches + configuration, so we report an error for the time being. + + The patch adds a new OPAL call to allow the OS to declare a p2p + (initiator, target) pair. + +- NX 842 and GZIP support on POWER9 + + +POWER9 DD2 +---------- + +Further support for POWER9 DD2 revision chips. Notable changes include: + +- xscom: Grab P9 DD2 revision level +- vas: Set mmio enable bits in DD2 + + POWER9 DD2 added some new "enable" bits that must be set for VAS to + work. These bits were unused in DD1. +- hdat: Add POWER9 DD2.0 specific pa_features + + Same as the default but with TM off. + +POWER9 +------ + +Since :ref:`skiboot-5.8-rc1`: + +- hw/npu2.c: Add ibm,nvlink-speed device-tree property + + NVLink2 links can support multiple different speeds. However the device driver + has no way of determining which speed was programmed so pass it down as a device + tree property. +- hw/npu2-hw-procedures.c: Update PHY_RESET procedure + + Newer versions of Hostboot will have various clocks powered down by default + to save power. Therefore we need to power them up before accessing the OBUS + PHY. +- p8-i2c: Fix random data corruption (POWER9 specific) + While waiting for the OCC to signal that it has finished using the I2C + master we put the master into the, poorly named, occache_dis state. + While in this state the transaction hasn't been started, but + p8_i2c_check_status() will only skip it's checks when the master is in + the idle state. Any action that checks that cranks the I2C state machine + (interrupt, poll, etc) will call p8_i2c_check_status() and since the + master is not idle, it will check the status register, see the + transaction complete flag set and complete the i2c request without + actually doing anything. + + If the transaction was a I2C read, the resulting output will be a + zeroed data buffer. + +- hw/p8-i2c: Fix OCC locking (POWER9 specific) + + There's a few issues with the Host<->OCC I2C bus handshaking. First up, + skiboot is currently examining the wrong bit when checking if the OCC + is currently using the bus. Secondly, when we need to wait for the OCC + to release the bus we are scheduling a recovery timer to run zero + timebase ticks after the current moment so the recovery timeout handler + will run immediately after the bus was requested, which will in turn + re-schedule itself, etc, etc. There's also a race between the OCC + interrupt and the recovery handler which can result in an assertion + failure in the recovery thread. All of this is bad. + + This patch addresses all these issues and sets the recovery timeout to + 10ms. +- vas: export chip-id to vas platform device + This is needed so VAS in the kernel can perform cpu to vas id mapping. +- slw: Modify the power9 stop0_lite latency & residency + + Currently skiboot exposes the exit-latency for stop0_lite as 200ns and + the target-residency to be 2us. + + However, the kernel cpu-idle infrastructure rounds up the latency to + microseconds and lists the stop0_lite latency as 0us, putting it on + par with snooze state. As a result, when the predicted latency is + small (< 1us), cpuidle will select stop0_lite instead of snooze. The + difference between these states is that snooze doesn't require an + interrupt to exit from the state, but stop0_lite does. And the value + 200ns doesn't include the interrupt latency. + + This shows up in the context_switch2 benchmark + (http://ozlabs.org/~anton/junkcode/context_switch2.c) where the number + of context switches per second with the stop0_lite disabled is found + to be roughly 30% more than with stop0_lite enabled. + This can be correlated with the number of times cpuidle enters + stop0_lite compared to snooze. + + Hence, bump up the exit latency of stop0_lite to 1us. Since the target + residency is chosen to be 10 times the exit latency, set the target + residency to 10us. + + With these values, we see a 50% improvement in the number of context + switches. + +Since :ref:`skiboot-5.7`: + +- Base NPU2 support on POWER9 DD2 +- hdata/i2c: Work around broken I2C array version + + Work around a bug in the I2C devices array that shows the + array version as being v2 when only the v1 data is populated. +- Recognize the 2s2u zz platform + + OPAL currently doesn't know about the 2s2u zz. It recognizes such a + box as a generic BMC machine and fails to boot. Add the 2s2u as a + supported platform. + + There will subsequently be a 2s2u-L system which may have a different + compatible property, which will need to be handled later. +- hdata/spira: POWER9 NX isn't software compatible with P7/P8 NX, don't claim so +- NX: Add P9 NX support for gzip compression engine + + Power 9 introduces NX gzip compression engine. This patch adds gzip + compression support in NX. Virtual Accelerator Switch (VAS) is used to + access NX gzip engine and the channel configuration will be done with + the receive FIFO. So RxFIFO address, logical partition ID (lpid), + process ID (pid) and thread ID (tid) are used to configure RxFIFO. + P9 NX supports high and normal priority FIFOS. Skiboot configures User + Mode Access Control (UMAC) noitify match register with these values and + also enables other registers to enable / disable the engine. + + Creates the following device-tree entries to provide RxFIFO address, + RxFIFO size, Fifo priority, lpid, pid and tid values so that kernel + can drive P9 NX gzip engine. + + The following nodes are located under an xscom node: :: + /xscom@<xscom_addr>/nx@<nx_addr> + + /ibm,gzip-high-fifo : High priority gzip RxFIFO + /ibm,gzip-normal-fifo : Normal priority gzip RxFIFO + + Each RxFIFO node contain:s + + ``compatible`` + ``ibm,p9-nx-gzip`` + ``priority`` + High or Normal + ``rx-fifo-address`` + RxFIFO address + ``rx-fifo-size`` + RxFIFO size + ``lpid`` + 0xfff (1's for 12 bits in UMAC notify match register) + ``pid`` + gzip coprocessor type + ``tid`` + counter for gzip + +- NX: Add P9 NX support for 842 compression engine + + This patch adds changes needed for 842 compression engine on power 9. + Virtual Accelerator Switch (VAS) is used to access NX 842 engine on P9 + and the channel setup will be done with receive FIFO. So RxFIFO + address, logical partition ID (lpid), process ID (pid) and thread ID + (tid) are used for this setup. p9 NX supports high and normal priority + FIFOs. skiboot is not involved to process data with 842 engine, but + configures User Mode Access Control (UMAC) noitify match register with + these values and export them to kernel with device-tree entries. + + Also configure registers to setup and enable / disable the engine with + the appropriate registers. Creates the following device-tree entries to + provide RxFIFO address, RxFIFO size, Fifo priority, lpid, pid and tid + values so that kernel can drive P9 NX 842 engine. + + The following nodes are located under an xscom node: + ``/xscom@<xscom_addr>/nx@<nx_addr>`` + + ``/ibm,842-high-fifo`` + High priority 842 RxFIFO + ``/ibm,842-normal-fifo`` + Normal priority 842 RxFIFO + + Each RxFIFO node contains: + + ``compatible`` + ibm,p9-nx-842 + ``priority`` + High or Normal + ``rx-fifo-address`` + RxFIFO address + ``rx-fifo-size`` + RXFIFO size + ``lpid`` + 0xfff (1's for 12 bits set in UMAC notify match register) + ``pid`` + 842 coprocessor type + ``tid`` + Counter for 842 +- vas: Create MMIO device tree node + + Create a device tree node for VAS and add properties that Linux + will need to configure/use VAS. +- opal: Extract sw checkstop fir address from HDAT. + + Extract sw checkstop fir address info from HDAT and populate device tree + node ibm,sw-checkstop-fir. + + This patch is required for OPAL_CEC_REBOOT2 OPAL call to work as expected + on p9. + + With this patch a device property 'ibm,sw-checkstop-fir' is now properly + populated: :: + + # lsprop ibm,sw-checkstop-fir + ibm,sw-checkstop-fir + 05012000 0000001f + +PHB4 +---- +- hdat: Fix PCIe GEN4 lane-eq setting for DD2 + + For PCIe GEN4, DD2 uses only 1 byte per PCIe lane for the lane-eq + settings (DD1 uses 2 bytes) +- pci: Wait for CRS and switch link when restoring bus numbers + + When a complete reset occurs, after the PHB recovers it propagates a + reset down the wire to every device. At the same time, skiboot talks to + every device in order to restore the state of devices to what they were + before the reset. + + In some situations, such as devices that recovered slowly and/or were + behind a switch, skiboot attempted to access config space of the device + before the link was up and the device could respond. + + Fix this by retrying CRS until the device responds correctly, and for + devices behind a switch, making sure the switch has its link up first. +- pci: Track whether a PCI device is a virtual function + + This can be checked from config space, but we will need to know this when + restoring the PCI topology, and it is not always safe to access config + space during this period. +- phb4: Enhanced PCIe training tracing + + This add more details to the PCI training tracing (aka Rick Mata + mode). It enables the PCIe Link Training and Status State + Machine (LTSSM) tracing and details on speed and link width. + + Output now looks like this when enabled (via nvram): :: + + [ 1.096995141,3] PHB#0000[0:0]: TRACE:0x0000001101000000 0ms GEN1:x16:detect + [ 1.102849137,3] PHB#0000[0:0]: TRACE:0x0000102101000000 11ms presence GEN1:x16:polling + [ 1.104341838,3] PHB#0000[0:0]: TRACE:0x0000182101000000 14ms training GEN1:x16:polling + [ 1.104357444,3] PHB#0000[0:0]: TRACE:0x00001c5101000000 14ms training GEN1:x16:recovery + [ 1.104580394,3] PHB#0000[0:0]: TRACE:0x00001c5103000000 14ms training GEN3:x16:recovery + [ 1.123259359,3] PHB#0000[0:0]: TRACE:0x00001c5104000000 51ms training GEN4:x16:recovery + [ 1.141737656,3] PHB#0000[0:0]: TRACE:0x0000144104000000 87ms presence GEN4:x16:L0 + [ 1.141752318,3] PHB#0000[0:0]: TRACE:0x0000154904000000 87ms trained GEN4:x16:L0 + [ 1.141757964,3] PHB#0000[0:0]: TRACE: Link trained. + [ 1.096834019,3] PHB#0001[0:1]: TRACE:0x0000001101000000 0ms GEN1:x16:detect + [ 1.105578525,3] PHB#0001[0:1]: TRACE:0x0000102101000000 17ms presence GEN1:x16:polling + [ 1.112763075,3] PHB#0001[0:1]: TRACE:0x0000183101000000 31ms training GEN1:x16:config + [ 1.112778956,3] PHB#0001[0:1]: TRACE:0x00001c5081000000 31ms training GEN1:x08:recovery + [ 1.113002083,3] PHB#0001[0:1]: TRACE:0x00001c5083000000 31ms training GEN3:x08:recovery + [ 1.114833873,3] PHB#0001[0:1]: TRACE:0x0000144083000000 35ms presence GEN3:x08:L0 + [ 1.114848832,3] PHB#0001[0:1]: TRACE:0x0000154883000000 35ms trained GEN3:x08:L0 + [ 1.114854650,3] PHB#0001[0:1]: TRACE: Link trained. + +- phb4: Fix reading wrong size registers in EEH dump + + These registers are supposed to be 16bit, and it makes part of the + register dump misleading. +- phb4: Ignore slot state if performing complete reset + + If a PHB is being completely reset, its state is about to be blown away + anyway, so if it's not in an appropriate state, creset it regardless. +- phb4: Prepare for link down when creset called from kernel + + phb4_creset() is typically called by functions that prepare the link + to go down. In cases where creset() is called directly by the kernel, + this isn't the case and it can cause issues. Prepare for link down in + creset, just like we do in freset and hreset. +- phb4: Skip attempting to fix PHBs broken on boot + + If a PHB is marked broken it didn't work on boot, and if it didn't work + on boot then there's no point trying to recover it later +- phb4: Fix duplicate in EEH register dump +- phb4: Be more conservative on link presence timeout + + In this patch we tuned our link timing to be more agressive: + ``cf960e2884 phb4: Improve reset and link training timing`` + + Cards should take only 32ms but unfortunately we've seen some take + up to 440ms. Hence bump our timer up to 1000ms. + + This can hurt boot times on systems where slots indicate a hotplug + status but no electrical link is present (which we've seen). Since we + have to wait 1 second between PERST and touching config space anyway, + it shouldn't hurt too much. +- phb4: Assert PERST before PHB reset + + Currently we don't assert PERST before issuing a PHB reset. This means + any link issues while resetting the PHB will be logged as errors. + + This asserts PERST before we start resetting the PHB to avoid this. +- Revert "phb4: Read PERST signal rather than assuming it's asserted" + + This reverts commit b42ff2b904165addf32e77679cebb94a08086966 + + The original patch assumes that PERST has been asserted well before (> + 250ms) we hit here (ie. during hostboot). + + In a subesquent patch this will no longer be the case as we need to + assert PERST during PHB reset, which may only be a few milliseconds + before we hit this code. + + Hence revert this patch. Go back to the software mechanism using + skip_perst to determine if PERST should be asserted or not. This + allows us to keep the speed optimisation on boot. +- phb4: Set REGB error enables based on link state + + Currently we always set these enables when initing the PHB. If the + link is already down, we shouldn't set them as it may cause spurious + errors. + + This changes the code to only sets them if the link is up. +- phb4: Mark PHB as fenced on creset + + If we have to inject an error to trigger recover, we end up not + marking the PHB as fenced in the PHB struct. This fixes that. +- phb4: Clear errors before deasserting reset + + During reset we may have logged some errors (eg. due to the link going + down). + + Hence before we deassert PERST or Hot Reset, we need to clear these + errors. This ensures that once link training starts, only new errors + are logged. +- phb4: Disable device config space access when fenced + + On DD2 you can't access device config space when fenced, so just + disable access whenever we are fenced. +- phb4: Dump devctl and devstat registers + + Dump devctl and devstat registers. These would have been useful when + debugging the MPS issue. +- phb4: Only clear some PHB config space registers on errors + + Currently on error we clear the entire PHB config space. This is a + problem as the PCIe Maximum Payload Size (MPS) negotiation may have + already occurred. Clearing MPS in the PHB back to a default of 128 + bytes will result an error for a device which already has a larger MPS + configured. + + This will manifest itself as error due to a malformed TLP packet. ie. + ``phbPblErrorStatus bit 41 = "Malformed TLP error"`` + + This has been seen after kexec on with some adapters. + + This fixes the problem by only clearing a subset of registers on a phb + error. + +Utilities +--------- +- external/xscom-utils: Add ``--list-bits`` + + When using getscom/putscom it's helpful to know what bits are set in the + register. This patch adds an option to print out which bits are set + along with the value that was read/written to the register. Note that + this output indicates which bits are set using the IBM bit ordering + since that's what the XSCOM documentation uses. + + +opal-prd +-------- + +- opal-prd: Do not pass pnor file while starting daemon. + + This change to the included systemd init file means opal-prd can + start and run on IBM FSP based systems. + + We do not have pnor support on all the system. Also we have logic to + autodetect PNOR. Hence do not pass ``--pnor`` by default. + +- opal-prd: Disable pnor access interface on FSP system + + On FSP system host does not have access to PNOR. Hence disable PNOR + access interfaces. + +OPAL Sensors +------------ +- sensor-groups : occ: Add 'ops' DT property + + Add new device-tree property 'ops' to define different operations + supported on each sensor-group. + +- OCC: Map OCC sensor to a chip-id + + Parse device tree to get chip-id for OCC sensor. + +- HDAT: Add chip-id property to ipmi sensors + + Presently we do not have a way to map sensor to chip id. Hence we are + always passing chip id 0 for occ_reset request (see occ_sensor_id_to_chip()). + + This patch adds chip-id property to sensors (whenever its available) so that + we can map occ sensor to chip-id and pass valid chip-id to occ_reset request. + +- xive: Check for valid PIR index when decoding + + This fixes an unlikely but possible assert() fail on kdump. + +- sensors: occ: Skip the deconfigured core sensors + + This patch skips the deconfigured cores from the core sensors while + parsing the sensor names in the main memory as these sensor values are + not updated by OCC. + +IBM FSP systems +--------------- +Since :ref:`skiboot-5.8-rc1`: + +- mktime: fix off-by-one error calling days_in_month + + From auditing all the mktime() users, there seems to be only a *very* + small window around new years day where we could possibly return + incorrect data to the OS, and even then, there would have to be FSP + reset/reload on FSP machines. I don't *think* there's an opportunity + on other machines. + +Tests +----- +Since :ref:`skiboot-5.8-rc1`: + +- travis: Debian Stretch must pass +- test kernels: link with -N +- core/test/run-msg: don't depend on unittest mem layout + +Since :ref:`skiboot-5.7`: + +- hdata_to_dt: use a realistic PVR and chip revision + +- nx: PR_INFO that NX RNG and Crypto not yet supported on POWER9 + +- external/pflash: Add tests +- external/pflash: Reinstate the progress bars + + Recent work did some optimising which unfortunately removed some of the + progress bars in pflash. + + It turns out that there's only one thing people prefer to correctly + programmed flash chips, it is the ability to watch little equals + characters go across their screens for potentially minutes. +- external/pflash: Correct erase alignment checks + + pflash should check the alignment of addresses and sizes when asked to + erase. There are two possibilities: + + 1. The user has specified sizes manually in which case pflash should + be as flexible as possible, blocklevel_smart_erase() permits this. To + prevent possible mistakes pflash will require --force to perform a + manual erase of unaligned sizes. + 2. The user used -P to specify a partition, partitions aren't + necessarily erase granule aligned anymore, blocklevel_smart_erase() can + handle. In this it doesn't make sense to warn/error about misalignment + since the misalignment is inherent to the FFS partition and not really + user input. + +- external/pflash: Check the result of strtoul + + Also add 0x in front of --info output to avoid a copy and paste mistake. + +- libflash/file: Break up MTD erase ioctl() calls + + Unfortunately not all drivers are created equal and several drivers on + which pflash relies block in the kernel for quite some time and ignore + signals. + + This is really only a problem if pflash is to perform large erases. So + don't, perform these ops in small chunks. + + An in kernel fix is possible in most cases but it takes time and systems + will be running older drivers for quite some time. Since sector erases + aren't significantly slower than whole chip erases there isn't much of a + performance penalty to breaking up the erase ioctl()s. + +General +------- + +Since :ref:`skiboot-5.8-rc1`: + +- gcov: support GCC 7.1+ +- Tests build and pass on Debian + A few things related to the Debian toolchain. + +Since :ref:`skiboot-5.7`: + +- opal-msg: Increase the max-async completion count by max chips possible + +- occ: Add support for OPAL-OCC command/response interface + + This patch adds support for a shared memory based command/response + interface between OCC and OPAL. In HOMER, there is an OPAL command + buffer and an OCC response buffer which is used to send inband + commands to OCC. + +- HDAT/device-tree: only add lid-type on pre-POWER9 systems + + Largely a relic of back when we had multiple entry points into OPAL depending + on which mechanism on an FSP we were using to get loaded, this isn't needed + on modern P9 as we only have one entry point (we don't do the PHYP LID hack). + +Contributors +------------ + +- Processed 156 csets from 17 developers +- 1 employers found +- A total of 6888 lines added, 1089 removed (delta 5799) + +Developers with the most changesets +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================== === ======= +Developer # % +========================== === ======= +Cyril Bur 35 (22.4%) +Stewart Smith 32 (20.5%) +Michael Neuling 23 (14.7%) +Sukadev Bhattiprolu 11 (7.1%) +Reza Arbab 10 (6.4%) +Russell Currey 9 (5.8%) +Shilpasri G Bhat 9 (5.8%) +Oliver O'Halloran 5 (3.2%) +Haren Myneni 5 (3.2%) +Alistair Popple 4 (2.6%) +Vasant Hegde 4 (2.6%) +Nicholas Piggin 3 (1.9%) +Andrew Donnellan 2 (1.3%) +Gautham R. Shenoy 1 (0.6%) +Mahesh Salgaonkar 1 (0.6%) +Ananth N Mavinakayanahalli 1 (0.6%) +Frederic Barrat 1 (0.6%) +========================== === ======= + +Developers with the most changed lines +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================== ==== ======= +Developer # % +========================== ==== ======= +Shilpasri G Bhat 1935 (27.9%) +Cyril Bur 1868 (26.9%) +Stewart Smith 866 (12.5%) +Sukadev Bhattiprolu 663 (9.5%) +Haren Myneni 584 (8.4%) +Michael Neuling 384 (5.5%) +Frederic Barrat 168 (2.4%) +Reza Arbab 98 (1.4%) +Oliver O'Halloran 98 (1.4%) +Vasant Hegde 93 (1.3%) +Alistair Popple 77 (1.1%) +Russell Currey 60 (0.9%) +Mahesh Salgaonkar 28 (0.4%) +Andrew Donnellan 11 (0.2%) +Gautham R. Shenoy 6 (0.1%) +Nicholas Piggin 4 (0.1%) +Ananth N Mavinakayanahalli 1 (0.0%) +========================== ==== ======= + +Developers with the most signoffs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================== === ======= +Developer # % +========================== === ======= +Stewart Smith 124 (97.6%) +Benjamin Herrenschmidt 2 (1.6%) +Vaidyanathan Srinivasan 1 (0.8%) +Total 127 (100%) +========================== === ======= + +Developers with the most reviews +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== == ======= +Developer # % +=========================== == ======= +Samuel Mendoza-Jonas 19 (52.8%) +Andrew Donnellan 11 (30.6%) +Vasant Hegde 2 (5.6%) +Cédric Le Goater 1 (2.8%) +Russell Currey 1 (2.8%) +Reza Arbab 1 (2.8%) +Cyril Bur 1 (2.8%) +Total 36 (100%) +=========================== == ======= + +Developers with the most test credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== == ======= +Developer # % +=========================== == ======= +Vasant Hegde 1 (50.0%) +Hari Bathini 1 (50.0%) +=========================== == ======= + + +Developers who gave the most tested-by credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== == ======= +Developer # % +=========================== == ======= +Russell Currey 1 (50.0%) +Mahesh Salgaonkar 1 (50.0%) +=========================== == ======= + +Developers with the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== == ======= +Developer # % +=========================== == ======= +Anton Blanchard 1 (16.7%) +Mark Linimon 1 (16.7%) +Pavaman Subramaniyam 1 (16.7%) +Pridhiviraj Paidipeddi 1 (16.7%) +Rob Lippert 1 (16.7%) +Michael Neuling 1 (16.7%) +=========================== == ======= + +Developers who gave the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== == ======= +Developer # % +=========================== == ======= +Stewart Smith 2 (33.3%) +Michael Neuling 1 (16.7%) +Andrew Donnellan 1 (16.7%) +Cyril Bur 1 (16.7%) +Gautham R. Shenoy 1 (16.7%) +=========================== == ======= diff --git a/roms/skiboot/doc/release-notes/skiboot-5.9-rc1.rst b/roms/skiboot/doc/release-notes/skiboot-5.9-rc1.rst new file mode 100644 index 000000000..7822015bb --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.9-rc1.rst @@ -0,0 +1,529 @@ +.. _skiboot-5.9-rc1: + +skiboot-5.9-rc1 +=============== + +skiboot v5.9-rc1 was released on Wednesday October 11th 2017. It is the first +release candidate of skiboot 5.9, which will become the new stable release +of skiboot following the 5.8 release, first released August 31st 2017. + +skiboot v5.9-rc1 contains all bug fixes as of :ref:`skiboot-5.4.7` +and :ref:`skiboot-5.1.21` (the currently maintained stable releases). We +do not currently expect to do any 5.8.x stable releases. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.9 by October 17th, with skiboot 5.9 +being for all POWER8 and POWER9 platforms in op-build v1.20 (Due October 18th). +This release will be targetted to early POWER9 systems. + +Over skiboot-5.8, we have the following changes: + +New Features +------------ + +POWER8 +^^^^^^ +- fast-reset by default (if possible) + + Currently, this is limited to POWER8 systems. + + A normal reboot will, rather than doing a full IPL, go through a + fast reboot procedure. This reduces the "reboot to petitboot" time + from minutes to a handful of seconds. + +POWER9 +^^^^^^ +- POWER9 power management during boot + + Less power should be consumed during boot. +- OPAL_SIGNAL_SYSTEM_RESET for POWER9 + + This implements OPAL_SIGNAL_SYSTEM_RESET, using scom registers to + quiesce the target thread and raise a system reset exception on it. + It has been tested on DD2 with stop0 ESL=0 and ESL=1 shallow power + saving modes. + + DD1 is not implemented because it is sufficiently different as to + make support difficult. +- Enable deep idle states for POWER9 + + - SLW: Add support for p9_stop_api + + p9_stop_api's are used to set SPR state on a core wakeup form a deeper + low power state. p9_stop_api uses low level platform formware and + self-restore microcode to restore the sprs to requested values. + + Code is taken from : + https://github.com/open-power/hostboot/tree/master/src/import/chips/p9/procedures/utils/stopreg + - SLW: Removing timebase related flags for stop4 + + When a core enters stop4, it does not loose decrementer and time base. + Hence removing flags OPAL_PM_DEC_STOP and OPAL_PM_TIMEBASE_STOP. + - SLW: Allow deep states if homer address is known + + Use a common variable has_wakeup_engine instead of has_slw to tell if + the: + - SLW image is populated in case of power8 + - CME image is populated in case of power9 + + Currently we expect CME to be loaded if homer address is known ( except + for simulators) + - SLW: Configure self-restore for HRMOR + + Make a stop api call using libpore to restore HRMOR register. HRMOR needs + to be cleared so that when thread exits stop, they arrives at linux + system_reset vector (0x100). + - SLW: Add opal_slw_set_reg support for power9 + + This OPAL call is made from Linux to OPAL to configure values in + various SPRs after wakeup from a deep idle state. +- PHB4: CAPP recovery + + CAPP recovery is initiated when a CAPP Machine Check is detected. + The capp recovery procedure is initiated via a Hypervisor Maintenance + interrupt (HMI). + + CAPP Machine Check may arise from either an error that results in a PHB + freeze or from an internal CAPP error with CAPP checkstop FIR action. + An error that causes a PHB freeze will result in the link down signal + being asserted. The system continues running and the CAPP and PSL will + be re-initialized. + + This implements CAPP recovery for POWER9 systems +- Add ``wafer-location`` property for POWER9 + + Extract wafer-location from ECID and add property under xscom node. + - bits 64:71 are the chip x location (7:0) + - bits 72:79 are the chip y location (7:0) + + Sample output: :: + + [root@wsp xscom@623fc00000000]# lsprop ecid + ecid 019a00d4 03100718 852c0000 00fd7911 + [root@wsp xscom@623fc00000000]# lsprop wafer-location + wafer-location 00000085 0000002c +- Add ``wafer-id`` property for POWER9 + + Wafer id is derived from ECID data. + - bits 4:63 are the wafer id ( ten 6 bit fields each containing a code) + + Sample output: :: + + [root@wsp xscom@623fc00000000]# lsprop ecid + ecid 019a00d4 03100718 852c0000 00fd7911 + [root@wsp xscom@623fc00000000]# lsprop wafer-id + wafer-id "6Q0DG340SO" +- Add ``ecid`` property under ``xscom`` node for POWER9. + Sample output: :: + + [root@wsp xscom@623fc00000000]# lsprop ecid + ecid 019a00d4 03100718 852c0000 00fd7911 +- Add ibm,firmware-versions device tree node + + In P8, hostboot provides mini device tree. It contains ``/ibm,firmware-versions`` + node which has various firmware component version details. + + In P9, OPAL is building device tree. This patch adds support to parse VERSION + section of PNOR and create ``/ibm,firmware-versions`` device tree node. + + Sample output: :: + + /sys/firmware/devicetree/base/ibm,firmware-versions # lsprop . + occ "6a00709" + skiboot "v5.7-rc1-p344fb62" + buildroot "2017.02.2-7-g23118ce" + capp-ucode "9c73e9f" + petitboot "v1.4.3-p98b6d83" + sbe "02021c6" + open-power "witherspoon-v1.17-128-gf1b53c7-dirty" + .... + .... + +POWER9 +------ + +- Disable Transactional Memory on Power9 DD 2.1 + + Update pa_features_p9[] to disable TM (Transactional Memory). On DD 2.1 + TM is not usable by Linux without other workarounds, so skiboot must + disable it. +- xscom: Do not print error message for 'chiplet offline' return values + + xscom_read/write operations returns CHIPLET_OFFLINE when chiplet is offline. + Some multicast xscom_read/write requests from HBRT results in xscom operation + on offline chiplet(s) and printing below warnings in OPAL console: :: + + [ 135.036327572,3] XSCOM: Read failed, ret = -14 + [ 135.092689829,3] XSCOM: Read failed, ret = -14 + + Some SCOM users can deal correctly with this error code (notably opal-prd), + so the error message is (in practice) erroneous. +- IMC: Fix the core_imc_event_mask + + CORE_IMC_EVENT_MASK is a scom that contains bits to control event sampling for + different machine state for core imc. The current event-mask setting sample + events only on host kernel (hypervisor) and host userspace. + + Patch to enable the sampling of events in other machine states (like guest + kernel and guest userspace). +- IMC: Update the nest_pmus array with occ/gpe microcode uav updates + + OOC/gpe nest microcode maintains the list of individual nest units + supported. Sync the recent updates to the UAV with nest_pmus array. + + For reference occ/gpr microcode link for the UAV: + https://github.com/open-power/occ/blob/master/src/occ_gpe1/gpe1_24x7.h +- Parse IOSLOT information from HDAT + + Add structure definitions that describe the physical PCIe topology of + a system and parse them into the device-tree based PCIe slot + description. +- idle: user context state loss flags fix for stop states + + The "lite" stop variants with PSSCR[ESL]=PSSCR[EC]=1 do not lose user + context, while the non-lite variants do (ESL: enable state loss). + + Some of the POWER9 idle states had these wrong. + +CAPI +^^^^ +- POWER9 DD2 update + + The CAPI initialization sequence has been updated in DD2. + This patch adapts to the changes, retaining compatibility with DD1. + The patch includes some changes to DD1 fix-ups as well. +- Load CAPP microcode for POWER9 DD2.0 and DD2.1 +- capi: Mask Psl Credit timeout error for POWER9 + + Mask the PSL credit timeout error in CAPP FIR Mask register + bit(46). As per the h/w team this error is now deprecated and shouldn't + cause any fir-action for P9. + +NVLINK2 +^^^^^^^ + +A notabale change is that we now generate the device tree description of +NVLINK based on the HDAT we get from hostboot. Since Hostboot will generate +HDAT based on VPD, you now *MUST* have correct VPD programmed or we will +*default* to a Sequoia layout, which will lead to random problems if you +are not booting a Sequoia Witherspoon planar. In the case of booting with +old VPD and/or Hostboot, we print a **giant scary warning** in order to scare you. + +- npu2: Read slot label from the HDAT link node + + Binding GPU to emulated NPU PCI devices is done using the slot labels + since the NPU devices do not have a patching slot node we need to + copy the label in here. + +- npu2: Copy link speed from the npu HDAT node + + This needs to be in the PCI device node so the speed of the NVLink + can be passed to the GPU driver. +- npu2: hw-procedures: Add settings to PHY_RESET + + Set a few new values in the PHY_RESET procedure, as specified by our + updated programming guide documentation. +- Parse NVLink information from HDAT + + Add the per-chip structures that descibe how the A-Bus/NVLink/OpenCAPI + phy is configured. This generates the npu@xyz nodes for each chip on + systems that support it. +- npu2: Add vendor cap for IRQ testing + + Provide a way to test recoverable data link interrupts via a new + vendor capability byte. +- npu2: Enable recoverable data link (no-stall) interrupts + + Allow the NPU2 to trigger "recoverable data link" interrupts. + +- npu2: Implement basic FLR (Function Level Reset) +- npu2: hw-procedures: Update PHY DC calibration procedure +- npu2: hw-procedures: Change rx_pr_phase_step value + +XIVE +^^^^ +- xive: Fix opal_xive_dump_tm() to access W2 properly. + The HW only supported limited access sizes. +- xive: Make opal_xive_allocate_irq() properly try all chips + + When requested via OPAL_XIVE_ANY_CHIP, we need to try all + chips. We first try the current one (on which the caller + sits) and if that fails, we iterate all chips until the + allocation succeeds. +- xive: Fix initialization & cleanup of HW thread contexts + + Instead of trying to "pull" everything and clear VT (which didn't + work and caused some FIRs to be set), instead just clear and then + set the PTER thread enable bit. This has the side effect of + completely resetting the corresponding thread context. + + This fixes the spurrious XIVE FIRs reported by PRD and fircheck +- xive: Add debug option for detecting misrouted IPI in emulation + + This is high overhead so we don't enable it by default even + in debug builds, it's also a bit messy, but it allowed me to + detect and debug a locking issue earlier so it can be useful. +- xive: Increase the interrupt "gap" on debug builds + + We normally allocate IPIs from 0x10. Make that 0x1000 on debug + builds to limit the chances of overlapping with Linux interrupt + numbers which makes debugging code that confuses them easier. + + Also add a warning in emulation if we get an interrupt in the + queue whose number is below the gap. +- xive: Fix locking around cache scrub & watch + + Thankfully the missing locking only affects debug code and + init code that doesn't run concurrently. Also adds a DEBUG + option that checks the lock is properly held. +- xive: Workaround HW issue with scrub facility + + Without this, we sometimes don't observe from a CPU the + values written to the ENDs or NVTs via the cache watch. +- xive: Add exerciser for cache watch/scrub facility in DEBUG builds +- xive: Make assertion in xive_eq_for_target() more informative +- xive: Add debug code to check initial cache updates +- xive: Ensure pressure relief interrupts are disabled + + We don't use them and we hijack the VP field with their + configuration to store the EQ reference, so make sure the + kernel or guest can't turn them back on by doing MMIO + writes to ACK# +- xive: Don't try setting the reserved ACK# field in VPs + + That doesn't work, the HW doesn't implement it in the cache + watch facility anyway. +- xive: Remove useless memory barriers in VP/EQ inits + + We no longer update "live" memory structures, we use a temporary + copy on the stack and update the actual memory structure using + the cache watch, so those barriers are pointless. + +PHB4 +^^^^ +- phb4: Mask RXE_ARB: DEC Stage Valid Error + + Change the inits to mask out the RXE ARB: DEC Stage Valid Error (bit + 370. This has been a fatal error but should be informational only. + + This update will be in the next version of the phb4 workbook. +- phb4: Add additional adapter to retrain whitelist + + The single port version of the ConnectX-5 has a different device ID 0x1017. + Updated descriptions to match pciutils database. +- PHB4: Default to PCIe GEN3 on POWER9 DD2.00 + + You can use the NVRAM override for DD2.00 screened parts. +- phb4: Retrain link if degraded + + On P9 Scale Out (Nimbus) DD2.0 and Scale in (Cumulus) DD1.0 (and + below) the PCIe PHY can lockup causing training issues. This can cause + a degradation in speed or width in ~5% of training cases (depending on + the card). This is fixed in later chip revisions. This issue can also + cause PCIe links to not train at all, but this case is already + handled. + + This patch checks if the PCIe link has trained optimally and if not, + does a full PHB reset (to fix the PHY lockup) and retrain. + + One complication is some devices are known to train degraded unless + device specific configuration is performed. Because of this, we only + retrain when the device is in a whitelist. All devices in the current + whitelist have been testing on a P9DSU/Boston, ZZ and Witherspoon. + + We always gather information on the link and print it in the logs even + if the card is not in the whitelist. + + For testing purposes, there's an nvram to retry all PCIe cards and all + P9 chips when a degraded link is detected. The new option is + 'pci-retry-all=true' which can be set using: + `nvram -p ibm,skiboot --update-config pci-retry-all=true`. + This option may increase the boot time if used on a badly behaving + card. + + +IBM FSP platforms +----------------- + +- FSP/NVRAM: Handle "get vNVRAM statistics" command + + FSP sends MBOX command (cmd : 0xEB, subcmd : 0x05, mod : 0x00) to get vNVRAM + statistics. OPAL doesn't maintain any such statistics. Hence return + FSP_STATUS_INVALID_SUBCMD. + + Fixes these messages appearing in the OPAL log: :: + + [16944.384670488,3] FSP: Unhandled message eb0500 + [16944.474110465,3] FSP: Unhandled message eb0500 + [16945.111280784,3] FSP: Unhandled message eb0500 + [16945.293393485,3] FSP: Unhandled message eb0500 +- fsp: Move common prints to trace + + These two prints just end up filling the skiboot logs on any machine + that's been booted for more than a few hours. + + They have never been useful, so make them trace level. They were: :: + SURV: Received heartbeat acknowledge from FSP + SURV: Sending the heartbeat command to FSP + +BMC based systems +----------------- +- hw/lpc-uart: read from RBR to clear character timeout interrupts + + When using the aspeed SUART, we see a condition where the UART sends + continuous character timeout interrupts. This change adds a (heavily + commented) dummy read from the RBR to clear the interrupt condition on + init. + + This was observed on p9dsu systems, but likely applies to other systems + using the SUART. +- astbmc: Add methods for handing Device Tree based slots + e.g. ones from HDAT on POWER9. + +General +------- +- ipmi: Convert common debug prints to trace + + OPAL logs messages for every IPMI request from host. Sometime OPAL console + is filled with only these messages. This path is pretty stable now and + we have enough logs to cover bad path. Hence lets convert these debug + message to trace/info message. Examples are: :: + + [ 1356.423958816,7] opal_ipmi_recv(cmd: 0xf0 netfn: 0x3b resp_size: 0x02) + [ 1356.430774496,7] opal_ipmi_send(cmd: 0xf0 netfn: 0x3a len: 0x3b) + [ 1356.430797392,7] BT: seq 0x20 netfn 0x3a cmd 0xf0: Message sent to host + [ 1356.431668496,7] BT: seq 0x20 netfn 0x3a cmd 0xf0: IPMI MSG done +- libflash/file: Handle short read()s and write()s correctly + + Currently we don't move the buffer along for a short read() or write() + and nor do we request only the remaining amount. + +- hw/p8-i2c: Rework timeout handling + + Currently we treat a timeout as a hard failure and will automatically + fail any transations that hit their timeout. This results in + unnecessarily failing I2C requests if interrupts are dropped, etc. + Although these are bad things that we should log we can handle them + better by checking the actual hardware status and completing the + transation if there are no real errors. This patch reworks the timeout + handling to check the status and continue the transaction if it can. + if it can while logging an error if it detects a timeout due to a + dropped interrupt. +- core/flash: Only expect ELF header for BOOTKERNEL partition flash resource + + When loading a flash resource which isn't signed (secure and trusted + boot) and which doesn't have a subpartition, we assume it's the + BOOTKERNEL since previously this was the only such resource. Thus we + also assumed it had an ELF header which we parsed to get the size of the + partition rather than trusting the actual_size field in the FFS header. + A previous commit (9727fe3 DT: Add ibm,firmware-versions node) added the + version resource which isn't signed and also doesn't have a subpartition, + thus we expect it to have an ELF header. It doesn't so we print the + error message "FLASH: Invalid ELF header part VERSION". + + It is a fluke that this works currently since we load the secure boot + header unconditionally and this happen to be the same size as the + version partition. We also don't update the return code on error so + happen to return OPAL_SUCCESS. + + To make this explicitly correct; only check for an ELF header if we are + loading the BOOTKERNEL resource, otherwise use the partition size from + the FFS header. Also set the return code on error so we don't + erroneously return OPAL_SUCCESS. Add a check that the resource will fit + in the supplied buffer to prevent buffer overrun. +- flash: Support adding the no-erase property to flash + + The mbox protocol explicitly states that an erase is not required + before a write. This means that issuing an erase from userspace, + through the mtd device, and back returns a successful operation + that does nothing. Unfortunately, this makes userspace tools unhappy. + Linux MTD devices support the MTD_NO_ERASE flag which conveys that + writes do not require erases on the underlying flash devices. We + should set this property on all of our + devices which do not require erases to be performed. + + NOTE: This still requires a linux kernel component to set the + MTD_NO_ERASE flag from the device tree property. + +Utilities +--------- +- external/gard: Clear entire guard partition instead of entry by entry + + When using the current implementation of the gard tool to ecc clear the + entire GUARD partition it is done one gard record at a time. While this + may be ok when accessing the actual flash this is very slow when done + from the host over the mbox protocol (on the order of 4 minutes) because + the bmc side is required to do many read, erase, writes under the hood. + + Fix this by rewriting the gard tool reset_partition() function. Now we + allocate all the erased guard entries and (if required) apply ecc to the + entire buffer. Then we can do one big erase and write of the entire + partition. This reduces the time to clear the guard partition to on the + order of 4 seconds. +- opal-prd: Fix opal-prd command line options + + HBRT OCC reset interface depends on service processor type. + + - FSP: reset_pm_complex() + - BMC: process_occ_reset() + + We have both `occ` and `pm-complex` command line interfaces. + This patch adds support to dispaly appropriate message depending + on system type. + + === ==================== ============================ + SP Command Action + === ==================== ============================ + FSP opal-prd occ display error message + FSP opal-prd pm-complex Call pm_complex_reset() + BMC opal-prd occ Call process_occ_reset() + BMC opal-prd pm-complex display error message + === ==================== ============================ + +- opal-prd: detect service processor type and + then make appropriate occ reset call. +- pflash: Fix erase command for unaligned start address + + The erase_range() function handles erasing the flash for a given start + address and length, and can handle an unaligned start address and + length. However in the unaligned start address case we are incorrectly + calculating the remaining size which can lead to incomplete erases. + + If we're going to update the remaining size based on what the start + address was then we probably want to do that before we overide the + origin start address. So rearrange the code so that this is indeed the + case. +- external/gard: Print an error if run on an FSP system + +Simulators +---------- + +- mambo: Add mambo socket program + + This adds a program that can be run inside a mambo simulator in linux + userspace which enables TCP sockets to be proxied in and out of the + simulator to the host. + + Unlike mambo bogusnet, it's requires no linux or skiboot specific + drivers/infrastructure to run. + + Run inside the simulator: + + - to forward host ssh connections to sim ssh server: + ``./mambo-socket-proxy -h 10022 -s 22``, then connect to port 10022 + on your host with ``ssh -p 10022 localhost`` + - to allow http proxy access from inside the sim to local http proxy: + ``./mambo-socket-proxy -b proxy.mynetwork -h 3128 -s 3128`` + + Multiple connections are supported. +- idle: disable stop*_lite POWER9 idle states for Mambo platform + + Mambo prior to Mambo.7.8.21 had a bug where the stop idle instruction + with PSSCR[ESL]=PSSCR[EC]=0 would resume with MSR set as though it had + taken a system reset interrupt. + + Linux currently executes this instruction with MSR already set that + way, so the problem went unnoticed. A proposed patch to Linux changes + that, and causes the idle code to crash. Work around this by disabling + lite stop states for the mambo platform for now. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.9-rc2.rst b/roms/skiboot/doc/release-notes/skiboot-5.9-rc2.rst new file mode 100644 index 000000000..ab97b087b --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.9-rc2.rst @@ -0,0 +1,246 @@ +.. _skiboot-5.9-rc2: + +skiboot-5.9-rc2 +=============== + +skiboot v5.9-rc2 was released on Monday October 16th 2017. It is the second +release candidate of skiboot 5.9, which will become the new stable release +of skiboot following the 5.8 release, first released August 31st 2017. + +skiboot v5.9-rc2 contains all bug fixes as of :ref:`skiboot-5.4.8` +and :ref:`skiboot-5.1.21` (the currently maintained stable releases). We +do not currently expect to do any 5.8.x stable releases. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.9 by October 17th, with skiboot 5.9 +being for all POWER8 and POWER9 platforms in op-build v1.20 (Due October 18th). +This release will be targetted to early POWER9 systems. + +Over :ref:`skiboot-5.9-rc1`, we have the following changes: + +- opal-prd: Fix memory leak +- hdata/i2c: update the list of known i2c devs + + This updates the list of known i2c devices - as of HDAT spec v10.5e - so + that they can be properly identified during the hdat parsing. +- hdata/i2c: log unknown i2c devices + + An i2c device is unknown if either the i2c device list is outdated or + the device is marked as unknown (0xFF) in the hdat. + +- opal/cpu: Mark the core as bad while disabling threads of the core. + + If any of the core fails to sync its TB during chipTOD initialization, + all the threads of that core are disabled. But this does not make + linux kernel to ignore the core/cpus. It crashes while bringing them up + with below backtrace: :: + + [ 38.883898] kexec_core: Starting new kernel + cpu 0x0: Vector: 300 (Data Access) at [c0000003f277b730] + pc: c0000000001b9890: internal_create_group+0x30/0x304 + lr: c0000000001b9880: internal_create_group+0x20/0x304 + sp: c0000003f277b9b0 + msr: 900000000280b033 + dar: 40 + dsisr: 40000000 + current = 0xc0000003f9f41000 + paca = 0xc00000000fe00000 softe: 0 irq_happened: 0x01 + pid = 2572, comm = kexec + Linux version 4.13.2-openpower1 (jenkins@p89) (gcc version 6.4.0 (Buildroot 2017.08-00006-g319c6e1)) #1 SMP Wed Sep 20 05:42:11 UTC 2017 + enter ? for help + [c0000003f277b9b0] c0000000008a8780 (unreliable) + [c0000003f277ba50] c00000000041c3ac topology_add_dev+0x2c/0x40 + [c0000003f277ba70] c00000000006b078 cpuhp_invoke_callback+0x88/0x170 + [c0000003f277bac0] c00000000006b22c cpuhp_up_callbacks+0x54/0xb8 + [c0000003f277bb10] c00000000006bc68 cpu_up+0x11c/0x168 + [c0000003f277bbc0] c00000000002f0e0 default_machine_kexec+0x1fc/0x274 + [c0000003f277bc50] c00000000002e2d8 machine_kexec+0x50/0x58 + [c0000003f277bc70] c0000000000de4e8 kernel_kexec+0x98/0xb4 + [c0000003f277bce0] c00000000008b0f0 SyS_reboot+0x1c8/0x1f4 + [c0000003f277be30] c00000000000b118 system_call+0x58/0x6c + +- hw/imc: pause microcode at boot + + IMC nest counters has both in-band (ucode access) and out of + band access to it. Since not all nest counter configurations + are supported by ucode, out of band tools are used to characterize + other configuration. + + So it is prefer to pause the nest microcode at boot to aid the + nest out of band tools. If the ucode not paused and OS does not + have IMC driver support, then out to band tools will race with + ucode and end up getting undesirable values. Patch to check and + pause the ucode at boot. + + OPAL provides APIs to control IMC counters. OPAL_IMC_COUNTERS_INIT + is used to initialize these counters at boot. OPAL_IMC_COUNTERS_START + and OPAL_IMC_COUNTERS_STOP API calls should be used to start and pause + these IMC engines. `doc/opal-api/opal-imc-counters.rst` details the + OPAL APIs and their usage. +- xive: Fix VP free block group mode false-positive parameter check + + The check to ensure the buddy allocation idx is aligned to its + allocation order was not taking into account the allocation split. + This would result in opal_xive_free_vp_block failures despite + giving the same value as returned by opal_xive_alloc_vp_block. + + E.g., starting then stopping 4 KVM guests gives the following pattern + in the host: :: + + opal_xive_alloc_vp_block(5)=0x45000020 + opal_xive_alloc_vp_block(5)=0x45000040 + opal_xive_alloc_vp_block(5)=0x45000060 + opal_xive_alloc_vp_block(5)=0x45000080 + opal_xive_free_vp_block(0x45000020)=-1 + opal_xive_free_vp_block(0x45000040)=0 + opal_xive_free_vp_block(0x45000060)=-1 + opal_xive_free_vp_block(0x45000080)=0 +- hw/p8-i2c: Fix deadlock in p9_i2c_bus_owner_change + + When debugging a system where Linux was taking soft lockup errors with + two CPUs stuck in OPAL: + + ======================= ============== + CPU0 CPU1 + ======================= ============== + lock + p8_i2c_recover + opal_handle_interrupt + sync_timer + cancel_timer + p9_i2c_bus_owner_change + occ_p9_interrupt + xive_source_interrupt + opal_handle_interrupt + ======================= ============== + + p8_i2c_recover() is a timer, and is stuck trying to take master->lock. + p9_i2c_bus_owner_change() has taken master->lock, but then is stuck waiting + for all timers to complete. We deadlock. + + Fix this by using cancel_timer_async(). + +- FSP/CONSOLE: Limit number of error logging + + Commit c8a7535f (FSP/CONSOLE: Workaround for unresponsive ipmi daemon) added + error logging when buffer is full. In some corner cases kernel may call this + function multiple time and we may endup logging error again and again. + + This patch fixes it by generating error log only once. + +- FSP/CONSOLE: Fix fsp_console_write_buffer_space() call + + Kernel calls fsp_console_write_buffer_space() to check console buffer space + availability. If there is enough buffer space to write data, then kernel will + call fsp_console_write() to write actual data. + + In some extreme corner cases (like one explained in commit c8a7535f) + console becomes full and this function returns 0 to kernel (or space available + in console buffer < next incoming data size). Kernel will continue retrying + until it gets enough space. So we will start seeing RCU stalls. + + This patch keeps track of previous available space. If previous space is same + as current means not enough space in console buffer to write incoming data. + It may be due to very high console write operation and slow response from FSP + -OR- FSP has stopped processing data (ex: because of ipmi daemon died). At this + point we will start timer with timeout of SER_BUFFER_OUT_TIMEOUT (10 secs). + If situation is not improved within 10 seconds means something went bad. Lets + return OPAL_RESOURCE so that kernel can drop console write and continue. +- FSP/CONSOLE: Close SOL session during R/R + + Presently we are not closing SOL and FW console sessions during R/R. Host will + continue to write to SOL buffer during FSP R/R. If there is heavy console write + operation happening during FSP R/R (like running `top` command inside console), + then at some point console buffer becomes full. fsp_console_write_buffer_space() + returns 0 (or less than required space to write data) to host. While one thread + is busy writing to console, if some other threads tries to write data to console + we may see RCU stalls (like below) in kernel. :: + + [ 2082.828363] INFO: rcu_sched detected stalls on CPUs/tasks: { 32} (detected by 16, t=6002 jiffies, g=23154, c=23153, q=254769) + [ 2082.828365] Task dump for CPU 32: + [ 2082.828368] kworker/32:3 R running task 0 4637 2 0x00000884 + [ 2082.828375] Workqueue: events dump_work_fn + [ 2082.828376] Call Trace: + [ 2082.828382] [c000000f1633fa00] [c00000000013b6b0] console_unlock+0x570/0x600 (unreliable) + [ 2082.828384] [c000000f1633fae0] [c00000000013ba34] vprintk_emit+0x2f4/0x5c0 + [ 2082.828389] [c000000f1633fb60] [c00000000099e644] printk+0x84/0x98 + [ 2082.828391] [c000000f1633fb90] [c0000000000851a8] dump_work_fn+0x238/0x250 + [ 2082.828394] [c000000f1633fc60] [c0000000000ecb98] process_one_work+0x198/0x4b0 + [ 2082.828396] [c000000f1633fcf0] [c0000000000ed3dc] worker_thread+0x18c/0x5a0 + [ 2082.828399] [c000000f1633fd80] [c0000000000f4650] kthread+0x110/0x130 + [ 2082.828403] [c000000f1633fe30] [c000000000009674] ret_from_kernel_thread+0x5c/0x68 + + Hence lets close SOL (and FW console) during FSP R/R. +- FSP/CONSOLE: Do not associate unavailable console + + Presently OPAL sends associate/unassociate MBOX command for all + FSP serial console (like below OPAL message). We have to check + console is available or not before sending this message. :: + + [ 5013.227994012,7] FSP: Reassociating HVSI console 1 + [ 5013.227997540,7] FSP: Reassociating HVSI console 2 +- FSP: Disable PSI link whenever FSP tells OPAL about impending R/R + + Commit 42d5d047 fixed scenario where DPO has been initiated, but FSP went + into reset before the CEC power down came in. But this is generic issue + that can happen in normal shutdown path as well. + + Hence disable PSI link as soon as we detect FSP impending R/R. + +- fsp: return OPAL_BUSY_EVENT on failure sending FSP_CMD_POWERDOWN_NORM + Also, return OPAL_BUSY_EVENT on failure sending FSP_CMD_REBOOT / DEEP_REBOOT. + + We had a race condition between FSP Reset/Reload and powering down + the system from the host: + + Roughly: + + == ======================== ========================================================== + # FSP Host + == ======================== ========================================================== + 1 Power on + 2 Power on + 3 (inject EPOW) + 4 (trigger FSP R/R) + 5 Processes EPOW event, starts shutting down + 6 calls OPAL_CEC_POWER_DOWN + 7 (is still in R/R) + 8 gets OPAL_INTERNAL_ERROR, spins in opal_poll_events + 9 (FSP comes back) + 10 spinning in opal_poll_events + 11 (thinks host is running) + == ======================== ========================================================== + + The call to OPAL_CEC_POWER_DOWN is only made once as the reset/reload + error path for fsp_sync_msg() is to return -1, which means we give + the OS OPAL_INTERNAL_ERROR, which is fine, except that our own API + docs give us the opportunity to return OPAL_BUSY when trying again + later may be successful, and we're ambiguous as to if you should retry + on OPAL_INTERNAL_ERROR. + + For reference, the linux code looks like this: :: + + static void __noreturn pnv_power_off(void) + { + long rc = OPAL_BUSY; + + pnv_prepare_going_down(); + + while (rc == OPAL_BUSY || rc == OPAL_BUSY_EVENT) { + rc = opal_cec_power_down(0); + if (rc == OPAL_BUSY_EVENT) + opal_poll_events(NULL); + else + mdelay(10); + } + for (;;) + opal_poll_events(NULL); + } + + Which means that *practically* our only option is to return OPAL_BUSY + or OPAL_BUSY_EVENT. + + We choose OPAL_BUSY_EVENT for FSP systems as we do want to ensure we're + running pollers to communicate with the FSP and do the final bits of + Reset/Reload handling before we power off the system. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.9-rc3.rst b/roms/skiboot/doc/release-notes/skiboot-5.9-rc3.rst new file mode 100644 index 000000000..f5095fc5a --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.9-rc3.rst @@ -0,0 +1,42 @@ +.. _skiboot-5.9-rc3: + +skiboot-5.9-rc3 +=============== + +skiboot v5.9-rc3 was released on Wednesday October 18th 2017. It is the third +release candidate of skiboot 5.9, which will become the new stable release +of skiboot following the 5.8 release, first released August 31st 2017. + +skiboot v5.9-rc3 contains all bug fixes as of :ref:`skiboot-5.4.8` +and :ref:`skiboot-5.1.21` (the currently maintained stable releases). We +do not currently expect to do any 5.8.x stable releases. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.9 by October 20th, with skiboot 5.9 +being for all POWER8 and POWER9 platforms in op-build v1.20 (Due October 18th). +This release will be targetted to early POWER9 systems. + +Over :ref:`skiboot-5.9-rc2`, we have the following changes: + + +- Improvements to vpd device tree entries + + Previously we would miss some properties +- Revert "npu2: Add vendor cap for IRQ testing" + + This reverts commit 9817c9e29b6fe00daa3a0e4420e69a97c90eb373 which seems to + break setting the PCI dev flag and the link number in the PCIe vendor + specific config space. This leads to the device driver attempting to + re-init the DL when it shouldn't which can cause HMI's. + +- hw/imc: Fix IMC Catalog load for DD2.X processors +- cpu: Add OPAL_REINIT_CPUS_TM_SUSPEND_DISABLED + + Add a new CPU reinit flag, "TM Suspend Disabled", which requests that + CPUs be configured so that TM (Transactional Memory) suspend mode is + disabled. + + Currently this always fails, because skiboot has no way to query the + state. A future hostboot change will add a mechanism for skiboot to + determine the status and return an appropriate error code. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.9-rc4.rst b/roms/skiboot/doc/release-notes/skiboot-5.9-rc4.rst new file mode 100644 index 000000000..609d1b671 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.9-rc4.rst @@ -0,0 +1,47 @@ +.. _skiboot-5.9-rc4: + +skiboot-5.9-rc4 +=============== + +skiboot v5.9-rc4 was released on Thursday October 19th 2017. It is the fourth +release candidate of skiboot 5.9, which will become the new stable release +of skiboot following the 5.8 release, first released August 31st 2017. + +skiboot v5.9-rc4 contains all bug fixes as of :ref:`skiboot-5.4.8` +and :ref:`skiboot-5.1.21` (the currently maintained stable releases). We +do not currently expect to do any 5.8.x stable releases. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.9 by October 20th, with skiboot 5.9 +being for all POWER8 and POWER9 platforms in op-build v1.20 (Due October 18th, +so we're running a bit behind there). +This release will be targetted to early POWER9 systems. + +Over :ref:`skiboot-5.9-rc3`, we have the following changes: + +- phb4: Fix PCIe GEN4 on DD2.1 and above + + In this change: + eef0e197ab PHB4: Default to PCIe GEN3 on POWER9 DD2.00 + + We clamped DD2.00 parts to GEN3 but unfortunately this change also + applies to DD2.1 and above. + + This fixes this to only apply to DD2.00. +- occ-sensors : Add OCC inband sensor region to exports + (useful for debugging) + +Two SRESET fixes: + +- core: direct-controls: Fix clearing of special wakeup + + 'special_wakeup_count' is incremented on successfully asserting + special wakeup. So we will never clear the special wakeup if we + check 'special_wakeup_count' to be zero. Fix this issue by checking + the 'special_wakeup_count' to 1 in dctl_clear_special_wakeup(). +- core/direct-controls: increase special wakeup timeout on POWER9 + + Some instances have been observed where the special wakeup assert + times out. The current timeout is too short for deeper sleep states. + Hostboot uses 100ms, so match that. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.9-rc5.rst b/roms/skiboot/doc/release-notes/skiboot-5.9-rc5.rst new file mode 100644 index 000000000..a2beb615a --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.9-rc5.rst @@ -0,0 +1,74 @@ +.. _skiboot-5.9-rc5: + +skiboot-5.9-rc5 +=============== + +skiboot v5.9-rc5 was released on Monday October 23rd 2017 approximately +32,000ft above somewhere north of Tucson, Arizona. It is the fifth +release candidate of skiboot 5.9, which will become the new stable release +of skiboot following the 5.8 release, first released August 31st 2017. + +skiboot v5.9-rc5 contains all bug fixes as of :ref:`skiboot-5.4.8` +and :ref:`skiboot-5.1.21` (the currently maintained stable releases). We +do not currently expect to do any 5.8.x stable releases. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 5.9 very shortly, with skiboot 5.9 +being for all POWER8 and POWER9 platforms in op-build v1.20 (Due October 18th, +so we're running a bit behind there). +This release will be targetted to early POWER9 systems. + +Over :ref:`skiboot-5.9-rc3`, we have the following changes: + +- opal/hmi: Workaround Power9 hw logic bug for couple of TFMR TB errors. +- opal/hmi: Fix TB reside and HDEC parity error recovery for power9 +- phb4: Escalate freeze to fence to avoid checkstop + + Freeze events such as MMIO loads can cause the PHB to lose it's + limited powerbus credits. If all credits are used and a further MMIO + will cause a checkstop. + + To work around this, we escalate the troublesome freeze events to a + fence. The fence will cause a full PHB reset which resets the powerbus + credits and avoids the checkstop. +- phb4: Update some init registers + + New inits based on next PHB4 workbook. Increases some timeouts to + avoid some spurious error conditions. +- phb4: Enable PHB MMIO in phb4_root_port_init() + + Linux EEH flow is somewhat broken. It saves the PCIe config space of + the PHB on boot, which it then uses to restore on EEH recovery. It + does this to restore MMIO bars and some other pieces. + + Unfortunately this save is done before any drivers are bound to + devices under the PHB. A number of other things are configured in the + PHB after drivers start, hence some configuration space settings + aren't saved correctly. These include bus master and MMIO bits in the + command register. + + Linux tried to hack around this in this linux commit + ``bf898ec5cb`` powerpc/eeh: Enable PCI_COMMAND_MASTER for PCI bridges + This sets the bus master bit but ignores the MMIO bit. + + Hence we lose MMIO after a full PHB reset. This causes the next MMIO + access to the device to fail and for us to perform a PE freeze + recovery, which still doesn't set the MMIO bit and hence we still + fail. + + This works around this by forcing MMIO on during + phb4_root_port_init(). + + With this we can recovery from a PHB fence event on POWER9. +- phb4: Reduce link degraded message log level to debug + + If we hit this message we'll retry and fix the problem. If we run out + of retries and can't fix the problem, we'll still print a log message + at error level indicating a problem. +- phb4: Fix GEN3 for DD2.00 + + In this fix: ``62ac7631ae`` "phb4: Fix PCIe GEN4 on DD2.1 and above", + We fixed DD2.1 GEN4 but broke DD2.00 as GEN3. + + This fixes DD2.00 back to GEN3. This time for sure! diff --git a/roms/skiboot/doc/release-notes/skiboot-5.9.1.rst b/roms/skiboot/doc/release-notes/skiboot-5.9.1.rst new file mode 100644 index 000000000..9aec74e88 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.9.1.rst @@ -0,0 +1,29 @@ +.. _skiboot-5.9.1: + +============= +skiboot-5.9.1 +============= + +skiboot 5.9.1 was released on Tuesday November 14th, 2017. It replaces +:ref:`skiboot-5.9` as the current stable release in the 5.9.x series. + +Over :ref:`skiboot-5.9`, we have two NPU2 (NVLink2) fixes and two XIVE +bug fixes: + +- npu2: hw-procedures: Refactor reset_ntl procedure + + Change the implementation of reset_ntl to match the latest programming + guide documentation. +- npu2: hw-procedures: Add phy_rx_clock_sel() + + Change the RX clk mux control to be done by software instead of HW. This + avoids glitches caused by changing the mux setting. + +- xive: Fix ability to clear some EQ flags + + We could never clear "unconditional notify" and "escalate" +- xive: Update inits for DD2.0 + + This updates some inits based on information from the HW + designers. This includes enabling some new DD2.0 features + that we don't yet exploit. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.9.2.rst b/roms/skiboot/doc/release-notes/skiboot-5.9.2.rst new file mode 100644 index 000000000..134fc999d --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.9.2.rst @@ -0,0 +1,80 @@ +.. _skiboot-5.9.2: + +============= +skiboot-5.9.2 +============= + +skiboot 5.9.2 was released on Thursday November 16th, 2017. It replaces +:ref:`skiboot-5.9.1` as the current stable release in the 5.9.x series. + +Over :ref:`skiboot-5.9.1`, we have a few PHB4 (PCI) fixes, an i2c fix for +POWER9 platforms to avoid conflicting with the OCC use and an important +NPU2 (NVLink2) fix. + +- phb4: Fix lane equalisation setting + + Fix cut and paste from phb3. The sizes have changes now we have GEN4, + so the check here needs to change also + + Without this we end up with the default settings (all '7') rather + than what's in HDAT. + +- phb4: Fix PE mapping of M32 BAR + + The M32 BAR is the PHB4 region used to map all the non-prefetchable + or 32-bit device BARs. It's supposed to have its segments remapped + via the MDT and Linux relies on that to assign them individual PE#. + + However, we weren't configuring that properly and instead used the + mode where PE# == segment#, thus causing EEH to freeze the wrong + device or PE#. +- phb4: Fix lost bit in PE number on config accesses + + A PE number can be up to 9 bits, using a uint8_t won't fly.. + + That was causing error on config accesses to freeze the + wrong PE. +- phb4: Update inits + + New init value from HW folks for the fence enable register. + + This clears bit 17 (CFG Write Error CA or UR response) and bit 22 (MMIO Write + DAT_ERR Indication) and sets bit 21 (MMIO CFG Pending Error) +- npu2: Move to new GPU memory map + + There are three different ways we configure the MCD and memory map. + + 1) Old way (current way) + Skiboot configures the MCD and puts GPUs at 4TB and below + 2) New way with MCD + Hostboot configures the MCD and skiboot puts GPU at 4TB and above + 3) New way without MCD + No one configures the MCD and skiboot puts GPU at 4TB and below + + The change keeps option 1 and adds options 2 and 3. + + The different configurations are detected using certain scoms (see + patch). + + Option 1 will go away eventually as it's a configuration that can + cause xstops or data integrity problems. We are keeping it around to + support existing hostboot. + + Option 2 supports only 4 GPUs and 512GB of memory per socket. + + Option 3 supports 6 GPUs and 4TB of memory but may have some + performance impact. + +- p8-i2c: Don't write the watermark register at init + + On P9 the I2C master is shared with the OCC. Currently the watermark + values are set once at init time which is bad for two reasons: + + a) We don't take the OCC master lock before setting it. Which + may cause issues if the OCC is currently using the master. + b) The OCC might change the watermark levels and we need to reset + them. + + Change this so that we set the watermark value when a new transaction + is started rather than at init time. + diff --git a/roms/skiboot/doc/release-notes/skiboot-5.9.3.rst b/roms/skiboot/doc/release-notes/skiboot-5.9.3.rst new file mode 100644 index 000000000..af5149a1a --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.9.3.rst @@ -0,0 +1,24 @@ +.. _skiboot-5.9.3: + +============= +skiboot-5.9.3 +============= + +skiboot 5.9.3 was released on Wednesday November 22nd, 2017. It replaces +:ref:`skiboot-5.9.2` as the current stable release in the 5.9.x series. + +Over :ref:`skiboot-5.9.2`, we have one NPU2/NVLink2 fix that causes the +machine to crash hard in the event of hardware error rather than crash +mysteriously later on whenever the NVLink2 links are used. + +That fix is: + +- npu2: hw-procedures: Add check_credits procedure + + As an immediate mitigator for a current hardware glitch, add a procedure + that can be used to validate NTL credit values. This will be called as a + safeguard to check that link training succeeded. + + Assert that things are exactly as we expect, because if they aren't, the + system will experience a catastrophic failure shortly after the start of + link traffic. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.9.4.rst b/roms/skiboot/doc/release-notes/skiboot-5.9.4.rst new file mode 100644 index 000000000..a222f215e --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.9.4.rst @@ -0,0 +1,26 @@ +.. _skiboot-5.9.4: + +============= +skiboot-5.9.4 +============= + +skiboot 5.9.4 was released on Wednesday November 29th, 2017. It replaces +:ref:`skiboot-5.9.3` as the current stable release in the 5.9.x series. + +Over :ref:`skiboot-5.9.3`, we have one NPU2/NVLink2 fix that works around +a potential glitch (the one :ref:`skiboot-5.9.3` would hard crash on rather +than let a system continue to run until it mysteriously crashed later on). + +That fix is in two parts: + +- npu2: hw-procedures: Change phy_rx_clock_sel values to recover from a + potential glitch. + +- npu2: hw-procedures: Manipulate IOVALID during training + + Ensure that the IOVALID bit for this brick is raised at the start of + link training, in the reset_ntl procedure. + + Then, to protect us from a glitch when the PHY clock turns off or gets + chopped, lower IOVALID for the duration of the phy_reset and + phy_rx_dccal procedures. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.9.5.rst b/roms/skiboot/doc/release-notes/skiboot-5.9.5.rst new file mode 100644 index 000000000..5167367c5 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.9.5.rst @@ -0,0 +1,77 @@ +.. _skiboot-5.9.5: + +============= +skiboot-5.9.5 +============= + +skiboot 5.9.5 was released on Wednesday December 13th, 2017. It replaces +:ref:`skiboot-5.9.4` as the current stable release in the 5.9.x series. + +Over :ref:`skiboot-5.9.4`, we have a few bug fixes, they are: + +- Fix *extremely* rare race in timer code. +- xive: Ensure VC informational FIRs are masked + + Some HostBoot versions leave those as checkstop, they are harmless + and can sometimes occur during normal operations. +- xive: Fix occasional VC checkstops in xive_reset + + The current workaround for the scrub bug described in + __xive_cache_scrub() has an issue in that it can leave + dirty invalid entries in the cache. + + When cleaning up EQs or VPs during reset, if we then + remove the underlying indirect page for these entries, + the XIVE will checkstop when trying to flush them out + of the cache. + + This replaces the existing workaround with a new pair of + workarounds for VPs and EQs: + + - The VP one does the dummy watch on another entry than + the one we scrubbed (which does the job of pushing old + stores out) using an entry that is known to be backed by + a permanent indirect page. + - The EQ one switches to a more efficient workaround + which consists of doing a non-side-effect ESB load from + the EQ's ESe control bits. +- io: Add load_wait() helper + + This uses the standard form twi/isync pair to ensure a load + is consumed by the core before continuing. This can be necessary + under some circumstances for example when having the following + sequence: + + - Store reg A + - Load reg A (ensure above store pushed out) + - delay loop + - Store reg A + + IE, a mandatory delay between 2 stores. In theory the first store + is only guaranteed to rach the device after the load from the same + location has completed. However the processor will start executing + the delay loop without waiting for the return value from the load. + + This construct enforces that the delay loop isn't executed until + the load value has been returned. +- xive: Do not return a trigger page for an escalation interrupt + + This is bogus, we don't support them. (Thankfully the callers + didn't actually try to use this on escalation interrupts). +- xive: Mark a freed IRQ's IVE as valid and masked + + Removing the valid bit means a FIR will trip if it's accessed + inadvertently. Under some circumstances, the XIVE will speculatively + access an IVE for a masked interrupt and trip it. So make sure that + freed entries are still marked valid (but masked). +- hw/nx: Fix NX BAR assignments + + The NX rng BAR is used by each core to source random numbers for the + DARN instruction. Currently we configure each core to use the NX rng of + the chip that it exists on. Unfortunately, the NX can be deconfigured by + hostboot and in this case we need to use the NX of a different chip. + + This patch moves the BAR assignments for the NX into the normal nx-rng + init path. This lets us check if the normal (chip local) NX is active + when configuring which NX a core should use so that we can fallback + gracefully. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.9.6.rst b/roms/skiboot/doc/release-notes/skiboot-5.9.6.rst new file mode 100644 index 000000000..0be7c5300 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.9.6.rst @@ -0,0 +1,30 @@ +.. _skiboot-5.9.6: + +============= +skiboot-5.9.6 +============= + +skiboot 5.9.6 was released on Friday December 15th, 2017. It replaces +:ref:`skiboot-5.9.5` as the current stable release in the 5.9.x series. + +Over :ref:`skiboot-5.9.5`, we have a few bug fixes, they are: + +- sensors: occ: Skip counter type of sensors + + Don't add counter type of sensors to device-tree as they don't + fit into hwmon sensor interface. +- p9_stop_api updates to support IMC across deep stop states. +- opal/xscom: Add recovery for lost core wakeup scom failures. + + Due to a hardware issue where core responding to scom was delayed due to + thread reconfiguration, leaves the SCOM logic in a state where the + subsequent scom to that core can get errors. This is affected for Core + PC scom registers in the range of 20010A80-20010ABF + + The solution is if a xscom timeout occurs to one of Core PC scom registers + in the range of 20010A80-20010ABF, a clearing scom write is done to + 0x20010800 with data of '0x00000000' which will also get a timeout but + clears the scom logic errors. After the clearing write is done the original + scom operation can be retried. + + The scom timeout is reported as status 0x4 (Invalid address) in HMER[21-23]. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.9.7.rst b/roms/skiboot/doc/release-notes/skiboot-5.9.7.rst new file mode 100644 index 000000000..398ad2d9f --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.9.7.rst @@ -0,0 +1,28 @@ +.. _skiboot-5.9.7: + +============= +skiboot-5.9.7 +============= + +skiboot 5.9.7 was released on Friday December 22nd, 2017. It replaces +:ref:`skiboot-5.9.6` as the current stable release in the 5.9.x series. + +Over :ref:`skiboot-5.9.6`, we have two bug fixes, they are: + +- phb4: Change PCI MMIO timers + + Currently we have a mismatch between the NCU and PCI timers for MMIO + accesses. The PCI timers must be lower than the NCU timers otherwise + it may cause checkstops. + + This changes PCI timeouts controlled by skiboot to 33-50ms. It should + be forwards and backwards compatible with expected hostboot changes to + the NCU timer. +- p8-i2c: Limit number of retry attempts + + Currently we will attempt to start an I2C transaction until it succeeds. + In the event that the OCC does not release the lock on an I2C bus this + results in an async token being held forever and the kernel thread that + started the transaction will block forever while waiting for an async + completion message. Fix this by limiting the number of attempts to + start the transaction. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.9.8.rst b/roms/skiboot/doc/release-notes/skiboot-5.9.8.rst new file mode 100644 index 000000000..d8bc966f9 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.9.8.rst @@ -0,0 +1,16 @@ +.. _skiboot-5.9.8: + +============= +skiboot-5.9.8 +============= + +skiboot-5.9.8 was released on Friday January 5th, 2018. It replaces +:ref:`skiboot-5.9.7` as the current stable release in the 5.9.x series. + +Over :ref:`skiboot-5.9.7`, we have one new feature: + +- Parse IPL FW feature settings + + Add parsing for the firmware feature flags in the HDAT. This + indicates the settings of various parameters which are set at IPL time + by firmware. diff --git a/roms/skiboot/doc/release-notes/skiboot-5.9.9.rst b/roms/skiboot/doc/release-notes/skiboot-5.9.9.rst new file mode 100644 index 000000000..81674f835 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.9.9.rst @@ -0,0 +1,27 @@ +.. _skiboot-5.9.9: + +============= +skiboot-5.9.9 +============= + +skiboot 5.9.9 was released on Monday May 28th, 2018. It replaces +:ref:`skiboot-5.9.8` as the current stable release in the 5.9.x series. + +Over :ref:`skiboot-5.9.8`, we have two bug fixes and a build fix, they are: + +- OPAL_PCI_SET_POWER_STATE: fix locking in error paths + + Otherwise we could exit OPAL holding locks, potentially leading + to all sorts of problems later on. +- lpc: Clear pending IRQs at boot + + When we come in from hostboot the LPC master has the bus reset indicator + set. This error isn't handled until the host kernel unmasks interrupts, + at which point we get the following suprious error: :: + + [ 20.053560375,3] LPC: Got LPC reset on chip 0x0 ! + [ 20.053564560,3] LPC[000]: Unknown LPC error Error address reg: 0x00000000 + + Fix this by clearing the various error bits in the LPC status register + before we initalise the skiboot LPC bus driver. +- stb: Build fixes in constructing secure and trusted boot header diff --git a/roms/skiboot/doc/release-notes/skiboot-5.9.rst b/roms/skiboot/doc/release-notes/skiboot-5.9.rst new file mode 100644 index 000000000..f06aeef80 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-5.9.rst @@ -0,0 +1,1181 @@ +.. _skiboot-5.9: + +skiboot-5.9 +=========== + +skiboot v5.9 was released on Tuesday October 31st 2017. It is the first +release of skiboot 5.9 and becomes the new stable release +of skiboot following the 5.8 release, first released August 31st 2017. +In this cyle we have had five release candidate releases, mostly centered +around bug fixing for POWER9 platforms. + +This release should be considered suitable for early-access POWER9 systems. + +skiboot v5.9 contains all bug fixes as of :ref:`skiboot-5.4.8` +and :ref:`skiboot-5.1.21` (the currently maintained stable releases). +There may be some 5.9.x stable releases, depending on what issues are found. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +Over :ref:`skiboot-5.8`, we have the following changes: + +New Features +------------ + +POWER8 +^^^^^^ +- fast-reset by default (if possible) + + Currently, this is limited to POWER8 systems. + + A normal reboot will, rather than doing a full IPL, go through a + fast reboot procedure. This reduces the "reboot to petitboot" time + from minutes to a handful of seconds. + +POWER9 +^^^^^^ + +Since :ref:`skiboot-5.9-rc3`: + +- occ-sensors : Add OCC inband sensor region to exports + (useful for debugging) + +Two SRESET fixes (see below for feature description): + +- core: direct-controls: Fix clearing of special wakeup + + 'special_wakeup_count' is incremented on successfully asserting + special wakeup. So we will never clear the special wakeup if we + check 'special_wakeup_count' to be zero. Fix this issue by checking + the 'special_wakeup_count' to 1 in dctl_clear_special_wakeup(). +- core/direct-controls: increase special wakeup timeout on POWER9 + + Some instances have been observed where the special wakeup assert + times out. The current timeout is too short for deeper sleep states. + Hostboot uses 100ms, so match that. + + +Since :ref:`skiboot-5.9-rc2`: +- cpu: Add OPAL_REINIT_CPUS_TM_SUSPEND_DISABLED + + Add a new CPU reinit flag, "TM Suspend Disabled", which requests that + CPUs be configured so that TM (Transactional Memory) suspend mode is + disabled. + + Currently this always fails, because skiboot has no way to query the + state. A future hostboot change will add a mechanism for skiboot to + determine the status and return an appropriate error code. + +Since :ref:`skiboot-5.8`: + +- POWER9 power management during boot + + Less power should be consumed during boot. +- OPAL_SIGNAL_SYSTEM_RESET for POWER9 + + This implements OPAL_SIGNAL_SYSTEM_RESET, using scom registers to + quiesce the target thread and raise a system reset exception on it. + It has been tested on DD2 with stop0 ESL=0 and ESL=1 shallow power + saving modes. + + DD1 is not implemented because it is sufficiently different as to + make support difficult. +- Enable deep idle states for POWER9 + + - SLW: Add support for p9_stop_api + + p9_stop_api's are used to set SPR state on a core wakeup form a deeper + low power state. p9_stop_api uses low level platform formware and + self-restore microcode to restore the sprs to requested values. + + Code is taken from : + https://github.com/open-power/hostboot/tree/master/src/import/chips/p9/procedures/utils/stopreg + - SLW: Removing timebase related flags for stop4 + + When a core enters stop4, it does not loose decrementer and time base. + Hence removing flags OPAL_PM_DEC_STOP and OPAL_PM_TIMEBASE_STOP. + - SLW: Allow deep states if homer address is known + + Use a common variable has_wakeup_engine instead of has_slw to tell if + the: + - SLW image is populated in case of power8 + - CME image is populated in case of power9 + + Currently we expect CME to be loaded if homer address is known ( except + for simulators) + - SLW: Configure self-restore for HRMOR + + Make a stop api call using libpore to restore HRMOR register. HRMOR needs + to be cleared so that when thread exits stop, they arrives at linux + system_reset vector (0x100). + - SLW: Add opal_slw_set_reg support for power9 + + This OPAL call is made from Linux to OPAL to configure values in + various SPRs after wakeup from a deep idle state. +- PHB4: CAPP recovery + + CAPP recovery is initiated when a CAPP Machine Check is detected. + The capp recovery procedure is initiated via a Hypervisor Maintenance + interrupt (HMI). + + CAPP Machine Check may arise from either an error that results in a PHB + freeze or from an internal CAPP error with CAPP checkstop FIR action. + An error that causes a PHB freeze will result in the link down signal + being asserted. The system continues running and the CAPP and PSL will + be re-initialized. + + This implements CAPP recovery for POWER9 systems +- Add ``wafer-location`` property for POWER9 + + Extract wafer-location from ECID and add property under xscom node. + - bits 64:71 are the chip x location (7:0) + - bits 72:79 are the chip y location (7:0) + + Sample output: :: + + [root@wsp xscom@623fc00000000]# lsprop ecid + ecid 019a00d4 03100718 852c0000 00fd7911 + [root@wsp xscom@623fc00000000]# lsprop wafer-location + wafer-location 00000085 0000002c +- Add ``wafer-id`` property for POWER9 + + Wafer id is derived from ECID data. + - bits 4:63 are the wafer id ( ten 6 bit fields each containing a code) + + Sample output: :: + + [root@wsp xscom@623fc00000000]# lsprop ecid + ecid 019a00d4 03100718 852c0000 00fd7911 + [root@wsp xscom@623fc00000000]# lsprop wafer-id + wafer-id "6Q0DG340SO" +- Add ``ecid`` property under ``xscom`` node for POWER9. + Sample output: :: + + [root@wsp xscom@623fc00000000]# lsprop ecid + ecid 019a00d4 03100718 852c0000 00fd7911 +- Add ibm,firmware-versions device tree node + + In P8, hostboot provides mini device tree. It contains ``/ibm,firmware-versions`` + node which has various firmware component version details. + + In P9, OPAL is building device tree. This patch adds support to parse VERSION + section of PNOR and create ``/ibm,firmware-versions`` device tree node. + + Sample output: :: + + /sys/firmware/devicetree/base/ibm,firmware-versions # lsprop . + occ "6a00709" + skiboot "v5.7-rc1-p344fb62" + buildroot "2017.02.2-7-g23118ce" + capp-ucode "9c73e9f" + petitboot "v1.4.3-p98b6d83" + sbe "02021c6" + open-power "witherspoon-v1.17-128-gf1b53c7-dirty" + .... + .... + +POWER9 +------ +Since :ref:`skiboot-5.9-rc5`: + +- Suppress XSCOM chiplet-offline errors on P9 + + Workaround on P9: PRD does operations it *knows* will fail with this + error to work around a hardware issue where accesses via the PIB + (FSI or OCC) work as expected, accesses via the ADU (what xscom goes + through) do not. The chip logic will always return all FFs if there + is any error on the scom. +- asm/head: initialize preferred DSCR value + + POWER7/8 use DSCR=0. POWER9 preferred value has "stride-N" enabled. + +Since :ref:`skiboot-5.9-rc4`: +- opal/hmi: Workaround Power9 hw logic bug for couple of TFMR TB errors. +- opal/hmi: Fix TB reside and HDEC parity error recovery for power9 + +Since :ref:`skiboot-5.9-rc2`: +- hw/imc: Fix IMC Catalog load for DD2.X processors + +Since :ref:`skiboot-5.9-rc1`: +- xive: Fix VP free block group mode false-positive parameter check + + The check to ensure the buddy allocation idx is aligned to its + allocation order was not taking into account the allocation split. + This would result in opal_xive_free_vp_block failures despite + giving the same value as returned by opal_xive_alloc_vp_block. + + E.g., starting then stopping 4 KVM guests gives the following pattern + in the host: :: + + opal_xive_alloc_vp_block(5)=0x45000020 + opal_xive_alloc_vp_block(5)=0x45000040 + opal_xive_alloc_vp_block(5)=0x45000060 + opal_xive_alloc_vp_block(5)=0x45000080 + opal_xive_free_vp_block(0x45000020)=-1 + opal_xive_free_vp_block(0x45000040)=0 + opal_xive_free_vp_block(0x45000060)=-1 + opal_xive_free_vp_block(0x45000080)=0 + +- hw/imc: pause microcode at boot + + IMC nest counters has both in-band (ucode access) and out of + band access to it. Since not all nest counter configurations + are supported by ucode, out of band tools are used to characterize + other configuration. + + So it is prefer to pause the nest microcode at boot to aid the + nest out of band tools. If the ucode not paused and OS does not + have IMC driver support, then out to band tools will race with + ucode and end up getting undesirable values. Patch to check and + pause the ucode at boot. + + OPAL provides APIs to control IMC counters. OPAL_IMC_COUNTERS_INIT + is used to initialize these counters at boot. OPAL_IMC_COUNTERS_START + and OPAL_IMC_COUNTERS_STOP API calls should be used to start and pause + these IMC engines. `doc/opal-api/opal-imc-counters.rst` details the + OPAL APIs and their usage. +- hdata/i2c: update the list of known i2c devs + + This updates the list of known i2c devices - as of HDAT spec v10.5e - so + that they can be properly identified during the hdat parsing. +- hdata/i2c: log unknown i2c devices + + An i2c device is unknown if either the i2c device list is outdated or + the device is marked as unknown (0xFF) in the hdat. + +Since :ref:`skiboot-5.8`: + +- Disable Transactional Memory on Power9 DD 2.1 + + Update pa_features_p9[] to disable TM (Transactional Memory). On DD 2.1 + TM is not usable by Linux without other workarounds, so skiboot must + disable it. +- xscom: Do not print error message for 'chiplet offline' return values + + xscom_read/write operations returns CHIPLET_OFFLINE when chiplet is offline. + Some multicast xscom_read/write requests from HBRT results in xscom operation + on offline chiplet(s) and printing below warnings in OPAL console: :: + + [ 135.036327572,3] XSCOM: Read failed, ret = -14 + [ 135.092689829,3] XSCOM: Read failed, ret = -14 + + Some SCOM users can deal correctly with this error code (notably opal-prd), + so the error message is (in practice) erroneous. +- IMC: Fix the core_imc_event_mask + + CORE_IMC_EVENT_MASK is a scom that contains bits to control event sampling for + different machine state for core imc. The current event-mask setting sample + events only on host kernel (hypervisor) and host userspace. + + Patch to enable the sampling of events in other machine states (like guest + kernel and guest userspace). +- IMC: Update the nest_pmus array with occ/gpe microcode uav updates + + OOC/gpe nest microcode maintains the list of individual nest units + supported. Sync the recent updates to the UAV with nest_pmus array. + + For reference occ/gpr microcode link for the UAV: + https://github.com/open-power/occ/blob/master/src/occ_gpe1/gpe1_24x7.h +- Parse IOSLOT information from HDAT + + Add structure definitions that describe the physical PCIe topology of + a system and parse them into the device-tree based PCIe slot + description. +- idle: user context state loss flags fix for stop states + + The "lite" stop variants with PSSCR[ESL]=PSSCR[EC]=1 do not lose user + context, while the non-lite variants do (ESL: enable state loss). + + Some of the POWER9 idle states had these wrong. + +CAPI +^^^^ +- POWER9 DD2 update + + The CAPI initialization sequence has been updated in DD2. + This patch adapts to the changes, retaining compatibility with DD1. + The patch includes some changes to DD1 fix-ups as well. +- Load CAPP microcode for POWER9 DD2.0 and DD2.1 +- capi: Mask Psl Credit timeout error for POWER9 + + Mask the PSL credit timeout error in CAPP FIR Mask register + bit(46). As per the h/w team this error is now deprecated and shouldn't + cause any fir-action for P9. + +NVLINK2 +^^^^^^^ + +A notabale change is that we now generate the device tree description of +NVLINK based on the HDAT we get from hostboot. Since Hostboot will generate +HDAT based on VPD, you now *MUST* have correct VPD programmed or we will +*default* to a Sequoia layout, which will lead to random problems if you +are not booting a Sequoia Witherspoon planar. In the case of booting with +old VPD and/or Hostboot, we print a **giant scary warning** in order to scare you. + +Since :ref:`skiboot-5.9-rc2`: +- Revert "npu2: Add vendor cap for IRQ testing" + + This reverts commit 9817c9e29b6fe00daa3a0e4420e69a97c90eb373 which seems to + break setting the PCI dev flag and the link number in the PCIe vendor + specific config space. This leads to the device driver attempting to + re-init the DL when it shouldn't which can cause HMI's. + +Since :ref:`skiboot-5.8`: + +- npu2: Read slot label from the HDAT link node + + Binding GPU to emulated NPU PCI devices is done using the slot labels + since the NPU devices do not have a patching slot node we need to + copy the label in here. + +- npu2: Copy link speed from the npu HDAT node + + This needs to be in the PCI device node so the speed of the NVLink + can be passed to the GPU driver. +- npu2: hw-procedures: Add settings to PHY_RESET + + Set a few new values in the PHY_RESET procedure, as specified by our + updated programming guide documentation. +- Parse NVLink information from HDAT + + Add the per-chip structures that descibe how the A-Bus/NVLink/OpenCAPI + phy is configured. This generates the npu@xyz nodes for each chip on + systems that support it. +- npu2: Add vendor cap for IRQ testing + + Provide a way to test recoverable data link interrupts via a new + vendor capability byte. +- npu2: Enable recoverable data link (no-stall) interrupts + + Allow the NPU2 to trigger "recoverable data link" interrupts. + +- npu2: Implement basic FLR (Function Level Reset) +- npu2: hw-procedures: Update PHY DC calibration procedure +- npu2: hw-procedures: Change rx_pr_phase_step value + +XIVE +^^^^ +- xive: Fix opal_xive_dump_tm() to access W2 properly. + The HW only supported limited access sizes. +- xive: Make opal_xive_allocate_irq() properly try all chips + + When requested via OPAL_XIVE_ANY_CHIP, we need to try all + chips. We first try the current one (on which the caller + sits) and if that fails, we iterate all chips until the + allocation succeeds. +- xive: Fix initialization & cleanup of HW thread contexts + + Instead of trying to "pull" everything and clear VT (which didn't + work and caused some FIRs to be set), instead just clear and then + set the PTER thread enable bit. This has the side effect of + completely resetting the corresponding thread context. + + This fixes the spurrious XIVE FIRs reported by PRD and fircheck +- xive: Add debug option for detecting misrouted IPI in emulation + + This is high overhead so we don't enable it by default even + in debug builds, it's also a bit messy, but it allowed me to + detect and debug a locking issue earlier so it can be useful. +- xive: Increase the interrupt "gap" on debug builds + + We normally allocate IPIs from 0x10. Make that 0x1000 on debug + builds to limit the chances of overlapping with Linux interrupt + numbers which makes debugging code that confuses them easier. + + Also add a warning in emulation if we get an interrupt in the + queue whose number is below the gap. +- xive: Fix locking around cache scrub & watch + + Thankfully the missing locking only affects debug code and + init code that doesn't run concurrently. Also adds a DEBUG + option that checks the lock is properly held. +- xive: Workaround HW issue with scrub facility + + Without this, we sometimes don't observe from a CPU the + values written to the ENDs or NVTs via the cache watch. +- xive: Add exerciser for cache watch/scrub facility in DEBUG builds +- xive: Make assertion in xive_eq_for_target() more informative +- xive: Add debug code to check initial cache updates +- xive: Ensure pressure relief interrupts are disabled + + We don't use them and we hijack the VP field with their + configuration to store the EQ reference, so make sure the + kernel or guest can't turn them back on by doing MMIO + writes to ACK# +- xive: Don't try setting the reserved ACK# field in VPs + + That doesn't work, the HW doesn't implement it in the cache + watch facility anyway. +- xive: Remove useless memory barriers in VP/EQ inits + + We no longer update "live" memory structures, we use a temporary + copy on the stack and update the actual memory structure using + the cache watch, so those barriers are pointless. + +PHB4 +^^^^ +Since :ref:`skiboot-5.9-rc4`: + +- phb4: Escalate freeze to fence to avoid checkstop + + Freeze events such as MMIO loads can cause the PHB to lose it's + limited powerbus credits. If all credits are used and a further MMIO + will cause a checkstop. + + To work around this, we escalate the troublesome freeze events to a + fence. The fence will cause a full PHB reset which resets the powerbus + credits and avoids the checkstop. +- phb4: Update some init registers + + New inits based on next PHB4 workbook. Increases some timeouts to + avoid some spurious error conditions. +- phb4: Enable PHB MMIO in phb4_root_port_init() + + Linux EEH flow is somewhat broken. It saves the PCIe config space of + the PHB on boot, which it then uses to restore on EEH recovery. It + does this to restore MMIO bars and some other pieces. + + Unfortunately this save is done before any drivers are bound to + devices under the PHB. A number of other things are configured in the + PHB after drivers start, hence some configuration space settings + aren't saved correctly. These include bus master and MMIO bits in the + command register. + + Linux tried to hack around this in this linux commit + ``bf898ec5cb`` powerpc/eeh: Enable PCI_COMMAND_MASTER for PCI bridges + This sets the bus master bit but ignores the MMIO bit. + + Hence we lose MMIO after a full PHB reset. This causes the next MMIO + access to the device to fail and for us to perform a PE freeze + recovery, which still doesn't set the MMIO bit and hence we still + fail. + + This works around this by forcing MMIO on during + phb4_root_port_init(). + + With this we can recovery from a PHB fence event on POWER9. +- phb4: Reduce link degraded message log level to debug + + If we hit this message we'll retry and fix the problem. If we run out + of retries and can't fix the problem, we'll still print a log message + at error level indicating a problem. +- phb4: Fix GEN3 for DD2.00 + + In this fix: ``62ac7631ae phb4: Fix PCIe GEN4 on DD2.1 and above`` + We fixed DD2.1 GEN4 but broke DD2.00 as GEN3. + + This fixes DD2.00 back to GEN3. This time for sure! + +Since :ref:`skiboot-5.9-rc3`: +- phb4: Fix PCIe GEN4 on DD2.1 and above + + In this change: + eef0e197ab PHB4: Default to PCIe GEN3 on POWER9 DD2.00 + + We clamped DD2.00 parts to GEN3 but unfortunately this change also + applies to DD2.1 and above. + + This fixes this to only apply to DD2.00. + +Since :ref:`skiboot-5.8`: + +- phb4: Mask RXE_ARB: DEC Stage Valid Error + + Change the inits to mask out the RXE ARB: DEC Stage Valid Error (bit + 370. This has been a fatal error but should be informational only. + + This update will be in the next version of the phb4 workbook. +- phb4: Add additional adapter to retrain whitelist + + The single port version of the ConnectX-5 has a different device ID 0x1017. + Updated descriptions to match pciutils database. +- PHB4: Default to PCIe GEN3 on POWER9 DD2.00 + + You can use the NVRAM override for DD2.00 screened parts. +- phb4: Retrain link if degraded + + On P9 Scale Out (Nimbus) DD2.0 and Scale in (Cumulus) DD1.0 (and + below) the PCIe PHY can lockup causing training issues. This can cause + a degradation in speed or width in ~5% of training cases (depending on + the card). This is fixed in later chip revisions. This issue can also + cause PCIe links to not train at all, but this case is already + handled. + + This patch checks if the PCIe link has trained optimally and if not, + does a full PHB reset (to fix the PHY lockup) and retrain. + + One complication is some devices are known to train degraded unless + device specific configuration is performed. Because of this, we only + retrain when the device is in a whitelist. All devices in the current + whitelist have been testing on a P9DSU/Boston, ZZ and Witherspoon. + + We always gather information on the link and print it in the logs even + if the card is not in the whitelist. + + For testing purposes, there's an nvram to retry all PCIe cards and all + P9 chips when a degraded link is detected. The new option is + 'pci-retry-all=true' which can be set using: + `nvram -p ibm,skiboot --update-config pci-retry-all=true`. + This option may increase the boot time if used on a badly behaving + card. + + +IBM FSP platforms +----------------- + +Since :ref:`skiboot-5.9-rc5`: +- FSP/CONSOLE: Disable notification on unresponsive consoles + + Commit fd6b71fc fixed the situation where ipmi console was open (hvc0) but got + data on different console (hvc1). + + During FSP Reset/Reload OPAL closes all consoles. After Reset/Reload + complete FSP requests to open hvc1 and sends data on this. If hvc1 registration failed or not opened in host kernel then it will not read data and results in RCU stalls. + + Note that this is workaround for older kernel where we don't have separate irq + for each console. Latest kernel works fine without this patch. + +Since :ref:`skiboot-5.9-rc1`: + +- FSP/CONSOLE: Limit number of error logging + + Commit c8a7535f (FSP/CONSOLE: Workaround for unresponsive ipmi daemon) added + error logging when buffer is full. In some corner cases kernel may call this + function multiple time and we may endup logging error again and again. + + This patch fixes it by generating error log only once. + +- FSP/CONSOLE: Fix fsp_console_write_buffer_space() call + + Kernel calls fsp_console_write_buffer_space() to check console buffer space + availability. If there is enough buffer space to write data, then kernel will + call fsp_console_write() to write actual data. + + In some extreme corner cases (like one explained in commit c8a7535f) + console becomes full and this function returns 0 to kernel (or space available + in console buffer < next incoming data size). Kernel will continue retrying + until it gets enough space. So we will start seeing RCU stalls. + + This patch keeps track of previous available space. If previous space is same + as current means not enough space in console buffer to write incoming data. + It may be due to very high console write operation and slow response from FSP + -OR- FSP has stopped processing data (ex: because of ipmi daemon died). At this + point we will start timer with timeout of SER_BUFFER_OUT_TIMEOUT (10 secs). + If situation is not improved within 10 seconds means something went bad. Lets + return OPAL_RESOURCE so that kernel can drop console write and continue. +- FSP/CONSOLE: Close SOL session during R/R + + Presently we are not closing SOL and FW console sessions during R/R. Host will + continue to write to SOL buffer during FSP R/R. If there is heavy console write + operation happening during FSP R/R (like running `top` command inside console), + then at some point console buffer becomes full. fsp_console_write_buffer_space() + returns 0 (or less than required space to write data) to host. While one thread + is busy writing to console, if some other threads tries to write data to console + we may see RCU stalls (like below) in kernel. :: + + [ 2082.828363] INFO: rcu_sched detected stalls on CPUs/tasks: { 32} (detected by 16, t=6002 jiffies, g=23154, c=23153, q=254769) + [ 2082.828365] Task dump for CPU 32: + [ 2082.828368] kworker/32:3 R running task 0 4637 2 0x00000884 + [ 2082.828375] Workqueue: events dump_work_fn + [ 2082.828376] Call Trace: + [ 2082.828382] [c000000f1633fa00] [c00000000013b6b0] console_unlock+0x570/0x600 (unreliable) + [ 2082.828384] [c000000f1633fae0] [c00000000013ba34] vprintk_emit+0x2f4/0x5c0 + [ 2082.828389] [c000000f1633fb60] [c00000000099e644] printk+0x84/0x98 + [ 2082.828391] [c000000f1633fb90] [c0000000000851a8] dump_work_fn+0x238/0x250 + [ 2082.828394] [c000000f1633fc60] [c0000000000ecb98] process_one_work+0x198/0x4b0 + [ 2082.828396] [c000000f1633fcf0] [c0000000000ed3dc] worker_thread+0x18c/0x5a0 + [ 2082.828399] [c000000f1633fd80] [c0000000000f4650] kthread+0x110/0x130 + [ 2082.828403] [c000000f1633fe30] [c000000000009674] ret_from_kernel_thread+0x5c/0x68 + + Hence lets close SOL (and FW console) during FSP R/R. +- FSP/CONSOLE: Do not associate unavailable console + + Presently OPAL sends associate/unassociate MBOX command for all + FSP serial console (like below OPAL message). We have to check + console is available or not before sending this message. :: + + [ 5013.227994012,7] FSP: Reassociating HVSI console 1 + [ 5013.227997540,7] FSP: Reassociating HVSI console 2 +- FSP: Disable PSI link whenever FSP tells OPAL about impending R/R + + Commit 42d5d047 fixed scenario where DPO has been initiated, but FSP went + into reset before the CEC power down came in. But this is generic issue + that can happen in normal shutdown path as well. + + Hence disable PSI link as soon as we detect FSP impending R/R. + +- fsp: return OPAL_BUSY_EVENT on failure sending FSP_CMD_POWERDOWN_NORM + Also, return OPAL_BUSY_EVENT on failure sending FSP_CMD_REBOOT / DEEP_REBOOT. + + We had a race condition between FSP Reset/Reload and powering down + the system from the host: + + Roughly: + + == ======================== ========================================================== + # FSP Host + == ======================== ========================================================== + 1 Power on + 2 Power on + 3 (inject EPOW) + 4 (trigger FSP R/R) + 5 Processes EPOW event, starts shutting down + 6 calls OPAL_CEC_POWER_DOWN + 7 (is still in R/R) + 8 gets OPAL_INTERNAL_ERROR, spins in opal_poll_events + 9 (FSP comes back) + 10 spinning in opal_poll_events + 11 (thinks host is running) + == ======================== ========================================================== + + The call to OPAL_CEC_POWER_DOWN is only made once as the reset/reload + error path for fsp_sync_msg() is to return -1, which means we give + the OS OPAL_INTERNAL_ERROR, which is fine, except that our own API + docs give us the opportunity to return OPAL_BUSY when trying again + later may be successful, and we're ambiguous as to if you should retry + on OPAL_INTERNAL_ERROR. + + For reference, the linux code looks like this: :: + + static void __noreturn pnv_power_off(void) + { + long rc = OPAL_BUSY; + + pnv_prepare_going_down(); + + while (rc == OPAL_BUSY || rc == OPAL_BUSY_EVENT) { + rc = opal_cec_power_down(0); + if (rc == OPAL_BUSY_EVENT) + opal_poll_events(NULL); + else + mdelay(10); + } + for (;;) + opal_poll_events(NULL); + } + + Which means that *practically* our only option is to return OPAL_BUSY + or OPAL_BUSY_EVENT. + + We choose OPAL_BUSY_EVENT for FSP systems as we do want to ensure we're + running pollers to communicate with the FSP and do the final bits of + Reset/Reload handling before we power off the system. + + +Since :ref:`skiboot-5.8`: + +- FSP/NVRAM: Handle "get vNVRAM statistics" command + + FSP sends MBOX command (cmd : 0xEB, subcmd : 0x05, mod : 0x00) to get vNVRAM + statistics. OPAL doesn't maintain any such statistics. Hence return + FSP_STATUS_INVALID_SUBCMD. + + Fixes these messages appearing in the OPAL log: :: + + [16944.384670488,3] FSP: Unhandled message eb0500 + [16944.474110465,3] FSP: Unhandled message eb0500 + [16945.111280784,3] FSP: Unhandled message eb0500 + [16945.293393485,3] FSP: Unhandled message eb0500 +- fsp: Move common prints to trace + + These two prints just end up filling the skiboot logs on any machine + that's been booted for more than a few hours. + + They have never been useful, so make them trace level. They were: :: + SURV: Received heartbeat acknowledge from FSP + SURV: Sending the heartbeat command to FSP + +BMC based systems +----------------- +- hw/lpc-uart: read from RBR to clear character timeout interrupts + + When using the aspeed SUART, we see a condition where the UART sends + continuous character timeout interrupts. This change adds a (heavily + commented) dummy read from the RBR to clear the interrupt condition on + init. + + This was observed on p9dsu systems, but likely applies to other systems + using the SUART. +- astbmc: Add methods for handing Device Tree based slots + e.g. ones from HDAT on POWER9. + +General +------- + +Since :ref:`skiboot-5.9-rc5`: + +- p8-i2c: Further timeout reworks + + This patch reworks the way timeouts are set so that rather than imposing + a hard deadline based on the transaction length it uses a + kick-the-can-down-the-road approach where the timeout will be reset each + time data is written to or received from the master. This fits better + with the actual failure modes that timeouts are designed to handle, such + as unusually slow or broken devices. + + Additionally this patch moves all the special case detection out of the + timeout handler. This is help to improve the robustness of the driver and + prepare for a more substantial rework of the driver as a whole later on. +- npu: Fix broken fast reset + + 0679f61244b "fast-reset: by default (if possible)" broke NPU - now + the NV links does not get enabled after reboot. + + This disables fast reboot for NPU machines till a better solution is found. + +Since :ref:`skiboot-5.9-rc2`: + +- Improvements to vpd device tree entries + + Previously we would miss some properties + +Since :ref:`skiboot-5.9-rc1`: + +- hw/p8-i2c: Fix deadlock in p9_i2c_bus_owner_change + + When debugging a system where Linux was taking soft lockup errors with + two CPUs stuck in OPAL: + + ======================= ============== + CPU0 CPU1 + ======================= ============== + lock + p8_i2c_recover + opal_handle_interrupt + sync_timer + cancel_timer + p9_i2c_bus_owner_change + occ_p9_interrupt + xive_source_interrupt + opal_handle_interrupt + ======================= ============== + + p8_i2c_recover() is a timer, and is stuck trying to take master->lock. + p9_i2c_bus_owner_change() has taken master->lock, but then is stuck waiting + for all timers to complete. We deadlock. + + Fix this by using cancel_timer_async(). +- opal/cpu: Mark the core as bad while disabling threads of the core. + + If any of the core fails to sync its TB during chipTOD initialization, + all the threads of that core are disabled. But this does not make + linux kernel to ignore the core/cpus. It crashes while bringing them up + with below backtrace: :: + + [ 38.883898] kexec_core: Starting new kernel + cpu 0x0: Vector: 300 (Data Access) at [c0000003f277b730] + pc: c0000000001b9890: internal_create_group+0x30/0x304 + lr: c0000000001b9880: internal_create_group+0x20/0x304 + sp: c0000003f277b9b0 + msr: 900000000280b033 + dar: 40 + dsisr: 40000000 + current = 0xc0000003f9f41000 + paca = 0xc00000000fe00000 softe: 0 irq_happened: 0x01 + pid = 2572, comm = kexec + Linux version 4.13.2-openpower1 (jenkins@p89) (gcc version 6.4.0 (Buildroot 2017.08-00006-g319c6e1)) #1 SMP Wed Sep 20 05:42:11 UTC 2017 + enter ? for help + [c0000003f277b9b0] c0000000008a8780 (unreliable) + [c0000003f277ba50] c00000000041c3ac topology_add_dev+0x2c/0x40 + [c0000003f277ba70] c00000000006b078 cpuhp_invoke_callback+0x88/0x170 + [c0000003f277bac0] c00000000006b22c cpuhp_up_callbacks+0x54/0xb8 + [c0000003f277bb10] c00000000006bc68 cpu_up+0x11c/0x168 + [c0000003f277bbc0] c00000000002f0e0 default_machine_kexec+0x1fc/0x274 + [c0000003f277bc50] c00000000002e2d8 machine_kexec+0x50/0x58 + [c0000003f277bc70] c0000000000de4e8 kernel_kexec+0x98/0xb4 + [c0000003f277bce0] c00000000008b0f0 SyS_reboot+0x1c8/0x1f4 + [c0000003f277be30] c00000000000b118 system_call+0x58/0x6c + +Since :ref:`skiboot-5.8`: + +- ipmi: Convert common debug prints to trace + + OPAL logs messages for every IPMI request from host. Sometime OPAL console + is filled with only these messages. This path is pretty stable now and + we have enough logs to cover bad path. Hence lets convert these debug + message to trace/info message. Examples are: :: + + [ 1356.423958816,7] opal_ipmi_recv(cmd: 0xf0 netfn: 0x3b resp_size: 0x02) + [ 1356.430774496,7] opal_ipmi_send(cmd: 0xf0 netfn: 0x3a len: 0x3b) + [ 1356.430797392,7] BT: seq 0x20 netfn 0x3a cmd 0xf0: Message sent to host + [ 1356.431668496,7] BT: seq 0x20 netfn 0x3a cmd 0xf0: IPMI MSG done +- libflash/file: Handle short read()s and write()s correctly + + Currently we don't move the buffer along for a short read() or write() + and nor do we request only the remaining amount. + +- hw/p8-i2c: Rework timeout handling + + Currently we treat a timeout as a hard failure and will automatically + fail any transations that hit their timeout. This results in + unnecessarily failing I2C requests if interrupts are dropped, etc. + Although these are bad things that we should log we can handle them + better by checking the actual hardware status and completing the + transation if there are no real errors. This patch reworks the timeout + handling to check the status and continue the transaction if it can. + if it can while logging an error if it detects a timeout due to a + dropped interrupt. +- core/flash: Only expect ELF header for BOOTKERNEL partition flash resource + + When loading a flash resource which isn't signed (secure and trusted + boot) and which doesn't have a subpartition, we assume it's the + BOOTKERNEL since previously this was the only such resource. Thus we + also assumed it had an ELF header which we parsed to get the size of the + partition rather than trusting the actual_size field in the FFS header. + A previous commit (9727fe3 DT: Add ibm,firmware-versions node) added the + version resource which isn't signed and also doesn't have a subpartition, + thus we expect it to have an ELF header. It doesn't so we print the + error message "FLASH: Invalid ELF header part VERSION". + + It is a fluke that this works currently since we load the secure boot + header unconditionally and this happen to be the same size as the + version partition. We also don't update the return code on error so + happen to return OPAL_SUCCESS. + + To make this explicitly correct; only check for an ELF header if we are + loading the BOOTKERNEL resource, otherwise use the partition size from + the FFS header. Also set the return code on error so we don't + erroneously return OPAL_SUCCESS. Add a check that the resource will fit + in the supplied buffer to prevent buffer overrun. +- flash: Support adding the no-erase property to flash + + The mbox protocol explicitly states that an erase is not required + before a write. This means that issuing an erase from userspace, + through the mtd device, and back returns a successful operation + that does nothing. Unfortunately, this makes userspace tools unhappy. + Linux MTD devices support the MTD_NO_ERASE flag which conveys that + writes do not require erases on the underlying flash devices. We + should set this property on all of our + devices which do not require erases to be performed. + + NOTE: This still requires a linux kernel component to set the + MTD_NO_ERASE flag from the device tree property. + +Utilities +--------- + +Since :ref:`skiboot-5.9-rc1`: +- opal-prd: Fix memory leak + +Since :ref:`skiboot-5.8`: + +- external/gard: Clear entire guard partition instead of entry by entry + + When using the current implementation of the gard tool to ecc clear the + entire GUARD partition it is done one gard record at a time. While this + may be ok when accessing the actual flash this is very slow when done + from the host over the mbox protocol (on the order of 4 minutes) because + the bmc side is required to do many read, erase, writes under the hood. + + Fix this by rewriting the gard tool reset_partition() function. Now we + allocate all the erased guard entries and (if required) apply ecc to the + entire buffer. Then we can do one big erase and write of the entire + partition. This reduces the time to clear the guard partition to on the + order of 4 seconds. +- opal-prd: Fix opal-prd command line options + + HBRT OCC reset interface depends on service processor type. + + - FSP: reset_pm_complex() + - BMC: process_occ_reset() + + We have both `occ` and `pm-complex` command line interfaces. + This patch adds support to dispaly appropriate message depending + on system type. + + === ==================== ============================ + SP Command Action + === ==================== ============================ + FSP opal-prd occ display error message + FSP opal-prd pm-complex Call pm_complex_reset() + BMC opal-prd occ Call process_occ_reset() + BMC opal-prd pm-complex display error message + === ==================== ============================ + +- opal-prd: detect service processor type and + then make appropriate occ reset call. +- pflash: Fix erase command for unaligned start address + + The erase_range() function handles erasing the flash for a given start + address and length, and can handle an unaligned start address and + length. However in the unaligned start address case we are incorrectly + calculating the remaining size which can lead to incomplete erases. + + If we're going to update the remaining size based on what the start + address was then we probably want to do that before we overide the + origin start address. So rearrange the code so that this is indeed the + case. +- external/gard: Print an error if run on an FSP system + +Simulators +---------- + +- mambo: Add mambo socket program + + This adds a program that can be run inside a mambo simulator in linux + userspace which enables TCP sockets to be proxied in and out of the + simulator to the host. + + Unlike mambo bogusnet, it's requires no linux or skiboot specific + drivers/infrastructure to run. + + Run inside the simulator: + + - to forward host ssh connections to sim ssh server: + ``./mambo-socket-proxy -h 10022 -s 22``, then connect to port 10022 + on your host with ``ssh -p 10022 localhost`` + - to allow http proxy access from inside the sim to local http proxy: + ``./mambo-socket-proxy -b proxy.mynetwork -h 3128 -s 3128`` + + Multiple connections are supported. +- idle: disable stop*_lite POWER9 idle states for Mambo platform + + Mambo prior to Mambo.7.8.21 had a bug where the stop idle instruction + with PSSCR[ESL]=PSSCR[EC]=0 would resume with MSR set as though it had + taken a system reset interrupt. + + Linux currently executes this instruction with MSR already set that + way, so the problem went unnoticed. A proposed patch to Linux changes + that, and causes the idle code to crash. Work around this by disabling + lite stop states for the mambo platform for now. + +Contributors +------------ + +- 209 csets from 32 developers +- 2 employers found +- A total of 9619 lines added, 1612 removed (delta 8007) + +Extending the analysis done for some previous releases, we can see our trends +in code review across versions: + +======= ====== ======== ========= ========= =========== +Release csets Ack % Reviews % Tested % Reported % +======= ====== ======== ========= ========= =========== +5.0 329 15 (5%) 20 (6%) 1 (0%) 0 (0%) +5.1 372 13 (3%) 38 (10%) 1 (0%) 4 (1%) +5.2-rc1 334 20 (6%) 34 (10%) 6 (2%) 11 (3%) +5.3-rc1 302 36 (12%) 53 (18%) 4 (1%) 5 (2%) +5.4 361 16 (4%) 28 (8%) 1 (0%) 9 (2%) +5.5 408 11 (3%) 48 (12%) 14 (3%) 10 (2%) +5.6 87 12 (14%) 6 (7%) 5 (6%) 2 (2%) +5.7 232 30 (13%) 32 (14%) 5 (2%) 2 (1%) +5.8 157 13 (8%) 36 (23%) 2 (1%) 6 (4%) +5.9 209 15 (7%) 78 (37%) 3 (1%) 10 (5%) +======= ====== ======== ========= ========= =========== + +The review count here is largely bogus, there was a series of 25 whitespace +patches that got "Reviewed-by" and if we exclude them, we're back to 14%, +which is more like what I'd expect. + + +Developers with the most changesets +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================== === ======= +Developer # % +========================== === ======= +Stewart Smith 28 (13.4%) +Vasant Hegde 25 (12.0%) +Joel Stanley 25 (12.0%) +Michael Neuling 24 (11.5%) +Oliver O'Halloran 20 (9.6%) +Benjamin Herrenschmidt 16 (7.7%) +Nicholas Piggin 12 (5.7%) +Akshay Adiga 8 (3.8%) +Madhavan Srinivasan 7 (3.3%) +Reza Arbab 6 (2.9%) +Mahesh Salgaonkar 3 (1.4%) +Claudio Carvalho 3 (1.4%) +Suraj Jitindar Singh 3 (1.4%) +Sam Bobroff 3 (1.4%) +Shilpasri G Bhat 2 (1.0%) +Michael Ellerman 2 (1.0%) +Andrew Donnellan 2 (1.0%) +Vaibhav Jain 2 (1.0%) +Jeremy Kerr 2 (1.0%) +Cyril Bur 2 (1.0%) +Christophe Lombard 2 (1.0%) +Daniel Black 2 (1.0%) +Alexey Kardashevskiy 1 (0.5%) +Alistair Popple 1 (0.5%) +Anton Blanchard 1 (0.5%) +Guilherme G. Piccoli 1 (0.5%) +John W Walthour 1 (0.5%) +Anju T Sudhakar 1 (0.5%) +Balbir Singh 1 (0.5%) +Russell Currey 1 (0.5%) +William A. Kennington III 1 (0.5%) +Sukadev Bhattiprolu 1 (0.5%) +========================== === ======= + +Developers with the most changed lines +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================== ==== ======= +Developer # % +========================== ==== ======= +Akshay Adiga 2731 (27.9%) +Oliver O'Halloran 1512 (15.5%) +Stewart Smith 1355 (13.9%) +Nicholas Piggin 929 (9.5%) +Vasant Hegde 827 (8.5%) +Michael Neuling 719 (7.4%) +Benjamin Herrenschmidt 522 (5.3%) +Madhavan Srinivasan 180 (1.8%) +Sam Bobroff 172 (1.8%) +Christophe Lombard 170 (1.7%) +Mahesh Salgaonkar 166 (1.7%) +Andrew Donnellan 125 (1.3%) +Joel Stanley 70 (0.7%) +Reza Arbab 64 (0.7%) +Claudio Carvalho 51 (0.5%) +Suraj Jitindar Singh 42 (0.4%) +Alistair Popple 28 (0.3%) +Jeremy Kerr 25 (0.3%) +Michael Ellerman 21 (0.2%) +Cyril Bur 18 (0.2%) +Shilpasri G Bhat 17 (0.2%) +Vaibhav Jain 8 (0.1%) +Daniel Black 6 (0.1%) +William A. Kennington III 4 (0.0%) +Sukadev Bhattiprolu 4 (0.0%) +Alexey Kardashevskiy 3 (0.0%) +John W Walthour 3 (0.0%) +Balbir Singh 3 (0.0%) +Guilherme G. Piccoli 2 (0.0%) +Anton Blanchard 1 (0.0%) +Anju T Sudhakar 1 (0.0%) +Russell Currey 1 (0.0%) +========================== ==== ======= + +Developers with the most lines removed +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================== ==== ======= +Developer # % +========================== ==== ======= +Alistair Popple 28 (1.7%) +========================== ==== ======= + +Developers with the most signoffs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================== === ======= +Developer # % +========================== === ======= +Stewart Smith 180 (97.8%) +Shilpasri G Bhat 2 (1.1%) +Mukesh Ojha 1 (0.5%) +Michael Neuling 1 (0.5%) +Total 184 (100%) +========================== === ======= + +Developers with the most reviews +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== == ======= +Developer # % +=========================== == ======= +Michael Neuling 25 (32.5%) +Russell Currey 25 (32.5%) +Vaidyanathan Srinivasan 9 (11.7%) +Oliver O'Halloran 4 (5.2%) +Andrew Donnellan 3 (3.9%) +Frederic Barrat 2 (2.6%) +Suraj Jitindar Singh 2 (2.6%) +Vasant Hegde 2 (2.6%) +Andrew Jeffery 1 (1.3%) +Samuel Mendoza-Jonas 1 (1.3%) +Alexey Kardashevskiy 1 (1.3%) +Cyril Bur 1 (1.3%) +Akshay Adiga 1 (1.3%) +Total 77 (100%) +=========================== == ======= + + +Developers with the most test credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== == ======= +Developer # % +=========================== == ======= +Pridhiviraj Paidipeddi 3 (100.0%) +=========================== == ======= + +Developers who gave the most tested-by credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== == ======= +Developer # % +=========================== == ======= +Vasant Hegde 2 (66.7%) +Michael Neuling 1 (33.3%) +=========================== == ======= + + +Developers with the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== == ======= +Developer # % +=========================== == ======= +Pridhiviraj Paidipeddi 6 (60.0%) +Andrew Donnellan 1 (10.0%) +Stewart Smith 1 (10.0%) +Shriya 1 (10.0%) +Robert Lippert 1 (10.0%) +Total 10 (100%) +=========================== == ======= + +Developers who gave the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +=========================== == ======= +Developer # % +=========================== == ======= +Stewart Smith 3 (30.0%) +Suraj Jitindar Singh 3 (30.0%) +Vasant Hegde 2 (20.0%) +Michael Neuling 1 (10.0%) +Madhavan Srinivasan 1 (10.0%) +Total 10 (100%) +=========================== == ======= + +Changesets and Employers +^^^^^^^^^^^^^^^^^^^^^^^^ + +Top changeset contributors by employer: + +=========================== === ======= +Employer # % +=========================== === ======= +IBM 208 (99.5%) +Google 1 (0.5%) +=========================== === ======= + +Top lines changed by employer: + +=========================== ==== ======= +Employer # % +=========================== ==== ======= +IBM 9776 (100.0%) +Google 4 (0.0%) +=========================== ==== ======= + +Employers with the most signoffs (total 184): + +=========================== === ======= +Employer # % +=========================== === ======= +IBM 184 (100.0%) +=========================== === ======= + +Employers with the most hackers (total 32): + +=========================== === ======= +Employer # % +=========================== === ======= +IBM 31 (96.9%) +Google 1 (3.1%) +=========================== === ======= diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0-rc1.rst b/roms/skiboot/doc/release-notes/skiboot-6.0-rc1.rst new file mode 100644 index 000000000..d40875745 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0-rc1.rst @@ -0,0 +1,869 @@ +.. _skiboot-6.0-rc1: + +skiboot-6.0-rc1 +================ + +skiboot v6.0-rc1 was released on Tuesday May 1st 2018. It is the first +release candidate of skiboot 6.0, which will become the new stable release +of skiboot following the 5.11 release, first released April 6th 2018. + +Skiboot 6.0 will mark the basis for op-build v2.0 and will be required for +POWER9 systems. + +skiboot v6.0-rc1 contains all bug fixes as of :ref:`skiboot-5.11`, +:ref:`skiboot-5.10.5`, and :ref:`skiboot-5.4.9` (the currently maintained +stable releases). Once 6.0 is released, we do *not* expect any further +stable releases in the 5.10.x series, nor in the 5.11.x series. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 6.0 in early May, with skiboot 6.0 +being for all POWER8 and POWER9 platforms in op-build v2.0. + +Over skiboot-5.11, we have the following changes: + +New Features +------------ +- Disable stop states from OPAL + + On ZZ, stop4,5,11 are enabled for PowerVM, even though doing + so may cause problems with OPAL due to bugs in hcode. + + For other platforms, this isn't so much of an issue as + we can just control stop states by the MRW. However the + rebuild-the-world approach to changing values there is a bit + annoying if you just want to rule out a specific stop state + from being problematic. + + Provide an nvram option to override what's disabled in OPAL. + + The OPAL mask is currently ~0xE0000000 (i.e. all but stop 0,1,2) + + You can set an NVRAM override with: :: + + nvram -p ibm,skiboot --update-config opal-stop-state-disable-mask=0xFFFFFFF + + This nvram override will disable *all* stop states. +- interrupts: Create an "interrupts" property in the OPAL node + + Deprecate the old "opal-interrupts", it's still there, but the new + property follows the standard and allow us to specify whether an + interrupt is level or edge sensitive. + + Similarly create "interrupt-names" whose content is identical to + "opal-interrupts-names". +- SBE: Add timer support on POWER9 + + SBE on P9 provides one shot programmable timer facility. We can use this + to implement OPAL timers and hence limit the reliance on the Linux + heartbeat (similar to HW timer facility provided by SLW on P8). +- Add SBE driver support + + SBE (Self Boot Engine) on P9 has two different jobs: + - Boot the chip up to the point the core is functional + - Provide various services like timer, scom, stash MPIPL, etc., at runtime + + We will use SBE for various purposes like timer, MPIPL, etc. + +- opal:hmi: Add missing processor recovery reason string. + + With this patch now we see reason string printed for CORE_WOF[43] bit. :: + + [ 477.352234986,7] HMI: [Loc: U78D3.001.WZS004A-P1-C48]: P:8 C:22 T:3: Processor recovery occurred. + [ 477.352240742,7] HMI: Core WOF = 0x0000000000100000 recovered error: + [ 477.352242181,7] HMI: PC - Thread hang recovery +- Add DIMM actual speed to device tree + + Recent HDAT provides DIMM actuall speed. Lets add this to device tree. +- Fix DIMM size property + + Today we parse vpd blob to get DIMM size information. This is limited + to FSP based system. HDAT provides DIMM size value. Lets use that to + populate device tree. So that we can get size information on BMC based + system as well. + +- PCI: Set slot power limit when supported + + The PCIe slot capability can be implemented in a root or switch + downstream port to set the maximum power a card is allowed to draw + from the system. This patch adds support for setting the power limit + when the platform has defined one. +- hdata/spira: parse vpd to add part-number and serial-number to xscom@ node + + Expected by FWTS and associates our processor with the part/serial + number, which is obviously a good thing for one's own sanity. + + +Improved HMI Handling +^^^^^^^^^^^^^^^^^^^^^ + +- opal/hmi: Add documentation for opal_handle_hmi2 call +- opal/hmi: Generate hmi event for recovered HDEC parity error. +- opal/hmi: check thread 0 tfmr to validate latched tfmr errors. + + Due to P9 errata, HDEC parity and TB residue errors are latched for + non-zero threads 1-3 even if they are cleared. But these are not + latched on thread 0. Hence, use xscom SCOMC/SCOMD to read thread 0 tfmr + value and ignore them on non-zero threads if they are not present on + thread 0. +- opal/hmi: Print additional debug information in rendezvous. +- opal/hmi: Fix handling of TFMR parity/corrupt error. + + While testing TFMR parity/corrupt error it has been observed that HMIs are + delivered twice for this error + + - First time HMI is delivered with HMER[4,5]=1 and TFMR[60]=1. + - Second time HMI is delivered with HMER[4,5]=1 and TFMR[60]=0 with valid TB. + + On second HMI we end up throwing "HMI: TB invalid without core error + reported" even though TB is in a valid state. +- opal/hmi: Stop flooding HMI event for TOD errors. + + Fix the issue where every thread on the chip sends HMI event to host for + TOD errors. TOD errors are reported to all the core/threads on the chip. + Any one thread can fix the error and send event. Rest of the threads don't + need to send HMI event unnecessarily. +- opal/hmi: Fix soft lockups during TOD errors + + There are some TOD errors which do not affect working of TOD and TB. They + stay in valid state. Hence we don't need rendez vous for TOD errors that + does not affect TB working. + + TOD errors that affects TOD/TB will report a global error on TFMR[44] + alongwith bit 51, and they will go in rendez vous path as expected. + + But the TOD errors that does not affect TB register sets only TFMR bit 51. + The TFMR bit 51 is cleared when any single thread clears the TOD error. + Once cleared, the bit 51 is reflected to all the cores on that chip. Any + thread that reads the TFMR register after the error is cleared will see + TFMR bit 51 reset. Hence the threads that see TFMR[51]=1, falls through + rendez-vous path and threads that see TFMR[51]=0, returns doing + nothing. This ends up in a soft lockups in host kernel. + + This patch fixes this issue by not considering TOD interrupt (TFMR[51]) + as a core-global error and hence avoiding rendez-vous path completely. + Instead threads that see TFMR[51]=1 will now take different path that + just do the TOD error recovery. +- opal/hmi: Do not send HMI event if no errors are found. + + For TOD errors, all the cores in the chip get HMIs. Any one thread from any + core can fix the issue and TFMR will have error conditions cleared. Rest of + the threads need take any action if TOD errors are already cleared. Hence + thread 0 of every core should get a fresh copy of TFMR before going ahead + recovery path. Initialize recover = -1, so that if no errors found that + thread need not send a HMI event to linux. This helps in stop flooding host + with hmi event by every thread even there are no errors found. +- opal/hmi: Initialize the hmi event with old value of HMER. + + Do this before we check for TFAC errors. Otherwise the event at host console + shows no error reported in HMER register. + + Without this patch the console event show HMER with all zeros :: + + [ 216.753417] Severe Hypervisor Maintenance interrupt [Recovered] + [ 216.753498] Error detail: Timer facility experienced an error + [ 216.753509] HMER: 0000000000000000 + [ 216.753518] TFMR: 3c12000870e04000 + + After this patch it shows old HMER values on host console: :: + + [ 2237.652533] Severe Hypervisor Maintenance interrupt [Recovered] + [ 2237.652651] Error detail: Timer facility experienced an error + [ 2237.652766] HMER: 0840000000000000 + [ 2237.652837] TFMR: 3c12000870e04000 +- opal/hmi: Rework HMI handling of TFAC errors + + This patch reworks the HMI handling for TFAC errors by introducing + 4 rendez-vous points improve the thread synchronization while handling + timebase errors that requires all thread to clear dirty data from TB/HDEC + register before clearing the errors. +- opal/hmi: Don't bother passing HMER to pre-recovery cleanup + + The test for TFAC error is now redundant so we remove it and + remove the HMER argument. +- opal/hmi: Move timer related error handling to a separate function + + Currently no functional change. This is a first step to completely + rewriting how these things are handled. +- opal/hmi: Add a new opal_handle_hmi2 that returns direct info to Linux + + It returns a 64-bit flags mask currently set to provide info + about which timer facilities were lost, and whether an event + was generated. +- opal/hmi: Remove races in clearing HMER + + Writing to HMER acts as an "AND". The current code writes back the + value we originally read with the bits we handled cleared. This is + racy, if a new bit gets set in HW after the original read, we'll end + up clearing it without handling it. + + Instead, use an all 1's mask with only the bit handled cleared. +- opal/hmi: Don't re-read HMER multiple times + + We want to make sure all reporting and actions are based + upon the same snapshot of HMER in case bits get added + by HW while we are in OPAL. + +libflash and ffspart +^^^^^^^^^^^^^^^^^^^^ + +Many improvements to the `ffspart` utility and `libflash` have come +in this release, making `ffspart` suitable for building bit-identical +PNOR images as the existing tooling used by `op-build`. The plan is to +switch `op-build` to use this infrastructure in the not too distant +future. + +- libflash/blocklevel: Make read/write be ECC agnostic for callers + + The blocklevel abstraction allows for regions of the backing store to be + marked as ECC protected so that blocklevel can decode/encode the ECC + bytes into the buffer automatically without the caller having to be ECC + aware. + + Unfortunately this abstraction is far from perfect, this is only useful + if reads and writes are performed at the start of the ECC region or in + some circumstances at an ECC aligned position - which requires the + caller be aware of the ECC regions. + + The problem that has arisen is that the blocklevel abstraction is + initialised somewhere but when it is later called the caller is unaware + if ECC exists in the region it wants to arbitrarily read and write to. + This should not have been a problem since blocklevel knows. Currently + misaligned reads will fail ECC checks and misaligned writes will + overwrite ECC bytes and the backing store will become corrupted. + + This patch add the smarts to blocklevel_read() and blocklevel_write() to + cope with the problem. Note that ECC can always be bypassed by calling + blocklevel_raw_() functions. + + All this work means that the gard tool can can safely call + blocklevel_read() and blocklevel_write() and as long as the blocklevel + knows of the presence of ECC then it will deal with all cases. + + This also commit removes code in the gard tool which compensated for + inadequacies no longer present in blocklevel. +- libflash/blocklevel: Return region start from ecc_protected() + + Currently all ecc_protected() does is say if a region is ECC protected + or not. Knowing a region is ECC protected is one thing but there isn't + much that can be done afterwards if this is the only known fact. A lot + more can be done if the caller is told where the ECC region begins. + + Knowing where the ECC region start it allows to caller to align its + read/and writes. This allows for more flexibility calling read and write + without knowing exactly how the backing store is organised. +- libflash/ecc: Add helpers to align a position within an ecc buffer + + As part of ongoing work to make ECC invisible to higher levels up the + stack this function converts a 'position' which should be ECC agnostic + to the equivalent position within an ECC region starting at a specified + location. +- libflash/ecc: Add functions to deal with unaligned ECC memcpy +- external/ffspart: Improve error output +- libffs: Fix bad checks for partition overlap + + Not all TOCs are written at zero +- libflash/libffs: Allow caller to specifiy header partition + + An FFS TOC is comprised of two parts. A small header which has a magic + and very minimmal information about the TOC which will be common to all + partitions, things like number of patritions, block sizes and the like. + Following this small header are a series of entries. Importantly there + is always an entry which encompases the TOC its self, this is usually + called the 'part' partition. + + Currently libffs always assumes that the 'part' partition is at zero. + While there is always a TOC and zero there doesn't actually have to be. + PNORs may have multiple TOCs within them, therefore libffs needs to be + flexible enough to allow callers to specify TOCs not at zero. + + The 'part' partition is otherwise a regular partition which may have + flags associated with it. libffs should allow the user to set the flags + for the 'part' partition. + + This patch achieves both by allowing the caller to specify the 'part' + partition. The caller can not and libffs will provide a sensible + default. +- libflash/libffs: Refcount ffs entries + + Currently consumers can add an new ffs entry to multiple headers, this + is fine but freeing any of the headers will cause the entry to be freed, + this causes double free problems. + + Even if only one header is uses, the consumer of the library still has a + reference to the entry, which they may well reuse at some other point. + + libffs will now refcount entries and only free when there are no more + references. + + This patch also removes the pointless return value of ffs_hdr_free() +- libflash/libffs: Switch to storing header entries in an array + + Since the libffs no longer needs to sort the entries as they get added + it makes little sense to have the complexity of a linked list when an + array will suffice. +- libflash/libffs: Remove backup partition from TOC generation code + + It turns out this code was messy and not all that reliable. Doing it at + the library level adds complexity to the library and restrictions to the + caller. + + A simpler approach can be achived with the just instantiating multiple + ffs_header structures pointing to different parts of the same file. +- libflash/libffs: Remove the 'sides' from the FFS TOC generation code + + It turns out this code was messy and not all that reliable. Doing it at + the library level adds complexity to the library and restrictions to the + caller. + + A simpler approach can be achived with the just instantiating multiple + ffs_header structures pointing to different parts of the same file. +- libflash/libffs: Always add entries to the end of the TOC + + It turns out that sorted order isn't the best idea. This removes + flexibility from the caller. If the user wants their partitions in + sorted order, they should insert them in sorted order. +- external/ffspart: Remove side, order and backup options + + These options are currently flakey in libflash/libffs so there isn't + much point to being able to use them in ffspart. + + Future reworks planned for libflash/libffs will render these options + redundant anyway. +- libflash/libffs: ffs_close() should use ffs_hdr_free() +- libflash/libffs: Add setter for a partitions actual size +- pflash: Use ffs_entry_user_to_string() to standardise flag strings +- libffs: Standardise ffs partition flags + + It seems we've developed a character respresentation for ffs partition + flags. Currently only pflash really prints them so it hasn't been a + problem but now ffspart wants to read them in from user input. + + It is important that what libffs reads and what pflash prints remain + consistent, we should move the code into libffs to avoid problems. +- external/ffspart: Allow # comments in input file\ + +p9dsu Platform changes +---------------------- + +The p9dsu platform from SuperMicro (also known as 'Boston') has received +a number of updates, and the patches once carried by SuperMicro are now +upstream. + +- p9dsu: detect p9dsu variant even when hostboot doesn't tell us + + The SuperMicro BMC can tell us what riser type we have, which dictates + the PCI slot tables. Usually, in an environment that a customer would + experience, Hostboot will do the query with an SMC specific patch + (not upstream as there's no platform specific code in hostboot) + and skiboot knows what variant it is based on the compatible string. + + However, if you're using upstream hostboot, you only get the bare + 'p9dsu' compatible type. We can work around this by asking the BMC + ourselves and setting the slot table appropriately. We do this + syncronously in platform init so that we don't start probing + PCI before we setup the slot table. +- p9dsu: add slot power limit. +- p9dsu: add pci slot table for Boston LC 1U/2U and Boston LA/ESS. +- p9dsu HACK: fix system-vpd eeprom +- p9dsu: change esel command from AMI to IBM 0x3a. + +ZZ Platform Changes +------------------- + +- hdata/i2c: Fix up pci hotplug labels + + These labels are used on the devices used to do PCIe slot power control + for implementing PCIe hotplug. I'm not sure how they ended up as + "eeprom-pgood" and "eeprom-controller" since that doesn't make any sense. +- hdata/i2c: Ignore multi-port I2C devices + + Recent FSP firmware builds add support for multi-port I2C devices such + as the GPIO expanders used for the presence detect of OpenCAPI devices + and the PCIe hotplug controllers used to power cycle PCIe slots on ZZ. + + The OpenCAPI driver inside of skiboot currently uses a platform-specific + method to talk to the relevant I2C device rather than relying on HDAT + since not all platforms correctly report the I2C devices (hello Zaius). + Additionally the nature of multi-port devices require that we a device + specific handler so that we generate the correct DT bindings. Currently + we don't and there is no immediate need for this support so just ignore + the multi-port devices for now. +- hdata/i2c: Replace `i2c_` prefix with `dev_` + + The current naming scheme makes it easy to conflate "i2cm_port" and + "i2c_port." The latter is used to describe multi-port I2C devices such + as GPIO expanders and multi-channel PCIe hotplug controllers. Rename + i2c_port to dev_port to make the two a bit more distinct. + + Also rename i2c_addr to dev_addr for consistency. +- hdata/i2c: Ignore CFAM I2C master + + Recent FSP firmware builds put in information about the CFAM I2C master + in addition the to host I2C masters accessible via XSCOM. Odds are this + information should not be there since there's no handshaking between the + FSP/BMC and the host over who controls that I2C master, but it is so + we need to deal with it. + + This patch adds filtering to the HDAT parser so it ignores the CFAM I2C + master. Without this it will create a bogus i2cm@<addr> which migh cause + issues. +- ZZ: hw/imc: Add support to load imc catalog lid file + + Add support to load the imc catalog from a lid file packaged + as part of the system firmware. Lid number allocated + is 0x80f00103.lid. + + +Bugs Fixed +---------- +- core: Fix iteration condition to skip garded cpu +- uart: fix uart_opal_flush to take console lock over uart_con_flush + This bug meant that OPAL_CONSOLE_FLUSH didn't take the appropriate locks. + Luckily, since this call is only currently used in the crash path. +- xive: fix missing unlock in error path +- OPAL_PCI_SET_POWER_STATE: fix locking in error paths + + Otherwise we could exit OPAL holding locks, potentially leading + to all sorts of problems later on. +- hw/slw: Don't assert on a unknown chip + + For some reason skiboot populates nodes in /cpus/ for the cores on + chips that are deconfigured. As a result Linux includes the threads + of those cores in it's set of possible CPUs in the system and attempts + to set the SPR values that should be used when waking a thread from + a deep sleep state. + + However, in the case where we have deconfigured chip we don't create + a xscom node for that chip and as a result we don't have a proc_chip + structure for that chip either. In turn, this results in an assertion + failure when calling opal_slw_set_reg() since it expects the chip + structure to exist. Fix this up and print an error instead. +- opal/hmi: Generate one event per core for processor recovery. + + Processor recovery is per core error. All threads on that core receive + HMI. All threads don't need to generate HMI event for same error. + + Let thread 0 only generate the event. +- sensors: Dont add DTS sensors when OCC inband sensors are available + + There are two sets of core temperature sensors today. One is DTS scom + based core temperature sensors and the second group is the sensors + provided by OCC. DTS is the highest temperature among the different + temperature zones in the core while OCC core temperature sensors are + the average temperature of the core. DTS sensors are read directly by + the host by SCOMing the DTS sensors while OCC sensors are read and + updated by OCC to main memory. + + Reading DTS sensors by SCOMing is a heavy and slower operation as + compared to reading OCC sensors which is as good as reading memory. + So dont add DTS sensors when OCC sensors are available. +- core/fast-reboot: Increase timeout for dctl sreset to 1sec + + Direct control xscom can take more time to complete. We seem to + wait too little on Boston failing fast-reboot for no good reason. + + Increase timeout to 1 sec as a reasonable value for sreset to be delivered + and core to start executing instructions. +- occ: sensors-groups: Add DT properties to mark HWMON sensor groups + + Fix the sensor type to match HWMON sensor types. Add compatible flag + to indicate the environmental sensor groups so that operations on + these groups can be handled by HWMON linux interface. +- core: Correctly load initramfs in stb container + + Skiboot does not calculate the actual size and start location of the + initramfs if it is wrapped by an STB container (for example if loading + an initramfs from the ROOTFS partition). + + Check if the initramfs is in an STB container and determine the size and + location correctly in the same manner as the kernel. Since + load_initramfs() is called after load_kernel() move the call to + trustedboot_exit_boot_services() into load_and_boot_kernel() so it is + called after both of these. +- hdat/i2c.c: quieten "v2 found, parsing as v1" +- hw/imc: Check for pause_microcode_at_boot() return status + + pause_microcode_at_boot() loops through all the chip's ucode + control block and pause the ucode if it is in the running state. + But it does not fail if any of the chip's ucode is not initialised. + + Add code to return a failure if ucode is not initialized in any + of the chip. Since pause_microcode_at_boot() is called just before + attaching the IMC device nodes in imc_init(), add code to check for + the function return. + + +Slot location code fixes: + +- npu2: Use ibm, loc-code rather than ibm, slot-label + + The ibm,slot-label property is to name the slot that appears under a + PCIe bridge. In the past we (ab)used the slot tables to attach names + to GPU devices and their corresponding NVLinks which resulted in npu2.c + using slot-label as a location code rather than as a way to name slots. + + Fix this up since it's confusing. +- hdata/slots: Apply slot label to the parent slot + + Slot names only really make sense when applied to an actual slot rather + than a device. On witherspoon the GPU devices have a name associated with + the device rather than the slot for the GPUs. Add a hack that moves the + slot label to the parent slot rather than on the device itself. +- pci-dt-slot: Big ol' cleanup + + The underlying data that we get from HDAT can only really describe a + PCIe system. As such we can simplify the devicetree slot lookup code + by only caring about the important cases, namly, root ports and switch + downstream ports. + + This also fixes a bug where root port didn't get a Slot label applied + which results in devices under that port not having ibm,loc-code set. + This results in the EEH core being unable to report the location of + EEHed devices under that port. + +opal-prd +^^^^^^^^ +- opal-prd: Insert powernv_flash module + + Explictly load powernv_flash module on BMC based system so that we are sure + that flash device is created before starting opal-prd daemon. + + Note that I have replaced pnor_available() check with is_fsp_system(). As we + want to load module on BMC system only. Also pnor_init has enough logic to + detect flash device. Hence pnor_available() becomes redundant check. + +NPU2/NVLINK2 +^^^^^^^^^^^^ +- npu2/hw-procedures: fence bricks on GPU reset + + The NPU workbook defines a way of fencing a brick and + getting the brick out of fence state. We do have an implementation + of bringing the brick out of fenced/quiesced state. We do + the latter in our procedures, but to support run time reset + we need to do the former. + + The fencing ensures that access to memory behind the links + will not lead to HMI's, but instead SUE's will be populated + in cache (in the case of speculation). The expectation is then + that prior to and after reset, the operating system components + will flush the cache for the region of memory behind the GPU. + + This patch does the following: + + 1. Implements a npu2_dev_fence_brick() function to set/clear + fence state + 2. Clear FIR bits prior to clearing the fence status + 3. Clear's the fence status + 4. We take the powerbus out of CQ fence much later now, + in credits_check() which is the last hardware procedure + called after link training. +- hw/npu2.c: Remove static configuration of NPU2 register + + The NPU_SM_CONFIG0 register currently needs to be configured in Skiboot to + select NVLink mode, however Hostboot should configure other bits in this + register. + + For some reason Skiboot was explicitly clearing bit-6 + (CONFIG_DISABLE_VG_NOT_SYS). It is unclear why this bit was getting cleared + as recent Hostboot versions explicitly set it to the correct value based on + the specific system configuration. Therefore Skiboot should not alter it. + + Bit-58 (CONFIG_NVLINK_MODE) selects if NVLink mode should be enabled or + not. Hostboot does not configure this bit so Skiboot should continue to + configure it. +- npu2: Improve log output of GPU-to-link mapping + + Debugging issues related to unconnected NVLinks can be a little less + irritating if we use the NPU2DEV{DBG,INF}() macros instead of prlog(). + + In short, change this: :: + + NPU2: comparing GPU 'GPU2' and NPU2 'GPU1' + NPU2: comparing GPU 'GPU3' and NPU2 'GPU1' + NPU2: comparing GPU 'GPU4' and NPU2 'GPU1' + NPU2: comparing GPU 'GPU5' and NPU2 'GPU1' + : + npu2_dev_bind_pci_dev: No PCI device for NPU2 device 0006:00:01.0 to bind to. If you expect a GPU to be there, this is a problem. + + to this: :: + + NPU6:0:1.0 Comparing GPU 'GPU2' and NPU2 'GPU1' + NPU6:0:1.0 Comparing GPU 'GPU3' and NPU2 'GPU1' + NPU6:0:1.0 Comparing GPU 'GPU4' and NPU2 'GPU1' + NPU6:0:1.0 Comparing GPU 'GPU5' and NPU2 'GPU1' + : + NPU6:0:1.0 No PCI device found for slot 'GPU1' +- npu2: Move NPU2_XTS_BDF_MAP_VALID assignment to context init + + A bad GPU or other condition may leave us with a subset of links that + never get initialized. If an ATSD is sent to one of those bricks, it + will never complete, leaving us waiting forever for a response: :: + + watchdog: BUG: soft lockup - CPU#23 stuck for 23s! [acos:2050] + ... + Modules linked in: nvidia_uvm(O) nvidia(O) + CPU: 23 PID: 2050 Comm: acos Tainted: G W O 4.14.0 #2 + task: c0000000285cfc00 task.stack: c000001fea860000 + NIP: c0000000000abdf0 LR: c0000000000acc48 CTR: c0000000000ace60 + REGS: c000001fea863550 TRAP: 0901 Tainted: G W O (4.14.0) + MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28004484 XER: 20040000 + CFAR: c0000000000abdf4 SOFTE: 1 + GPR00: c0000000000acc48 c000001fea8637d0 c0000000011f7c00 c000001fea863820 + GPR04: 0000000002000000 0004100026000000 c0000000012778c8 c00000000127a560 + GPR08: 0000000000000001 0000000000000080 c000201cc7cb7750 ffffffffffffffff + GPR12: 0000000000008000 c000000003167e80 + NIP [c0000000000abdf0] mmio_invalidate_wait+0x90/0xc0 + LR [c0000000000acc48] mmio_invalidate.isra.11+0x158/0x370 + + + ATSDs are only sent to bricks which have a valid entry in the XTS_BDF + table. So to prevent the hang, don't set NPU2_XTS_BDF_MAP_VALID unless + we make it all the way to creating a context for the BDF. + +Secure and Trusted Boot +^^^^^^^^^^^^^^^^^^^^^^^ +- hdata/tpmrel: detect tpm not present by looking up the stinfo->status + + Skiboot detects if tpm is present by checking if a secureboot_tpm_info + entry exists. However, if a tpm is not present, hostboot also creates a + secureboot_tpm_info entry. In this case, hostboot creates an empty + entry, but setting the field tpm_status to TPM_NOT_PRESENT. + + This detects if tpm is not present by looking up the stinfo->status. + + This fixes the "TPMREL: TPM node not found for chip_id=0 (HB bug)" + issue, reproduced when skiboot is running on a system that has no tpm. + +PCI +^^^ +- phb4: Restore bus numbers after CRS + + Currently we restore PCIe bus numbers right after the link is + up. Unfortunately as this point we haven't done CRS so config space + may not be accessible. + + This moves the bus number restore till after CRS has happened. +- romulus: Add a barebones slot table +- phb4: Quieten and improve "Timeout waiting for electrical link" + + This happens normally if a slot doesn't have a working HW presence + detect and relies instead of inband presence detect. + + The message we display is scary and not very useful unless ou + are debugging, so quiten it up and change it to something more + meaningful. +- pcie-slot: Don't fail powering on an already on switch + + If the power state is already the required value, return + OPAL_SUCCESS rather than OPAL_PARAMETER to avoid spurrious + errors during boot. + +CAPI/OpenCAPI +^^^^^^^^^^^^^ +- capi: Keep the current mmio windows in the mbt cache table. + + When the phb is used as a CAPI interface, the current mmio windows list + is cleaned before adding the capi and the prefetchable memory (M64) + windows, which implies that the non-prefetchable BAR is no more + configured. + This patch allows to set only the mbt bar to pass capi mmio window and + to keep, as defined, the other mmio values (M32 and M64). +- npu2-opencapi: Fix 'link internal error' FIR, take 2 + + When setting up an opencapi link, we set the transport muxes first, + then set the PHY training config register, which includes disabling + nvlink mode for the bricks. That's the order of the init sequence, as + found in the NPU workbook. + + In reality, doing so works, but it raises 2 FIR bits in the PowerBus + OLL FIR Register for the 2 links when we configure the transport + muxes. Presumably because nvlink is not disabled yet and we are + configuring the transport muxes for opencapi. + + bit 60: + link0 internal error + bit 61: + link1 internal error + + Overall the current setup ends up being correct and everything works, + but we raise 2 FIR bits. + + So tweak the order of operations to disable nvlink before configuring + the transport muxes. Incidentally, this is what the scripts from the + opencapi enablement team were doing all along. +- npu2-opencapi: Fix 'link internal error' FIR, take 1 + + When we setup a link, we always enable ODL0 and ODL1 at the same time + in the PHY training config register, even though we are setting up + only one OTL/ODL, so it raises a "link internal error" FIR bit in the + PowerBus OLL FIR Register for the second link. The error is harmless, + as we'll eventually setup the second link, but there's no reason to + raise that FIR bit. + + The fix is simply to only enable the ODL we are using for the link. +- phb4: Do not set the PBCQ Tunnel BAR register when enabling capi mode. + + The cxl driver will set the capi value, like other drivers already do. +- phb4: set TVT1 for tunneled operations in capi mode + + The ASN indication is used for tunneled operations (as_notify and + atomics). Tunneled operation messages can be sent in PCI mode as + well as CAPI mode. + + The address field of as_notify messages is hijacked to encode the + LPID/PID/TID of the target thread, so those messages should not go + through address translation. Therefore bit 59 is part of the ASN + indication. + + This patch sets TVT#1 in bypass mode when capi mode is enabled, + to prevent as_notify messages from being dropped. + +Debugging/Testing improvements +------------------------------ +- core/stack: backtrace unwind basic OPAL call details + + Put OPAL callers' r1 into the stack back chain, and then use that to + unwind back to the OPAL entry frame (as opposed to boot entry, which + has a 0 back chain). + + From there, dump the OPAL call token and the caller's r1. A backtrace + looks like this: :: + + CPU 0000 Backtrace: + S: 0000000031c03ba0 R: 000000003001a548 ._abort+0x4c + S: 0000000031c03c20 R: 000000003001baac .opal_run_pollers+0x3c + S: 0000000031c03ca0 R: 000000003001bcbc .opal_poll_events+0xc4 + S: 0000000031c03d20 R: 00000000300051dc opal_entry+0x12c + --- OPAL call entry token: 0xa caller R1: 0xc0000000006d3b90 --- + + This is pretty basic for the moment, but it does give you the bottom + of the Linux stack. It will allow some interesting improvements in + future. + + First, with the eframe, all the call's parameters can be printed out + as well. The ___backtrace / ___print_backtrace API needs to be + reworked in order to support this, but it's otherwise very simple + (see opal_trace_entry()). + + Second, it will allow Linux's stack to be passed back to Linux via + a debugging opal call. This will allow Linux's BUG() or xmon to + also print the Linux back trace in case of a NMI or MCE or watchdog + lockup that hits in OPAL. +- asm/head: implement quiescing without stack or clobbering regs + + Quiescing currently is implmeented in C in opal_entry before the + opal call handler is called. This works well enough for simple + cases like fast reset when one CPU wants all others out of the way. + + Linux would like to use it to prevent an sreset IPI from + interrupting firmware, which could lead to deadlocks when crash + dumping or entering the debugger. Linux interrupts do not recover + well when returning back to general OPAL code, due to r13 not being + restored. OPAL also can't be re-entered, which may happen e.g., + from the debugger. + + So move the quiesce hold/reject to entry code, beore the stack or + r1 or r13 registers are switched. OPAL can be interrupted and + returned to or re-entered during this period. + + This does not completely solve all such problems. OPAL will be + interrupted with sreset if the quiesce times out, and it can be + interrupted by MCEs as well. These still have the issues above. +- core/opal: Allow poller re-entry if OPAL was re-entered + + If an NMI interrupts the middle of running pollers and the OS + invokes pollers again (e.g., for console output), the poller + re-entrancy check will prevent it from running and spam the + console. + + That check was designed to catch a poller calling opal_run_pollers, + OPAL re-entrancy is something different and is detected elsewhere. + Avoid the poller recursion check if OPAL has been re-entered. This + is a best-effort attempt to cope with errors. +- core/opal: Emergency stack for re-entry + + This detects OPAL being re-entered by the OS, and switches to an + emergency stack if it was. This protects the firmware's main stack + from re-entrancy and allows the OS to use NMI facilities for crash + / debug functionality. + + Further nested re-entry will destroy the previous emergency stack + and prevent returning, but those should be rare cases. + + This stack is sized at 16kB, which doubles the size of CPU stacks, + so as not to introduce a regression in primary stack size. The 16kB + stack originally had a 4kB machine check stack at the top, which was + removed by 80eee1946 ("opal: Remove machine check interrupt patching + in OPAL."). So it is possible the size could be tightened again, but + that would require further analysis. + +- hdat_to_dt: hash_prop the same on all platforms + Fixes this unit test on ppc64le hosts. +- mambo: Add persistent memory disk support + + This adds support to for mapping disks images using persistent + memory. Disks can be added by setting this ENV variable: + + PMEM_DISK="/mydisks/disk1.img,/mydisks/disk2.img" + + These will show up in Linux as /dev/pmem0 and /dev/pmem1. + + This uses a new feature in mambo "mysim memory mmap .." which is only + available since mambo commit 0131f0fc08 (from 24/4/2018). + + This also needs the of_pmem.c driver in Linux which is only available + since v4.17. It works with powernv_defconfig + CONFIG_OF_PMEM. +- external/mambo: Add di command to decode instructions + + By default you get 16 instructions but you can specify the number you + want. i.e. :: + + systemsim % di 0x100 4 + 0x0000000000000100: Enc:0xA64BB17D : mtspr HSPRG1,r13 + 0x0000000000000104: Enc:0xA64AB07D : mfspr r13,HSPRG0 + 0x0000000000000108: Enc:0xF0092DF9 : std r9,0x9F0(r13) + 0x000000000000010C: Enc:0xA6E2207D : mfspr r9,PPR + + Using di since it's what xmon uses. +- mambo/mambo_utils.tcl: Inject an MCE at a specified address + + Currently we don't support injecting an MCE on a specific address. + This is useful for testing functionality like memcpy_mcsafe() + (see https://patchwork.ozlabs.org/cover/893339/) + + The core of the functionality is a routine called + inject_mce_ue_on_addr, which takes an addr argument and injects + an MCE (load/store with UE) when the specified address is accessed + by code. This functionality can easily be enhanced to cover + instruction UE's as well. + + A sample use case to create an MCE on stack access would be :: + + set addr [mysim display gpr 1] + inject_mce_ue_on_addr $addr + + This would cause an mce on any r1 or r1 based access +- external/mambo: improve helper for machine checks + + Improve workarounds for stop injection, because mambo often will + trigger on 0x104/204 when injecting sreset/mces. + + This also adds a workaround to skip injecting on reservations to + avoid infinite loops when doing inject_mce_step. +- travis: Enable ppc64le builds + + At least on the IBM Travis Enterprise instance, we can now do + ppc64le builds! + + We can only build a subset of our matrix due to availability of + ppc64le distros. The Dockerfiles need some tweaking to only + attempt to install (x86_64 only) Mambo binaries, as well as the + build scripts. +- external: Add "lpc" tool + + This is a little front-end to the lpc debugfs files to access + the LPC bus from userspace on the host. +- core/test/run-trace: fix on ppc64el + + diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0-rc2.rst b/roms/skiboot/doc/release-notes/skiboot-6.0-rc2.rst new file mode 100644 index 000000000..0fefe7168 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0-rc2.rst @@ -0,0 +1,154 @@ +.. _skiboot-6.0-rc2: + +skiboot-6.0-rc2 +=============== + +skiboot v6.0-rc2 was released on Wednesday May 9th 2018. It is the second +release candidate of skiboot 6.0, which will become the new stable release +of skiboot following the 5.11 release, first released April 6th 2018. + +Skiboot 6.0 will mark the basis for op-build v2.0 and will be required for +POWER9 systems. + +skiboot v6.0-rc2 contains all bug fixes as of :ref:`skiboot-5.11`, +:ref:`skiboot-5.10.5`, and :ref:`skiboot-5.4.9` (the currently maintained +stable releases). Once 6.0 is released, we do *not* expect any further +stable releases in the 5.10.x series, nor in the 5.11.x series. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +The current plan is to cut the final 6.0 in early May (maybe in a day or two +after this -rc if things look okay), with skiboot 6.0 +being for all POWER8 and POWER9 platforms in op-build v2.0. + +Over skiboot-6.0-rc1, we have the following changes: + +- Update default stop-state-disable mask to cut only stop11 + + Stability improvements in microcode for stop4/stop5 are + available in upstream hcode images. Stop4 and stop5 can + be safely enabled by default. + + Use ~0xE0000000 to cut all but stop0,1,2 in case there + are any issues with stop4/5. + + example: :: + + nvram -p ibm,skiboot --update-config opal-stop-state-disable-mask=0x1FFFFFFF + + **Note**: that DD2.1 chips that have a frequency <1867Mhz possible *need* to + run a hcode image *different* than the default in op-build (set + `BR2_HCODE_LATEST_VERSION=y` in your config) + +- ibm,firmware-versions: add hcode to device tree + + op-build commit 736a08b996e292a449c4996edb264011dfe56a40 + added hcode to the VERSION partition, let's parse it out + and let the user know. + +- ipmi: Add BMC firmware version to device tree + + BMC Get device ID command gives BMC firmware version details. Lets add this + to device tree. User space tools will use this information to display BMC + version details. + +- mambo: Enable XER CA32 and OV32 bits on P9 + + POWER9 adds 32 bit carry and overflow bits to the XER, but we need to + set the relevant CTRL1 bit to enable them. + +- Makefile: Fix building natively on ppc64le + + When on ppc64le and CROSS is not set by the environment, make assumes + ppc64 and sets a default CROSS. Check for ppc64le as well, so that + 'make' works out of the box on ppc64le. +- p9dsu: timeout for variant detection, default to 2uess + +- core/direct-controls: improve p9_stop_thread error handling + + p9_stop_thread should fail the operation if it finds the thread was + already quiescd. This implies something else is doing direct controls + on the thread (e.g., pdbg) or there is some exceptional condition we + don't know how to deal with. Proceeding here would cause things to + trample on each other, for example the hard lockup watchdog trying to + send a sreset to the core while it is stopped for debugging with pdbg + will end in tears. + + If p9_stop_thread times out waiting for the thread to quiesce, do + not hit it with a core_start direct control, because we don't know + what state things are in and doing more things at this point is worse + than doing nothing. There is no good recipe described in the workbook + to de-assert the core_stop control if it fails to quiesce the thread. + After timing out here, the thread may eventually quiesce and get + stuck, but that's simpler to debug than undefied behaviour. + +- core/direct-controls: fix p9_cont_thread for stopped/inactive threads + + Firstly, p9_cont_thread should check that the thread actually was + quiesced before it tries to resume it. Anything could happen if we + try this from an arbitrary thread state. + + Then when resuming a quiesced thread that is inactive or stopped (in + a stop idle state), we must not send a core_start direct control, + clear_maint must be used in these cases. +- occ: Use major version number while checking the pstate table format + + The minor version increments of the pstate table are backward + compatible. The minor version is changed when the pstate table + remains same and the existing reserved bytes are used for pointing + new data. So use only major version number while parsing the pstate + table. This will allow old skiboot to parse the pstate table and + handle minor version updates. + +- hmi: Clear unknown debug trigger + + On some systems, seeing hangs like this when Linux starts: :: + + [ 170.027252763,5] OCC: All Chip Rdy after 0 ms + [ 170.062930145,5] INIT: Starting kernel at 0x20011000, fdt at 0x30ae0530 366247 bytes) + [ 171.238270428,5] OPAL: Switch to little-endian OS + + If you look at the in memory skiboot console (or do `nvram -p + ibm,skiboot --update-config log-level-driver=7`) we see the console get + spammed with: :: + + [ 5209.109790675,7] HMI: Received HMI interrupt: HMER = 0x0000400000000000 + [ 5209.109792716,7] HMI: Received HMI interrupt: HMER = 0x0000400000000000 + [ 5209.109794695,7] HMI: Received HMI interrupt: HMER = 0x0000400000000000 + [ 5209.109796689,7] HMI: Received HMI interrupt: HMER = 0x0000400000000000 + + We're taking the debug trigger (bit 17) early on, before the + hmi_debug_trigger function in the kernel is set up. + + This clears the HMI in Skiboot and reports to the kernel instead of + bringing down the machine. + +- core/hmi: assign flags=0 in case nothing set by handle_hmi_exception + + Theoretically we could have returned junk to the OS in this parameter. + +- SLW: Fix mambo boot to use stop states + + After commit 35c66b8ce5a2 ("SLW: Move MAMBO simulator checks to + slw_init"), mambo boot no longer calls add_cpu_idle_state_properties() + and as such we never enable stop states. + + After adding the call back, we get more testing coverage as well + as faster mambo SMT boots. + +- phb4: Hardware init updates + + CFG Write Request Timeout was incorrectly set to informational and not + fatal for both non-CAPI and CAPI, so set it to fatal. This was a + mistake in the specification. Correcting this fixes a niche bug in + escalation (which is necessary on pre-DD2.2) that can cause a checkstop + due to a NCU timeout. + + In addition, set the values in the timeout control registers to match. + This fixes an extremely rare and unreproducible bug, though the current + timings don't make sense since they're higher than the NCU timeout (16) + which will checkstop the machine anyway. + +- SLW: quieten 'Configuring self-restore' for DARN,NCU_SPEC_BAR and HRMOR +- Experimental support for building with Clang +- Improvements to testing and Travis CI diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.1.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.1.rst new file mode 100644 index 000000000..7a6254f40 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.1.rst @@ -0,0 +1,29 @@ +.. _skiboot-6.0.1: + +============= +skiboot-6.0.1 +============= + +skiboot 6.0.1 was released on Wednesday May 16th, 2018. It replaces +:ref:`skiboot-6.0` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.1 be used instead of any previous 6.0.x version +due to the bug fixes and debugging enhancements in it. + +Over :ref:`skiboot-6.0`, we have two bug fixes: + +- OpenBMC: use 0x3a as OEM command for partial add esel. + + This fixes the bug where skiboot would never send an eSEL to the BMC. +- Add location code to NPU2 HMI logging + + The current HMI error message does not specifiy where the HMI + error occured. + + The original error message was :: + + NPU: FIR#0 FIR 0x0080100000000000 mask 0x009a48180f01ffff + + The enhanced error message is :: + + NPU2: [Loc: UOPWR.0000000-Node0-Proc0] P:0 FIR#0 FIR 0x0000100000000000 mask 0x009a48180f03ffff diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.10.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.10.rst new file mode 100644 index 000000000..9a6fd8db3 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.10.rst @@ -0,0 +1,81 @@ +.. _skiboot-6.0.10: + +============== +skiboot-6.0.10 +============== + +skiboot 6.0.10 was released on Wednesday October 31st, 2018. It replaces +:ref:`skiboot-6.0.9` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.10 be used instead of any previous 6.0.x version +due to the bug fixes it contains. + +The bug fixes are: + +- Recognise signed VERSION partition +- hdata/i2c: Skip unknown device type + + Do not add unknown I2C devices to device tree. +- hdata/i2c: Make SPD workaround more workaroundy + + We have a hack in the I2C device parser to fix up entries generated by + hostboot for the DIMM SPD devices. For some reason they get reported as + 128Kbit EEPROMs which is bad since those have a different I2C interface + to an actual SPD device. + + Oddly enough, the FSP also gets this wrong in a slightly different way. + In the FSP case they are reported as a at24c04 (4Kbit) EEPROM, which + also has a different I2C interface. + + To fix both these problems for any eeprom we find on that bus to have + the compatible string of "spd". + +- hdata/i2c: Add whitelisting for Host I2C devices + + Many of the devices that we get information about through HDAT are for + use by firmware rather than the host operating system. This patch adds + a boolean flag to hdat_i2c_info structure that indicates whether devices + with a given purpose should be reserved for use inside of OPAL (or some + other firmware component, such as the OCC). +- Add fast-reboot property to /ibm,opal DT node + + this means that if it's permanently disabled on boot, the test suite can + pick that up and not try a fast reboot test. +- libflash: Add ipmi-hiomap (currently for Witherspoon only) + + ipmi-hiomap implements the PNOR access control protocol formerly known + as "the mbox protocol" but uses IPMI instead of the AST LPC mailbox as a + transport. As there is no-longer any mailbox involved in this alternate + implementation the old protocol name is quite misleading, and so it has + been renamed to "the hiomap protoocol" (Host I/O Mapping protocol). The + same commands and events are used though this client-side implementation + assumes v2 of the protocol is supported by the BMC. +- AMI BMC: use 0x3a as OEM command + + The 0x3a OEM command is for IBM commands, while 0x32 was for AMI ones. + Sometime in the P8 timeframe, AMI BMCs were changed to listen for our + commands on either 0x32 or 0x3a. Since 0x3a is the direction forward, + we'll use that, as P9 machines with AMI BMCs probably also want these + to work, and let's not bet that 0x32 will continue to be okay. +- astbmc: Set romulus BMC type to OpenBMC +- Fixes to bulid with GCC8 +- phb4/capp: Use link width to allocate STQ engines to CAPP + + Update phb4_init_capp_regs() to allocates STQ Engines to CAPP/PEC2 + based on link width instead of always assuming it to x8. + + Also re-factor the function slightly to evaluate the link-width only + once and cache it so that it can also be used to allocate DMA read + engines. +- phb4/capp: Update the expected Eye-catcher for CAPP ucode lid + + Currently on a FSP based P9 system load_capp_code() expects CAPP ucode + lid header to have eye-catcher magic of 'CAPPPSLL'. However skiboot + currently supports CAPP ucode only lids that have a eye-catcher magic + of 'CAPPLIDH'. This prevents skiboot from loading the ucode with this + error message: :: + + CAPP: ucode header invalid + + We fix this issue by updating load_capp_ucode() to use the eye-catcher + value of 'CAPPLIDH' instead of 'CAPPPSLL'. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.11.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.11.rst new file mode 100644 index 000000000..a09f38122 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.11.rst @@ -0,0 +1,57 @@ +.. _skiboot-6.0.11: + +============== +skiboot-6.0.11 +============== + +skiboot 6.0.11 was released on Friday November 2nd, 2018. It replaces +:ref:`skiboot-6.0.10` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.11 be used instead of any previous 6.0.x version +due to the bug fixes it contains. + +The bug fixes are: + +- phb4/capp: Only reset FIR bits that cause capp machine check + + During CAPP recovery do_capp_recovery_scoms() will reset the CAPP Fir + register just after CAPP recovery is completed. This has an + unintentional side effect of preventing PRD from analyzing and + reporting this error. If PRD tries to read the CAPP FIR after opal has + already reset it, then it logs a critical error complaining "No active + error bits found". + + To prevent this from happening we update do_capp_recovery_scoms() to + only reset fir bits that cause CAPP machine check (local xstop). This + is done by reading the CAPP Fir Action0/1 & Mask registers and + generating a mask which is then written on CAPP_FIR_CLEAR register. +- phb4: Check for RX errors after link training + + Some PHB4 PHYs can get stuck in a bad state where they are constantly + retraining the link. This happens transparently to skiboot and Linux + but will causes PCIe to be slow. Resetting the PHB4 clears the + problem. + + We can detect this case by looking at the RX errors count where we + check for link stability. This patch does this by modifying the link + optimal code to check for RX errors. If errors are occurring we + retrain the link irrespective of the chip rev or card. + + Normally when this problem occurs, the RX error count is maxed out at + 255. When there is no problem, the count is 0. We chose 8 as the max + rx errors value to give us some margin for a few errors. There is also + a knob that can be used to set the error threshold for when we should + retrain the link. i.e. :: + + nvram -p ibm,skiboot --update-config phb-rx-err-max=8 + +- core/flash: Log return code when ffs_init() fails +- libflash/ipmi-hiomap: Use error codes rather than abort() +- libflash/ipmi-hiomap: Restore window state on window/protocol reset +- libflash/ipmi-hiomap: Improve event handling +- p9dsu: Describe platform BMC register configuration + + Provide the p9dsu-specific BMC configuration values required for the + host kernel to drive the VGA display correctly. +- p9dsu: Add HIOMAP-over-IPMI support +- libflash/ipmi-hiomap: Cleanup allocation on init failure diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.12.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.12.rst new file mode 100644 index 000000000..f647be2f4 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.12.rst @@ -0,0 +1,24 @@ +.. _skiboot-6.0.12: + +============== +skiboot-6.0.12 +============== + +skiboot 6.0.12 was released on Monday November 12th, 2018. It replaces +:ref:`skiboot-6.0.11` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.12 be used instead of any previous 6.0.x version +due to the bug fixes it contains. + +The bug fixes are: + +- hiomap: quieten warning on failing to move a window + + This isn't *necessarily* an error that we should complain loudly about. + If, for example, the BMC enforces the Read Only flag on a FFS partition, + opening a write window *should* fail, and we do indeed test this in + op-test. + + Thus we deal with the error in a well known path: returning an error + code and then it's eventually a userspace problem. +- libflash/ipmi-hiomap: Respect daemon presence and flash control diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.13.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.13.rst new file mode 100644 index 000000000..9b66e92a7 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.13.rst @@ -0,0 +1,22 @@ +.. _skiboot-6.0.13: + +============== +skiboot-6.0.13 +============== + +skiboot 6.0.13 was released on Wednesday November 14th, 2018. It replaces +:ref:`skiboot-6.0.12` as the current stable release in the 6.0.x series. + +This release includes one pflash change. This release does not modify skiboot +itself, so there is no reason to upgrade to this version if you're on 6.0.12 +already. This release is made exclusively so OpenBMC can ship an updated pflash +from a tagged release. + +The pflash change is: + +- pflash: Add --skip option for reading + + Add a --skip=N option to pflash to skip N number of bytes when reading. + This would allow users to print the VERSION partition without the STB + header by specifying the --skip=4096 argument, and it's a more generic + solution rather than making pflash depend on secure/trusted boot code. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.14.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.14.rst new file mode 100644 index 000000000..8d3265ccc --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.14.rst @@ -0,0 +1,50 @@ +.. _skiboot-6.0.14: + +============== +skiboot-6.0.14 +============== + +skiboot 6.0.14 was released on Tuesday November 27th, 2018. It replaces +:ref:`skiboot-6.0.13` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.14 be used instead of any previous 6.0.x version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- libflash: Don't merge ECC-protected ranges + + Libflash currently merges contiguous ECC-protected ranges, but doesn't + check that the ECC bytes at the end of the first and start of the second + range actually match sanely. More importantly, if blocklevel_read() is + called with a position at the start of a partition that is contained + somewhere within a region that has been merged it will update the + position assuming ECC wasn't being accounted for. This results in the + position being somewhere well after the actual start of the partition + which is incorrect. + + For now, remove the code merging ranges. This means more ranges must be + held and checked however it prevents incorrectly reading ECC-correct + regions like below: :: + + [ 174.334119453,7] FLASH: CAPP partition has ECC + [ 174.437349574,3] ECC: uncorrectable error: ffffffffffffffff ff + [ 174.437426306,3] FLASH: failed to read the first 0x1000 from CAPP partition, rc 14 + [ 174.439919343,3] CAPP: Error loading ucode lid. index=201d1 + +- ipmi: Reduce ipmi_queue_msg_sync() polling loop time to 10ms. + + On a plain boot with hiomap, this reduces the time spent in OPAL + by ~170ms on p9dsu. This is due to hiomap (currently) using + synchronous IPMI messages. + + It will also *significantly* reduce latency on runtime flash + operations with hiomap, as we'll spend typically 10-20ms in OPAL + rather than 100-200ms. It's not an ideal solution to that, but + it's a quick and obvious win for jitter. + +- opal-prd: Fix opal-prd crash + + Crash log without this patch: :: + + opal-prd[2864]: unhandled signal 11 at 0000000000029320 nip 00000 00102012830 lr 0000000102016890 code 1 diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.15.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.15.rst new file mode 100644 index 000000000..5e4d37549 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.15.rst @@ -0,0 +1,45 @@ +.. _skiboot-6.0.15: + +============== +skiboot-6.0.15 +============== + +skiboot 6.0.15 was released on Monday December 17th, 2018. It replaces +:ref:`skiboot-6.0.14` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.15 be used instead of any previous 6.0.x version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- i2c: Fix i2c request hang during opal init if timers are not checked + + If an i2c request cannot go through the first time, because the bus is + found in error and need a reset or it's locked by the OCC for example, + the underlying i2c implementation is using timers to manage the + request. However during opal init, opal pollers may not be called, it + depends in the context in which the i2c request is made. If the + pollers are not called, the timers are not checked and we can end up + with an i2c request which will not move foward and skiboot hangs. + + Fix it by explicitly checking the timers if we are waiting for an i2c + request to complete and it seems to be taking a while. + +- opal-prd: hservice: Enable hservice->wakeup() in BMC + + This patch enables HBRT to use HYP special wakeup register in openBMC + which until now was only used in FSP based machines. + + This patch also adds a capability check for opal-prd so that HBRT can + decide if the host special wakeup register can be used. + +- npu2: Advertise correct TCE page size + + The P9 NPU workbook says that only 4K/64K/16M/256M page size are supported + and in fact npu2_map_pe_dma_window() supports just these but in absence of + the "ibm,supported-tce-sizes" property Linux assumes the default P9 PHB4 + page sizes - 4K/64K/2M/1G - so when Linux tries 2M/1G TCEs, we get lots of + "Unexpected TCE size" from npu2_tce_kill(). + + This advertises TCE page sizes so Linux could handle it correctly, i.e. + fall back to 4K/64K TCEs. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.16.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.16.rst new file mode 100644 index 000000000..00ff91ba9 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.16.rst @@ -0,0 +1,53 @@ +.. _skiboot-6.0.16: + +============== +skiboot-6.0.16 +============== + +skiboot 6.0.16 was released on Tuesday February 5th, 2019. It replaces +:ref:`skiboot-6.0.15` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.16 be used instead of any previous 6.0.x version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- p9dsu: Fix p9dsu default variant + + Add the default when no riser_id is returned from the ipmi query. + + This addresses: https://github.com/open-power/boston-openpower/issues/1369 + + Allow a little more time for BMC reply and cleanup some label strings. + +- p9dsu: Fix p9dsu slot tables + + Set the attributes on the slot tables to account for + builtin or pluggable etypes, this will allow pci + enumeration to calculate subordinate buses. + + Update some slot label strings. + + Add WIO Slot5 which is standard on the ESS config. + +- phb4: Generate checkstop on AIB ECC corr/uncorr for DD2.0 parts + + On DD2.0 parts, PCIe ECC protection is not warranted in the response + data path. Thus, for these parts, we need to flag any ECC errors + detected from the adjacent AIB RX Data path so the part can be + replaced. + + This patch configures the FIRs so that we escalate these AIB ECC + errors to a checkstop so the parts can be replaced. + +- core/lock: Stop drop_my_locks() from always causing abort + + Fix an erroneous failure in an error path that looked like this: :: + + LOCK ERROR: Releasing lock we don't hold depth @0x30493d20 (state: 0x0000000000000001) + [13836.000173140,0] Aborting! + CPU 0000 Backtrace: + S: 0000000031c03930 R: 000000003001d840 ._abort+0x60 + S: 0000000031c039c0 R: 000000003001a0c4 .lock_error+0x64 + S: 0000000031c03a50 R: 0000000030019c70 .unlock+0x54 + S: 0000000031c03af0 R: 000000003001a040 .drop_my_locks+0xf4 diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.17.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.17.rst new file mode 100644 index 000000000..9bf7d4e33 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.17.rst @@ -0,0 +1,66 @@ +.. _skiboot-6.0.17: + +============== +skiboot-6.0.17 +============== + +skiboot 6.0.17 was released on Wednesday February 20th, 2019. It replaces +:ref:`skiboot-6.0.16` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.17 be used instead of any previous 6.0.x version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- core/opal: Print PIR value in exit path, which is useful for debugging. +- core/ipmi: Improve error message +- hdata: Fix dtc warnings + + Fix dtc warnings related to mcbist node :: + + Warning (reg_format): "reg" property in /xscom@623fc00000000/mcbist@1 has invalid length (4 bytes) (#address-cells == 1, #size-cells == 1) + Warning (reg_format): "reg" property in /xscom@623fc00000000/mcbist@2 has invalid length (4 bytes) (#address-cells == 1, #size-cells == 1) + Warning (reg_format): "reg" property in /xscom@603fc00000000/mcbist@1 has invalid length (4 bytes) (#address-cells == 1, #size-cells == 1) + Warning (reg_format): "reg" property in /xscom@603fc00000000/mcbist@2 has invalid length (4 bytes) (#address-cells == 1, #size-cells == 1) + + Ideally we should add proper xscom range here... but we are not getting that + information in HDAT today. Lets fix warning until we get proper data in HDAT. +- hdata/test: workaround dtc bugs + + In dtc v1.4.5 to at least v1.4.7 there have been a few bugs introduced + that change the layout of what's produced in the dts. In order to be + immune from them, we should use the (provided) dtdiff utility, but we + also need to run the dts we're diffing against through a dtb cycle in + order to ensure we get the same format as what the hdat_to_dt to dts + conversion will. + + This fixes a bunch of unit test failures on the version of dtc shipped + with recent Linux distros such as Fedora 29. +- firmware-versions: Add test case for parsing VERSION + + Also make it possible to use with afl-lop/afl-fuzz just to help make + *sure* we're all good. + + Additionally, if we hit a entry in VERSION that is larger than our + buffer size, we skip over it gracefully rather than overwriting the + stack. This is only a problem if VERSION isn't trusted, which as of + 4b8cc05a94513816d43fb8bd6178896b430af08f it is verified as part of + Secure Boot. +- core/cpu: HID update race + + If the per-core HID register is updated concurrently by multiple + threads, updates can get lost. This has been observed during fast + reboot where the HILE bit does not get cleared on all cores, which + can cause machine check exception interrupts to crash. + + Fix this by only updating HID on thread0. +- cpufeatures: Always advertise POWER8NVL as DD2 + + Despite the major version of PVR being 1 (0x004c0100) for POWER8NVL, + these chips are functionally equalent to P8/P8E DD2 levels. + + This advertises POWER8NVL as DD2. As the result, skiboot adds + ibm,powerpc-cpu-features/processor-control-facility for such CPUs and + the linux kernel can use hypervisor doorbell messages to wake secondary + threads; otherwise "KVM: CPU %d seems to be stuck" would appear because + of missing LPCR_PECEDH. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.18.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.18.rst new file mode 100644 index 000000000..8011d46a1 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.18.rst @@ -0,0 +1,185 @@ +.. _skiboot-6.0.18: + +============== +skiboot-6.0.18 +============== + +skiboot 6.0.18 was released on Wednesday March 6th, 2019. It replaces +:ref:`skiboot-6.0.17` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.18 be used instead of any previous 6.0.x version +due to the bug fixes it contains. + +Over :ref:`skiboot-6.0.17` we have several bug fixes, including important ones +for powercap, ipmi-hiomap and BMC communication driver. + +powercap +======== +- powercap: occ: Fix the powercapping range allowed for user + + OCC provides two limits for minimum powercap. One being hard powercap + minimum which is guaranteed by OCC and the other one is a soft + powercap minimum which is lesser than hard-min and may or may not be + asserted due to various power-thermal reasons. So to allow the users + to access the entire powercap range, this patch exports soft powercap + minimum as the "powercap-min" DT property. And it also adds a new + DT property called "powercap-hard-min" to export the hard-min powercap + limit. + +IPMI-HIOMAP +=========== +- ipmi-hiomap test case enhancements/fixes. + +- libflash/ipmi-hiomap: Enforce message size for empty response + + The protocol defines the response to the associated messages as empty + except for the command ID and sequence fields. If the BMC is returning + extra data consider the message malformed. + +- libflash/ipmi-hiomap: Remove unused close handling + + Issuing a HIOMAP_C_CLOSE is not required by the protocol specification, + rather a close can be implicit in a subsequent + CREATE_{READ,WRITE}_WINDOW request. The implicit close provides an + opportunity to reduce LPC traffic and the implementation takes up that + optimisation, so remove the case from the IPMI callback handler. + +- libflash/ipmi-hiomap: Overhaul event handling + + Reworking the event handling was inspired by a bug report by Vasant + where the host would get wedged on multiple flash access attempts in the + face of a persistent error state on the BMC-side. The cause of this bug + was the early-exit based on ctx->update, which erronously assumed that + all events had been completely handled in prior calls to + ipmi_hiomap_handle_events(). This is not true if e.g. + HIOMAP_E_DAEMON_READY is clear in the prior calls. + + Regardless, there were other correctness and efficiency problems with + the handling strategy: + + * Ack-able event state was not restored in the face of errors in the + process of re-establishing protocol state + + * It forced needless window restoration with respect to the context in + which ipmi_hiomap_handle_events() was called. + + * Tests for HIOMAP_E_DAEMON_READY and HIOMAP_E_FLASH_LOST were redundant + with the overhauled error handling introduced in the previous patch + + Fix all of the above issues and add comments to explain the event + handling flow. + + Tests for correctness follow later in the series. + +- libflash/ipmi-hiomap: Overhaul error handling + + The aim is to improve the robustness with respect to absence of the + BMC-side daemon. The current error handling roughly mirrors what was + done for the mailbox implementation, but there's room for improvement. + + Errors are split into two classes, those that affect the transport state + and those that affect the window validity. From here, we push the + transport state error checks right to the bottom of the stack, to ensure + the link is known to be in a good state before any message is sent. + Window validity tests remain as they were in the hiomap_window_move() + and ipmi_hiomap_read() functions. Validity tests are not necessary in + the write and erase paths as we will receive an error response from the + BMC when performing a dirty or flush on an invalid window. + + Recovery also remains as it was, done on entry to the blocklevel + callbacks. If an error state is encountered in the middle of an + operation no attempt is made to recover it on the spot, instead the + error is returned up the stack and the caller can choose how it wishes + to respond. + +- libflash/ipmi-hiomap: Fix leak of msg in callback + +BMC communication +================= +- core/ipmi: Add ipmi sync messages to top of the list + + In ipmi_queue_msg_sync() path OPAL will wait until it gets response from + BMC. If we do not get response ontime we may endup in kernel hardlockups. + Hence lets add sync messages to top of the queue. This will reduces the + chance of hardlockups. + +- hw/bt: Introduce separate list for synchronous messages + + BT send logic always sends top of bt message list to BMC. Once BMC reads the + message, it clears the interrupt and bt_idle() becomes true. + + bt_add_ipmi_msg_head() adds message to top of the list. If bt message list + is not empty then: + + - if bt_idle() is true then we will endup sending message to BMC before + getting response from BMC for inflight message. Looks like on some + BMC implementation this results in message timeout. + - else we endup starting message timer without actually sending message + to BMC.. which is not correct. + + This patch introduces separate list to track synchronous messages. + bt_add_ipmi_msg_head() will add messages to tail of this new list. We + will always process this queue before processing normal queue. + + Finally this patch introduces new variable (inflight_bt_msg) to track + inflight message. This will point to current inflight message. + +- hw/bt: Fix message retry handler + + In some corner cases (like BMC reboot), bt_send_and_unlock() starts + message timer, but won't send message to BMC as driver is not free to + send message. bt_expire_old_msg() function enables H2B interrupt without + actually sending message. + + This patch fixes above issue. + +- ipmi/power: Fix system reboot issue + + Kernel makes reboot/shudown OPAL call for reboot/shutdown. Once kernel + gets response from OPAL it runs opal_poll_events() until firmware + handles the request. + + On BMC based system, OPAL makes IPMI call (IPMI_CHASSIS_CONTROL) to + initiate system reboot/shutdown. At present OPAL queues IPMI messages + and return SUCESS to Host. If BMC is not ready to accept command (like + BMC reboot), then these message will fail. We have to manually + reboot/shutdown the system using BMC interface. + + This patch adds logic to validate message return value. If message failed, + then it will resend the message. At some stage BMC will be ready to accept + message and handles IPMI message. + +- hw/bt: Add backend interface to disable ipmi message retry option + + During boot OPAL makes IPMI_GET_BT_CAPS call to BMC to get BT interface + capabilities which includes IPMI message max resend count, message + timeout, etc,. Most of the time OPAL gets response from BMC within + specified timeout. In some corner cases (like mboxd daemon reset in BMC, + BMC reboot, etc) OPAL may not get response within timeout period. In + such scenarios, OPAL resends message until max resend count reaches. + + OPAL uses synchronous IPMI message (ipmi_queue_msg_sync()) for few + operations like flash read, write, etc. Thread will wait in OPAL until + it gets response from BMC. In some corner cases like BMC reboot, thread + may wait in OPAL for long time (more than 20 seconds) and results in + kernel hardlockup. + + This patch introduces new interface to disable message resend option. We + will disable message resend option for synchrous message. This will + greatly reduces kernel hardlock up issues. + + This is short term fix. Long term solution is to convert all synchronous + messages to asynhrounous one. + +PHB3 +==== +- hw/phb3/naples: Disable D-states + + Putting "Mellanox Technologies MT27700 Family [ConnectX-4] [15b3:1013]" + (more precisely, the second of 2 its PCI functions, no matter in what + order) into the D3 state causes EEH with the "PCT timeout" error. + This has been noticed on garrison machines only and firestones do not + seem to have this issue. + + This disables D-states changing for devices on root buses on Naples by + installing a config space access filter (copied from PHB4). diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.19.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.19.rst new file mode 100644 index 000000000..bdd1ac7c1 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.19.rst @@ -0,0 +1,37 @@ +.. _skiboot-6.0.19: + +============== +skiboot-6.0.19 +============== + +skiboot 6.0.19 was released on Tuesday March 19th, 2019. It replaces +:ref:`skiboot-6.0.18` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.19 be used instead of any previous 6.0.x version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- p9dsu: Undo slot label name changes + + During some code updates the slot labels were updated to reflect + the phb layout, however expectations were that the slot labels be + aligned with the riser card slots and not the system planar slots. + + [stewart: The tale of how we got here is long and varied and not at + all clear. The first ESS systems went out with a skiboot v5.9.8 with + additional SuperMicro patches. It was probably a slot table, but who knows, + we don't have the code so can't check. It's possible it was all coming + in through HDAT instead). The op-build tree (thus the exact patches) + shipped on systems that work correct seems to not be around anywhere anymore + (if it ever was). It was only in skiboot v6.0 that a slot table made + it in, and, of course, only having remote machines in random configs, + including possibly with riser cards from Briggs&Stratton rather than + the ones destined for this system, doesn't make for verifying this + at all. It also doesn't help that *consistently* there is *never* + any review on slot tables, and we've had things be wrong in the past. + Combine this with not upstream Hostboot patches.] + +- p9dsu: Fix slot labels for p9dsu2u + + Update the slot labels for the p9dsu2u tables. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.2.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.2.rst new file mode 100644 index 000000000..4012bc17e --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.2.rst @@ -0,0 +1,23 @@ +.. _skiboot-6.0.2: + +============= +skiboot-6.0.2 +============= + +skiboot 6.0.2 was released on Friday May 18th, 2018. It replaces +:ref:`skiboot-6.0.1` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.2 be used instead of any previous 6.0.x version. + +Over :ref:`skiboot-6.0.1`, we one bug fix: + +- cpu: Clear PCR SPR in opal_reinit_cpus() + + Currently if Linux boots with a non-zero PCR, things can go bad where + some early userspace programs can take illegal instructions. This is + being fixed in Linux, but in the mean time, we should cleanup in + skiboot also. + + This could exhibit itself as petitboot getting killed with SIGILL and + no boot devices showing up, but only in a situation where you've done + a kdump from a kernel running a p8 compat guest diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.20.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.20.rst new file mode 100644 index 000000000..6542dece1 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.20.rst @@ -0,0 +1,207 @@ +.. _skiboot-6.0.20: + +============== +skiboot-6.0.20 +============== + +skiboot 6.0.20 was released on Thursday May 9th, 2019. It replaces +:ref:`skiboot-6.0.19` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.20 be used instead of any previous 6.0.x version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- core/flash: Retry requests as necessary in flash_load_resource() + + We would like to successfully boot if we have a dependency on the BMC + for flash even if the BMC is not current ready to service flash + requests. On the assumption that it will become ready, retry for several + minutes to cover a BMC reboot cycle and *eventually* rather than + *immediately* crash out with: :: + + [ 269.549748] reboot: Restarting system + [ 390.297462587,5] OPAL: Reboot request... + [ 390.297737995,5] RESET: Initiating fast reboot 1... + [ 391.074707590,5] Clearing unused memory: + [ 391.075198880,5] PCI: Clearing all devices... + [ 391.075201618,7] Clearing region 201ffe000000-201fff800000 + [ 391.086235699,5] PCI: Resetting PHBs and training links... + [ 391.254089525,3] FFS: Error 17 reading flash header + [ 391.254159668,3] FLASH: Can't open ffs handle: 17 + [ 392.307245135,5] PCI: Probing slots... + [ 392.363723191,5] PCI Summary: + ... + [ 393.423255262,5] OCC: All Chip Rdy after 0 ms + [ 393.453092828,5] INIT: Starting kernel at 0x20000000, fdt at + 0x30800a88 390645 bytes + [ 393.453202605,0] FATAL: Kernel is zeros, can't execute! + [ 393.453247064,0] Assert fail: core/init.c:593:0 + [ 393.453289682,0] Aborting! + CPU 0040 Backtrace: + S: 0000000031e03ca0 R: 000000003001af60 ._abort+0x4c + S: 0000000031e03d20 R: 000000003001afdc .assert_fail+0x34 + S: 0000000031e03da0 R: 00000000300146d8 .load_and_boot_kernel+0xb30 + S: 0000000031e03e70 R: 0000000030026cf0 .fast_reboot_entry+0x39c + S: 0000000031e03f00 R: 0000000030002a4c fast_reset_entry+0x2c + --- OPAL boot --- + + The OPAL flash API hooks directly into the blocklevel layer, so there's + no delay for e.g. the host kernel, just for asynchronously loaded + resources during boot. + +- pci/iov: Remove skiboot VF tracking + + This feature was added a few years ago in response to a request to make + the MaxPayloadSize (MPS) field of a Virtual Function match the MPS of the + Physical Function that hosts it. + + The SR-IOV specification states the the MPS field of the VF is "ResvP". + This indicates the VF will use whatever MPS is configured on the PF and + that the field should be treated as a reserved field in the config space + of the VF. In other words, a SR-IOV spec compliant VF should always return + zero in the MPS field. Adding hacks in OPAL to make it non-zero is... + misguided at best. + + Additionally, there is a bug in the way pci_device structures are handled + by VFs that results in a crash on fast-reboot that occurs if VFs are + enabled and then disabled prior to rebooting. This patch fixes the bug by + removing the code entirely. This patch has no impact on SR-IOV support on + the host operating system. + +- hw/xscom: Enable sw xstop by default on p9 + + This was disabled at some point during bringup to make life easier for + the lab folks trying to debug NVLink issues. This hack really should + have never made it out into the wild though, so we now have the + following situation occuring in the field: + + 1) A bad happens + 2) The host kernel recieves an unrecoverable HMI and calls into OPAL to + request a platform reboot. + 3) OPAL rejects the reboot attempt and returns to the kernel with + OPAL_PARAMETER. + 4) Kernel panics and attempts to kexec into a kdump kernel. + + A side effect of the HMI seems to be CPUs becoming stuck which results + in the initialisation of the kdump kernel taking a extremely long time + (6+ hours). It's also been observed that after performing a dump the + kdump kernel then crashes itself because OPAL has ended up in a bad + state as a side effect of the HMI. + + All up, it's not very good so re-enable the software checkstop by + default. If people still want to turn it off they can using the nvram + override. + +- opal/hmi: Initialize the hmi event with old value of TFMR. + + Do this before we fix TFAC errors. Otherwise the event at host console + shows no thread error reported in TFMR register. + + Without this patch the console event show TFMR with no thread error: + (DEC parity error TFMR[59] injection) :: + + [ 53.737572] Severe Hypervisor Maintenance interrupt [Recovered] + [ 53.737596] Error detail: Timer facility experienced an error + [ 53.737611] HMER: 0840000000000000 + [ 53.737621] TFMR: 3212000870e04000 + + After this patch it shows old TFMR value on host console: :: + + [ 2302.267271] Severe Hypervisor Maintenance interrupt [Recovered] + [ 2302.267305] Error detail: Timer facility experienced an error + [ 2302.267320] HMER: 0840000000000000 + [ 2302.267330] TFMR: 3212000870e14010 + +- libflash/ipmi-hiomap: Fix blocks count issue + + We convert data size to block count and pass block count to BMC. + If data size is not block aligned then we endup sending block count + less than actual data. BMC will write partial data to flash memory. + + Sample log :: + + [ 594.388458416,7] HIOMAP: Marked flash dirty at 0x42010 for 8 + [ 594.398756487,7] HIOMAP: Flushed writes + [ 594.409596439,7] HIOMAP: Marked flash dirty at 0x42018 for 3970 + [ 594.419897507,7] HIOMAP: Flushed writes + + In this case HIOMAP sent data with block count=0 and hence BMC didn't + flush data to flash. + + Lets fix this issue by adjusting block count before sending it to BMC. + +- Fix hang in pnv_platform_error_reboot path due to TOD failure. + + On TOD failure, with TB stuck, when linux heads down to + pnv_platform_error_reboot() path due to unrecoverable hmi event, the panic + cpu gets stuck in OPAL inside ipmi_queue_msg_sync(). At this time, rest + all other cpus are in smp_handle_nmi_ipi() waiting for panic cpu to proceed. + But with panic cpu stuck inside OPAL, linux never recovers/reboot. :: + + p0 c1 t0 + NIA : 0x000000003001dd3c <.time_wait+0x64> + CFAR : 0x000000003001dce4 <.time_wait+0xc> + MSR : 0x9000000002803002 + LR : 0x000000003002ecf8 <.ipmi_queue_msg_sync+0xec> + + STACK: SP NIA + 0x0000000031c236e0 0x0000000031c23760 (big-endian) + 0x0000000031c23760 0x000000003002ecf8 <.ipmi_queue_msg_sync+0xec> + 0x0000000031c237f0 0x00000000300aa5f8 <.hiomap_queue_msg_sync+0x7c> + 0x0000000031c23880 0x00000000300aaadc <.hiomap_window_move+0x150> + 0x0000000031c23950 0x00000000300ab1d8 <.ipmi_hiomap_write+0xcc> + 0x0000000031c23a90 0x00000000300a7b18 <.blocklevel_raw_write+0xbc> + 0x0000000031c23b30 0x00000000300a7c34 <.blocklevel_write+0xfc> + 0x0000000031c23bf0 0x0000000030030be0 <.flash_nvram_write+0xd4> + 0x0000000031c23c90 0x000000003002c128 <.opal_write_nvram+0xd0> + 0x0000000031c23d20 0x00000000300051e4 <opal_entry+0x134> + 0xc000001fea6e7870 0xc0000000000a9060 <opal_nvram_write+0x80> + 0xc000001fea6e78c0 0xc000000000030b84 <nvram_write_os_partition+0x94> + 0xc000001fea6e7960 0xc0000000000310b0 <nvram_pstore_write+0xb0> + 0xc000001fea6e7990 0xc0000000004792d4 <pstore_dump+0x1d4> + 0xc000001fea6e7ad0 0xc00000000018a570 <kmsg_dump+0x140> + 0xc000001fea6e7b40 0xc000000000028e5c <panic_flush_kmsg_end+0x2c> + 0xc000001fea6e7b60 0xc0000000000a7168 <pnv_platform_error_reboot+0x68> + 0xc000001fea6e7bd0 0xc0000000000ac9b8 <hmi_event_handler+0x1d8> + 0xc000001fea6e7c80 0xc00000000012d6c8 <process_one_work+0x1b8> + 0xc000001fea6e7d20 0xc00000000012da28 <worker_thread+0x88> + 0xc000001fea6e7db0 0xc0000000001366f4 <kthread+0x164> + 0xc000001fea6e7e20 0xc00000000000b65c <ret_from_kernel_thread+0x5c> + + This is because, there is a while loop towards the end of + ipmi_queue_msg_sync() which keeps looping until "sync_msg" does not match + with "msg". It loops over time_wait_ms() until exit condition is met. In + normal scenario time_wait_ms() calls run pollers so that ipmi backend gets + a chance to check ipmi response and set sync_msg to NULL. + + .. code-block:: c + + while (sync_msg == msg) + time_wait_ms(10); + + But in the event when TB is in failed state time_wait_ms()->time_wait_poll() + returns immediately without calling pollers and hence we end up looping + forever. This patch fixes this hang by calling opal_run_pollers() in TB + failed state as well. + +- core/ipmi: Print correct netfn value + +- core/lock: don't set bust_locks on lock error + + bust_locks is a big hammer that guarantees a mess if it's set while + all other threads are not stopped. + + I propose removing this in the lock error paths. In debugging the + previous deadlock false positive, none of the error messages printed, + and the in-memory console was totally garbled due to lack of locking. + + I think it's generally better for debugging and system integrity to + keep locks held when lock errors occur. Lock busting should be used + carefully, just to allow messages to be printed out or machine to be + restarted, probably when the whole system is single-threaded. + + Skiboot is slowly working toward that being feasible with co-operative + debug APIs between firmware and host, but for the time being, + difficult lock crashes are better not to corrupt everything by + busting locks. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.21.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.21.rst new file mode 100644 index 000000000..1473b0db0 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.21.rst @@ -0,0 +1,15 @@ +.. _skiboot-6.0.21: + +============== +skiboot-6.0.21 +============== + +skiboot 6.0.21 was released on Tuesday Jan 7th, 2020. It replaces +:ref:`skiboot-6.0.20` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.21 be used instead of any previous 6.0.x version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- npu2/hw-procedures: Remove assertion from check_credits() diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.22.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.22.rst new file mode 100644 index 000000000..ce1e7af01 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.22.rst @@ -0,0 +1,21 @@ +.. _skiboot-6.0.22: + +============== +skiboot-6.0.22 +============== + +skiboot 6.0.22 was released on Friday March 27th, 2020. It replaces +:ref:`skiboot-6.0.21` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.22 be used instead of any previous 6.0.x version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- errorlog: Increase the severity of abnormal reboot events + +- eSEL: Make sure PANIC logs are sent to BMC before calling assert + +- core/ipmi: Fix use-after-free + +- ipmi: ensure forward progress on ipmi_queue_msg_sync() diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.23.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.23.rst new file mode 100644 index 000000000..41d1961d2 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.23.rst @@ -0,0 +1,17 @@ +.. _skiboot-6.0.23: + +============== +skiboot-6.0.23 +============== + +skiboot 6.0.23 was released on Tuesday April 7th, 2020. It replaces +:ref:`skiboot-6.0.22` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.23 be used instead of any previous 6.0.x version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- npu2: Clear fence on all bricks + +- npu2: Clear fence state for a brick being reset diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.3.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.3.rst new file mode 100644 index 000000000..3f127da61 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.3.rst @@ -0,0 +1,53 @@ +.. _skiboot-6.0.3: + +============= +skiboot-6.0.3 +============= + +skiboot 6.0.3 was released on Wednesday May 23rd, 2018. It replaces +:ref:`skiboot-6.0.2` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.3 be used instead of any previous 6.0.x version. + +Over :ref:`skiboot-6.0.3`, we have bug fixes related to i2c booting in +secure mode, and general functionality with a TPM present. These changes are: + +- p8-i2c: Remove force reset + + Force reset was added as an attempt to work around some issues with TPM + devices locking up their I2C bus. In that particular case the problem + was that the device would hold the SCL line down permanently due to a + device firmware bug. The force reset doesn't actually do anything to + alleviate the situation here, it just happens to reset the internal + master state enough to make the I2C driver appear to work until + something tries to access the bus again. + + On P9 systems with secure boot enabled there is the added problem + of the "diagostic mode" not being supported on I2C masters A,B,C and + D. Diagnostic mode allows the SCL and SDA lines to be driven directly + by software. Without this force reset is impossible to implement. + + This patch removes the force reset functionality entirely since: + + a) it doesn't do what it's supposed to, and + b) it's butt ugly code + + Additionally, turn p8_i2c_reset_engine() into p8_i2c_reset_port(). + There's no need to reset every port on a master in response to an + error that occurred on a specific port. + +- libstb/i2c-driver: Bump max timeout + + We have observed some TPMs clock streching the I2C bus for signifigant + amounts of time when processing commands. The same TPMs also have + errata that can result in permernantly locking up a bus in response to + an I2C transaction they don't understand. Using an excessively long + timeout to prevent this in the field. +- Add TPM timeout workaround + + Set the default timeout for any bus containing a TPM to one second. This + is needed to work around a bug in the firmware of certain TPMs that will + clock strech the I2C port the for up to a second. Additionally, when the + TPM is clock streching it responds to a STOP condition on the bus by + bricking itself. Clearing this error requires a hard power cycle of the + system since the TPM is powered by standby power. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.4.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.4.rst new file mode 100644 index 000000000..0db6aac32 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.4.rst @@ -0,0 +1,55 @@ +.. _skiboot-6.0.4: + +============= +skiboot-6.0.4 +============= + +skiboot 6.0.4 was released on Monday May 28th, 2018. It replaces +:ref:`skiboot-6.0.3` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.4 be used instead of any previous 6.0.x version. + +Over :ref:`skiboot-6.0.3`, we have two bug fixes: one helps with performance +(especially in HPC environments), and one is an opal-prd fix. + +Changes are: + +- SLW: Remove stop1_lite and stop2_lite + + stop1_lite has been removed since it adds no additional benefit + over stop0_lite. stop2_lite has been removed since currently it adds + minimal benefit over stop2. However, the benefit is eclipsed by the time + required to ungate the clocks + + Moreover, Lite states don't give up the SMT resources, can potentially + have a performance impact on sibling threads. + + Since current OSs (Linux) aren't smart enough to make good decisions + with these stop states, we're (temporarly) removing them from what + we expose to the OS, the idea being to bring them back in a new + DT representation so that only an OS that knows what to do will + do things with them. +- opal-prd: Do not error out on first failure for soft/hard offline. + + The memory errors (CEs and UEs) that are detected as part of background + memory scrubbing are reported by PRD asynchronously to opal-prd along with + affected memory ranges. hservice_memory_error() converts these ranges into + page granularity before hooking up them to soft/hard offline-ing + infrastructure. + + But the current implementation of hservice_memory_error() does not hookup + all the pages to soft/hard offline-ing if any of the page offline action + fails. e.g hard offline can fail for: + + - Pages that are not part of buddy managed pool. + - Pages that are reserved by kernel using memblock_reserved() + - Pages that are in use by kernel. + + But for the pages that are in use by user space application, the hard + offline marks the page as hwpoison, sends SIGBUS signal to kill the + affected application as recovery action and returns success. + + Hence, It is possible that some of the pages in that memory range are in + use by application or free. By stopping on first error we loose the + opportunity to hwpoison the subsequent pages which may be free or in use by + application. This patch fixes this issue. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.5.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.5.rst new file mode 100644 index 000000000..c185015d1 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.5.rst @@ -0,0 +1,118 @@ +.. _skiboot-6.0.5: + +============= +skiboot-6.0.5 +============= + +skiboot 6.0.5 was released on Wednesday July 11th, 2018. It replaces +:ref:`skiboot-6.0.4` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.5 be used instead of any previous 6.0.x version. + +Over :ref:`skiboot-6.0.4` we have several bug fixes, including important ones +for NVLINK2 and NX. + +PCI/PHB4 +======== + +- phb4: Delay training till after PERST is deasserted + + This helps some cards train on the second PERST (ie fast-reboot). The + reason is not clear why but it helps, so YOLO! +- pci: Fix PCI_DEVICE_ID() + + The vendor ID is 16 bits not 8. This error leaves the top of the vendor + ID in the bottom bits of the device ID, which resulted in e.g. a failure + to run the PCI quirk for the AST VGA device. + + Fixes: 2b841bf0ef1b (present in v5.7-rc1) + +PHB4/CAPI +========= +- phb4/capp: Calculate STQ/DMA read engines based on link-width for PEC + + Presently in CAPI mode the number of STQ/DMA-read engines allocated on + PEC2 for CAPP is fixed to 6 and 0-30 respectively irrespective of the + PCI link width. These values are only suitable for x8 cards and + quickly run out if a x16 card is plugged to a PEC2 attached slot. This + usually manifests as CAPP reporting TLBI timeout due to these messages + getting stalled due to insufficient STQs. + + To fix this we update enable_capi_mode() to check if PEC2 chiplet is + in x16 mode and if yes then we allocate 4/0-47 STQ/DMA-read engines + for the CAPP traffic. +- capi: Select the correct IODA table entry for the mbt cache. + + With the current code, the capi mmio window is not correctly configured + in the IODA table entry. The first entry (generally the non-prefetchable + BAR) is overwrriten. + This patch sets the capi window bar at the right place. + +Sensors +======= + +- occ: sensors: Fix the size of the phandle array 'sensors' in DT + + Fixes: 99505c03f493 (present in v5.10-rc4) + +NPU2/NVLINK2 +============ + +- npu2/hw-procedures: Fence bricks via NTL instead of MISC + + There are a couple of places we can set/unset fence for a brick: + + 1. MISC register: NPU2_MISC_FENCE_STATE + 2. NTL register for the brick: NPU2_NTL_MISC_CFG1(ndev) + + Recent testing of ATS in combination with GPU reset has exposed a side + effect of using (1); if fence is set for all six bricks, it triggers a + sticky nmmu latch which prevents the NPU from getting ATR responses. + This manifests as a hang in the tests. + + We have npu2_dev_fence_brick() which uses (1), and only two calls to it. + Replace the call which sets fence with a write to (2). Remove the + corresponding unset call entirely. It's unneeded because the procedures + already do a progression from full fence to half to idle using (2). +- opal/hmi: Display correct chip id while printing NPU FIRs. + + HMIs for NPU xstops are broadcasted to all chips. All cores on all the + chips receive HMI. HMI handler correctly identifies and extracts the + NPU FIR details from affected chip, but while printing FIR data it + prints chip id and location code details of this_cpu()->chip_id which + may not be correct. This patch fixes this issue. + + Fixes: 7bcbc78c (present in v6.0.1) + +VPD +=== + +- vpd: Add vendor property to processor node + + Processor FRU vpd doesn't contain vendor detail. We have to parse + module VPD to get vendor detail. +- vpd: Sanitize VPD data + + On OpenPower system, VPD keyword size tells us the maximum size of the data. + But they fill trailing end with space (0x20) instead of NULL. Also spec + doesn't stop user to have space (0x20) within actual data. + + This patch discards trailing spaces before populating device tree. + +NX/VAS for POWER9 +================= + +- NX: Add NX coprocessor init opal call + + The read offset (4:11) in Receive FIFO control register is incremented + by FIFO size whenever CRB read by NX. But the index in RxFIFO has to + match with the corresponding entry in FIFO maintained by VAS in kernel. + VAS entry is reset to 0 when opening the receive window during driver + initialization. So when NX842 is reloaded or in kexec boot, possibility + of mismatch between RxFIFO control register and VAS entries in kernel. + It could cause CRB failure / timeout from NX. + + This patch adds nx_coproc_init opal call for kernel to initialize + readOffset (4:11) and Queued (15:23) in RxFIFO control register. + + Fixes: 3b3c5962f432 (present in v5.8-rc1) diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.6.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.6.rst new file mode 100644 index 000000000..311586a5c --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.6.rst @@ -0,0 +1,51 @@ +.. _skiboot-6.0.6: + +============= +skiboot-6.0.6 +============= + +skiboot 6.0.6 was released on Thursday July 19th, 2018. It replaces +:ref:`skiboot-6.0.5` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.5 be used instead of any previous 6.0.x version, +especially in the case where NVLINK2 GPUs and/or Mellanox CX5 adapters are +being used. + +Over :ref:`skiboot-6.0.5` we have several important performance related bug +fixes and one stability bug fix: + +- phb4/CAPI: Reallocate PEC2 DMA-Read engines to improve GPU-Direct bandwidth + + We reallocate additional 16/8 DMA-Read engines allocated to stack0/1 + on PEC2 respectively. This is needed to improve bandwidth available to + the Mellanox CX5 adapter when trying to read GPU memory (GPU-Direct). + + If kernel cxl driver indicates a request to allocate maximum possible + DMA read engines when calling enable_capi_mode() and card is attached + to PEC2/stack0 slot then we assume its a Mellanox CX5 adapter. We then + allocate additional 16/8 extra DMA read engines to stack0 and stack1 + respectively on PEC2. This is done by populating the + XPEC_PCI_PRDSTKOVR and XPEC_NEST_READ_STACK_OVERRIDE as suggested by + the h/w team. +- phb4: Disable nodal scoped DMA accesses when PB pump mode is enabled + + By default when a PCIe device issues a read request via the PHB it is first + issued with nodal scope. When accessing GPU memory the NPU does not know at the + time of response if the requested memory page is off node or not. Therefore + every read of GPU memory by a PHB is retried with larger scope which introduces + bandwidth and latency issues. + + On smaller boxes which have pump mode enabled nodal and group scoped reads are + treated the same and both types of request are broadcast to one chip. Therefore + we can avoid the retry by disabling nodal scope on the PHB for these boxes. On + larger boxes nodal (single chip) and group (multiple chip) scoped reads are + treated differently. Therefore we avoid disabling nodal scope on large boxes + which have pump mode disabled to avoid all PHB requests being broadcast to + multiple chips. +- npu2/hw-procedures: Enable parity and credit overflow checks + + Enable these error checking features by setting the appropriate bits in + our one-off initialization of each "NTL Misc Config 2" register. + + The exception is NDL RX parity checking, which should be disabled during + the link training procedures. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.7.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.7.rst new file mode 100644 index 000000000..0f4dd9923 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.7.rst @@ -0,0 +1,20 @@ +.. _skiboot-6.0.7: + +============= +skiboot-6.0.7 +============= + +skiboot 6.0.7 was released on Friday August 3rd, 2018. It replaces +:ref:`skiboot-6.0.6` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.7 be used instead of any previous 6.0.x version +due to it containing a workaround for hardware errata in the XIVE interrupt +controller (present on POWER9 systems). + +The bug fix is: + +- xive: Disable block tracker + + Due to some HW errata, the block tracking facility (performance optimisation + for large systems) should be disabled on Nimbus chips. Disable it unconditionally + for now. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.8.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.8.rst new file mode 100644 index 000000000..73a9698d4 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.8.rst @@ -0,0 +1,67 @@ +.. _skiboot-6.0.8: + +============= +skiboot-6.0.8 +============= + +skiboot 6.0.8 was released on Thursday August 16th, 2018. It replaces +:ref:`skiboot-6.0.7` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.8 be used instead of any previous 6.0.x version +due to the bug fixes it contains. + +The bug fixes are: + +- i2c: Ensure ordering between i2c_request_send() and completion + + i2c_request_send loops waiting for a flag "uc.done" set by + the completion routine, and then look for a result code + also set by that same completion. + + There is no synchronization, the completion can happen on another + processor, so we need to order the stores to uc and the reads + from uc so that uc.done is stored last and tested first using + memory barriers. +- i2c: Fix multiple-enqueue of the same request on NACK + + i2c_request_send() will retry the request if the error is a NAK, + however it forgets to clear the "ud.done" flag. It will thus + loop again and try to re-enqueue the same request causing internal + request list corruption. +- phb4: Disable 32-bit MSI in capi mode + + If a capi device does a DMA write targeting an address lower than 4GB, + it does so through a 32-bit operation, per the PCI spec. In capi mode, + the first TVE entry is configured in bypass mode, so the address is + valid. But with any (bad) luck, the address could be 0xFFFFxxxx, thus + looking like a 32-bit MSI. + + We currently enable both 32-bit and 64-bit MSIs, so the PHB will + interpret the DMA write as a MSI, which very likely results in an EEH + (MSI with a bad payload size). + + We can fix it by disabling 32-bit MSI when switching the PHB to capi + mode. Capi devices are 64-bit. + +- capp: Fix the capp recovery timeout comparison + + The current capp recovery timeout control loop in + do_capp_recovery_scoms() uses a wrong comparison for return value of + tb_compare(). This may cause do_capp_recovery_scoms() to report an + timeout earlier than the 168ms stipulated time. + + The patch fixes this by updating the loop timeout control branch in + do_capp_recovery_scoms() to use the correct enum tb_cmpval. +- phb4/capp: Update DMA read engines set in APC_FSM_READ_MASK based on link-width + + Commit 47c09cdfe7a3("phb4/capp: Calculate STQ/DMA read engines based + on link-width for PEC") update the CAPP init sequence by calculating + the needed STQ/DMA-read engines based on link width and populating it + in XPEC_NEST_CAPP_CNTL register. This however needs to be synchronized + with the value set in CAPP APC FSM Read Machine Mask Register. + + Hence this patch update phb4_init_capp_regs() to calculate the link + width of the stack on PEC2 and populate the same values as previously + populated in PEC CAPP_CNTL register. + +- core/cpu: Call memset with proper cpu_thread offset diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.9.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.9.rst new file mode 100644 index 000000000..87eb79784 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.9.rst @@ -0,0 +1,139 @@ +.. _skiboot-6.0.9: + +============= +skiboot-6.0.9 +============= + +skiboot 6.0.9 was released on Friday October 12th, 2018. It replaces +:ref:`skiboot-6.0.8` as the current stable release in the 6.0.x series. + +It is recommended that 6.0.9 be used instead of any previous 6.0.x version +due to the bug fixes it contains. + +The bug fixes are: + +- opal/hmi: Ignore debug trigger inject core FIR. + + Core FIR[60] is a side effect of the work around for the CI Vector Load + issue in DD2.1. Usually this gets delivered as HMI with HMER[17] where + Linux already ignores it. But it looks like in some cases we may happen + to see CORE_FIR[60] while we are already in Malfunction Alert HMI + (HMER[0]) due to other reasons e.g. CAPI recovery or NPU xstop. If that + happens then just ignore it instead of crashing kernel as not recoverable. + +- opal/hmi: Handle early HMIs on thread0 when secondaries are still in OPAL. + + When primary thread receives a CORE level HMI for timer facility errors + while secondaries are still in OPAL, thread 0 ends up in rendez-vous + waiting for secondaries to get into hmi handling. This is because OPAL + runs with MSR(EE=0) and hence HMIs are delayed on secondary threads until + they are given to Linux OS. Fix this by adding a check for secondary + state and force them in hmi handling by queuing job on secondary threads. + + I have tested this by injecting HDEC parity error very early during Linux + kernel boot. Recovery works fine for non-TB errors. But if TB is bad at + this very eary stage we already doomed. + + Without this patch we see: :: + + [ 285.046347408,7] OPAL: Start CPU 0x0843 (PIR 0x0843) -> 0x000000000000a83c + [ 285.051160609,7] OPAL: Start CPU 0x0844 (PIR 0x0844) -> 0x000000000000a83c + [ 285.055359021,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [ 285.055361439,7] HMI: [Loc: U78D3.ND1.WZS004A-P1-C48]: P:8 C:17 T:0: TFMR(2e12002870e14000) Timer Facility Error + [ 286.232183823,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 1 (sptr=0000ccc1) + [ 287.409002056,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 2 (sptr=0000ccc1) + [ 289.073820164,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 3 (sptr=0000ccc1) + [ 290.250638683,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 1 (sptr=0000ccc2) + [ 291.427456821,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 2 (sptr=0000ccc2) + [ 293.092274807,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 3 (sptr=0000ccc2) + [ 294.269092904,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 1 (sptr=0000ccc3) + [ 295.445910944,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 2 (sptr=0000ccc3) + [ 297.110728970,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 3 (sptr=0000ccc3) + + After this patch: :: + + [ 259.401719351,7] OPAL: Start CPU 0x0841 (PIR 0x0841) -> 0x000000000000a83c + [ 259.406259572,7] OPAL: Start CPU 0x0842 (PIR 0x0842) -> 0x000000000000a83c + [ 259.410615534,7] OPAL: Start CPU 0x0843 (PIR 0x0843) -> 0x000000000000a83c + [ 259.415444519,7] OPAL: Start CPU 0x0844 (PIR 0x0844) -> 0x000000000000a83c + [ 259.419641401,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [ 259.419644124,7] HMI: [Loc: U78D3.ND1.WZS004A-P1-C48]: P:8 C:17 T:0: TFMR(2e12002870e04000) Timer Facility Error + [ 259.419650678,7] HMI: Sending hmi job to thread 1 + [ 259.419652744,7] HMI: Sending hmi job to thread 2 + [ 259.419653051,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [ 259.419654725,7] HMI: Sending hmi job to thread 3 + [ 259.419654916,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [ 259.419658025,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [ 259.419658406,7] HMI: [Loc: U78D3.ND1.WZS004A-P1-C48]: P:8 C:17 T:2: TFMR(2e12002870e04000) Timer Facility Error + [ 259.419663095,7] HMI: [Loc: U78D3.ND1.WZS004A-P1-C48]: P:8 C:17 T:3: TFMR(2e12002870e04000) Timer Facility Error + [ 259.419655234,7] HMI: [Loc: U78D3.ND1.WZS004A-P1-C48]: P:8 C:17 T:1: TFMR(2e12002870e04000) Timer Facility Error + [ 259.425109779,7] OPAL: Start CPU 0x0845 (PIR 0x0845) -> 0x000000000000a83c + [ 259.429870681,7] OPAL: Start CPU 0x0846 (PIR 0x0846) -> 0x000000000000a83c + [ 259.434549250,7] OPAL: Start CPU 0x0847 (PIR 0x0847) -> 0x000000000000a83c + +- hw/bt.c: quieten all the noisy BT/IPMI messages +- npu2: Use correct kill type for TCE invalidation + + kill_type is enum of OPAL_PCI_TCE_KILL_PAGES, OPAL_PCI_TCE_KILL_PE, + OPAL_PCI_TCE_KILL_ALL and phb4_tce_kill() gets it right but + npu2_tce_kill() uses OPAL_PCI_TCE_KILL which is an OPAL API token. + +- hw/npu2-opencapi: Fix setting of supported OpenCAPI templates + + In opal_npu_tl_set(), we made a typo that means the OPAL_NPU_TL_SET call + may not clear the enable bits for templates that were previously enabled + but are now disabled. + + Fix the typo so we clear NPU2_OTL_CONFIG1_TX_TEMP2_EN as well as + TEMP{1,3}_EN. + +- phb4: Workaround PHB errata with CFG write UR/CA errors + + If the PHB encounters a UR or CA status on a CFG write, it will + incorrectly freeze the wrong PE. Instead of using the PE# specified + in the CONFIG_ADDRESS register, it will use the PE# of whatever + MMIO occurred last. + + Work around this disabling freeze on such errors + +- phb4: Handle allocation errors in phb4_eeh_dump_regs() + + If the zalloc fails (and it can be a rather large allocation), + we will overwite memory at 0 instead of failing. + +- phb4: Don't try to access non-existent PEST entries + + In a POWER9 chip, some PHB4s have 256 PEs, some have 512. + + Currently, the diagnostics code retrieves 512 unconditionally, + which is wrong and causes us to incorrectly report bogus values + for the "high" PEs on the small PHBs. + + Use the actual number of implemented PEs instead + +- phb4: Don't probe a PHB if its garded + + Presently phb4_probe_stack() causes an exception while trying to probe + a PHB if its garded. This causes skiboot to go into a reboot loop with + following exception log: :: + + *********************************************** + Fatal MCE at 000000003006ecd4 .probe_phb4+0x570 + CFAR : 00000000300b98a0 + <snip> + Aborting! + CPU 0018 Backtrace: + S: 0000000031cc37e0 R: 000000003001a51c ._abort+0x4c + S: 0000000031cc3860 R: 0000000030028170 .exception_entry+0x180 + S: 0000000031cc3a40 R: 0000000000001f10 * + S: 0000000031cc3c20 R: 000000003006ecb0 .probe_phb4+0x54c + S: 0000000031cc3e30 R: 0000000030014ca4 .main_cpu_entry+0x5b0 + S: 0000000031cc3f00 R: 0000000030002700 boot_entry+0x1b8 + + This is caused as phb4_probe_stack() will ignore all xscom read/write + errors to enable PHB Bars and then tries to perform an mmio to read + PHB Version registers that cause the fatal MCE. + + We fix this by ignoring the PHB probe if the first xscom_write() to + populate the PHB Bar register fails, which indicates that there is + something wrong with the PHB. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.0.rst b/roms/skiboot/doc/release-notes/skiboot-6.0.rst new file mode 100644 index 000000000..6dae77d9f --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.0.rst @@ -0,0 +1,1028 @@ +.. _skiboot-6.0: + +skiboot-6.0 +=========== + +skiboot v6.0 was released on Friday May 11th 2018. It is the first +release of skiboot 6.0, which is the new stable release of skiboot +following the 5.11 release, first released April 6th 2018. + +Skiboot 6.0 is the basis for op-build v2.0 and will is *required* for +POWER9 systems. + +skiboot v6.0 contains all bug fixes as of :ref:`skiboot-5.11`, +:ref:`skiboot-5.10.5`, and :ref:`skiboot-5.4.9` (the currently maintained +stable releases). We do *not* expect any further stable releases in the +5.10.x series, nor in the 5.11.x series. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +Over skiboot-5.11, we have the following changes: + + +New Features +------------ + +Since 6.0-rc1: + +- Update default stop-state-disable mask to cut only stop11 + + Stability improvements in microcode for stop4/stop5 are + available in upstream hcode images. Stop4 and stop5 can + be safely enabled by default. + + Use ~0xE0000000 to cut all but stop0,1,2 in case there + are any issues with stop4/5. + + example: :: + + nvram -p ibm,skiboot --update-config opal-stop-state-disable-mask=0x1FFFFFFF + + **Note**: that DD2.1 chips that have a frequency <1867Mhz possible *need* to + run a hcode image *different* than the default in op-build (set + `BR2_HCODE_LATEST_VERSION=y` in your config) +- ibm,firmware-versions: add hcode to device tree + + op-build commit 736a08b996e292a449c4996edb264011dfe56a40 + added hcode to the VERSION partition, let's parse it out + and let the user know. +- ipmi: Add BMC firmware version to device tree + + BMC Get device ID command gives BMC firmware version details. Lets add this + to device tree. User space tools will use this information to display BMC + version details. + +Since 5.11: + +- Disable stop states from OPAL + + On ZZ, stop4,5,11 are enabled for PowerVM, even though doing + so may cause problems with OPAL due to bugs in hcode. + + For other platforms, this isn't so much of an issue as + we can just control stop states by the MRW. However the + rebuild-the-world approach to changing values there is a bit + annoying if you just want to rule out a specific stop state + from being problematic. + + Provide an nvram option to override what's disabled in OPAL. + + The OPAL mask is currently ~0xE0000000 (i.e. all but stop 0,1,2) + + You can set an NVRAM override with: :: + + nvram -p ibm,skiboot --update-config opal-stop-state-disable-mask=0xFFFFFFF + + This nvram override will disable *all* stop states. +- interrupts: Create an "interrupts" property in the OPAL node + + Deprecate the old "opal-interrupts", it's still there, but the new + property follows the standard and allow us to specify whether an + interrupt is level or edge sensitive. + + Similarly create "interrupt-names" whose content is identical to + "opal-interrupts-names". +- SBE: Add timer support on POWER9 + + SBE on P9 provides one shot programmable timer facility. We can use this + to implement OPAL timers and hence limit the reliance on the Linux + heartbeat (similar to HW timer facility provided by SLW on P8). +- Add SBE driver support + + SBE (Self Boot Engine) on P9 has two different jobs: + - Boot the chip up to the point the core is functional + - Provide various services like timer, scom, stash MPIPL, etc., at runtime + + We will use SBE for various purposes like timer, MPIPL, etc. + +- opal:hmi: Add missing processor recovery reason string. + + With this patch now we see reason string printed for CORE_WOF[43] bit. :: + + [ 477.352234986,7] HMI: [Loc: U78D3.001.WZS004A-P1-C48]: P:8 C:22 T:3: Processor recovery occurred. + [ 477.352240742,7] HMI: Core WOF = 0x0000000000100000 recovered error: + [ 477.352242181,7] HMI: PC - Thread hang recovery +- Add DIMM actual speed to device tree + + Recent HDAT provides DIMM actuall speed. Lets add this to device tree. +- Fix DIMM size property + + Today we parse vpd blob to get DIMM size information. This is limited + to FSP based system. HDAT provides DIMM size value. Lets use that to + populate device tree. So that we can get size information on BMC based + system as well. + +- PCI: Set slot power limit when supported + + The PCIe slot capability can be implemented in a root or switch + downstream port to set the maximum power a card is allowed to draw + from the system. This patch adds support for setting the power limit + when the platform has defined one. +- hdata/spira: parse vpd to add part-number and serial-number to xscom@ node + + Expected by FWTS and associates our processor with the part/serial + number, which is obviously a good thing for one's own sanity. + + +Improved HMI Handling +^^^^^^^^^^^^^^^^^^^^^ + +- opal/hmi: Add documentation for opal_handle_hmi2 call +- opal/hmi: Generate hmi event for recovered HDEC parity error. +- opal/hmi: check thread 0 tfmr to validate latched tfmr errors. + + Due to P9 errata, HDEC parity and TB residue errors are latched for + non-zero threads 1-3 even if they are cleared. But these are not + latched on thread 0. Hence, use xscom SCOMC/SCOMD to read thread 0 tfmr + value and ignore them on non-zero threads if they are not present on + thread 0. +- opal/hmi: Print additional debug information in rendezvous. +- opal/hmi: Fix handling of TFMR parity/corrupt error. + + While testing TFMR parity/corrupt error it has been observed that HMIs are + delivered twice for this error + + - First time HMI is delivered with HMER[4,5]=1 and TFMR[60]=1. + - Second time HMI is delivered with HMER[4,5]=1 and TFMR[60]=0 with valid TB. + + On second HMI we end up throwing "HMI: TB invalid without core error + reported" even though TB is in a valid state. +- opal/hmi: Stop flooding HMI event for TOD errors. + + Fix the issue where every thread on the chip sends HMI event to host for + TOD errors. TOD errors are reported to all the core/threads on the chip. + Any one thread can fix the error and send event. Rest of the threads don't + need to send HMI event unnecessarily. +- opal/hmi: Fix soft lockups during TOD errors + + There are some TOD errors which do not affect working of TOD and TB. They + stay in valid state. Hence we don't need rendez vous for TOD errors that + does not affect TB working. + + TOD errors that affects TOD/TB will report a global error on TFMR[44] + alongwith bit 51, and they will go in rendez vous path as expected. + + But the TOD errors that does not affect TB register sets only TFMR bit 51. + The TFMR bit 51 is cleared when any single thread clears the TOD error. + Once cleared, the bit 51 is reflected to all the cores on that chip. Any + thread that reads the TFMR register after the error is cleared will see + TFMR bit 51 reset. Hence the threads that see TFMR[51]=1, falls through + rendez-vous path and threads that see TFMR[51]=0, returns doing + nothing. This ends up in a soft lockups in host kernel. + + This patch fixes this issue by not considering TOD interrupt (TFMR[51]) + as a core-global error and hence avoiding rendez-vous path completely. + Instead threads that see TFMR[51]=1 will now take different path that + just do the TOD error recovery. +- opal/hmi: Do not send HMI event if no errors are found. + + For TOD errors, all the cores in the chip get HMIs. Any one thread from any + core can fix the issue and TFMR will have error conditions cleared. Rest of + the threads need take any action if TOD errors are already cleared. Hence + thread 0 of every core should get a fresh copy of TFMR before going ahead + recovery path. Initialize recover = -1, so that if no errors found that + thread need not send a HMI event to linux. This helps in stop flooding host + with hmi event by every thread even there are no errors found. +- opal/hmi: Initialize the hmi event with old value of HMER. + + Do this before we check for TFAC errors. Otherwise the event at host console + shows no error reported in HMER register. + + Without this patch the console event show HMER with all zeros :: + + [ 216.753417] Severe Hypervisor Maintenance interrupt [Recovered] + [ 216.753498] Error detail: Timer facility experienced an error + [ 216.753509] HMER: 0000000000000000 + [ 216.753518] TFMR: 3c12000870e04000 + + After this patch it shows old HMER values on host console: :: + + [ 2237.652533] Severe Hypervisor Maintenance interrupt [Recovered] + [ 2237.652651] Error detail: Timer facility experienced an error + [ 2237.652766] HMER: 0840000000000000 + [ 2237.652837] TFMR: 3c12000870e04000 +- opal/hmi: Rework HMI handling of TFAC errors + + This patch reworks the HMI handling for TFAC errors by introducing + 4 rendez-vous points improve the thread synchronization while handling + timebase errors that requires all thread to clear dirty data from TB/HDEC + register before clearing the errors. +- opal/hmi: Don't bother passing HMER to pre-recovery cleanup + + The test for TFAC error is now redundant so we remove it and + remove the HMER argument. +- opal/hmi: Move timer related error handling to a separate function + + Currently no functional change. This is a first step to completely + rewriting how these things are handled. +- opal/hmi: Add a new opal_handle_hmi2 that returns direct info to Linux + + It returns a 64-bit flags mask currently set to provide info + about which timer facilities were lost, and whether an event + was generated. +- opal/hmi: Remove races in clearing HMER + + Writing to HMER acts as an "AND". The current code writes back the + value we originally read with the bits we handled cleared. This is + racy, if a new bit gets set in HW after the original read, we'll end + up clearing it without handling it. + + Instead, use an all 1's mask with only the bit handled cleared. +- opal/hmi: Don't re-read HMER multiple times + + We want to make sure all reporting and actions are based + upon the same snapshot of HMER in case bits get added + by HW while we are in OPAL. + +libflash and ffspart +^^^^^^^^^^^^^^^^^^^^ + +Many improvements to the `ffspart` utility and `libflash` have come +in this release, making `ffspart` suitable for building bit-identical +PNOR images as the existing tooling used by `op-build`. The plan is to +switch `op-build` to use this infrastructure in the not too distant +future. + +- libflash/blocklevel: Make read/write be ECC agnostic for callers + + The blocklevel abstraction allows for regions of the backing store to be + marked as ECC protected so that blocklevel can decode/encode the ECC + bytes into the buffer automatically without the caller having to be ECC + aware. + + Unfortunately this abstraction is far from perfect, this is only useful + if reads and writes are performed at the start of the ECC region or in + some circumstances at an ECC aligned position - which requires the + caller be aware of the ECC regions. + + The problem that has arisen is that the blocklevel abstraction is + initialised somewhere but when it is later called the caller is unaware + if ECC exists in the region it wants to arbitrarily read and write to. + This should not have been a problem since blocklevel knows. Currently + misaligned reads will fail ECC checks and misaligned writes will + overwrite ECC bytes and the backing store will become corrupted. + + This patch add the smarts to blocklevel_read() and blocklevel_write() to + cope with the problem. Note that ECC can always be bypassed by calling + blocklevel_raw_() functions. + + All this work means that the gard tool can can safely call + blocklevel_read() and blocklevel_write() and as long as the blocklevel + knows of the presence of ECC then it will deal with all cases. + + This also commit removes code in the gard tool which compensated for + inadequacies no longer present in blocklevel. +- libflash/blocklevel: Return region start from ecc_protected() + + Currently all ecc_protected() does is say if a region is ECC protected + or not. Knowing a region is ECC protected is one thing but there isn't + much that can be done afterwards if this is the only known fact. A lot + more can be done if the caller is told where the ECC region begins. + + Knowing where the ECC region start it allows to caller to align its + read/and writes. This allows for more flexibility calling read and write + without knowing exactly how the backing store is organised. +- libflash/ecc: Add helpers to align a position within an ecc buffer + + As part of ongoing work to make ECC invisible to higher levels up the + stack this function converts a 'position' which should be ECC agnostic + to the equivalent position within an ECC region starting at a specified + location. +- libflash/ecc: Add functions to deal with unaligned ECC memcpy +- external/ffspart: Improve error output +- libffs: Fix bad checks for partition overlap + + Not all TOCs are written at zero +- libflash/libffs: Allow caller to specifiy header partition + + An FFS TOC is comprised of two parts. A small header which has a magic + and very minimmal information about the TOC which will be common to all + partitions, things like number of patritions, block sizes and the like. + Following this small header are a series of entries. Importantly there + is always an entry which encompases the TOC its self, this is usually + called the 'part' partition. + + Currently libffs always assumes that the 'part' partition is at zero. + While there is always a TOC and zero there doesn't actually have to be. + PNORs may have multiple TOCs within them, therefore libffs needs to be + flexible enough to allow callers to specify TOCs not at zero. + + The 'part' partition is otherwise a regular partition which may have + flags associated with it. libffs should allow the user to set the flags + for the 'part' partition. + + This patch achieves both by allowing the caller to specify the 'part' + partition. The caller can not and libffs will provide a sensible + default. +- libflash/libffs: Refcount ffs entries + + Currently consumers can add an new ffs entry to multiple headers, this + is fine but freeing any of the headers will cause the entry to be freed, + this causes double free problems. + + Even if only one header is uses, the consumer of the library still has a + reference to the entry, which they may well reuse at some other point. + + libffs will now refcount entries and only free when there are no more + references. + + This patch also removes the pointless return value of ffs_hdr_free() +- libflash/libffs: Switch to storing header entries in an array + + Since the libffs no longer needs to sort the entries as they get added + it makes little sense to have the complexity of a linked list when an + array will suffice. +- libflash/libffs: Remove backup partition from TOC generation code + + It turns out this code was messy and not all that reliable. Doing it at + the library level adds complexity to the library and restrictions to the + caller. + + A simpler approach can be achived with the just instantiating multiple + ffs_header structures pointing to different parts of the same file. +- libflash/libffs: Remove the 'sides' from the FFS TOC generation code + + It turns out this code was messy and not all that reliable. Doing it at + the library level adds complexity to the library and restrictions to the + caller. + + A simpler approach can be achived with the just instantiating multiple + ffs_header structures pointing to different parts of the same file. +- libflash/libffs: Always add entries to the end of the TOC + + It turns out that sorted order isn't the best idea. This removes + flexibility from the caller. If the user wants their partitions in + sorted order, they should insert them in sorted order. +- external/ffspart: Remove side, order and backup options + + These options are currently flakey in libflash/libffs so there isn't + much point to being able to use them in ffspart. + + Future reworks planned for libflash/libffs will render these options + redundant anyway. +- libflash/libffs: ffs_close() should use ffs_hdr_free() +- libflash/libffs: Add setter for a partitions actual size +- pflash: Use ffs_entry_user_to_string() to standardise flag strings +- libffs: Standardise ffs partition flags + + It seems we've developed a character respresentation for ffs partition + flags. Currently only pflash really prints them so it hasn't been a + problem but now ffspart wants to read them in from user input. + + It is important that what libffs reads and what pflash prints remain + consistent, we should move the code into libffs to avoid problems. +- external/ffspart: Allow # comments in input file\ + +p9dsu Platform changes +---------------------- + +The p9dsu platform from SuperMicro (also known as 'Boston') has received +a number of updates, and the patches once carried by SuperMicro are now +upstream. + +Since 6.0-rc1: + +- p9dsu: timeout for variant detection, default to 2uess + + +Since 5.11: + +- p9dsu: detect p9dsu variant even when hostboot doesn't tell us + + The SuperMicro BMC can tell us what riser type we have, which dictates + the PCI slot tables. Usually, in an environment that a customer would + experience, Hostboot will do the query with an SMC specific patch + (not upstream as there's no platform specific code in hostboot) + and skiboot knows what variant it is based on the compatible string. + + However, if you're using upstream hostboot, you only get the bare + 'p9dsu' compatible type. We can work around this by asking the BMC + ourselves and setting the slot table appropriately. We do this + syncronously in platform init so that we don't start probing + PCI before we setup the slot table. +- p9dsu: add slot power limit. +- p9dsu: add pci slot table for Boston LC 1U/2U and Boston LA/ESS. +- p9dsu HACK: fix system-vpd eeprom +- p9dsu: change esel command from AMI to IBM 0x3a. + +ZZ Platform Changes +------------------- + +- hdata/i2c: Fix up pci hotplug labels + + These labels are used on the devices used to do PCIe slot power control + for implementing PCIe hotplug. I'm not sure how they ended up as + "eeprom-pgood" and "eeprom-controller" since that doesn't make any sense. +- hdata/i2c: Ignore multi-port I2C devices + + Recent FSP firmware builds add support for multi-port I2C devices such + as the GPIO expanders used for the presence detect of OpenCAPI devices + and the PCIe hotplug controllers used to power cycle PCIe slots on ZZ. + + The OpenCAPI driver inside of skiboot currently uses a platform-specific + method to talk to the relevant I2C device rather than relying on HDAT + since not all platforms correctly report the I2C devices (hello Zaius). + Additionally the nature of multi-port devices require that we a device + specific handler so that we generate the correct DT bindings. Currently + we don't and there is no immediate need for this support so just ignore + the multi-port devices for now. +- hdata/i2c: Replace `i2c_` prefix with `dev_` + + The current naming scheme makes it easy to conflate "i2cm_port" and + "i2c_port." The latter is used to describe multi-port I2C devices such + as GPIO expanders and multi-channel PCIe hotplug controllers. Rename + i2c_port to dev_port to make the two a bit more distinct. + + Also rename i2c_addr to dev_addr for consistency. +- hdata/i2c: Ignore CFAM I2C master + + Recent FSP firmware builds put in information about the CFAM I2C master + in addition the to host I2C masters accessible via XSCOM. Odds are this + information should not be there since there's no handshaking between the + FSP/BMC and the host over who controls that I2C master, but it is so + we need to deal with it. + + This patch adds filtering to the HDAT parser so it ignores the CFAM I2C + master. Without this it will create a bogus i2cm@<addr> which migh cause + issues. +- ZZ: hw/imc: Add support to load imc catalog lid file + + Add support to load the imc catalog from a lid file packaged + as part of the system firmware. Lid number allocated + is 0x80f00103.lid. + + +Bugs Fixed +---------- + +Since 6.0-rc2: + +- core/opal: Fix recursion check in opal_run_pollers() + + An earlier commit introduced a counter variable poller_recursion to + limit to the number number of error messages shown when opal_pollers + are run recursively. However the check for the counter value was + placed in a way that the poller recursion was only detected first 16 + times and then allowed afterwards. + + This patch fixes this by moving the check for the counter value inside + the conditional branch with some re-factoring so that opal_poller + recursion is not erroneously allowed after poll_recursion is detected + first 16 times. +- phb4: Print WOF registers on fence detect + + Without the WOF registers it's hard to figure out what went wrong first, + so print those when we print the FIRs when a fence is detected. +- p9dsu: detect variant in init only if probe fails to found. + + Currently the slot table init happens twice in both probe and init + functions due to the variant detection logic called with in-correct + condition check. + +Since 6.0-rc1: + +- core/direct-controls: improve p9_stop_thread error handling + + p9_stop_thread should fail the operation if it finds the thread was + already quiescd. This implies something else is doing direct controls + on the thread (e.g., pdbg) or there is some exceptional condition we + don't know how to deal with. Proceeding here would cause things to + trample on each other, for example the hard lockup watchdog trying to + send a sreset to the core while it is stopped for debugging with pdbg + will end in tears. + + If p9_stop_thread times out waiting for the thread to quiesce, do + not hit it with a core_start direct control, because we don't know + what state things are in and doing more things at this point is worse + than doing nothing. There is no good recipe described in the workbook + to de-assert the core_stop control if it fails to quiesce the thread. + After timing out here, the thread may eventually quiesce and get + stuck, but that's simpler to debug than undefied behaviour. + +- core/direct-controls: fix p9_cont_thread for stopped/inactive threads + + Firstly, p9_cont_thread should check that the thread actually was + quiesced before it tries to resume it. Anything could happen if we + try this from an arbitrary thread state. + + Then when resuming a quiesced thread that is inactive or stopped (in + a stop idle state), we must not send a core_start direct control, + clear_maint must be used in these cases. +- hmi: Clear unknown debug trigger + + On some systems, seeing hangs like this when Linux starts: :: + + [ 170.027252763,5] OCC: All Chip Rdy after 0 ms + [ 170.062930145,5] INIT: Starting kernel at 0x20011000, fdt at 0x30ae0530 366247 bytes) + [ 171.238270428,5] OPAL: Switch to little-endian OS + + If you look at the in memory skiboot console (or do `nvram -p + ibm,skiboot --update-config log-level-driver=7`) we see the console get + spammed with: :: + + [ 5209.109790675,7] HMI: Received HMI interrupt: HMER = 0x0000400000000000 + [ 5209.109792716,7] HMI: Received HMI interrupt: HMER = 0x0000400000000000 + [ 5209.109794695,7] HMI: Received HMI interrupt: HMER = 0x0000400000000000 + [ 5209.109796689,7] HMI: Received HMI interrupt: HMER = 0x0000400000000000 + + We're taking the debug trigger (bit 17) early on, before the + hmi_debug_trigger function in the kernel is set up. + + This clears the HMI in Skiboot and reports to the kernel instead of + bringing down the machine. + +- core/hmi: assign flags=0 in case nothing set by handle_hmi_exception + + Theoretically we could have returned junk to the OS in this parameter. + +- SLW: Fix mambo boot to use stop states + + After commit 35c66b8ce5a2 ("SLW: Move MAMBO simulator checks to + slw_init"), mambo boot no longer calls add_cpu_idle_state_properties() + and as such we never enable stop states. + + After adding the call back, we get more testing coverage as well + as faster mambo SMT boots. + +- phb4: Hardware init updates + + CFG Write Request Timeout was incorrectly set to informational and not + fatal for both non-CAPI and CAPI, so set it to fatal. This was a + mistake in the specification. Correcting this fixes a niche bug in + escalation (which is necessary on pre-DD2.2) that can cause a checkstop + due to a NCU timeout. + + In addition, set the values in the timeout control registers to match. + This fixes an extremely rare and unreproducible bug, though the current + timings don't make sense since they're higher than the NCU timeout (16) + which will checkstop the machine anyway. + +- SLW: quieten 'Configuring self-restore' for DARN,NCU_SPEC_BAR and HRMOR + +Since 5.11: + +- core: Fix iteration condition to skip garded cpu +- uart: fix uart_opal_flush to take console lock over uart_con_flush + This bug meant that OPAL_CONSOLE_FLUSH didn't take the appropriate locks. + Luckily, since this call is only currently used in the crash path. +- xive: fix missing unlock in error path +- OPAL_PCI_SET_POWER_STATE: fix locking in error paths + + Otherwise we could exit OPAL holding locks, potentially leading + to all sorts of problems later on. +- hw/slw: Don't assert on a unknown chip + + For some reason skiboot populates nodes in /cpus/ for the cores on + chips that are deconfigured. As a result Linux includes the threads + of those cores in it's set of possible CPUs in the system and attempts + to set the SPR values that should be used when waking a thread from + a deep sleep state. + + However, in the case where we have deconfigured chip we don't create + a xscom node for that chip and as a result we don't have a proc_chip + structure for that chip either. In turn, this results in an assertion + failure when calling opal_slw_set_reg() since it expects the chip + structure to exist. Fix this up and print an error instead. +- opal/hmi: Generate one event per core for processor recovery. + + Processor recovery is per core error. All threads on that core receive + HMI. All threads don't need to generate HMI event for same error. + + Let thread 0 only generate the event. +- sensors: Dont add DTS sensors when OCC inband sensors are available + + There are two sets of core temperature sensors today. One is DTS scom + based core temperature sensors and the second group is the sensors + provided by OCC. DTS is the highest temperature among the different + temperature zones in the core while OCC core temperature sensors are + the average temperature of the core. DTS sensors are read directly by + the host by SCOMing the DTS sensors while OCC sensors are read and + updated by OCC to main memory. + + Reading DTS sensors by SCOMing is a heavy and slower operation as + compared to reading OCC sensors which is as good as reading memory. + So dont add DTS sensors when OCC sensors are available. +- core/fast-reboot: Increase timeout for dctl sreset to 1sec + + Direct control xscom can take more time to complete. We seem to + wait too little on Boston failing fast-reboot for no good reason. + + Increase timeout to 1 sec as a reasonable value for sreset to be delivered + and core to start executing instructions. +- occ: sensors-groups: Add DT properties to mark HWMON sensor groups + + Fix the sensor type to match HWMON sensor types. Add compatible flag + to indicate the environmental sensor groups so that operations on + these groups can be handled by HWMON linux interface. +- core: Correctly load initramfs in stb container + + Skiboot does not calculate the actual size and start location of the + initramfs if it is wrapped by an STB container (for example if loading + an initramfs from the ROOTFS partition). + + Check if the initramfs is in an STB container and determine the size and + location correctly in the same manner as the kernel. Since + load_initramfs() is called after load_kernel() move the call to + trustedboot_exit_boot_services() into load_and_boot_kernel() so it is + called after both of these. +- hdat/i2c.c: quieten "v2 found, parsing as v1" +- hw/imc: Check for pause_microcode_at_boot() return status + + pause_microcode_at_boot() loops through all the chip's ucode + control block and pause the ucode if it is in the running state. + But it does not fail if any of the chip's ucode is not initialised. + + Add code to return a failure if ucode is not initialized in any + of the chip. Since pause_microcode_at_boot() is called just before + attaching the IMC device nodes in imc_init(), add code to check for + the function return. + + +Slot location code fixes: + +- npu2: Use ibm, loc-code rather than ibm, slot-label + + The ibm,slot-label property is to name the slot that appears under a + PCIe bridge. In the past we (ab)used the slot tables to attach names + to GPU devices and their corresponding NVLinks which resulted in npu2.c + using slot-label as a location code rather than as a way to name slots. + + Fix this up since it's confusing. +- hdata/slots: Apply slot label to the parent slot + + Slot names only really make sense when applied to an actual slot rather + than a device. On witherspoon the GPU devices have a name associated with + the device rather than the slot for the GPUs. Add a hack that moves the + slot label to the parent slot rather than on the device itself. +- pci-dt-slot: Big ol' cleanup + + The underlying data that we get from HDAT can only really describe a + PCIe system. As such we can simplify the devicetree slot lookup code + by only caring about the important cases, namly, root ports and switch + downstream ports. + + This also fixes a bug where root port didn't get a Slot label applied + which results in devices under that port not having ibm,loc-code set. + This results in the EEH core being unable to report the location of + EEHed devices under that port. + +opal-prd +^^^^^^^^ +- opal-prd: Insert powernv_flash module + + Explictly load powernv_flash module on BMC based system so that we are sure + that flash device is created before starting opal-prd daemon. + + Note that I have replaced pnor_available() check with is_fsp_system(). As we + want to load module on BMC system only. Also pnor_init has enough logic to + detect flash device. Hence pnor_available() becomes redundant check. + +NPU2/NVLINK2 +^^^^^^^^^^^^ +- npu2/hw-procedures: fence bricks on GPU reset + + The NPU workbook defines a way of fencing a brick and + getting the brick out of fence state. We do have an implementation + of bringing the brick out of fenced/quiesced state. We do + the latter in our procedures, but to support run time reset + we need to do the former. + + The fencing ensures that access to memory behind the links + will not lead to HMI's, but instead SUE's will be populated + in cache (in the case of speculation). The expectation is then + that prior to and after reset, the operating system components + will flush the cache for the region of memory behind the GPU. + + This patch does the following: + + 1. Implements a npu2_dev_fence_brick() function to set/clear + fence state + 2. Clear FIR bits prior to clearing the fence status + 3. Clear's the fence status + 4. We take the powerbus out of CQ fence much later now, + in credits_check() which is the last hardware procedure + called after link training. +- hw/npu2.c: Remove static configuration of NPU2 register + + The NPU_SM_CONFIG0 register currently needs to be configured in Skiboot to + select NVLink mode, however Hostboot should configure other bits in this + register. + + For some reason Skiboot was explicitly clearing bit-6 + (CONFIG_DISABLE_VG_NOT_SYS). It is unclear why this bit was getting cleared + as recent Hostboot versions explicitly set it to the correct value based on + the specific system configuration. Therefore Skiboot should not alter it. + + Bit-58 (CONFIG_NVLINK_MODE) selects if NVLink mode should be enabled or + not. Hostboot does not configure this bit so Skiboot should continue to + configure it. +- npu2: Improve log output of GPU-to-link mapping + + Debugging issues related to unconnected NVLinks can be a little less + irritating if we use the NPU2DEV{DBG,INF}() macros instead of prlog(). + + In short, change this: :: + + NPU2: comparing GPU 'GPU2' and NPU2 'GPU1' + NPU2: comparing GPU 'GPU3' and NPU2 'GPU1' + NPU2: comparing GPU 'GPU4' and NPU2 'GPU1' + NPU2: comparing GPU 'GPU5' and NPU2 'GPU1' + : + npu2_dev_bind_pci_dev: No PCI device for NPU2 device 0006:00:01.0 to bind to. If you expect a GPU to be there, this is a problem. + + to this: :: + + NPU6:0:1.0 Comparing GPU 'GPU2' and NPU2 'GPU1' + NPU6:0:1.0 Comparing GPU 'GPU3' and NPU2 'GPU1' + NPU6:0:1.0 Comparing GPU 'GPU4' and NPU2 'GPU1' + NPU6:0:1.0 Comparing GPU 'GPU5' and NPU2 'GPU1' + : + NPU6:0:1.0 No PCI device found for slot 'GPU1' +- npu2: Move NPU2_XTS_BDF_MAP_VALID assignment to context init + + A bad GPU or other condition may leave us with a subset of links that + never get initialized. If an ATSD is sent to one of those bricks, it + will never complete, leaving us waiting forever for a response: :: + + watchdog: BUG: soft lockup - CPU#23 stuck for 23s! [acos:2050] + ... + Modules linked in: nvidia_uvm(O) nvidia(O) + CPU: 23 PID: 2050 Comm: acos Tainted: G W O 4.14.0 #2 + task: c0000000285cfc00 task.stack: c000001fea860000 + NIP: c0000000000abdf0 LR: c0000000000acc48 CTR: c0000000000ace60 + REGS: c000001fea863550 TRAP: 0901 Tainted: G W O (4.14.0) + MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28004484 XER: 20040000 + CFAR: c0000000000abdf4 SOFTE: 1 + GPR00: c0000000000acc48 c000001fea8637d0 c0000000011f7c00 c000001fea863820 + GPR04: 0000000002000000 0004100026000000 c0000000012778c8 c00000000127a560 + GPR08: 0000000000000001 0000000000000080 c000201cc7cb7750 ffffffffffffffff + GPR12: 0000000000008000 c000000003167e80 + NIP [c0000000000abdf0] mmio_invalidate_wait+0x90/0xc0 + LR [c0000000000acc48] mmio_invalidate.isra.11+0x158/0x370 + + + ATSDs are only sent to bricks which have a valid entry in the XTS_BDF + table. So to prevent the hang, don't set NPU2_XTS_BDF_MAP_VALID unless + we make it all the way to creating a context for the BDF. + +Secure and Trusted Boot +^^^^^^^^^^^^^^^^^^^^^^^ +- hdata/tpmrel: detect tpm not present by looking up the stinfo->status + + Skiboot detects if tpm is present by checking if a secureboot_tpm_info + entry exists. However, if a tpm is not present, hostboot also creates a + secureboot_tpm_info entry. In this case, hostboot creates an empty + entry, but setting the field tpm_status to TPM_NOT_PRESENT. + + This detects if tpm is not present by looking up the stinfo->status. + + This fixes the "TPMREL: TPM node not found for chip_id=0 (HB bug)" + issue, reproduced when skiboot is running on a system that has no tpm. + +PCI +^^^ +- phb4: Restore bus numbers after CRS + + Currently we restore PCIe bus numbers right after the link is + up. Unfortunately as this point we haven't done CRS so config space + may not be accessible. + + This moves the bus number restore till after CRS has happened. +- romulus: Add a barebones slot table +- phb4: Quieten and improve "Timeout waiting for electrical link" + + This happens normally if a slot doesn't have a working HW presence + detect and relies instead of inband presence detect. + + The message we display is scary and not very useful unless ou + are debugging, so quiten it up and change it to something more + meaningful. +- pcie-slot: Don't fail powering on an already on switch + + If the power state is already the required value, return + OPAL_SUCCESS rather than OPAL_PARAMETER to avoid spurrious + errors during boot. + +CAPI/OpenCAPI +^^^^^^^^^^^^^ +- capi: Keep the current mmio windows in the mbt cache table. + + When the phb is used as a CAPI interface, the current mmio windows list + is cleaned before adding the capi and the prefetchable memory (M64) + windows, which implies that the non-prefetchable BAR is no more + configured. + This patch allows to set only the mbt bar to pass capi mmio window and + to keep, as defined, the other mmio values (M32 and M64). +- npu2-opencapi: Fix 'link internal error' FIR, take 2 + + When setting up an opencapi link, we set the transport muxes first, + then set the PHY training config register, which includes disabling + nvlink mode for the bricks. That's the order of the init sequence, as + found in the NPU workbook. + + In reality, doing so works, but it raises 2 FIR bits in the PowerBus + OLL FIR Register for the 2 links when we configure the transport + muxes. Presumably because nvlink is not disabled yet and we are + configuring the transport muxes for opencapi. + + bit 60: + link0 internal error + bit 61: + link1 internal error + + Overall the current setup ends up being correct and everything works, + but we raise 2 FIR bits. + + So tweak the order of operations to disable nvlink before configuring + the transport muxes. Incidentally, this is what the scripts from the + opencapi enablement team were doing all along. +- npu2-opencapi: Fix 'link internal error' FIR, take 1 + + When we setup a link, we always enable ODL0 and ODL1 at the same time + in the PHY training config register, even though we are setting up + only one OTL/ODL, so it raises a "link internal error" FIR bit in the + PowerBus OLL FIR Register for the second link. The error is harmless, + as we'll eventually setup the second link, but there's no reason to + raise that FIR bit. + + The fix is simply to only enable the ODL we are using for the link. +- phb4: Do not set the PBCQ Tunnel BAR register when enabling capi mode. + + The cxl driver will set the capi value, like other drivers already do. +- phb4: set TVT1 for tunneled operations in capi mode + + The ASN indication is used for tunneled operations (as_notify and + atomics). Tunneled operation messages can be sent in PCI mode as + well as CAPI mode. + + The address field of as_notify messages is hijacked to encode the + LPID/PID/TID of the target thread, so those messages should not go + through address translation. Therefore bit 59 is part of the ASN + indication. + + This patch sets TVT#1 in bypass mode when capi mode is enabled, + to prevent as_notify messages from being dropped. + +Debugging/Testing improvements +------------------------------ + +Since 6.0-rc1: + +- mambo: Enable XER CA32 and OV32 bits on P9 + + POWER9 adds 32 bit carry and overflow bits to the XER, but we need to + set the relevant CTRL1 bit to enable them. +- Makefile: Fix building natively on ppc64le + + When on ppc64le and CROSS is not set by the environment, make assumes + ppc64 and sets a default CROSS. Check for ppc64le as well, so that + 'make' works out of the box on ppc64le. +- Experimental support for building with Clang +- Improvements to testing and Travis CI + +Since 5.11: + +- core/stack: backtrace unwind basic OPAL call details + + Put OPAL callers' r1 into the stack back chain, and then use that to + unwind back to the OPAL entry frame (as opposed to boot entry, which + has a 0 back chain). + + From there, dump the OPAL call token and the caller's r1. A backtrace + looks like this: :: + + CPU 0000 Backtrace: + S: 0000000031c03ba0 R: 000000003001a548 ._abort+0x4c + S: 0000000031c03c20 R: 000000003001baac .opal_run_pollers+0x3c + S: 0000000031c03ca0 R: 000000003001bcbc .opal_poll_events+0xc4 + S: 0000000031c03d20 R: 00000000300051dc opal_entry+0x12c + --- OPAL call entry token: 0xa caller R1: 0xc0000000006d3b90 --- + + This is pretty basic for the moment, but it does give you the bottom + of the Linux stack. It will allow some interesting improvements in + future. + + First, with the eframe, all the call's parameters can be printed out + as well. The ___backtrace / ___print_backtrace API needs to be + reworked in order to support this, but it's otherwise very simple + (see opal_trace_entry()). + + Second, it will allow Linux's stack to be passed back to Linux via + a debugging opal call. This will allow Linux's BUG() or xmon to + also print the Linux back trace in case of a NMI or MCE or watchdog + lockup that hits in OPAL. +- asm/head: implement quiescing without stack or clobbering regs + + Quiescing currently is implmeented in C in opal_entry before the + opal call handler is called. This works well enough for simple + cases like fast reset when one CPU wants all others out of the way. + + Linux would like to use it to prevent an sreset IPI from + interrupting firmware, which could lead to deadlocks when crash + dumping or entering the debugger. Linux interrupts do not recover + well when returning back to general OPAL code, due to r13 not being + restored. OPAL also can't be re-entered, which may happen e.g., + from the debugger. + + So move the quiesce hold/reject to entry code, beore the stack or + r1 or r13 registers are switched. OPAL can be interrupted and + returned to or re-entered during this period. + + This does not completely solve all such problems. OPAL will be + interrupted with sreset if the quiesce times out, and it can be + interrupted by MCEs as well. These still have the issues above. +- core/opal: Allow poller re-entry if OPAL was re-entered + + If an NMI interrupts the middle of running pollers and the OS + invokes pollers again (e.g., for console output), the poller + re-entrancy check will prevent it from running and spam the + console. + + That check was designed to catch a poller calling opal_run_pollers, + OPAL re-entrancy is something different and is detected elsewhere. + Avoid the poller recursion check if OPAL has been re-entered. This + is a best-effort attempt to cope with errors. +- core/opal: Emergency stack for re-entry + + This detects OPAL being re-entered by the OS, and switches to an + emergency stack if it was. This protects the firmware's main stack + from re-entrancy and allows the OS to use NMI facilities for crash + / debug functionality. + + Further nested re-entry will destroy the previous emergency stack + and prevent returning, but those should be rare cases. + + This stack is sized at 16kB, which doubles the size of CPU stacks, + so as not to introduce a regression in primary stack size. The 16kB + stack originally had a 4kB machine check stack at the top, which was + removed by 80eee1946 ("opal: Remove machine check interrupt patching + in OPAL."). So it is possible the size could be tightened again, but + that would require further analysis. + +- hdat_to_dt: hash_prop the same on all platforms + Fixes this unit test on ppc64le hosts. +- mambo: Add persistent memory disk support + + This adds support to for mapping disks images using persistent + memory. Disks can be added by setting this ENV variable: + + PMEM_DISK="/mydisks/disk1.img,/mydisks/disk2.img" + + These will show up in Linux as /dev/pmem0 and /dev/pmem1. + + This uses a new feature in mambo "mysim memory mmap .." which is only + available since mambo commit 0131f0fc08 (from 24/4/2018). + + This also needs the of_pmem.c driver in Linux which is only available + since v4.17. It works with powernv_defconfig + CONFIG_OF_PMEM. +- external/mambo: Add di command to decode instructions + + By default you get 16 instructions but you can specify the number you + want. i.e. :: + + systemsim % di 0x100 4 + 0x0000000000000100: Enc:0xA64BB17D : mtspr HSPRG1,r13 + 0x0000000000000104: Enc:0xA64AB07D : mfspr r13,HSPRG0 + 0x0000000000000108: Enc:0xF0092DF9 : std r9,0x9F0(r13) + 0x000000000000010C: Enc:0xA6E2207D : mfspr r9,PPR + + Using di since it's what xmon uses. +- mambo/mambo_utils.tcl: Inject an MCE at a specified address + + Currently we don't support injecting an MCE on a specific address. + This is useful for testing functionality like memcpy_mcsafe() + (see https://patchwork.ozlabs.org/cover/893339/) + + The core of the functionality is a routine called + inject_mce_ue_on_addr, which takes an addr argument and injects + an MCE (load/store with UE) when the specified address is accessed + by code. This functionality can easily be enhanced to cover + instruction UE's as well. + + A sample use case to create an MCE on stack access would be :: + + set addr [mysim display gpr 1] + inject_mce_ue_on_addr $addr + + This would cause an mce on any r1 or r1 based access +- external/mambo: improve helper for machine checks + + Improve workarounds for stop injection, because mambo often will + trigger on 0x104/204 when injecting sreset/mces. + + This also adds a workaround to skip injecting on reservations to + avoid infinite loops when doing inject_mce_step. +- travis: Enable ppc64le builds + + At least on the IBM Travis Enterprise instance, we can now do + ppc64le builds! + + We can only build a subset of our matrix due to availability of + ppc64le distros. The Dockerfiles need some tweaking to only + attempt to install (x86_64 only) Mambo binaries, as well as the + build scripts. +- external: Add "lpc" tool + + This is a little front-end to the lpc debugfs files to access + the LPC bus from userspace on the host. +- core/test/run-trace: fix on ppc64el diff --git a/roms/skiboot/doc/release-notes/skiboot-6.1-rc1.rst b/roms/skiboot/doc/release-notes/skiboot-6.1-rc1.rst new file mode 100644 index 000000000..3ae436d9e --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.1-rc1.rst @@ -0,0 +1,466 @@ +.. _skiboot-6.1-rc1: + +skiboot-6.1-rc1 +=============== + +skiboot v6.1-rc1 was released on Friday June 22nd 2018. It is the first +release candidate of skiboot 6.1, which will become the new stable release +of skiboot following the 6.0 release, first released May 11th 2018. + +Skiboot 6.1 will mark the basis for op-build v2.1. + +skiboot v6.1-rc1 contains all bug fixes as of :ref:`skiboot-6.0.4`, +and :ref:`skiboot-5.4.9` (the currently maintained +stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +This release contains a lot of small cleanups and fixes all over the place, +which is possibly a sign that we've shipped our big POWER9 GA release and +now get to breathe for a moment to look at what we ended up with. +Since this is a really small incremental release, there will unlikely be +many release candidates. + +Over skiboot 6.0, we have the following changes: + +General changes and bug fixes +----------------------------- + +- GCC8 build fixes +- Add prepare_hbrt_update to hbrt interfaces + + Add placeholder support for prepare_hbrt_update call into + hostboot runtime (opal-prd) code. This interface is only + called as part of a concurrent code update on a FSP based + system. +- cpu: Clear PCR SPR in opal_reinit_cpus() + + Currently if Linux boots with a non-zero PCR, things can go bad where + some early userspace programs can take illegal instructions. This is + being fixed in Linux, but in the mean time, we should cleanup in + skiboot also. +- pci: Fix PCI_DEVICE_ID() + + The vendor ID is 16 bits not 8. This error leaves the top of the vendor + ID in the bottom bits of the device ID, which resulted in e.g. a failure + to run the PCI quirk for the AST VGA device. +- Quieten console output on boot + + We print out a whole bunch of things on boot, most of which aren't + interesting, so we should *not* print them instead. + + Printing things like what CPUs we found and what PCI devices we found + *are* useful, so continue to do that. But we don't need to splat out + a bunch of things that are always going to be true. +- core/console: fix deadlock when printing with console lock held + + Some debugging options will print while the console lock is held, + which is why the console lock is taken as a recursive lock. + However console_write calls __flush_console, which will drop and + re-take the lock non-recursively in some cases. + + Just set con_need_flush and return from __flush_console if we are + holding the console lock already. + + This stack usage message (taken with this patch applied) could lead + to a deadlock without this: :: + + CPU 0000 lowest stack mark 11768 bytes left pc=300cb808 token=0 + CPU 0000 Backtrace: + S: 0000000031c03370 R: 00000000300cb808 .list_check_node+0x1c + S: 0000000031c03410 R: 00000000300cb910 .list_check+0x38 + S: 0000000031c034b0 R: 00000000300190ac .try_lock_caller+0xb8 + S: 0000000031c03540 R: 00000000300192e0 .lock_caller+0x80 + S: 0000000031c03600 R: 0000000030012c70 .__flush_console+0x134 + S: 0000000031c036d0 R: 00000000300130cc .console_write+0x68 + S: 0000000031c03780 R: 00000000300347bc .vprlog+0xc8 + S: 0000000031c03970 R: 0000000030034844 ._prlog+0x50 + S: 0000000031c03a00 R: 00000000300364a4 .log_simple_error+0x74 + S: 0000000031c03b90 R: 000000003004ab48 .occ_pstates_init+0x184 + S: 0000000031c03d50 R: 000000003001480c .load_and_boot_kernel+0x38c + S: 0000000031c03e30 R: 000000003001571c .main_cpu_entry+0x62c + S: 0000000031c03f00 R: 0000000030002700 boot_entry+0x1c0 +- opal-prd: Do not error out on first failure for soft/hard offline. + + The memory errors (CEs and UEs) that are detected as part of background + memory scrubbing are reported by PRD asynchronously to opal-prd along with + affected memory ranges. hservice_memory_error() converts these ranges into + page granularity before hooking up them to soft/hard offline-ing + infrastructure. + + But the current implementation of hservice_memory_error() does not hookup + all the pages to soft/hard offline-ing if any of the page offline action + fails. e.g hard offline can fail for: + + - Pages that are not part of buddy managed pool. + - Pages that are reserved by kernel using memblock_reserved() + - Pages that are in use by kernel. + + But for the pages that are in use by user space application, the hard + offline marks the page as hwpoison, sends SIGBUS signal to kill the + affected application as recovery action and returns success. + + Hence, It is possible that some of the pages in that memory range are in + use by application or free. By stopping on first error we loose the + opportunity to hwpoison the subsequent pages which may be free or in use by + application. This patch fixes this issue. +- libflash/blocklevel_write: Fix missing error handling + + Caught by scan-build, we seem to trap the errors in rc, but + not take any recovery action during blocklevel_write. + +I2C +^^^ +- p8-i2c: fix wrong request status when a reset is needed + + If the bus is found in error state when starting a new request, the + engine is reset and we enter recovery. However, once complete, the + reset operation shows a status of complete in the status register. So + any badly-timed called to check_status() will think the current top + request is complete, even though it hasn't run yet. + + So don't update any request status while we are in recovery, as + nothing useful for the request is supposed to happen in that state. +- p8-i2c: Remove force reset + + Force reset was added as an attempt to work around some issues with TPM + devices locking up their I2C bus. In that particular case the problem + was that the device would hold the SCL line down permanently due to a + device firmware bug. The force reset doesn't actually do anything to + alleviate the situation here, it just happens to reset the internal + master state enough to make the I2C driver appear to work until + something tries to access the bus again. + + On P9 systems with secure boot enabled there is the added problem + of the "diagostic mode" not being supported on I2C masters A,B,C and + D. Diagnostic mode allows the SCL and SDA lines to be driven directly + by software. Without this force reset is impossible to implement. + + This patch removes the force reset functionality entirely since: + + a) it doesn't do what it's supposed to, and + b) it's butt ugly code + + Additionally, turn p8_i2c_reset_engine() into p8_i2c_reset_port(). + There's no need to reset every port on a master in response to an + error that occurred on a specific port. +- libstb/i2c-driver: Bump max timeout + + We have observed some TPMs clock streching the I2C bus for signifigant + amounts of time when processing commands. The same TPMs also have + errata that can result in permernantly locking up a bus in response to + an I2C transaction they don't understand. Using an excessively long + timeout to prevent this in the field. +- hdata: Add TPM timeout workaround + + Set the default timeout for any bus containing a TPM to one second. This + is needed to work around a bug in the firmware of certain TPMs that will + clock strech the I2C port the for up to a second. Additionally, when the + TPM is clock streching it responds to a STOP condition on the bus by + bricking itself. Clearing this error requires a hard power cycle of the + system since the TPM is powered by standby power. +- p8-i2c: Allow a per-port default timeout + + Add support for setting a default timeout for the I2C port to the + device-tree. This is consumed by skiboot. + +IPMI Watchdog +^^^^^^^^^^^^^ +- ipmi-watchdog: Support handling re-initialization + + Watchdog resets can return an error code from the BMC indicating that + the BMC watchdog was not initialized. Currently we abort skiboot due to + a missing error handler. This patch implements handling + re-initialization for the watchdog, automatically saving the last + watchdog set values and re-issuing them if needed. +- ipmi-watchdog: The stop action should disable reset + + Otherwise it is possible for the reset timer to elapse and trigger the + watchdog to wake back up. This doesn't affect the behavior of the + system since we are providing a NONE action to the BMC. However we would + like to avoid the action from taking place if possible. +- ipmi-watchdog: Add a flag to determine if we are still ticking + + This makes it easier for future changes to ensure that the watchdog + stops ticking and doesn't requeue itself for execution in the + background. This way it is safe for resets to be performed after the + ticks are assumed to be stopped and it won't start the timer again. +- ipmi-watchdog: (prepare for) not disabling at shutdown + + The op-build linux kernel has been configured to support the ipmi + watchdog. This driver will always handle the watchdog by either leaving + it enabled if configured, or by disabling it during module load if no + configuration is provided. This increases the coverage of the watchdog + during the boot process. The watchdog should no longer be disabled at + any point during skiboot execution. + + We're not enabling this by default yet as people can (and do, at least in + development) mix and match old BOOTKERNEL with new skiboot and we don't + want to break that too obviously. +- ipmi-watchdog: Don't reset the watchdog twice + + There is no clarification for why this change was needed, but presumably + this is due to a buggy BMC implementation where the Watchdog Set command + was processed concurrently or after the initial Watchdog Reset. This + inversion would cause the watchdog to stop since the DONT_STOP bit was + not set. Since we are now using the DONT_STOP bit during initialization, + the watchdog should not be stopped even if an inversion occurs. +- ipmi-watchdog: Make it possible to set DONT_STOP + + The IPMI standard supports setting a DONT_STOP bit during an Watchdog + Set operation. Most of the time we don't want to stop the Watchdog when + updating the settings so we should be using this bit. This patch makes + it possible for callers of set_wdt to prevent the watchdog from being + stopped. This only changes the behavior of the watchdog during the + initial settings update when initializing skiboot. The watchdog is no + longer disabled and then immediately re-enabled. +- ipmi-watchdog: WD_POWER_CYCLE_ACTION -> WD_RESET_ACTION + + The IPMI specification denotes that action 0x1 is Host Reset and 0x3 is + Host Power Cycle. Use the correct name for Reset in our watchdog code. + + +POWER8 platforms +---------------- + +- astbmc: Enable mbox depending on scratch reg + + P8 boxes can opt in for mbox pnor support if they set the scratch + register bit to indicate it is supported. + +Simulator platforms +------------------- +- plat/qemu: add PNOR support + + To access the PNOR, OPAL/skiboot drives the BMC SPI controller using + the iLPC2AHB device of the BMC SuperIO controller and accesses the + flash contents using the LPC FW address space on which the PNOR is + remapped. + + The QEMU PowerNV machine now integrates such models (SuperIO + controller, iLPC2AHB device) and also a pseudo Aspeed SoC AHB memory + space populated with the SPI controller registers (same model as for + ARM). The AHB window giving access to the contents of the BMC SPI + controller flash modules is mapped on the LPC FW address space. + + The change should be compatible for machine without PNOR support. +- external/mambo: Add support for readline if it exists + + Add support for tclreadline package if it is present. + This patch loads the package and uses it when the + simulation stops for any reason. + + +FSP based platforms +------------------- + +- Disable fast reboot on FSP IPL side change + + If FSP changes next IPL side, then disable fast reboot. + + sample output: :: + + [ 620.196442259,5] FSP: Got sysparam update, param ID 0xf0000007 + [ 620.196444501,5] CUPD: FW IPL side changed. Disable fast reboot + [ 620.196445389,5] CUPD: Next IPL side : perm +- fsp/console: Always establish OPAL console API backend + + Currently we only call set_opal_console() to establish the backend + used by the OPAL console API if we find at least one FSP serial + port in HDAT. + + On systems where there is none (IPMI only), we fail to set it, + causing the console code to try to use the dummy console causing + an assertion failure during boot due to clashing on the device-tree + node names. + + So always set it if an FSP is present + +AST BMC based platforms +----------------------- + +- AMI BMC: use 0x3a as OEM command + + The 0x3a OEM command is for IBM commands, while 0x32 was for AMI ones. + Sometime in the P8 timeframe, AMI BMCs were changed to listen for our + commands on either 0x32 or 0x3a. Since 0x3a is the direction forward, + we'll use that, as P9 machines with AMI BMCs probably also want these + to work, and let's not bet that 0x32 will continue to be okay. +- astbmc: Set romulus BMC type to OpenBMC +- platform/astbmc: Do not delete compatible property + + P9 onwards OPAL is building device tree for BMC based system using + HDAT. We are populating bmc/compatible node with bmc version. Hence + do not delete this property. + +Utilities +--------- +- external/xscom-utils: Add python library for xscom access + + Patch adds a simple python library module for xscom access. + It directly manipulate the '/access' file for scom read + and write from debugfs 'scom' directory. + + Example on how to generate a getscom using this module: + + .. code-block:: python + + from adu_scoms import * + getscom = GetSCom() + getscom.parse_args() + getscom.run_command() + + Sample output for above getscom.py: + + .. code-block:: console + + # ./getscom.py -l + Chip ID | Rev | Chip type + ---------|-------|----------- + 00000008 | DD2.0 | P9 (Nimbus) processor + 00000000 | DD2.0 | P9 (Nimbus) processor +- ffspart: Don't require user to create blank partitions manually + + Add '--allow-empty' which allows the filename for a given partition to + be blank. If set ffspart will set that part of the PNOR file 'blank' and + set ECC bits if required. + Without this option behaviour is unchanged and ffspart will return an + error if it can not find the partition file. +- pflash: Use correct prefix when installing + + pflash uses lowercase prefix when running make install in it's + direcetory, but uppercase PREFIX when running it in shared. Use + lowercase everywhere. + + With this the OpenBMC bitbake recipie can drop an out of tree patch it's + been carrying for years. + + +POWER9 +------ + +- occ-sensor: Avoid using uninitialised struct cpu_thread + + When adding the sensors in occ_sensors_init, if the type is not + OCC_SENSOR_LOC_CORE, then the loop to find 'c' will not be executed. + Then c->pir is used for both of the the add_sensor_node calls below. + + This provides a default value of 0 instead. +- NX: Add NX coprocessor init opal call + + The read offset (4:11) in Receive FIFO control register is incremented + by FIFO size whenever CRB read by NX. But the index in RxFIFO has to + match with the corresponding entry in FIFO maintained by VAS in kernel. + VAS entry is reset to 0 when opening the receive window during driver + initialization. So when NX842 is reloaded or in kexec boot, possibility + of mismatch between RxFIFO control register and VAS entries in kernel. + It could cause CRB failure / timeout from NX. + + This patch adds nx_coproc_init opal call for kernel to initialize + readOffset (4:11) and Queued (15:23) in RxFIFO control register. +- SLW: Remove stop1_lite and stop2_lite + + stop1_lite has been removed since it adds no additional benefit + over stop0_lite. stop2_lite has been removed since currently it adds + minimal benefit over stop2. However, the benefit is eclipsed by the time + required to ungate the clocks + + Moreover, Lite states don't give up the SMT resources, can potentially + have a performance impact on sibling threads. + + Since current OSs (Linux) aren't smart enough to make good decisions + with these stop states, we're (temporarly) removing them from what + we expose to the OS, the idea being to bring them back in a new + DT representation so that only an OS that knows what to do will + do things with them. +- cpu: Use STOP1 on POWER9 for idle/sleep inside OPAL + + The current code requests STOP3, which means it gets STOP2 in practice. + + STOP2 has proven to occasionally be unreliable depending on FW + version and chip revision, it also requires a functional CME, + so instead, let's use STOP1. The difference is rather minimum + for something that is only used a few seconds during boot. + +NPU2 (NVLink2 and OpenCAPI) +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- npu2: Reset NVLinks on hot reset + + This effectively fences GPU RAM on GPU reset so the host system + does not have to crash every time we stop a KVM guest with a GPU + passed through. +- npu2-opencapi: reduce number of retries to train the link + + We've been reliably training the opencapi link on the first attempt + for quite a while. Furthermore, if it doesn't train on the first + attempt, retries haven't been that useful. So let's reduce the number + of attempts we do to train the link. + + 2 retries = 3 attempts to train. + + Each (failed) training sequence costs about 3 seconds. +- opal/hmi: Display correct chip id while printing NPU FIRs. + + HMIs for NPU xstops are broadcasted to all chips. All cores on all the + chips receive HMI. HMI handler correctly identifies and extracts the + NPU FIR details from affected chip, but while printing FIR data it + prints chip id and location code details of this_cpu()->chip_id which + may not be correct. This patch fixes this issue. +- npu2-opencapi: Fix link state to report link down + + The PHB callback 'get_link_state' is always reporting the link width, + irrespective of the link status and even when the link is down. It is + causing too much work (and failures) when the PHB is probed during pci + init. + The fix is to look at the link status first and report the link as + down when appropriate. +- npu2-opencapi: Cleanup traces printed during link training + + Now that links may train in parallel, traces shown during training can + be all mixed up. So add a prefix to all the traces to clearly identify + the chip and link the trace refers to: :: + + OCAPI[<chip id>:<link id>]: this is a very useful message + + The lower-level hardware procedures (npu2-hw-procedures.c) also print + traces which would need work. But that code is being reworked to be + better integrated with opencapi and nvidia, so leave it alone for now. +- npu2-opencapi: Train links on fundamental reset + + Reorder our link training steps so that they are executed on + fundamental reset instead of during the initial setup. Skiboot always + call a fundamental reset on all the PHBs during pci init. + + It is done through a state machine, similarly to what is done for + 'real' PHBs. + + This is the first step for a longer term goal to be able to trigger an + adapter reset from linux. We'll need the reset callbacks of the PHB to + be defined. We have to handle the various delays differently, since a + linux thread shouldn't stay stuck waiting in opal for too long. +- npu2-opencapi: Rework adapter reset + + Rework a bit the code to reset the opencapi adapter: + + - make clearer which i2c pin is resetting which device + - break the reset operation in smaller chunks. This is really to + prepare for a future patch. + + No functional changes. +- npu2-opencapi: Use presence detection + + Presence detection is not part of the opencapi specification. So each + platform may choose to implement it the way it wants. + + All current platforms implement it through an i2c device where we can + query a pin to know if a device is connected or not. ZZ and Zaius have + a similar design and even use the same i2c information and pin + numbers. + However, presence detection on older ZZ planar (older than v4) doesn't + work, so we don't activate it for now, until our lab systems are + upgraded and it's better tested. + + Presence detection on witherspoon is still being worked on. It's + shaping up to be quite different, so we may have to revisit the topic + in a later patch. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.1.rst b/roms/skiboot/doc/release-notes/skiboot-6.1.rst new file mode 100644 index 000000000..45d87e4aa --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.1.rst @@ -0,0 +1,651 @@ +.. _skiboot-6.1: + +skiboot-6.1 +=========== + +skiboot v6.1 was released on Wednesday July 11th 2018. It is the first +release of skiboot 6.1, which is the new stable release of skiboot +following the 6.0 release, first released May 11th 2018. + +Skiboot 6.1 is the basis for op-build v2.1 and contains all bug fixes as +of :ref:`skiboot-6.0.5`, and :ref:`skiboot-5.4.9` (the currently maintained +stable releases). We expect further stable releases in the 6.0.x and 5.4.x +series, while we do not expect to do any stable releases of 6.1.x. + +This final 6.1 release follows a single release candidate release, as this +cycle we have been rather quiet, with mainly cleanup and bug fix patches +going in. + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +Over skiboot-6.0, we have the following changes: + +General changes and bug fixes +----------------------------- + +Since :ref:`skiboot-6.1-rc1`: + +- slw: Fix trivial typo in debug message +- vpd: Add vendor property to processor node + + Processor FRU vpd doesn't contain vendor detail. We have to parse + module VPD to get vendor detail. + +- vpd: Sanitize VPD data + + On OpenPower system, VPD keyword size tells us the maximum size of the data. + But they fill trailing end with space (0x20) instead of NULL. Also spec + doesn't stop user to have space (0x20) within actual data. + + This patch discards trailing spaces before populating device tree. +- core: always flush console before stopping + + This catches a few cases (e.g., fast reboot failure messages) that + don't always make it to the console before the machine is rebooted. +- core/cpu: parallelise global CPU register setting jobs + + On a 176 thread system, before: :: + + [ 122.319923233,5] OPAL: Switch to big-endian OS + [ 126.317897467,5] OPAL: Switch to little-endian OS + + after: :: + + [ 212.439299889,5] OPAL: Switch to big-endian OS + [ 212.469323643,5] OPAL: Switch to little-endian OS +- init, occ: Initialise OCC earlier on BMC systems + + We need to use the OCC to obtain presence data for the SXM2 slots on + Witherspoon systems. This is needed to determine device type for NVLink + GPUs and OpenCAPI devices which can be plugged into the same slot. Support + for this will be implemented in a future patch. + + Currently, OCC initialisation is done just before handing over to Linux, + which is well after NPU probe. On FSP systems, OCC boot starts very late, + so we wait until the last possible moment to initialise the skiboot side in + order to give it the maximum time to boot. On BMC systems, OCC boot starts + earlier, so there aren't any issues in moving it earlier in the skiboot + init sequence. + + When running on a BMC machine, call occ_pstates_init() as early as + possible in the init sequence. On FSP machines, continue to call it from + its current location. + +Since :ref:`skiboot-6.0`: + +- GCC8 build fixes +- Add prepare_hbrt_update to hbrt interfaces + + Add placeholder support for prepare_hbrt_update call into + hostboot runtime (opal-prd) code. This interface is only + called as part of a concurrent code update on a FSP based + system. +- cpu: Clear PCR SPR in opal_reinit_cpus() + + Currently if Linux boots with a non-zero PCR, things can go bad where + some early userspace programs can take illegal instructions. This is + being fixed in Linux, but in the mean time, we should cleanup in + skiboot also. +- pci: Fix PCI_DEVICE_ID() + + The vendor ID is 16 bits not 8. This error leaves the top of the vendor + ID in the bottom bits of the device ID, which resulted in e.g. a failure + to run the PCI quirk for the AST VGA device. +- Quieten console output on boot + + We print out a whole bunch of things on boot, most of which aren't + interesting, so we should *not* print them instead. + + Printing things like what CPUs we found and what PCI devices we found + *are* useful, so continue to do that. But we don't need to splat out + a bunch of things that are always going to be true. +- core/console: fix deadlock when printing with console lock held + + Some debugging options will print while the console lock is held, + which is why the console lock is taken as a recursive lock. + However console_write calls __flush_console, which will drop and + re-take the lock non-recursively in some cases. + + Just set con_need_flush and return from __flush_console if we are + holding the console lock already. + + This stack usage message (taken with this patch applied) could lead + to a deadlock without this: :: + + CPU 0000 lowest stack mark 11768 bytes left pc=300cb808 token=0 + CPU 0000 Backtrace: + S: 0000000031c03370 R: 00000000300cb808 .list_check_node+0x1c + S: 0000000031c03410 R: 00000000300cb910 .list_check+0x38 + S: 0000000031c034b0 R: 00000000300190ac .try_lock_caller+0xb8 + S: 0000000031c03540 R: 00000000300192e0 .lock_caller+0x80 + S: 0000000031c03600 R: 0000000030012c70 .__flush_console+0x134 + S: 0000000031c036d0 R: 00000000300130cc .console_write+0x68 + S: 0000000031c03780 R: 00000000300347bc .vprlog+0xc8 + S: 0000000031c03970 R: 0000000030034844 ._prlog+0x50 + S: 0000000031c03a00 R: 00000000300364a4 .log_simple_error+0x74 + S: 0000000031c03b90 R: 000000003004ab48 .occ_pstates_init+0x184 + S: 0000000031c03d50 R: 000000003001480c .load_and_boot_kernel+0x38c + S: 0000000031c03e30 R: 000000003001571c .main_cpu_entry+0x62c + S: 0000000031c03f00 R: 0000000030002700 boot_entry+0x1c0 +- opal-prd: Do not error out on first failure for soft/hard offline. + + The memory errors (CEs and UEs) that are detected as part of background + memory scrubbing are reported by PRD asynchronously to opal-prd along with + affected memory ranges. hservice_memory_error() converts these ranges into + page granularity before hooking up them to soft/hard offline-ing + infrastructure. + + But the current implementation of hservice_memory_error() does not hookup + all the pages to soft/hard offline-ing if any of the page offline action + fails. e.g hard offline can fail for: + + - Pages that are not part of buddy managed pool. + - Pages that are reserved by kernel using memblock_reserved() + - Pages that are in use by kernel. + + But for the pages that are in use by user space application, the hard + offline marks the page as hwpoison, sends SIGBUS signal to kill the + affected application as recovery action and returns success. + + Hence, It is possible that some of the pages in that memory range are in + use by application or free. By stopping on first error we loose the + opportunity to hwpoison the subsequent pages which may be free or in use by + application. This patch fixes this issue. +- libflash/blocklevel_write: Fix missing error handling + + Caught by scan-build, we seem to trap the errors in rc, but + not take any recovery action during blocklevel_write. + +I2C +^^^ +- p8-i2c: fix wrong request status when a reset is needed + + If the bus is found in error state when starting a new request, the + engine is reset and we enter recovery. However, once complete, the + reset operation shows a status of complete in the status register. So + any badly-timed called to check_status() will think the current top + request is complete, even though it hasn't run yet. + + So don't update any request status while we are in recovery, as + nothing useful for the request is supposed to happen in that state. +- p8-i2c: Remove force reset + + Force reset was added as an attempt to work around some issues with TPM + devices locking up their I2C bus. In that particular case the problem + was that the device would hold the SCL line down permanently due to a + device firmware bug. The force reset doesn't actually do anything to + alleviate the situation here, it just happens to reset the internal + master state enough to make the I2C driver appear to work until + something tries to access the bus again. + + On P9 systems with secure boot enabled there is the added problem + of the "diagostic mode" not being supported on I2C masters A,B,C and + D. Diagnostic mode allows the SCL and SDA lines to be driven directly + by software. Without this force reset is impossible to implement. + + This patch removes the force reset functionality entirely since: + + a) it doesn't do what it's supposed to, and + b) it's butt ugly code + + Additionally, turn p8_i2c_reset_engine() into p8_i2c_reset_port(). + There's no need to reset every port on a master in response to an + error that occurred on a specific port. +- libstb/i2c-driver: Bump max timeout + + We have observed some TPMs clock streching the I2C bus for signifigant + amounts of time when processing commands. The same TPMs also have + errata that can result in permernantly locking up a bus in response to + an I2C transaction they don't understand. Using an excessively long + timeout to prevent this in the field. +- hdata: Add TPM timeout workaround + + Set the default timeout for any bus containing a TPM to one second. This + is needed to work around a bug in the firmware of certain TPMs that will + clock strech the I2C port the for up to a second. Additionally, when the + TPM is clock streching it responds to a STOP condition on the bus by + bricking itself. Clearing this error requires a hard power cycle of the + system since the TPM is powered by standby power. +- p8-i2c: Allow a per-port default timeout + + Add support for setting a default timeout for the I2C port to the + device-tree. This is consumed by skiboot. + +IPMI Watchdog +^^^^^^^^^^^^^ +- ipmi-watchdog: Support handling re-initialization + + Watchdog resets can return an error code from the BMC indicating that + the BMC watchdog was not initialized. Currently we abort skiboot due to + a missing error handler. This patch implements handling + re-initialization for the watchdog, automatically saving the last + watchdog set values and re-issuing them if needed. +- ipmi-watchdog: The stop action should disable reset + + Otherwise it is possible for the reset timer to elapse and trigger the + watchdog to wake back up. This doesn't affect the behavior of the + system since we are providing a NONE action to the BMC. However we would + like to avoid the action from taking place if possible. +- ipmi-watchdog: Add a flag to determine if we are still ticking + + This makes it easier for future changes to ensure that the watchdog + stops ticking and doesn't requeue itself for execution in the + background. This way it is safe for resets to be performed after the + ticks are assumed to be stopped and it won't start the timer again. +- ipmi-watchdog: (prepare for) not disabling at shutdown + + The op-build linux kernel has been configured to support the ipmi + watchdog. This driver will always handle the watchdog by either leaving + it enabled if configured, or by disabling it during module load if no + configuration is provided. This increases the coverage of the watchdog + during the boot process. The watchdog should no longer be disabled at + any point during skiboot execution. + + We're not enabling this by default yet as people can (and do, at least in + development) mix and match old BOOTKERNEL with new skiboot and we don't + want to break that too obviously. +- ipmi-watchdog: Don't reset the watchdog twice + + There is no clarification for why this change was needed, but presumably + this is due to a buggy BMC implementation where the Watchdog Set command + was processed concurrently or after the initial Watchdog Reset. This + inversion would cause the watchdog to stop since the DONT_STOP bit was + not set. Since we are now using the DONT_STOP bit during initialization, + the watchdog should not be stopped even if an inversion occurs. +- ipmi-watchdog: Make it possible to set DONT_STOP + + The IPMI standard supports setting a DONT_STOP bit during an Watchdog + Set operation. Most of the time we don't want to stop the Watchdog when + updating the settings so we should be using this bit. This patch makes + it possible for callers of set_wdt to prevent the watchdog from being + stopped. This only changes the behavior of the watchdog during the + initial settings update when initializing skiboot. The watchdog is no + longer disabled and then immediately re-enabled. +- ipmi-watchdog: WD_POWER_CYCLE_ACTION -> WD_RESET_ACTION + + The IPMI specification denotes that action 0x1 is Host Reset and 0x3 is + Host Power Cycle. Use the correct name for Reset in our watchdog code. + + +POWER8 platforms +---------------- + +- astbmc: Enable mbox depending on scratch reg + + P8 boxes can opt in for mbox pnor support if they set the scratch + register bit to indicate it is supported. + +Simulator platforms +------------------- + +Since :ref:`skiboot-6.1-rc1`: + +- pmem: volatile bindings for the poorly enabled + + PMEM_DISK bindings were added, but they rely on a rather + recent mmap feature. This patch steals from those bindings + to add volatile bindings. I've used these bindings with + PMEM_VOLATILE to launch an instance with the publicly + available systemsim-p9. The bindings are volatile and one + should not expect any data to be saved/retrieved. + +Since :ref:`skiboot-6.0`: + +- plat/qemu: add PNOR support + + To access the PNOR, OPAL/skiboot drives the BMC SPI controller using + the iLPC2AHB device of the BMC SuperIO controller and accesses the + flash contents using the LPC FW address space on which the PNOR is + remapped. + + The QEMU PowerNV machine now integrates such models (SuperIO + controller, iLPC2AHB device) and also a pseudo Aspeed SoC AHB memory + space populated with the SPI controller registers (same model as for + ARM). The AHB window giving access to the contents of the BMC SPI + controller flash modules is mapped on the LPC FW address space. + + The change should be compatible for machine without PNOR support. +- external/mambo: Add support for readline if it exists + + Add support for tclreadline package if it is present. + This patch loads the package and uses it when the + simulation stops for any reason. + + +FSP based platforms +------------------- + +- Disable fast reboot on FSP IPL side change + + If FSP changes next IPL side, then disable fast reboot. + + sample output: :: + + [ 620.196442259,5] FSP: Got sysparam update, param ID 0xf0000007 + [ 620.196444501,5] CUPD: FW IPL side changed. Disable fast reboot + [ 620.196445389,5] CUPD: Next IPL side : perm +- fsp/console: Always establish OPAL console API backend + + Currently we only call set_opal_console() to establish the backend + used by the OPAL console API if we find at least one FSP serial + port in HDAT. + + On systems where there is none (IPMI only), we fail to set it, + causing the console code to try to use the dummy console causing + an assertion failure during boot due to clashing on the device-tree + node names. + + So always set it if an FSP is present + +AST BMC based platforms +----------------------- + +- AMI BMC: use 0x3a as OEM command + + The 0x3a OEM command is for IBM commands, while 0x32 was for AMI ones. + Sometime in the P8 timeframe, AMI BMCs were changed to listen for our + commands on either 0x32 or 0x3a. Since 0x3a is the direction forward, + we'll use that, as P9 machines with AMI BMCs probably also want these + to work, and let's not bet that 0x32 will continue to be okay. +- astbmc: Set romulus BMC type to OpenBMC +- platform/astbmc: Do not delete compatible property + + P9 onwards OPAL is building device tree for BMC based system using + HDAT. We are populating bmc/compatible node with bmc version. Hence + do not delete this property. + +Utilities +--------- +- external/xscom-utils: Add python library for xscom access + + Patch adds a simple python library module for xscom access. + It directly manipulate the '/access' file for scom read + and write from debugfs 'scom' directory. + + Example on how to generate a getscom using this module: + + .. code-block:: python + + from adu_scoms import * + getscom = GetSCom() + getscom.parse_args() + getscom.run_command() + + Sample output for above getscom.py: + + .. code-block:: console + + # ./getscom.py -l + Chip ID | Rev | Chip type + ---------|-------|----------- + 00000008 | DD2.0 | P9 (Nimbus) processor + 00000000 | DD2.0 | P9 (Nimbus) processor +- ffspart: Don't require user to create blank partitions manually + + Add '--allow-empty' which allows the filename for a given partition to + be blank. If set ffspart will set that part of the PNOR file 'blank' and + set ECC bits if required. + Without this option behaviour is unchanged and ffspart will return an + error if it can not find the partition file. +- pflash: Use correct prefix when installing + + pflash uses lowercase prefix when running make install in it's + direcetory, but uppercase PREFIX when running it in shared. Use + lowercase everywhere. + + With this the OpenBMC bitbake recipie can drop an out of tree patch it's + been carrying for years. + + +POWER9 +------ + +Since :ref:`skiboot-6.1-rc1`: + +- occ: sensors: Fix the size of the phandle array 'sensors' in DT + + Fixes: 99505c03f493 (present in v5.10-rc4) +- phb4: Delay training till after PERST is deasserted + + This helps some cards train on the second PERST (ie fast-reboot). The + reason is not clear why but it helps, so YOLO! + +Since :ref:`skiboot-6.0`: + +- occ-sensor: Avoid using uninitialised struct cpu_thread + + When adding the sensors in occ_sensors_init, if the type is not + OCC_SENSOR_LOC_CORE, then the loop to find 'c' will not be executed. + Then c->pir is used for both of the the add_sensor_node calls below. + + This provides a default value of 0 instead. +- NX: Add NX coprocessor init opal call + + The read offset (4:11) in Receive FIFO control register is incremented + by FIFO size whenever CRB read by NX. But the index in RxFIFO has to + match with the corresponding entry in FIFO maintained by VAS in kernel. + VAS entry is reset to 0 when opening the receive window during driver + initialization. So when NX842 is reloaded or in kexec boot, possibility + of mismatch between RxFIFO control register and VAS entries in kernel. + It could cause CRB failure / timeout from NX. + + This patch adds nx_coproc_init opal call for kernel to initialize + readOffset (4:11) and Queued (15:23) in RxFIFO control register. +- SLW: Remove stop1_lite and stop2_lite + + stop1_lite has been removed since it adds no additional benefit + over stop0_lite. stop2_lite has been removed since currently it adds + minimal benefit over stop2. However, the benefit is eclipsed by the time + required to ungate the clocks + + Moreover, Lite states don't give up the SMT resources, can potentially + have a performance impact on sibling threads. + + Since current OSs (Linux) aren't smart enough to make good decisions + with these stop states, we're (temporarly) removing them from what + we expose to the OS, the idea being to bring them back in a new + DT representation so that only an OS that knows what to do will + do things with them. +- cpu: Use STOP1 on POWER9 for idle/sleep inside OPAL + + The current code requests STOP3, which means it gets STOP2 in practice. + + STOP2 has proven to occasionally be unreliable depending on FW + version and chip revision, it also requires a functional CME, + so instead, let's use STOP1. The difference is rather minimum + for something that is only used a few seconds during boot. + +NPU2 (NVLink2 and OpenCAPI) +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Since :ref:`skiboot-6.1-rc1`: + +- capi: Select the correct IODA table entry for the mbt cache. + + With the current code, the capi mmio window is not correctly configured + in the IODA table entry. The first entry (generally the non-prefetchable + BAR) is overwrriten. + This patch sets the capi window bar at the right place. +- npu2/hw-procedures: Fence bricks via NTL instead of MISC + + There are a couple of places we can set/unset fence for a brick: + + 1. MISC register: NPU2_MISC_FENCE_STATE + 2. NTL register for the brick: NPU2_NTL_MISC_CFG1(ndev) + + Recent testing of ATS in combination with GPU reset has exposed a side + effect of using (1); if fence is set for all six bricks, it triggers a + sticky nmmu latch which prevents the NPU from getting ATR responses. + This manifests as a hang in the tests. + + We have npu2_dev_fence_brick() which uses (1), and only two calls to it. + Replace the call which sets fence with a write to (2). Remove the + corresponding unset call entirely. It's unneeded because the procedures + already do a progression from full fence to half to idle using (2). + +- phb4/capp: Calculate STQ/DMA read engines based on link-width for PEC + + Presently in CAPI mode the number of STQ/DMA-read engines allocated on + PEC2 for CAPP is fixed to 6 and 0-30 respectively irrespective of the + PCI link width. These values are only suitable for x8 cards and + quickly run out if a x16 card is plugged to a PEC2 attached slot. This + usually manifests as CAPP reporting TLBI timeout due to these messages + getting stalled due to insufficient STQs. + + To fix this we update enable_capi_mode() to check if PEC2 chiplet is + in x16 mode and if yes then we allocate 4/0-47 STQ/DMA-read engines + for the CAPP traffic. + + Fixes: 37ea3cfdc852 (present in v5.7-rc1) +- npu2: Use same compatible string for NVLink and OpenCAPI link nodes in device tree + + Currently, we distinguish between NPU links for NVLink devices and OpenCAPI + devices through the use of two different compatible strings - ibm,npu-link + and ibm,npu-link-opencapi. + + As we move towards supporting configurations with both NVLink and OpenCAPI + devices behind a single NPU, we need to detect the device type as part of + presence detection, which can't happen until well after the point where the + HDAT or platform code has created the NPU device tree nodes. Changing a + node's compatible string after it's been created is a bit ugly, so instead + we should move the device type to a new property which we can add to the + node later on. + + Get rid of the ibm,npu-link-opencapi compatible string, add a new + ibm,npu-link-type property, and a helper function to check the link type. + Add an "unknown" device type in preparation for later patches to detect + device type dynamically. + + These device tree bindings are entirely internal to skiboot and are not + consumed directly by Linux, so this shouldn't break anything (other than + internal BML lab environments). +- occ: Add support for GPU presence detection + + On the Witherspoon platform, we need to distinguish between NVLink GPUs and + OpenCAPI accelerators. In order to do this, we first need to find out + whether the SXM2 socket is populated. + + On Witherspoon, the SXM2 socket's presence detection pin is only visible + via I2C from the APSS, and thus can only be exposed to the host via the + OCC. The OCC, per OCC Firmware Interface Specification for POWER9 version + 0.22, now exposes this to skiboot through a field in the dynamic data + shared memory. + + Add the necessary dynamic data changes required to read the version and + GPU presence fields. Add a function, occ_get_gpu_presence(), that can be + used to check GPU presence. + + If the OCC isn't reporting presence (old OCC firmware, or some other + reason), we default to assuming there is a device present and wait until + link training to fail. + + This will be used in later patches to fix up the NPU2 probe path for + OpenCAPI support on Witherspoon. +- hw/npu2, core/hmi: Use NPU instead of NPU2 as log message prefix + + The NPU2{DBG,INF,ERR} macros use "NPU%d" as a prefix to identify messages + relating to a particular NPU. + + It's slightly confusing to have per-NPU messages prefixed with "NPU0" or + "NPU1" and NPU-generic messages prefixed with "NPU2". On some future system + we could potentially have a NPU #2 in which case it'd be really confusing. + + Use NPU rather than NPU2 for NPU-generic log messages. There's no risk of + confusion with the original npu.c code since that's only for P8. + +Since :ref:`skiboot-6.0`: + +- npu2: Reset NVLinks on hot reset + + This effectively fences GPU RAM on GPU reset so the host system + does not have to crash every time we stop a KVM guest with a GPU + passed through. +- npu2-opencapi: reduce number of retries to train the link + + We've been reliably training the opencapi link on the first attempt + for quite a while. Furthermore, if it doesn't train on the first + attempt, retries haven't been that useful. So let's reduce the number + of attempts we do to train the link. + + 2 retries = 3 attempts to train. + + Each (failed) training sequence costs about 3 seconds. +- opal/hmi: Display correct chip id while printing NPU FIRs. + + HMIs for NPU xstops are broadcasted to all chips. All cores on all the + chips receive HMI. HMI handler correctly identifies and extracts the + NPU FIR details from affected chip, but while printing FIR data it + prints chip id and location code details of this_cpu()->chip_id which + may not be correct. This patch fixes this issue. +- npu2-opencapi: Fix link state to report link down + + The PHB callback 'get_link_state' is always reporting the link width, + irrespective of the link status and even when the link is down. It is + causing too much work (and failures) when the PHB is probed during pci + init. + The fix is to look at the link status first and report the link as + down when appropriate. +- npu2-opencapi: Cleanup traces printed during link training + + Now that links may train in parallel, traces shown during training can + be all mixed up. So add a prefix to all the traces to clearly identify + the chip and link the trace refers to: :: + + OCAPI[<chip id>:<link id>]: this is a very useful message + + The lower-level hardware procedures (npu2-hw-procedures.c) also print + traces which would need work. But that code is being reworked to be + better integrated with opencapi and nvidia, so leave it alone for now. +- npu2-opencapi: Train links on fundamental reset + + Reorder our link training steps so that they are executed on + fundamental reset instead of during the initial setup. Skiboot always + call a fundamental reset on all the PHBs during pci init. + + It is done through a state machine, similarly to what is done for + 'real' PHBs. + + This is the first step for a longer term goal to be able to trigger an + adapter reset from linux. We'll need the reset callbacks of the PHB to + be defined. We have to handle the various delays differently, since a + linux thread shouldn't stay stuck waiting in opal for too long. +- npu2-opencapi: Rework adapter reset + + Rework a bit the code to reset the opencapi adapter: + + - make clearer which i2c pin is resetting which device + - break the reset operation in smaller chunks. This is really to + prepare for a future patch. + + No functional changes. +- npu2-opencapi: Use presence detection + + Presence detection is not part of the opencapi specification. So each + platform may choose to implement it the way it wants. + + All current platforms implement it through an i2c device where we can + query a pin to know if a device is connected or not. ZZ and Zaius have + a similar design and even use the same i2c information and pin + numbers. + However, presence detection on older ZZ planar (older than v4) doesn't + work, so we don't activate it for now, until our lab systems are + upgraded and it's better tested. + + Presence detection on witherspoon is still being worked on. It's + shaping up to be quite different, so we may have to revisit the topic + in a later patch. + +Testing and CI +-------------- + +Since :ref:`skiboot-6.1-rc1`: + +- test/qemu: start building qemu again, and use our built qemu for tests + + We need to use QEMU_BIN rather than QEMU as the makefiles define + QEMU already. +- opal-ci: qemu: Use the powernv-3.0 branch + + This is based off the current development version of Qemu, and + importantly it contains the patch that allows skiboot and Linux to clear + the PCR that we require to boot. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.2-rc1.rst b/roms/skiboot/doc/release-notes/skiboot-6.2-rc1.rst new file mode 100644 index 000000000..e930a4ca0 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.2-rc1.rst @@ -0,0 +1,893 @@ +.. _skiboot-6.2-rc1: + +skiboot-6.2-rc1 +=============== + +skiboot v6.2-rc1 was released on Monday November 19th 2018. It is the first +release candidate of skiboot 6.2, which will become the new stable release +of skiboot following the 6.1 release, first released July 11th 2018. + +Skiboot 6.2 will mark the basis for op-build v2.2. + +skiboot v6.2-rc1 contains all bug fixes as of :ref:`skiboot-6.0.13`, +and :ref:`skiboot-5.4.10` (the currently maintained +stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +This release has been a longer cycle than typical for a variety of reasons. It +also contains a lot of cleanup work and minor bug fixes (much like skiboot 6.1 +did). + +Over skiboot 6.1, we have the following changes: + +General +------- + +- cpu: Quieten OS endian switch messages + + Users see these when loading an OS from Petitboot: :: + + [ 119.486794100,5] OPAL: Switch to big-endian OS + [ 120.022302604,5] OPAL: Switch to little-endian OS + + Which is expected and doesn't provide any information the user can act + on. Switch them to PR_INFO so they still appear in the log, but not on + the serial console. +- Recognise signed VERSION partition + + A few things need to change to support a signed VERSION partition: + + - A signed VERSION partition will be 4K + SECURE_BOOT_HEADERS_SIZE (4K). + - The VERSION partition needs to be loaded after secure/trusted boot is + set up, and therefore after nvram_init(). + - Added to the trustedboot resources array. + + This also moves the ipmi_dt_add_bmc_info() call to after + flash_dt_add_fw_version() since it adds info to ibm,firmware-versions. +- Run pollers in time_wait() when not booting + + This only bit us hard with hiomap in one scenario. + + Our OPAL API has been OPAL_POLL_EVENTS may be needed to make forward + progress on ongoing operations, and the internal to skiboot API has been + that time_wait() of a suitable time will run pollers (on at least one + CPU) to help ensure forward progress can be made. + + In a perfect world, interrupts are used but they may: a) be disabled, or + b) the thing we're doing can't use interrupts because computers are + generally terrible. + + Back in 3db397ea5892a (circa 2015), we changed skiboot so that we'd run + pollers only on the boot CPU, and not if we held any locks. This was to + reduce the chance of programming code that could deadlock, as well as to + ensure that we didn't just thrash all the cachelines for running pollers + all over a large system during boot, or hard spin on the same locks on + all secondary CPUs. + + The problem arises if the OS we're booting makes an OPAL call early on, + with interrupts disabled, that requires a poller to run to make forward + progress. An example of this would be OPAL_WRITE_NVRAM early in Linux + boot (where Linux sets up the partitions it wants) - something that + occurs iff we've had to reformat NVRAM this boot (i.e. first boot or + corrupted NVRAM). + + The hiomap implementation should arguably *not* rely on synchronous IPMI + messages, but this is a future improvement (as was for mbox before it). + The mbox-flash code solved this problem by spinning on check_timers(). + + More generically though, the approach of running the pollers when no + longer booting means we behave more in line with what the API is meant + to be, rather than have this odd case of "time_wait() for a condition + that could also be tripped by an interrupt works fine unless the OS is + up and running but hasn't set interrupts up yet". +- ipmi: Reduce ipmi_queue_msg_sync() polling loop time to 10ms + + On a plain boot, this reduces the time spent in OPAL by ~170ms on + p9dsu. This is due to hiomap (currently) using synchronous IPMI + messages. + + It will also *significantly* reduce latency on runtime flash + operations for hiomap, as we'll spend typically 10-20ms in OPAL + rather than 100-200ms. It's not an ideal solution to that, but + it's a quick and obvious win for jitter. +- core/device: NULL pointer dereference fix +- core/flash: NULL pointer dereference fixes +- core/cpu: Call memset with proper cpu_thread offset +- libflash: Add ipmi-hiomap, and prefer it for PNOR access + + ipmi-hiomap implements the PNOR access control protocol formerly known + as "the mbox protocol" but uses IPMI instead of the AST LPC mailbox as a + transport. As there is no-longer any mailbox involved in this alternate + implementation the old protocol name is quite misleading, and so it has + been renamed to "the hiomap protoocol" (Host I/O Mapping protocol). The + same commands and events are used though this client-side implementation + assumes v2 of the protocol is supported by the BMC. + + The code is a heavily-reworked copy of the mbox-flash source and is + introduced this way to allow for the mbox implementation's eventual + removal. + + mbox-flash should in theory be renamed to mbox-hiomap for consistency, + but as it is on life-support effective immediately we may as well just + remove it entirely when the time is right. +- opal/hmi: Handle early HMIs on thread0 when secondaries are still in OPAL. + + When primary thread receives a CORE level HMI for timer facility errors + while secondaries are still in OPAL, thread 0 ends up in rendez-vous + waiting for secondaries to get into hmi handling. This is because OPAL + runs with MSR(EE=0) and hence HMIs are delayed on secondary threads until + they are given to Linux OS. Fix this by adding a check for secondary + state and force them in hmi handling by queuing job on secondary threads. + + I have tested this by injecting HDEC parity error very early during Linux + kernel boot. Recovery works fine for non-TB errors. But if TB is bad at + this very eary stage we already doomed. + + Without this patch we see: :: + + [ 285.046347408,7] OPAL: Start CPU 0x0843 (PIR 0x0843) -> 0x000000000000a83c + [ 285.051160609,7] OPAL: Start CPU 0x0844 (PIR 0x0844) -> 0x000000000000a83c + [ 285.055359021,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [ 285.055361439,7] HMI: [Loc: U78D3.ND1.WZS004A-P1-C48]: P:8 C:17 T:0: TFMR(2e12002870e14000) Timer Facility Error + [ 286.232183823,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 1 (sptr=0000ccc1) + [ 287.409002056,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 2 (sptr=0000ccc1) + [ 289.073820164,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 3 (sptr=0000ccc1) + [ 290.250638683,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 1 (sptr=0000ccc2) + [ 291.427456821,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 2 (sptr=0000ccc2) + [ 293.092274807,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 3 (sptr=0000ccc2) + [ 294.269092904,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 1 (sptr=0000ccc3) + [ 295.445910944,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 2 (sptr=0000ccc3) + [ 297.110728970,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 3 (sptr=0000ccc3) + + After this patch: :: + + [ 259.401719351,7] OPAL: Start CPU 0x0841 (PIR 0x0841) -> 0x000000000000a83c + [ 259.406259572,7] OPAL: Start CPU 0x0842 (PIR 0x0842) -> 0x000000000000a83c + [ 259.410615534,7] OPAL: Start CPU 0x0843 (PIR 0x0843) -> 0x000000000000a83c + [ 259.415444519,7] OPAL: Start CPU 0x0844 (PIR 0x0844) -> 0x000000000000a83c + [ 259.419641401,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [ 259.419644124,7] HMI: [Loc: U78D3.ND1.WZS004A-P1-C48]: P:8 C:17 T:0: TFMR(2e12002870e04000) Timer Facility Error + [ 259.419650678,7] HMI: Sending hmi job to thread 1 + [ 259.419652744,7] HMI: Sending hmi job to thread 2 + [ 259.419653051,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [ 259.419654725,7] HMI: Sending hmi job to thread 3 + [ 259.419654916,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [ 259.419658025,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [ 259.419658406,7] HMI: [Loc: U78D3.ND1.WZS004A-P1-C48]: P:8 C:17 T:2: TFMR(2e12002870e04000) Timer Facility Error + [ 259.419663095,7] HMI: [Loc: U78D3.ND1.WZS004A-P1-C48]: P:8 C:17 T:3: TFMR(2e12002870e04000) Timer Facility Error + [ 259.419655234,7] HMI: [Loc: U78D3.ND1.WZS004A-P1-C48]: P:8 C:17 T:1: TFMR(2e12002870e04000) Timer Facility Error + [ 259.425109779,7] OPAL: Start CPU 0x0845 (PIR 0x0845) -> 0x000000000000a83c + [ 259.429870681,7] OPAL: Start CPU 0x0846 (PIR 0x0846) -> 0x000000000000a83c + [ 259.434549250,7] OPAL: Start CPU 0x0847 (PIR 0x0847) -> 0x000000000000a83c + +- core/cpu: Fix memory allocation for job array + + fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs + + This bug would result in boot-hang on some configurations due to + cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer. +- i2c: Fix multiple-enqueue of the same request on NACK + + i2c_request_send() will retry the request if the error is a NAK, + however it forgets to clear the "ud.done" flag. It will thus + loop again and try to re-enqueue the same request causing internal + request list corruption. +- i2c: Ensure ordering between i2c_request_send() and completion + + i2c_request_send loops waiting for a flag "uc.done" set by + the completion routine, and then look for a result code + also set by that same completion. + + There is no synchronization, the completion can happen on another + processor, so we need to order the stores to uc and the reads + from uc so that uc.done is stored last and tested first using + memory barriers. +- pci: Clarify power down logic + + Currently pci_scan_bus() unconditionally calls pci_slot_set_power_state() + when it's finished scanning a bus. This is one of those things that + makes you go "WHAT?" when you first see it and frankly the skiboot PCI + code could do with less of that. + +Fast Reboot +^^^^^^^^^^^ + +- fast-reboot: parallel memory clearing + + Arbitrarily pick 16GB as the unit of parallelism, and + split up clearing memory into jobs and schedule them + node-local to the memory (or on node 0 if we can't + work that out because it's the memory up to SKIBOOT_BASE) + + This seems to cut at least ~40% time from memory zeroing on + fast-reboot on a 256GB Boston system. + + For many systems, scanning PCI takes about as much time as + zeroing all of RAM, so we may as well do them at the same time + and cut a few seconds off the total fast reboot time. +- fast-reboot: verify firmware "romem" checksum + + This takes a checksum of skiboot memory after boot that should be + unchanged during OS operation, and verifies it before allowing a + fast reboot. + + This is not read-only memory from skiboot's point of view, beause + it includes things like the opal branch table that gets populated + during boot. + + This helps to improve the integrity of firmware against host and + runtime firmware memory scribble bugs. + +- core/fast-reboot: print the fast reboot disable reason + + Once things start to go wrong, disable_fast_reboot can be called a + number of times, so make the first reason sticky, and also print it + to the console at disable time. This helps with making sense of + fast reboot disables. +- Add fast-reboot property to /ibm,opal DT node + + this means that if it's permanently disabled on boot, the test suite can + pick that up and not try a fast reboot test. + +Utilities +--------- + +- pflash: Add --skip option for reading + + Add a --skip=N option to pflash to skip N number of bytes when reading. + This would allow users to print the VERSION partition without the STB + header by specifying the --skip=4096 argument, and it's a more generic + solution rather than making pflash depend on secure/trusted boot code. +- xscom-utils: Rework getsram + + Allow specifying a file on the command line to read OCC SRAM data into. + If no file is specified then we print it to stdout as text. This is a + bit inconsistent, but it retains compatibility with the existing tool. +- xscom-utils/getsram: Make it work on P9 + + The XSCOM base address of the OCC control registers changed slightly + between P8 and P9. Fix this up and add a bit of PVR checking so we look + in the right place. +- opal-prd: Fix opal-prd crash + + Presently callback function from HBRT uses r11 to point to target function + pointer. r12 is garbage. This works fine when we compile with "-no-pie" option + (as we don't use r12 to calculate TOC). + + As per ABIv2 : "r12 : Function entry address at global entry point" + + With "-pie" compilation option, we have to set r12 to point to global function + entry point. So that we can calculate TOC properly. + + Crash log without this patch: :: + + opal-prd[2864]: unhandled signal 11 at 0000000000029320 nip 00000 00102012830 lr 0000000102016890 code 1 + + +Development and Debugging +------------------------- + +- core/lock: Use try_lock_caller() in lock_caller() to capture owner + + Otherwise we can get reports of core/lock.c owning the lock, which is + not helpful when tracking down ownership issues. +- core/flash: Emit a warning if Skiboot version doesn't match + + This means you'll get a warning that you've modified skiboot separately + to the rest of the PNOR image, which can be useful in determining what + firmware is actually running on a machine. +- gcov: link in ctors* as newer GCC doesn't group them all + + It seems that newer toolchains get us multiple ctors sections to link in + rather than just one. If we discard them (as we were doing), then we + don't have a working gcov build (and we get the "doesn't look sane" + warning on boot). +- core/flash: Log return code when ffs_init() fails + + Knowing the return code is at least better than not knowing the return + code. +- gcov: Fix building with GCC8 +- travis/ci: rework Dockerfiles to produce build artifacts + + ubuntu-latest was also missing clang, as ubuntu-latest is closer to + ubuntu 18.04 than 16.04 +- cpu: add cpu_queue_job_on_node() + + Add a job scheduling API which will run the job on the requested + chip_id (or return failure). +- opal-ci: Build old dtc version for fedora 28 + + There are patches that will go into dtc to fix the issues we hit, but + for the moment let's just build and use a slightly older version. +- mem_region: Merge similar allocations when dumping + + Currently we print one line for each allocation done at runtime when + dumping the memory allocations. We do a few thousand allocations at + boot so this can result in a huge amount of text being printed which + is a) slow to print, and b) Can result in the log buffer overflowing + which destroys otherwise useful information. + + This patch adds a de-duplication to this memory allocation dump by + merging "similar" allocations (same location, same size) into one. + + Unfortunately, the algorithm used to do the de-duplication is quadratic, + but considering we only dump the allocations in the event of a fatal + error I think this is acceptable. I also did some benchmarking and found + that on a ZZ it takes ~3ms to do a dump with 12k allocations. On a Zaius + it's slightly longer at about ~10ms for 10k allocs. However, the + difference there was due to the output being written to the UART. + + This patch also bumps the log level to PR_NOTICE. PR_INFO messages are + suppressed at the default log level, which probably isn't something you + want considering we only dump the allocations when we run out of skiboot + heap space. +- core/lock: fix timeout warning causing a deadlock false positive + + If a lock waiter exceeds the warning timeout, it prints a message + while still registered as requesting the lock. Printing the message + can take locks, so if one is held when the owner of the original + lock tries to print a message, it will get a false positive deadlock + detection, which brings down the system. + + This can easily be hit when there is a lot of HMI activity from a + KVM guest, where the timebase was not returned to host timebase + before calling the HMI handler. +- hw/p8-i2c: Print the set error bits + + This is purely to save me from having to look it up every time someone + gets an I2C error. +- init: Fix starting stripped kernel + + Currently if we try to run a raw/stripped binary kernel (ie. without + the elf header) we crash with: :: + + [ 0.008757768,5] INIT: Waiting for kernel... + [ 0.008762937,5] INIT: platform wait for kernel load failed + [ 0.008768171,5] INIT: Assuming kernel at 0x20000000 + [ 0.008779241,3] INIT: ELF header not found. Assuming raw binary. + [ 0.017047348,5] INIT: Starting kernel at 0x0, fdt at 0x3044b230 14339 bytes + [ 0.017054251,0] FATAL: Kernel is zeros, can't execute! + [ 0.017059054,0] Assert fail: core/init.c:590:0 + [ 0.017065371,0] Aborting! + + This is because we haven't set kernel_entry correctly in this path. + This fixes it. +- cpu: Better output when waiting for a very long job + + Instead of printing at the end if the job took more than 1s, + print in the loop every 30s along with a backtrace. This will + give us some output if the job is deadlocked. +- lock: Fix interactions between lock dependency checker and stack checker + + The lock dependency checker does a few nasty things that can cause + re-entrancy deadlocks in conjunction with the stack checker or + in fact other debug tests. + + A lot of it revolves around taking a new lock (dl_lock) as part + of the locking process. + + This tries to fix it by making sure we do not hit the stack + checker while holding dl_lock. + + We achieve that in part by directly using the low-level __try_lock + and manually unlocking on the dl_lock, and making some functions + "nomcount". + + In addition, we mark the dl_lock as being in the console path to + avoid deadlocks with the UART driver. + + We move the enabling of the deadlock checker to a separate config + option from DEBUG_LOCKS as well, in case we chose to disable it + by default later on. +- xscom-utils/adu_scoms.py: run 2to3 over it +- clang: -Wno-error=ignored-attributes + +Mambo Platform +^^^^^^^^^^^^^^ + +- mambo: Merge PMEM_DISK and PMEM_VOLATILE code + + PMEM_VOLATILE and PMEM_DISK can't be used together and are basically + copies of the same code. + + This merges the two and allows them used together. Same API is kept. +- hw/chiptod: test QUIRK_NO_CHIPTOD in opal_resync_timebase + + This allows some test coverage of deep stop states in Linux with + Mambo. +- core/mem_region: mambo reserve kernel payload areas + + Mambo image payloads get overwritten by the OS and by + fast reboot memory clearing because they have no region + defined. Add them, which allows fast reboot to work. + +Qemu platform +^^^^^^^^^^^^^ + +- nx: Don't abort on missing NX when using a QEMU machine + + These don't have an NX node (and probably never will) as they + don't provide any coprocessor. However, the DARN instruction + works so this abort is unnecessary. + +POWER8 Platforms +---------------- +- SBE-p8: Do all sbe timer update with xscom lock held + + Without this, on some P8 platforms, we could (falsely) think the SBE timer + had stalled getting the dreaded "timer stuck" message. + + The code was doing the mftb() to set the start of the timeout period while + *not* holding the lock, so the 1ms timeout started sometime when somebody + else had the xscom lock. + + The simple solution is to just do the whole routine holding the xscom lock, + so do it that way. + +Vesnin Platform +^^^^^^^^^^^^^^^ +- platforms/astbmc/vesnin: Send list of PCI devices to BMC through IPMI + + Implements sending a list of installed PCI devices through IPMI protocol. + Each PCI device description is sent as a standalone IPMI message. + A list of devices can be gathered from separate messages using the + session identifier. The session Id is an incremental counter that is + updated at the start of synchronization session. + + +POWER9 Platforms +---------------- + +- STOP API: API conditionally supports 255 SCOM restore entries for each quad. +- hdata/i2c: Skip unknown device type + + Do not add unknown I2C devices to device tree. +- hdata/i2c: Add whitelisting for Host I2C devices + + Many of the devices that we get information about through HDAT are for + use by firmware rather than the host operating system. This patch adds + a boolean flag to hdat_i2c_info structure that indicates whether devices + with a given purpose should be reserved for use inside of OPAL (or some + other firmware component, such as the OCC). +- hdata/iohub: Fix Cumulus Hub ID number +- opal/hmi: Wakeup the cpu before reading core_fir + + When stop state 5 is enabled, reading the core_fir during an HMI can + result in a xscom read error with xscom_read() returning an + OPAL_XSCOM_PARTIAL_GOOD error code and core_fir value of all FFs. At + present this return error code is not handled in decode_core_fir() + hence the invalid core_fir value is sent to the kernel where it + interprets it as a FATAL hmi causing a system check-stop. + + This can be prevented by forcing the core to wake-up using before + reading the core_fir. Hence this patch wraps the call to + read_core_fir() within calls to dctl_set_special_wakeup() and + dctl_clear_special_wakeup(). +- xive: Disable block tracker + + Due to some HW errata, the block tracking facility (performance optimisation + for large systems) should be disabled on Nimbus chips. Disable it unconditionally + for now. +- opal/hmi: Ignore debug trigger inject core FIR. + + Core FIR[60] is a side effect of the work around for the CI Vector Load + issue in DD2.1. Usually this gets delivered as HMI with HMER[17] where + Linux already ignores it. But it looks like in some cases we may happen + to see CORE_FIR[60] while we are already in Malfunction Alert HMI + (HMER[0]) due to other reasons e.g. CAPI recovery or NPU xstop. If that + happens then just ignore it instead of crashing kernel as not recoverable. +- hdata: Make sure reserved node name starts with "ibm, " + + HDAT does not provide consistent label format for reserved memory label. + Few starts with "ibm," while few other starts with component name. +- hdata: Fix dtc warnings + + Fix dtc warnings related to mcbist node. :: + + Warning (reg_format): "reg" property in /xscom@623fc00000000/mcbist@1 has invalid length (4 bytes) (#address-cells == 1, #size-cells == 1) + Warning (reg_format): "reg" property in /xscom@623fc00000000/mcbist@2 has invalid length (4 bytes) (#address-cells == 1, #size-cells == 1) + Warning (reg_format): "reg" property in /xscom@603fc00000000/mcbist@1 has invalid length (4 bytes) (#address-cells == 1, #size-cells == 1) + Warning (reg_format): "reg" property in /xscom@603fc00000000/mcbist@2 has invalid length (4 bytes) (#address-cells == 1, #size-cells == 1) + + Ideally we should add proper xscom range here... but we are not getting that + information in HDAT today. Lets fix warning until we get proper data in HDAT. + +PHB4 +^^^^ + +- phb4: Generate checkstop on AIB ECC corr/uncorr for DD2.0 parts + + On DD2.0 parts, PCIe ECC protection is not warranted in the response + data path. Thus, for these parts, we need to flag any ECC errors + detected from the adjacent AIB RX Data path so the part can be + replaced. + + This patch configures the FIRs so that we escalate these AIB ECC + errors to a checkstop so the parts can be replaced. +- phb4: Reset pfir and nfir if new errors reported during ETU reset + + During fast-reboot new PEC errors can be latched even after ETU-Reset + is asserted. This will result in values of variables nfir_cache and + pfir_cache to be out of sync. + + During step-2 of CRESET nfir_cache and pfir_cache values are used to + bring the PHB out of reset state. However if these variables are out + as noted above of date the nfir/pfir registers are never reset + completely and ETU still remains frozen. + + Hence this patch updates step-2 of phb4_creset to re-read the values of + nfir/pfir registers to check if any new errors were reported after + ETU-reset was asserted, report these new errors and reset the + nfir/pfir registers. This should bring the ETU out of reset + successfully. +- phb4: Disable nodal scoped DMA accesses when PB pump mode is enabled + + By default when a PCIe device issues a read request via the PHB it is first + issued with nodal scope. When accessing GPU memory the NPU does not know at the + time of response if the requested memory page is off node or not. Therefore + every read of GPU memory by a PHB is retried with larger scope which introduces + bandwidth and latency issues. + + On smaller boxes which have pump mode enabled nodal and group scoped reads are + treated the same and both types of request are broadcast to one chip. Therefore + we can avoid the retry by disabling nodal scope on the PHB for these boxes. On + larger boxes nodal (single chip) and group (multiple chip) scoped reads are + treated differently. Therefore we avoid disabling nodal scope on large boxes + which have pump mode disabled to avoid all PHB requests being broadcast to + multiple chips. +- phb4/capp: Only reset FIR bits that cause capp machine check + + During CAPP recovery do_capp_recovery_scoms() will reset the CAPP Fir + register just after CAPP recovery is completed. This has an + unintentional side effect of preventing PRD from analyzing and + reporting this error. If PRD tries to read the CAPP FIR after opal has + already reset it, then it logs a critical error complaining "No active + error bits found". + + To prevent this from happening we update do_capp_recovery_scoms() to + only reset fir bits that cause CAPP machine check (local xstop). This + is done by reading the CAPP Fir Action0/1 & Mask registers and + generating a mask which is then written on CAPP_FIR_CLEAR register. + +- phb4: Check for RX errors after link training + + Some PHB4 PHYs can get stuck in a bad state where they are constantly + retraining the link. This happens transparently to skiboot and Linux + but will causes PCIe to be slow. Resetting the PHB4 clears the + problem. + + We can detect this case by looking at the RX errors count where we + check for link stability. This patch does this by modifying the link + optimal code to check for RX errors. If errors are occurring we + retrain the link irrespective of the chip rev or card. + + Normally when this problem occurs, the RX error count is maxed out at + 255. When there is no problem, the count is 0. We chose 8 as the max + rx errors value to give us some margin for a few errors. There is also + a knob that can be used to set the error threshold for when we should + retrain the link. ie :: + + nvram -p ibm,skiboot --update-config phb-rx-err-max=8 + +- hw/phb4: Add a helper to dump the PELT-V + + The "Partitionable Endpoint Lookup Table (Vector)" is used by the PHB + when processing EEH events. The PELT-V defines which PEs should be + additionally frozen in the event of an error being flagged on a + given PE. Knowing the state of the PELT-V is sometimes useful for + debugging PHB issues so this patch adds a helper to dump it. + +- hw/phb4: Print the PEs in the EEH dump in hex + + Linux always displays the PE number in hexidecimal while skiboot + displays the PEST index (PE number) in decimal. This makes correlating + errors between Skiboot and Linux more annoying than it should be so + this patch makes Skiboot print the PEST number in hex. + +- phb4: Reallocate PEC2 DMA-Read engines to improve GPU-Direct bandwidth + + We reallocate additional 16/8 DMA-Read engines allocated to stack0/1 + on PEC2 respectively. This is needed to improve bandwidth available to + the Mellanox CX5 adapter when trying to read GPU memory (GPU-Direct). + + If kernel cxl driver indicates a request to allocate maximum possible + DMA read engines when calling enable_capi_mode() and card is attached + to PEC2/stack0 slot then we assume its a Mellanox CX5 adapter. We then + allocate additional 16/8 extra DMA read engines to stack0 and stack1 + respectively on PEC2. This is done by populating the + XPEC_PCI_PRDSTKOVR and XPEC_NEST_READ_STACK_OVERRIDE as suggested by + the h/w team. +- phb4: Enable PHB MMIO-0/1 Bars only when mmio window exists + + Presently phb4_probe_stack() will always enable PHB MMIO0/1 windows + even if they doesn't exist in phy_map. Hence we do some minor shuffling + in the phb4_probe_stack() so that MMIO-0/1 Bars are only enabled if + there corresponding MMIO window exists in the phy_map. In case phy_map + for an mmio window is '0' we set the corresponding BAR register to + '0'. +- hw/phb4: Use local_alloc for phb4 structures + + Struct phb4 is fairly heavyweight at 283664 bytes. On systems with + 6x PHBs per socket this results in using 3.2MB of heap space the PHB + structures alone. This is a fairly large chunk of our 12MB heap and + on systems with particularly large PCIe topologies, or additional + PHBs we can fail to boot because we cannot allocate space for the + FDT blob. + + This patch switches to using local_alloc() for the PHB structures + so they don't consume too large a portion of our 12MB heap space. +- phb4: Fix typo in disable lane eq code + + In this commit :: + + commit 737c0ba3d72b8aab05a765a9fc111a48faac0f75 + Author: Michael Neuling <mikey@neuling.org> + Date: Thu Feb 22 10:52:18 2018 +1100 + phb4: Disable lane eq when retrying some nvidia GEN3 devices + + We made a typo and set PH2 twice. This fixes it. + + It worked previously as if only phase 2 (PH2) is set it, skips phase 2 + and phase 3 (PH3). +- phb4: Don't probe a PHB if its garded + + Presently phb4_probe_stack() causes an exception while trying to probe + a PHB if its garded. This causes skiboot to go into a reboot loop with + following exception log: :: + + *********************************************** + Fatal MCE at 000000003006ecd4 .probe_phb4+0x570 + CFAR : 00000000300b98a0 + <snip> + Aborting! + CPU 0018 Backtrace: + S: 0000000031cc37e0 R: 000000003001a51c ._abort+0x4c + S: 0000000031cc3860 R: 0000000030028170 .exception_entry+0x180 + S: 0000000031cc3a40 R: 0000000000001f10 * + S: 0000000031cc3c20 R: 000000003006ecb0 .probe_phb4+0x54c + S: 0000000031cc3e30 R: 0000000030014ca4 .main_cpu_entry+0x5b0 + S: 0000000031cc3f00 R: 0000000030002700 boot_entry+0x1b8 + + This is caused as phb4_probe_stack() will ignore all xscom read/write + errors to enable PHB Bars and then tries to perform an mmio to read + PHB Version registers that cause the fatal MCE. + + We fix this by ignoring the PHB probe if the first xscom_write() to + populate the PHB Bar register fails, which indicates that there is + something wrong with the PHB. +- phb4: Workaround PHB errata with CFG write UR/CA errors + + If the PHB encounters a UR or CA status on a CFG write, it will + incorrectly freeze the wrong PE. Instead of using the PE# specified + in the CONFIG_ADDRESS register, it will use the PE# of whatever + MMIO occurred last. + + Work around this disabling freeze on such errors +- phb4: Handle allocation errors in phb4_eeh_dump_regs() + + If the zalloc fails (and it can be a rather large allocation), + we will overwite memory at 0 instead of failing. +- phb4: Don't try to access non-existent PEST entries + + In a POWER9 chip, some PHB4s have 256 PEs, some have 512. + + Currently, the diagnostics code retrieves 512 unconditionally, + which is wrong and causes us to incorrectly report bogus values + for the "high" PEs on the small PHBs. + + Use the actual number of implemented PEs instead + +CAPI2 +^^^^^ + +- phb4/capp: Use link width to allocate STQ engines to CAPP + + Update phb4_init_capp_regs() to allocates STQ Engines to CAPP/PEC2 + based on link width instead of always assuming it to x8. + + Also re-factor the function slightly to evaluate the link-width only + once and cache it so that it can also be used to allocate DMA read + engines. +- phb4/capp: Update DMA read engines set in APC_FSM_READ_MASK based on link-width + + Commit 47c09cdfe7a3("phb4/capp: Calculate STQ/DMA read engines based + on link-width for PEC") update the CAPP init sequence by calculating + the needed STQ/DMA-read engines based on link width and populating it + in XPEC_NEST_CAPP_CNTL register. This however needs to be synchronized + with the value set in CAPP APC FSM Read Machine Mask Register. + + Hence this patch update phb4_init_capp_regs() to calculate the link + width of the stack on PEC2 and populate the same values as previously + populated in PEC CAPP_CNTL register. +- capp: Fix the capp recovery timeout comparison + + The current capp recovery timeout control loop in + do_capp_recovery_scoms() uses a wrong comparison for return value of + tb_compare(). This may cause do_capp_recovery_scoms() to report an + timeout earlier than the 168ms stipulated time. + + The patch fixes this by updating the loop timeout control branch in + do_capp_recovery_scoms() to use the correct enum tb_cmpval. +- phb4: Disable 32-bit MSI in capi mode + + If a capi device does a DMA write targeting an address lower than 4GB, + it does so through a 32-bit operation, per the PCI spec. In capi mode, + the first TVE entry is configured in bypass mode, so the address is + valid. But with any (bad) luck, the address could be 0xFFFFxxxx, thus + looking like a 32-bit MSI. + + We currently enable both 32-bit and 64-bit MSIs, so the PHB will + interpret the DMA write as a MSI, which very likely results in an EEH + (MSI with a bad payload size). + + We can fix it by disabling 32-bit MSI when switching the PHB to capi + mode. Capi devices are 64-bit. + +NVLINK2 +^^^^^^^ +- npu2: Add support for relaxed-ordering mode + + Some device drivers support out of order access to GPU memory. This does + not affect the CPU view of memory but it does affect the GPU view of + memory. It should only be enabled if the GPU driver has requested it. + + Add OPAL APIs allowing the driver to query relaxed ordering state or + request it to be set for a device. Current hardware only allows relaxed + ordering to be enabled per PCIe root port. So the code here doesn't + enable relaxed ordering until it has been explicitly requested for every + device on the port. +- Add the other 7 ATSD registers to the device tree. +- npu2/hw-procedures: Don't open code NPU2_NTL_MISC_CFG2_BRICK_ENABLE + + Name this bit properly. There's a lot more cleanup like this to be done, + but I'm catching this one now as part of some related changes. +- npu2/hw-procedures: Enable parity and credit overflow checks + + Enable these error checking features by setting the appropriate bits in + our one-off initialization of each "NTL Misc Config 2" register. + + The exception is NDL RX parity checking, which should be disabled during + the link training procedures. +- npu2: Use correct kill type for TCE invalidation + + kill_type is enum of OPAL_PCI_TCE_KILL_PAGES, OPAL_PCI_TCE_KILL_PE, + OPAL_PCI_TCE_KILL_ALL and phb4_tce_kill() gets it right but + npu2_tce_kill() uses OPAL_PCI_TCE_KILL which is an OPAL API token. + + This fixes an obvious mistype. + +OpenCAPI +^^^^^^^^ + +- Support OpenCAPI on Witherspoon platform +- npu2-opencapi: Enable presence detection on ZZ + + Presence detection for opencapi adapters was broken for ZZ planars v3 + and below. All ZZ systems currently used in the lab have had their + planar upgraded, so we can now remove the override we had to force + presence and activate presence detection. Which should improve boot + time. + + Considering the state of opal support on ZZ, this is really only for + lab usage on BML. The opencapi enablement team has okay'd the + change. In the unlikely case somebody tries opencapi on an old ZZ, the + presence detection through i2c will show that no adapter is present + and skiboot won't try to access or train the link. +- npu2-opencapi: Don't send commands to NPU when link is down + + Even if an opencapi link is down, we currently always try to issue a + config read operation when probing for PCI devices, because of the + default scan map used for an opencapi PHB. The config operation fails, + as expected, but it can also raise a FIR bit and trigger an HMI. + + For opencapi, there's no root device like for a "normal" PCI PHB, so + there's no reason to do the config operation. To fix it, we keep the + scan map blank by default, and only add a device once the link is + trained. +- opal/hmi: Catch NPU2 HMIs for opencapi + + HMIs for NPU2 are filtered with the 'compatible' string of the PHB, so + add opencapi to the mix. +- occ: Wait if OCC GPU presence status not immediately available + + It takes a few seconds for the OCC to set everything up in order to read + GPU presence. At present, we try to kick off OCC initialisation as early as + possible to maximise the time it has to read GPU presence. + + Unfortunately sometimes that's not enough, so add a loop in + occ_get_gpu_presence() so that on the first time we try to get GPU presence + we keep trying for up to 2 seconds. Experimentally this seems to be + adequate. +- hw/npu2-hw-procedures: Enable RX auto recal on OpenCAPI links + + The RX_RC_ENABLE_AUTO_RECAL flag is required on OpenCAPI but not NVLink. + + Traditionally, Hostboot sets this value according to the machine type. + However, now that Witherspoon supports both NVLink and OpenCAPI, it can't + tell whether or not a link is OpenCAPI. + + So instead, set it in skiboot, where it will only be triggered after we've + done device detection and found an OpenCAPI device. +- hw/npu2-opencapi: Fix setting of supported OpenCAPI templates + + In opal_npu_tl_set(), we made a typo that means the OPAL_NPU_TL_SET call + may not clear the enable bits for templates that were previously enabled + but are now disabled. + + Fix the typo so we clear NPU2_OTL_CONFIG1_TX_TEMP2_EN as well as + TEMP{1,3}_EN. + +Barreleye G2 and Zaius platforms +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- zaius: Add a slot table +- zaius: Add slots for the Barreleye G2 HDD rack + + The Barreleye G2 is distinct from the Zaius in that it features a 24 + Bay NVMe/SATA HDD rack. To provide meaningful slot names for each NVMe + device we need to define a slot table for the NVMe capable HDD bays. + + Unfortunately this isn't straightforward because the PCIe path to the + NVMe devices isn't fixed. The PCIe topology is something like: + P9 -> HBA card -> 9797 switch -> 20x NVMe HDD slots + + The 9797 switch is partitioned into two (or four) virtual switches which + allow multiple HBA cards to be used (e.g. one per socket). As a result + the exact BDFN of the ports will vary depending on how the system is + configured. + + That said, the virtual switch configuration of the 9797 does not change + the device and function numbers of the switch downports. This means that + we can define a single slot table that maps switch ports to the NVMe bay + names. + + Unfortunately we still need to guess which bus to use this table on, so + we assume that any switch downport we find with the PEX9797 VDID is part + of the 9797 that supports the HDD rack. + +FSP based platforms (firenze and ZZ) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- phb4/capp: Update the expected Eye-catcher for CAPP ucode lid + + Currently on a FSP based P9 system load_capp_code() expects CAPP ucode + lid header to have eye-catcher magic of 'CAPPPSLL'. However skiboot + currently supports CAPP ucode only lids that have a eye-catcher magic + of 'CAPPLIDH'. This prevents skiboot from loading the ucode with this + error message: :: + + CAPP: ucode header invalid + + We fix this issue by updating load_capp_ucode() to use the eye-catcher + value of 'CAPPLIDH' instead of 'CAPPPSLL'. + +- FSP: Improve Reset/Reload log message + + Below message is confusing. Lets make it clear. + + FSP sends "R/R complete notification" whenever there is a dump. We use `flag` + to identify whether its its R/R completion -OR- just new dump notification. :: + + [ 483.406351956,6] FSP: SP says Reset/Reload complete + [ 483.406354278,5] DUMP: FipS dump available. ID = 0x1a00001f [size: 6367640 bytes] + [ 483.406355968,7] A Reset/Reload was NOT done + +Witherspoon platform +^^^^^^^^^^^^^^^^^^^^ + +- platforms/astbmc/witherspoon: Implement OpenCAPI support + + OpenCAPI on Witherspoon is slightly more involved than on Zaius and ZZ, due + to the OpenCAPI links using the SXM2 connectors that are used for NVLink + GPUs. + + This patch adds the regular OpenCAPI platform information, and also a + Witherspoon-specific presence detection callback that uses the previously + added OCC GPU presence detection to figure out the device types plugged + into each SXM2 socket. + + The SXM2 connectors are capable of carrying 2 OpenCAPI links, and future + OpenCAPI devices are expected to make use of this. However, we don't yet + support ganged links and the various implications that has for handling + things like device reset, so for now, we only enable 1 brick per device. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.2-rc2.rst b/roms/skiboot/doc/release-notes/skiboot-6.2-rc2.rst new file mode 100644 index 000000000..1a3ff6363 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.2-rc2.rst @@ -0,0 +1,76 @@ +.. _skiboot-6.2-rc2: + +skiboot-6.2-rc2 +=============== + +skiboot v6.2-rc2 was released on Thursday November 29th 2018. It is the second +release candidate of skiboot 6.2, which will become the new stable release +of skiboot following the 6.1 release, first released July 11th 2018. + +Skiboot 6.2 will mark the basis for op-build v2.2. + +skiboot v6.2-rc2 contains all bug fixes as of :ref:`skiboot-6.0.14`, +and :ref:`skiboot-5.4.10` (the currently maintained +stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +Over :ref:`skiboot-6.2-rc1`, we have the following changes: + +- npu2-opencapi: Log extra information on link training failure +- npu2-opencapi: Detect if link trained in degraded mode +- platform/firenze: Fix branch-to-null crash + + When the bus alloc and free methods were removed we missed a case in the + Firenze platform slot code that relied on the the bus-specific method to + the bus pointer in the request structure. This results in a + branch-to-null during boot and a crash. This patch fixes it by + initialising it manually here. +- libflash: Don't merge ECC-protected ranges + + Libflash currently merges contiguous ECC-protected ranges, but doesn't + check that the ECC bytes at the end of the first and start of the second + range actually match sanely. More importantly, if blocklevel_read() is + called with a position at the start of a partition that is contained + somewhere within a region that has been merged it will update the + position assuming ECC wasn't being accounted for. This results in the + position being somewhere well after the actual start of the partition + which is incorrect. + + For now, remove the code merging ranges. This means more ranges must be + held and checked however it prevents incorrectly reading ECC-correct + regions like below: :: + + [ 174.334119453,7] FLASH: CAPP partition has ECC + [ 174.437349574,3] ECC: uncorrectable error: ffffffffffffffff ff + [ 174.437426306,3] FLASH: failed to read the first 0x1000 from CAPP partition, rc 14 + [ 174.439919343,3] CAPP: Error loading ucode lid. index=201d1 + +- libflash: Restore blocklevel tests + + This fell out in f58be46 "libflash/test: Rewrite Makefile.check to + improve scalability". Add it back in as test-blocklevel. +- Warn on long OPAL calls + + Measure entry/exit time for OPAL calls and warn appropriately if the + calls take too long (>100ms gets us a DEBUG log, > 1000ms gets us a + warning). + +CI, testing, and utilities +-------------------------- + +- travis: Coverity fixed their SSL cert +- opal-ci: Use ubuntu:rolling for Ubuntu latest image +- ffspart: Add test for eraseblock size +- ffspart: Add toc test +- hdata/test: workaround dtc bugs + + In dtc v1.4.5 to at least v1.4.7 there have been a few bugs introduced + that change the layout of what's produced in the dts. In order to be + immune from them, we should use the (provided) dtdiff utility, but we + also need to run the dts we're diffing against through a dtb cycle in + order to ensure we get the same format as what the hdat_to_dt to dts + conversion will. + + This fixes a bunch of unit test failures on the version of dtc shipped + with recent Linux distros such as Fedora 29. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.2.1.rst b/roms/skiboot/doc/release-notes/skiboot-6.2.1.rst new file mode 100644 index 000000000..8a475d2bc --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.2.1.rst @@ -0,0 +1,83 @@ +.. _skiboot-6.2.1: + +============= +skiboot-6.2.1 +============= + +skiboot 6.2.1 was released on Wednesday February 20th, 2019. It replaces +:ref:`skiboot-6.2` as the current stable release in the 6.2.x series. + +It is recommended that 6.2.1 be used instead of any previous 6.2.x version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- libflash/ecc: Fix compilation warning with gcc9 + + Fixes: https://github.com/open-power/skiboot/issues/218 + +- core/opal: Print PIR value in exit path, useful for debugging +- core/ipmi: Improve error message +- firmware-versions: Add test case for parsing VERSION + + If we hit a entry in VERSION that is larger than our + buffer size, we skip over it gracefully rather than overwriting the + stack. This is only a problem if VERSION isn't trusted, which as of + 4b8cc05a94513816d43fb8bd6178896b430af08f it is verified as part of + Secure Boot. +- core/cpu: HID update race + + If the per-core HID register is updated concurrently by multiple + threads, updates can get lost. This has been observed during fast + reboot where the HILE bit does not get cleared on all cores, which + can cause machine check exception interrupts to crash. + + Fix this by only updating HID on thread0. +- cpufeatures: Always advertise POWER8NVL as DD2 + + Despite the major version of PVR being 1 (0x004c0100) for POWER8NVL, + these chips are functionally equalent to P8/P8E DD2 levels. + + This advertises POWER8NVL as DD2. As the result, skiboot adds + ibm,powerpc-cpu-features/processor-control-facility for such CPUs and + the linux kernel can use hypervisor doorbell messages to wake secondary + threads; otherwise "KVM: CPU %d seems to be stuck" would appear because + of missing LPCR_PECEDH. +- p9dsu: Fix p9dsu slot tables + + Set the attributes on the slot tables to account for + builtin or pluggable etypes, this will allow pci + enumeration to calculate subordinate buses. + + Update some slot label strings. + + Add WIO Slot5 which is standard on the ESS config. +- core/lock: Stop drop_my_locks() from always causing abort + + The loop in drop_my_locks() looks like this: :: + + while((l = list_pop(&this_cpu()->locks_held, struct lock, list)) != NULL) { + if (warn) + prlog(PR_ERR, " %s\n", l->owner); + unlock(l); + } + + Both list_pop() and unlock() call list_del(). This means that on the + last iteration of the loop, the list will be empty when we get to + unlock_check(), causing this: :: + + LOCK ERROR: Releasing lock we don't hold depth @0x30493d20 (state: 0x0000000000000001) + [13836.000173140,0] Aborting! + CPU 0000 Backtrace: + S: 0000000031c03930 R: 000000003001d840 ._abort+0x60 + S: 0000000031c039c0 R: 000000003001a0c4 .lock_error+0x64 + S: 0000000031c03a50 R: 0000000030019c70 .unlock+0x54 + S: 0000000031c03af0 R: 000000003001a040 .drop_my_locks+0xf4 + + To fix this, change list_pop() to list_top(). +- p9dsu: Fix p9dsu default variant + + Add the default when no riser_id is returned from the ipmi query. + + Allow a little more time for BMC reply and cleanup some label strings. + diff --git a/roms/skiboot/doc/release-notes/skiboot-6.2.2.rst b/roms/skiboot/doc/release-notes/skiboot-6.2.2.rst new file mode 100644 index 000000000..e34841ab7 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.2.2.rst @@ -0,0 +1,228 @@ +.. _skiboot-6.2.2: + +============= +skiboot-6.2.2 +============= + +skiboot 6.2.2 was released on Wednesday March 6th, 2019. It replaces +:ref:`skiboot-6.2.1` as the current stable release in the 6.2.x series. + +It is recommended that 6.2.2 be used instead of any previous 6.2.x version +due to the bug fixes it contains. + +Over :ref:`skiboot-6.2.1` we have several bug fixes, including important ones +for powercap, ipmi-hiomap, astbmc and BMC communication driver. + +powercap +======== +- powercap: occ: Fix the powercapping range allowed for user + + OCC provides two limits for minimum powercap. One being hard powercap + minimum which is guaranteed by OCC and the other one is a soft + powercap minimum which is lesser than hard-min and may or may not be + asserted due to various power-thermal reasons. So to allow the users + to access the entire powercap range, this patch exports soft powercap + minimum as the "powercap-min" DT property. And it also adds a new + DT property called "powercap-hard-min" to export the hard-min powercap + limit. + +ASTBMC +====== +- astbmc: Enable IPMI HIOMAP for AMI platforms + + Required for Habanero, Palmetto and Romulus. + +- astbmc: Try IPMI HIOMAP for P8 (again) + + The HIOMAP protocol was developed after the release of P8 in preparation + for P9. As a consequence P9 always uses it, but it has rarely been + enabled for P8. P8DTU has recently added IPMI HIOMAP support to its BMC + firmware, so enable its use in skiboot with P8 machines. Doing so + requires some rework to ensure fallback works correctly as in the past + the fallback was to mbox, which will only work for P9. + + Tested on Garrison, Palmetto without HIOMAP, Palmetto with HIOMAP, and + Witherspoon. + +- ast-io: Rework ast_sio_is_enabled() test sequence + + The postcondition of probing with a lock sequence is easier to make + correct than with unlock. The original implementation left SuperIO + locked after execution which broke an assumption of some callers. + + Tested on Garrison, Palmetto without HIOMAP, Palmetto with HIOMAP and + Witherspoon. + +P8DTU +===== +- p8dtu: Enable HIOMAP support + +- p8dtu: Configure BMC graphics + + We can no-longer read the values from the BMC in the way we have in the + past. Values were provided by Eric Chen of SMC. + +IPMI-HIOMAP +=========== +- ipmi-hiomap test case enhancements/fixes. + +- libflash/ipmi-hiomap: Enforce message size for empty response + + The protocol defines the response to the associated messages as empty + except for the command ID and sequence fields. If the BMC is returning + extra data consider the message malformed. + +- libflash/ipmi-hiomap: Remove unused close handling + + Issuing a HIOMAP_C_CLOSE is not required by the protocol specification, + rather a close can be implicit in a subsequent + CREATE_{READ,WRITE}_WINDOW request. The implicit close provides an + opportunity to reduce LPC traffic and the implementation takes up that + optimisation, so remove the case from the IPMI callback handler. + +- libflash/ipmi-hiomap: Overhaul event handling + + Reworking the event handling was inspired by a bug report by Vasant + where the host would get wedged on multiple flash access attempts in the + face of a persistent error state on the BMC-side. The cause of this bug + was the early-exit based on ctx->update, which erronously assumed that + all events had been completely handled in prior calls to + ipmi_hiomap_handle_events(). This is not true if e.g. + HIOMAP_E_DAEMON_READY is clear in the prior calls. + + Regardless, there were other correctness and efficiency problems with + the handling strategy: + + * Ack-able event state was not restored in the face of errors in the + process of re-establishing protocol state + + * It forced needless window restoration with respect to the context in + which ipmi_hiomap_handle_events() was called. + + * Tests for HIOMAP_E_DAEMON_READY and HIOMAP_E_FLASH_LOST were redundant + with the overhauled error handling introduced in the previous patch + + Fix all of the above issues and add comments to explain the event + handling flow. + + Tests for correctness follow later in the series. + +- libflash/ipmi-hiomap: Overhaul error handling + + The aim is to improve the robustness with respect to absence of the + BMC-side daemon. The current error handling roughly mirrors what was + done for the mailbox implementation, but there's room for improvement. + + Errors are split into two classes, those that affect the transport state + and those that affect the window validity. From here, we push the + transport state error checks right to the bottom of the stack, to ensure + the link is known to be in a good state before any message is sent. + Window validity tests remain as they were in the hiomap_window_move() + and ipmi_hiomap_read() functions. Validity tests are not necessary in + the write and erase paths as we will receive an error response from the + BMC when performing a dirty or flush on an invalid window. + + Recovery also remains as it was, done on entry to the blocklevel + callbacks. If an error state is encountered in the middle of an + operation no attempt is made to recover it on the spot, instead the + error is returned up the stack and the caller can choose how it wishes + to respond. + +- libflash/ipmi-hiomap: Fix leak of msg in callback + +BMC communication +================= +- core/ipmi: Add ipmi sync messages to top of the list + + In ipmi_queue_msg_sync() path OPAL will wait until it gets response from + BMC. If we do not get response ontime we may endup in kernel hardlockups. + Hence lets add sync messages to top of the queue. This will reduces the + chance of hardlockups. + +- hw/bt: Introduce separate list for synchronous messages + + BT send logic always sends top of bt message list to BMC. Once BMC reads the + message, it clears the interrupt and bt_idle() becomes true. + + bt_add_ipmi_msg_head() adds message to top of the list. If bt message list + is not empty then: + + - if bt_idle() is true then we will endup sending message to BMC before + getting response from BMC for inflight message. Looks like on some + BMC implementation this results in message timeout. + - else we endup starting message timer without actually sending message + to BMC.. which is not correct. + + This patch introduces separate list to track synchronous messages. + bt_add_ipmi_msg_head() will add messages to tail of this new list. We + will always process this queue before processing normal queue. + + Finally this patch introduces new variable (inflight_bt_msg) to track + inflight message. This will point to current inflight message. + +- hw/bt: Fix message retry handler + + In some corner cases (like BMC reboot), bt_send_and_unlock() starts + message timer, but won't send message to BMC as driver is not free to + send message. bt_expire_old_msg() function enables H2B interrupt without + actually sending message. + + This patch fixes above issue. + +- ipmi/power: Fix system reboot issue + + Kernel makes reboot/shudown OPAL call for reboot/shutdown. Once kernel + gets response from OPAL it runs opal_poll_events() until firmware + handles the request. + + On BMC based system, OPAL makes IPMI call (IPMI_CHASSIS_CONTROL) to + initiate system reboot/shutdown. At present OPAL queues IPMI messages + and return SUCESS to Host. If BMC is not ready to accept command (like + BMC reboot), then these message will fail. We have to manually + reboot/shutdown the system using BMC interface. + + This patch adds logic to validate message return value. If message failed, + then it will resend the message. At some stage BMC will be ready to accept + message and handles IPMI message. + +- hw/bt: Add backend interface to disable ipmi message retry option + + During boot OPAL makes IPMI_GET_BT_CAPS call to BMC to get BT interface + capabilities which includes IPMI message max resend count, message + timeout, etc,. Most of the time OPAL gets response from BMC within + specified timeout. In some corner cases (like mboxd daemon reset in BMC, + BMC reboot, etc) OPAL may not get response within timeout period. In + such scenarios, OPAL resends message until max resend count reaches. + + OPAL uses synchronous IPMI message (ipmi_queue_msg_sync()) for few + operations like flash read, write, etc. Thread will wait in OPAL until + it gets response from BMC. In some corner cases like BMC reboot, thread + may wait in OPAL for long time (more than 20 seconds) and results in + kernel hardlockup. + + This patch introduces new interface to disable message resend option. We + will disable message resend option for synchrous message. This will + greatly reduces kernel hardlock up issues. + + This is short term fix. Long term solution is to convert all synchronous + messages to asynhrounous one. + +- qemu: bt device isn't always hanging off / + + Just use the normal for_each_compatible instead. + + Otherwise in the qemu model as executed by op-test, + we wouldn't go down the astbmc_init() path, thus not having flash. + +PHB3 +==== +- hw/phb3/naples: Disable D-states + + Putting "Mellanox Technologies MT27700 Family [ConnectX-4] [15b3:1013]" + (more precisely, the second of 2 its PCI functions, no matter in what + order) into the D3 state causes EEH with the "PCT timeout" error. + This has been noticed on garrison machines only and firestones do not + seem to have this issue. + + This disables D-states changing for devices on root buses on Naples by + installing a config space access filter (copied from PHB4). diff --git a/roms/skiboot/doc/release-notes/skiboot-6.2.3.rst b/roms/skiboot/doc/release-notes/skiboot-6.2.3.rst new file mode 100644 index 000000000..2f153e9bd --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.2.3.rst @@ -0,0 +1,45 @@ +.. _skiboot-6.2.3: + +============= +skiboot-6.2.3 +============= + +skiboot 6.2.3 was released on Tuesday March 19th, 2019. It replaces +:ref:`skiboot-6.2.2` as the current stable release in the 6.2.x series. + +It is recommended that 6.2.3 be used instead of any previous 6.2.x version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- p9dsu: Undo slot label name changes + + During some code updates the slot labels were updated to reflect + the phb layout, however expectations were that the slot labels be + aligned with the riser card slots and not the system planar slots. + + [stewart: The tale of how we got here is long and varied and not at + all clear. The first ESS systems went out with a skiboot v5.9.8 with + additional SuperMicro patches. It was probably a slot table, but who knows, + we don't have the code so can't check. It's possible it was all coming + in through HDAT instead). The op-build tree (thus the exact patches) + shipped on systems that work correct seems to not be around anywhere anymore + (if it ever was). It was only in skiboot v6.0 that a slot table made + it in, and, of course, only having remote machines in random configs, + including possibly with riser cards from Briggs&Stratton rather than + the ones destined for this system, doesn't make for verifying this + at all. It also doesn't help that *consistently* there is *never* + any review on slot tables, and we've had things be wrong in the past. + Combine this with not upstream Hostboot patches.] + +- p9dsu: Fix slot labels for p9dsu2u + + Update the slot labels for the p9dsu2u tables. + +- fast-reboot: occ: Call occ_pstates_init() on fast-reset on all machines + + Commit 815417dcda2e ("init, occ: Initialise OCC earlier on BMC systems") + conditionally invoked occ_pstates_init() only on FSP based systems in + load_and_boot_kernel(). Due to this pstate table is re-parsed on FSP + system and skipped on BMC system during fast-reboot. So this patch fixes + this by invoking occ_pstates_init() on all boxes during fast-reboot. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.2.4.rst b/roms/skiboot/doc/release-notes/skiboot-6.2.4.rst new file mode 100644 index 000000000..bba9ebb5e --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.2.4.rst @@ -0,0 +1,236 @@ +.. _skiboot-6.2.4: + +============= +skiboot-6.2.4 +============= + +skiboot 6.2.4 was released on Thursday May 9th, 2019. It replaces +:ref:`skiboot-6.2.3` as the current stable release in the 6.2.x series. + +It is recommended that 6.2.4 be used instead of any previous 6.2.x version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- core/flash: Retry requests as necessary in flash_load_resource() + + We would like to successfully boot if we have a dependency on the BMC + for flash even if the BMC is not current ready to service flash + requests. On the assumption that it will become ready, retry for several + minutes to cover a BMC reboot cycle and *eventually* rather than + *immediately* crash out with: :: + + [ 269.549748] reboot: Restarting system + [ 390.297462587,5] OPAL: Reboot request... + [ 390.297737995,5] RESET: Initiating fast reboot 1... + [ 391.074707590,5] Clearing unused memory: + [ 391.075198880,5] PCI: Clearing all devices... + [ 391.075201618,7] Clearing region 201ffe000000-201fff800000 + [ 391.086235699,5] PCI: Resetting PHBs and training links... + [ 391.254089525,3] FFS: Error 17 reading flash header + [ 391.254159668,3] FLASH: Can't open ffs handle: 17 + [ 392.307245135,5] PCI: Probing slots... + [ 392.363723191,5] PCI Summary: + ... + [ 393.423255262,5] OCC: All Chip Rdy after 0 ms + [ 393.453092828,5] INIT: Starting kernel at 0x20000000, fdt at + 0x30800a88 390645 bytes + [ 393.453202605,0] FATAL: Kernel is zeros, can't execute! + [ 393.453247064,0] Assert fail: core/init.c:593:0 + [ 393.453289682,0] Aborting! + CPU 0040 Backtrace: + S: 0000000031e03ca0 R: 000000003001af60 ._abort+0x4c + S: 0000000031e03d20 R: 000000003001afdc .assert_fail+0x34 + S: 0000000031e03da0 R: 00000000300146d8 .load_and_boot_kernel+0xb30 + S: 0000000031e03e70 R: 0000000030026cf0 .fast_reboot_entry+0x39c + S: 0000000031e03f00 R: 0000000030002a4c fast_reset_entry+0x2c + --- OPAL boot --- + + The OPAL flash API hooks directly into the blocklevel layer, so there's + no delay for e.g. the host kernel, just for asynchronously loaded + resources during boot. + +- pci/iov: Remove skiboot VF tracking + + This feature was added a few years ago in response to a request to make + the MaxPayloadSize (MPS) field of a Virtual Function match the MPS of the + Physical Function that hosts it. + + The SR-IOV specification states the the MPS field of the VF is "ResvP". + This indicates the VF will use whatever MPS is configured on the PF and + that the field should be treated as a reserved field in the config space + of the VF. In other words, a SR-IOV spec compliant VF should always return + zero in the MPS field. Adding hacks in OPAL to make it non-zero is... + misguided at best. + + Additionally, there is a bug in the way pci_device structures are handled + by VFs that results in a crash on fast-reboot that occurs if VFs are + enabled and then disabled prior to rebooting. This patch fixes the bug by + removing the code entirely. This patch has no impact on SR-IOV support on + the host operating system. + +- astbmc: Handle failure to initialise raw flash + + Initialising raw flash lead to a dead assignment to rc. Check the return + code and take the failure path as necessary. Both before and after the + fix we see output along the lines of the following when flash_init() + fails: :: + + [ 53.283182881,7] IRQ: Registering 0800..0ff7 ops @0x300d4b98 (data 0x3052b9d8) + [ 53.283184335,7] IRQ: Registering 0ff8..0fff ops @0x300d4bc8 (data 0x3052b9d8) + [ 53.283185513,7] PHB#0000: Initializing PHB... + [ 53.288260827,4] FLASH: Can't load resource id:0. No system flash found + [ 53.288354442,4] FLASH: Can't load resource id:1. No system flash found + [ 53.342933439,3] CAPP: Error loading ucode lid. index=200ea + [ 53.462749486,2] NVRAM: Failed to load + [ 53.462819095,2] NVRAM: Failed to load + [ 53.462894236,2] NVRAM: Failed to load + [ 53.462967071,2] NVRAM: Failed to load + [ 53.463033077,2] NVRAM: Failed to load + [ 53.463144847,2] NVRAM: Failed to load + + Eventually followed by: :: + + [ 57.216942479,5] INIT: platform wait for kernel load failed + [ 57.217051132,5] INIT: Assuming kernel at 0x20000000 + [ 57.217127508,3] INIT: ELF header not found. Assuming raw binary. + [ 57.217249886,2] NVRAM: Failed to load + [ 57.221294487,0] FATAL: Kernel is zeros, can't execute! + [ 57.221397429,0] Assert fail: core/init.c:615:0 + [ 57.221471414,0] Aborting! + CPU 0028 Backtrace: + S: 0000000031d43c60 R: 000000003001b274 ._abort+0x4c + S: 0000000031d43ce0 R: 000000003001b2f0 .assert_fail+0x34 + S: 0000000031d43d60 R: 0000000030014814 .load_and_boot_kernel+0xae4 + S: 0000000031d43e30 R: 0000000030015164 .main_cpu_entry+0x680 + S: 0000000031d43f00 R: 0000000030002718 boot_entry+0x1c0 + --- OPAL boot --- + + Analysis of the execution paths suggests we'll always "safely" end this + way due the setup sequence for the blocklevel callbacks in flash_init() + and error handling in blocklevel_get_info(), and there's no current risk + of executing from unexpected memory locations. As such the issue is + reduced to down to a fix for poor error hygene in the original change + and a resolution for a Coverity warning (famous last words etc). + +- hw/xscom: Enable sw xstop by default on p9 + + This was disabled at some point during bringup to make life easier for + the lab folks trying to debug NVLink issues. This hack really should + have never made it out into the wild though, so we now have the + following situation occuring in the field: + + 1) A bad happens + 2) The host kernel recieves an unrecoverable HMI and calls into OPAL to + request a platform reboot. + 3) OPAL rejects the reboot attempt and returns to the kernel with + OPAL_PARAMETER. + 4) Kernel panics and attempts to kexec into a kdump kernel. + + A side effect of the HMI seems to be CPUs becoming stuck which results + in the initialisation of the kdump kernel taking a extremely long time + (6+ hours). It's also been observed that after performing a dump the + kdump kernel then crashes itself because OPAL has ended up in a bad + state as a side effect of the HMI. + + All up, it's not very good so re-enable the software checkstop by + default. If people still want to turn it off they can using the nvram + override. + +- opal/hmi: Initialize the hmi event with old value of TFMR. + + Do this before we fix TFAC errors. Otherwise the event at host console + shows no thread error reported in TFMR register. + + Without this patch the console event show TFMR with no thread error: + (DEC parity error TFMR[59] injection) :: + + [ 53.737572] Severe Hypervisor Maintenance interrupt [Recovered] + [ 53.737596] Error detail: Timer facility experienced an error + [ 53.737611] HMER: 0840000000000000 + [ 53.737621] TFMR: 3212000870e04000 + + After this patch it shows old TFMR value on host console: :: + + [ 2302.267271] Severe Hypervisor Maintenance interrupt [Recovered] + [ 2302.267305] Error detail: Timer facility experienced an error + [ 2302.267320] HMER: 0840000000000000 + [ 2302.267330] TFMR: 3212000870e14010 + +- libflash/ipmi-hiomap: Fix blocks count issue + + We convert data size to block count and pass block count to BMC. + If data size is not block aligned then we endup sending block count + less than actual data. BMC will write partial data to flash memory. + + Sample log :: + + [ 594.388458416,7] HIOMAP: Marked flash dirty at 0x42010 for 8 + [ 594.398756487,7] HIOMAP: Flushed writes + [ 594.409596439,7] HIOMAP: Marked flash dirty at 0x42018 for 3970 + [ 594.419897507,7] HIOMAP: Flushed writes + + In this case HIOMAP sent data with block count=0 and hence BMC didn't + flush data to flash. + + Lets fix this issue by adjusting block count before sending it to BMC. + +- Fix hang in pnv_platform_error_reboot path due to TOD failure. + + On TOD failure, with TB stuck, when linux heads down to + pnv_platform_error_reboot() path due to unrecoverable hmi event, the panic + cpu gets stuck in OPAL inside ipmi_queue_msg_sync(). At this time, rest + all other cpus are in smp_handle_nmi_ipi() waiting for panic cpu to proceed. + But with panic cpu stuck inside OPAL, linux never recovers/reboot. :: + + p0 c1 t0 + NIA : 0x000000003001dd3c <.time_wait+0x64> + CFAR : 0x000000003001dce4 <.time_wait+0xc> + MSR : 0x9000000002803002 + LR : 0x000000003002ecf8 <.ipmi_queue_msg_sync+0xec> + + STACK: SP NIA + 0x0000000031c236e0 0x0000000031c23760 (big-endian) + 0x0000000031c23760 0x000000003002ecf8 <.ipmi_queue_msg_sync+0xec> + 0x0000000031c237f0 0x00000000300aa5f8 <.hiomap_queue_msg_sync+0x7c> + 0x0000000031c23880 0x00000000300aaadc <.hiomap_window_move+0x150> + 0x0000000031c23950 0x00000000300ab1d8 <.ipmi_hiomap_write+0xcc> + 0x0000000031c23a90 0x00000000300a7b18 <.blocklevel_raw_write+0xbc> + 0x0000000031c23b30 0x00000000300a7c34 <.blocklevel_write+0xfc> + 0x0000000031c23bf0 0x0000000030030be0 <.flash_nvram_write+0xd4> + 0x0000000031c23c90 0x000000003002c128 <.opal_write_nvram+0xd0> + 0x0000000031c23d20 0x00000000300051e4 <opal_entry+0x134> + 0xc000001fea6e7870 0xc0000000000a9060 <opal_nvram_write+0x80> + 0xc000001fea6e78c0 0xc000000000030b84 <nvram_write_os_partition+0x94> + 0xc000001fea6e7960 0xc0000000000310b0 <nvram_pstore_write+0xb0> + 0xc000001fea6e7990 0xc0000000004792d4 <pstore_dump+0x1d4> + 0xc000001fea6e7ad0 0xc00000000018a570 <kmsg_dump+0x140> + 0xc000001fea6e7b40 0xc000000000028e5c <panic_flush_kmsg_end+0x2c> + 0xc000001fea6e7b60 0xc0000000000a7168 <pnv_platform_error_reboot+0x68> + 0xc000001fea6e7bd0 0xc0000000000ac9b8 <hmi_event_handler+0x1d8> + 0xc000001fea6e7c80 0xc00000000012d6c8 <process_one_work+0x1b8> + 0xc000001fea6e7d20 0xc00000000012da28 <worker_thread+0x88> + 0xc000001fea6e7db0 0xc0000000001366f4 <kthread+0x164> + 0xc000001fea6e7e20 0xc00000000000b65c <ret_from_kernel_thread+0x5c> + + This is because, there is a while loop towards the end of + ipmi_queue_msg_sync() which keeps looping until "sync_msg" does not match + with "msg". It loops over time_wait_ms() until exit condition is met. In + normal scenario time_wait_ms() calls run pollers so that ipmi backend gets + a chance to check ipmi response and set sync_msg to NULL. + + .. code-block:: c + + while (sync_msg == msg) + time_wait_ms(10); + + But in the event when TB is in failed state time_wait_ms()->time_wait_poll() + returns immediately without calling pollers and hence we end up looping + forever. This patch fixes this hang by calling opal_run_pollers() in TB + failed state as well. + +- core/ipmi: Print correct netfn value + +- libffs: Fix string truncation gcc warning. + + Use memcpy as other libffs functions do. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.2.rst b/roms/skiboot/doc/release-notes/skiboot-6.2.rst new file mode 100644 index 000000000..cbb5fab32 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.2.rst @@ -0,0 +1,1375 @@ +.. _skiboot-6.2: + +skiboot-6.2 +=============== + +skiboot v6.2 was released on Friday December 14th 2018. It is the first +release of skiboot 6.2, which becomes the new stable release +of skiboot following the 6.1 release, first released July 11th 2018. + +Skiboot 6.2 will mark the basis for op-build v2.2. + +skiboot v6.2 contains all bug fixes as of :ref:`skiboot-6.0.14`, +and :ref:`skiboot-5.4.10` (the currently maintained +stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +This release has been a longer cycle than typical for a variety of reasons. It +also contains a lot of cleanup work and minor bug fixes (much like skiboot 6.1 +did). + +Over skiboot 6.1, we have the following changes: + +General +------- + +Since v6.2-rc2: + +- i2c: Fix i2c request hang during opal init if timers are not checked + + If an i2c request cannot go through the first time, because the bus is + found in error and need a reset or it's locked by the OCC for example, + the underlying i2c implementation is using timers to manage the + request. However during opal init, opal pollers may not be called, it + depends in the context in which the i2c request is made. If the + pollers are not called, the timers are not checked and we can end up + with an i2c request which will not move foward and skiboot hangs. + + Fix it by explicitly checking the timers if we are waiting for an i2c + request to complete and it seems to be taking a while. + +Since v6.1: + +- cpu: Quieten OS endian switch messages + + Users see these when loading an OS from Petitboot: :: + + [ 119.486794100,5] OPAL: Switch to big-endian OS + [ 120.022302604,5] OPAL: Switch to little-endian OS + + Which is expected and doesn't provide any information the user can act + on. Switch them to PR_INFO so they still appear in the log, but not on + the serial console. +- Recognise signed VERSION partition + + A few things need to change to support a signed VERSION partition: + + - A signed VERSION partition will be 4K + SECURE_BOOT_HEADERS_SIZE (4K). + - The VERSION partition needs to be loaded after secure/trusted boot is + set up, and therefore after nvram_init(). + - Added to the trustedboot resources array. + + This also moves the ipmi_dt_add_bmc_info() call to after + flash_dt_add_fw_version() since it adds info to ibm,firmware-versions. +- Run pollers in time_wait() when not booting + + This only bit us hard with hiomap in one scenario. + + Our OPAL API has been OPAL_POLL_EVENTS may be needed to make forward + progress on ongoing operations, and the internal to skiboot API has been + that time_wait() of a suitable time will run pollers (on at least one + CPU) to help ensure forward progress can be made. + + In a perfect world, interrupts are used but they may: a) be disabled, or + b) the thing we're doing can't use interrupts because computers are + generally terrible. + + Back in 3db397ea5892a (circa 2015), we changed skiboot so that we'd run + pollers only on the boot CPU, and not if we held any locks. This was to + reduce the chance of programming code that could deadlock, as well as to + ensure that we didn't just thrash all the cachelines for running pollers + all over a large system during boot, or hard spin on the same locks on + all secondary CPUs. + + The problem arises if the OS we're booting makes an OPAL call early on, + with interrupts disabled, that requires a poller to run to make forward + progress. An example of this would be OPAL_WRITE_NVRAM early in Linux + boot (where Linux sets up the partitions it wants) - something that + occurs iff we've had to reformat NVRAM this boot (i.e. first boot or + corrupted NVRAM). + + The hiomap implementation should arguably *not* rely on synchronous IPMI + messages, but this is a future improvement (as was for mbox before it). + The mbox-flash code solved this problem by spinning on check_timers(). + + More generically though, the approach of running the pollers when no + longer booting means we behave more in line with what the API is meant + to be, rather than have this odd case of "time_wait() for a condition + that could also be tripped by an interrupt works fine unless the OS is + up and running but hasn't set interrupts up yet". +- ipmi: Reduce ipmi_queue_msg_sync() polling loop time to 10ms + + On a plain boot, this reduces the time spent in OPAL by ~170ms on + p9dsu. This is due to hiomap (currently) using synchronous IPMI + messages. + + It will also *significantly* reduce latency on runtime flash + operations for hiomap, as we'll spend typically 10-20ms in OPAL + rather than 100-200ms. It's not an ideal solution to that, but + it's a quick and obvious win for jitter. +- core/device: NULL pointer dereference fix +- core/flash: NULL pointer dereference fixes +- core/cpu: Call memset with proper cpu_thread offset +- libflash: Add ipmi-hiomap, and prefer it for PNOR access + + ipmi-hiomap implements the PNOR access control protocol formerly known + as "the mbox protocol" but uses IPMI instead of the AST LPC mailbox as a + transport. As there is no-longer any mailbox involved in this alternate + implementation the old protocol name is quite misleading, and so it has + been renamed to "the hiomap protoocol" (Host I/O Mapping protocol). The + same commands and events are used though this client-side implementation + assumes v2 of the protocol is supported by the BMC. + + The code is a heavily-reworked copy of the mbox-flash source and is + introduced this way to allow for the mbox implementation's eventual + removal. + + mbox-flash should in theory be renamed to mbox-hiomap for consistency, + but as it is on life-support effective immediately we may as well just + remove it entirely when the time is right. +- opal/hmi: Handle early HMIs on thread0 when secondaries are still in OPAL. + + When primary thread receives a CORE level HMI for timer facility errors + while secondaries are still in OPAL, thread 0 ends up in rendez-vous + waiting for secondaries to get into hmi handling. This is because OPAL + runs with MSR(EE=0) and hence HMIs are delayed on secondary threads until + they are given to Linux OS. Fix this by adding a check for secondary + state and force them in hmi handling by queuing job on secondary threads. + + I have tested this by injecting HDEC parity error very early during Linux + kernel boot. Recovery works fine for non-TB errors. But if TB is bad at + this very eary stage we already doomed. + + Without this patch we see: :: + + [ 285.046347408,7] OPAL: Start CPU 0x0843 (PIR 0x0843) -> 0x000000000000a83c + [ 285.051160609,7] OPAL: Start CPU 0x0844 (PIR 0x0844) -> 0x000000000000a83c + [ 285.055359021,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [ 285.055361439,7] HMI: [Loc: U78D3.ND1.WZS004A-P1-C48]: P:8 C:17 T:0: TFMR(2e12002870e14000) Timer Facility Error + [ 286.232183823,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 1 (sptr=0000ccc1) + [ 287.409002056,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 2 (sptr=0000ccc1) + [ 289.073820164,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 3 (sptr=0000ccc1) + [ 290.250638683,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 1 (sptr=0000ccc2) + [ 291.427456821,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 2 (sptr=0000ccc2) + [ 293.092274807,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 3 (sptr=0000ccc2) + [ 294.269092904,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 1 (sptr=0000ccc3) + [ 295.445910944,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 2 (sptr=0000ccc3) + [ 297.110728970,3] HMI: Rendez-vous stage 1 timeout, CPU 0x844 waiting for thread 3 (sptr=0000ccc3) + + After this patch: :: + + [ 259.401719351,7] OPAL: Start CPU 0x0841 (PIR 0x0841) -> 0x000000000000a83c + [ 259.406259572,7] OPAL: Start CPU 0x0842 (PIR 0x0842) -> 0x000000000000a83c + [ 259.410615534,7] OPAL: Start CPU 0x0843 (PIR 0x0843) -> 0x000000000000a83c + [ 259.415444519,7] OPAL: Start CPU 0x0844 (PIR 0x0844) -> 0x000000000000a83c + [ 259.419641401,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [ 259.419644124,7] HMI: [Loc: U78D3.ND1.WZS004A-P1-C48]: P:8 C:17 T:0: TFMR(2e12002870e04000) Timer Facility Error + [ 259.419650678,7] HMI: Sending hmi job to thread 1 + [ 259.419652744,7] HMI: Sending hmi job to thread 2 + [ 259.419653051,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [ 259.419654725,7] HMI: Sending hmi job to thread 3 + [ 259.419654916,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [ 259.419658025,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 + [ 259.419658406,7] HMI: [Loc: U78D3.ND1.WZS004A-P1-C48]: P:8 C:17 T:2: TFMR(2e12002870e04000) Timer Facility Error + [ 259.419663095,7] HMI: [Loc: U78D3.ND1.WZS004A-P1-C48]: P:8 C:17 T:3: TFMR(2e12002870e04000) Timer Facility Error + [ 259.419655234,7] HMI: [Loc: U78D3.ND1.WZS004A-P1-C48]: P:8 C:17 T:1: TFMR(2e12002870e04000) Timer Facility Error + [ 259.425109779,7] OPAL: Start CPU 0x0845 (PIR 0x0845) -> 0x000000000000a83c + [ 259.429870681,7] OPAL: Start CPU 0x0846 (PIR 0x0846) -> 0x000000000000a83c + [ 259.434549250,7] OPAL: Start CPU 0x0847 (PIR 0x0847) -> 0x000000000000a83c + +- core/cpu: Fix memory allocation for job array + + fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs + + This bug would result in boot-hang on some configurations due to + cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer. +- i2c: Fix multiple-enqueue of the same request on NACK + + i2c_request_send() will retry the request if the error is a NAK, + however it forgets to clear the "ud.done" flag. It will thus + loop again and try to re-enqueue the same request causing internal + request list corruption. +- i2c: Ensure ordering between i2c_request_send() and completion + + i2c_request_send loops waiting for a flag "uc.done" set by + the completion routine, and then look for a result code + also set by that same completion. + + There is no synchronization, the completion can happen on another + processor, so we need to order the stores to uc and the reads + from uc so that uc.done is stored last and tested first using + memory barriers. +- pci: Clarify power down logic + + Currently pci_scan_bus() unconditionally calls pci_slot_set_power_state() + when it's finished scanning a bus. This is one of those things that + makes you go "WHAT?" when you first see it and frankly the skiboot PCI + code could do with less of that. + +Fast Reboot +^^^^^^^^^^^ + +- fast-reboot: parallel memory clearing + + Arbitrarily pick 16GB as the unit of parallelism, and + split up clearing memory into jobs and schedule them + node-local to the memory (or on node 0 if we can't + work that out because it's the memory up to SKIBOOT_BASE) + + This seems to cut at least ~40% time from memory zeroing on + fast-reboot on a 256GB Boston system. + + For many systems, scanning PCI takes about as much time as + zeroing all of RAM, so we may as well do them at the same time + and cut a few seconds off the total fast reboot time. +- fast-reboot: verify firmware "romem" checksum + + This takes a checksum of skiboot memory after boot that should be + unchanged during OS operation, and verifies it before allowing a + fast reboot. + + This is not read-only memory from skiboot's point of view, beause + it includes things like the opal branch table that gets populated + during boot. + + This helps to improve the integrity of firmware against host and + runtime firmware memory scribble bugs. + +- core/fast-reboot: print the fast reboot disable reason + + Once things start to go wrong, disable_fast_reboot can be called a + number of times, so make the first reason sticky, and also print it + to the console at disable time. This helps with making sense of + fast reboot disables. +- Add fast-reboot property to /ibm,opal DT node + + this means that if it's permanently disabled on boot, the test suite can + pick that up and not try a fast reboot test. + +Utilities +--------- + +Since v6.2-rc2: + +- opal-prd: hservice: Enable hservice->wakeup() in BMC + + This patch enables HBRT to use HYP special wakeup register in openBMC + which until now was only used in FSP based machines. + + This patch also adds a capability check for opal-prd so that HBRT can + decide if the host special wakeup register can be used. +- ffspart: Support flashing already ECC protected images + + We do this by assuming filenames with '.ecc' in them are already ECC + protected. + + This solves a practical problem in transitioning op-build to use ffspart + for pnor assembly rather than three perl scripts and a lot of XML. + + We also update the ffspart tests to take into account ECC requirements. +- ffspart: Increase MAX_LINE to above PATH_MAX + + Otherwise we saw failures in CI and the ~221 character paths Jankins + likes to have. +- libflash/file: greatly increase perf of file_erase() + + Do 4096 byte chunks not 8 byte chunks. A ffspart invocation constructing + a 64MB PNOR goes from a couple of seconds to ~0.1seconds with this + patch. + +Since v6.2-rc1: +- libflash: Don't merge ECC-protected ranges + + Libflash currently merges contiguous ECC-protected ranges, but doesn't + check that the ECC bytes at the end of the first and start of the second + range actually match sanely. More importantly, if blocklevel_read() is + called with a position at the start of a partition that is contained + somewhere within a region that has been merged it will update the + position assuming ECC wasn't being accounted for. This results in the + position being somewhere well after the actual start of the partition + which is incorrect. + + For now, remove the code merging ranges. This means more ranges must be + held and checked however it prevents incorrectly reading ECC-correct + regions like below: :: + + [ 174.334119453,7] FLASH: CAPP partition has ECC + [ 174.437349574,3] ECC: uncorrectable error: ffffffffffffffff ff + [ 174.437426306,3] FLASH: failed to read the first 0x1000 from CAPP partition, rc 14 + [ 174.439919343,3] CAPP: Error loading ucode lid. index=201d1 + +- libflash: Restore blocklevel tests + + This fell out in f58be46 "libflash/test: Rewrite Makefile.check to + improve scalability". Add it back in as test-blocklevel. + +Since v6.1: + +- pflash: Add --skip option for reading + + Add a --skip=N option to pflash to skip N number of bytes when reading. + This would allow users to print the VERSION partition without the STB + header by specifying the --skip=4096 argument, and it's a more generic + solution rather than making pflash depend on secure/trusted boot code. +- xscom-utils: Rework getsram + + Allow specifying a file on the command line to read OCC SRAM data into. + If no file is specified then we print it to stdout as text. This is a + bit inconsistent, but it retains compatibility with the existing tool. +- xscom-utils/getsram: Make it work on P9 + + The XSCOM base address of the OCC control registers changed slightly + between P8 and P9. Fix this up and add a bit of PVR checking so we look + in the right place. +- opal-prd: Fix opal-prd crash + + Presently callback function from HBRT uses r11 to point to target function + pointer. r12 is garbage. This works fine when we compile with "-no-pie" option + (as we don't use r12 to calculate TOC). + + As per ABIv2 : "r12 : Function entry address at global entry point" + + With "-pie" compilation option, we have to set r12 to point to global function + entry point. So that we can calculate TOC properly. + + Crash log without this patch: :: + + opal-prd[2864]: unhandled signal 11 at 0000000000029320 nip 00000 00102012830 lr 0000000102016890 code 1 + + +Development and Debugging +------------------------- + +Since v6.1-rc1: +- Warn on long OPAL calls + + Measure entry/exit time for OPAL calls and warn appropriately if the + calls take too long (>100ms gets us a DEBUG log, > 1000ms gets us a + warning). + +Since v6.1: + +- core/lock: Use try_lock_caller() in lock_caller() to capture owner + + Otherwise we can get reports of core/lock.c owning the lock, which is + not helpful when tracking down ownership issues. +- core/flash: Emit a warning if Skiboot version doesn't match + + This means you'll get a warning that you've modified skiboot separately + to the rest of the PNOR image, which can be useful in determining what + firmware is actually running on a machine. +- gcov: link in ctors* as newer GCC doesn't group them all + + It seems that newer toolchains get us multiple ctors sections to link in + rather than just one. If we discard them (as we were doing), then we + don't have a working gcov build (and we get the "doesn't look sane" + warning on boot). +- core/flash: Log return code when ffs_init() fails + + Knowing the return code is at least better than not knowing the return + code. +- gcov: Fix building with GCC8 +- travis/ci: rework Dockerfiles to produce build artifacts + + ubuntu-latest was also missing clang, as ubuntu-latest is closer to + ubuntu 18.04 than 16.04 +- cpu: add cpu_queue_job_on_node() + + Add a job scheduling API which will run the job on the requested + chip_id (or return failure). +- opal-ci: Build old dtc version for fedora 28 + + There are patches that will go into dtc to fix the issues we hit, but + for the moment let's just build and use a slightly older version. +- mem_region: Merge similar allocations when dumping + + Currently we print one line for each allocation done at runtime when + dumping the memory allocations. We do a few thousand allocations at + boot so this can result in a huge amount of text being printed which + is a) slow to print, and b) Can result in the log buffer overflowing + which destroys otherwise useful information. + + This patch adds a de-duplication to this memory allocation dump by + merging "similar" allocations (same location, same size) into one. + + Unfortunately, the algorithm used to do the de-duplication is quadratic, + but considering we only dump the allocations in the event of a fatal + error I think this is acceptable. I also did some benchmarking and found + that on a ZZ it takes ~3ms to do a dump with 12k allocations. On a Zaius + it's slightly longer at about ~10ms for 10k allocs. However, the + difference there was due to the output being written to the UART. + + This patch also bumps the log level to PR_NOTICE. PR_INFO messages are + suppressed at the default log level, which probably isn't something you + want considering we only dump the allocations when we run out of skiboot + heap space. +- core/lock: fix timeout warning causing a deadlock false positive + + If a lock waiter exceeds the warning timeout, it prints a message + while still registered as requesting the lock. Printing the message + can take locks, so if one is held when the owner of the original + lock tries to print a message, it will get a false positive deadlock + detection, which brings down the system. + + This can easily be hit when there is a lot of HMI activity from a + KVM guest, where the timebase was not returned to host timebase + before calling the HMI handler. +- hw/p8-i2c: Print the set error bits + + This is purely to save me from having to look it up every time someone + gets an I2C error. +- init: Fix starting stripped kernel + + Currently if we try to run a raw/stripped binary kernel (ie. without + the elf header) we crash with: :: + + [ 0.008757768,5] INIT: Waiting for kernel... + [ 0.008762937,5] INIT: platform wait for kernel load failed + [ 0.008768171,5] INIT: Assuming kernel at 0x20000000 + [ 0.008779241,3] INIT: ELF header not found. Assuming raw binary. + [ 0.017047348,5] INIT: Starting kernel at 0x0, fdt at 0x3044b230 14339 bytes + [ 0.017054251,0] FATAL: Kernel is zeros, can't execute! + [ 0.017059054,0] Assert fail: core/init.c:590:0 + [ 0.017065371,0] Aborting! + + This is because we haven't set kernel_entry correctly in this path. + This fixes it. +- cpu: Better output when waiting for a very long job + + Instead of printing at the end if the job took more than 1s, + print in the loop every 30s along with a backtrace. This will + give us some output if the job is deadlocked. +- lock: Fix interactions between lock dependency checker and stack checker + + The lock dependency checker does a few nasty things that can cause + re-entrancy deadlocks in conjunction with the stack checker or + in fact other debug tests. + + A lot of it revolves around taking a new lock (dl_lock) as part + of the locking process. + + This tries to fix it by making sure we do not hit the stack + checker while holding dl_lock. + + We achieve that in part by directly using the low-level __try_lock + and manually unlocking on the dl_lock, and making some functions + "nomcount". + + In addition, we mark the dl_lock as being in the console path to + avoid deadlocks with the UART driver. + + We move the enabling of the deadlock checker to a separate config + option from DEBUG_LOCKS as well, in case we chose to disable it + by default later on. +- xscom-utils/adu_scoms.py: run 2to3 over it +- clang: -Wno-error=ignored-attributes + +CI, testing, and utilities +-------------------------- + +Since v6.1-rc2: + +- opal-ci: Drop fedora27, add fedora29 +- ci: Bump Qemu version + + This moves the qemu version to qemu-powernv-for-skiboot-7 which is based + on upstream's 3.1.0, and supports a Power9 machine. + + It also includes a fix for the skiboot XSCOM errors: :: + + XSCOM: read error gcid=0x0 pcb_addr=0x1020013 stat=0x0 + + There is no modelling of the xscom behaviour but the reads/writes + now succeed which is enough for skiboot to not error out. +- test: Update qemu arguments to use bmc simulator + + THe qemu skiboot platform as of 8340a9642bba ("plat/qemu: use the common + OpenPOWER routines to initialize") uses the common aspeed BMC setup + routines. This means a BT interface is always set up, and if the + corresponding Qemu model is not present the timeout is 30 seconds. + + It looks like this every time an IPMI message is sent: :: + + BT: seq 0x9e netfn 0x06 cmd 0x31: Maximum queue length exceeded + BT: seq 0x9d netfn 0x06 cmd 0x31: Removed from queue + BT: seq 0x9f netfn 0x06 cmd 0x31: Maximum queue length exceeded + BT: seq 0x9e netfn 0x06 cmd 0x31: Removed from queue + BT: seq 0xa0 netfn 0x06 cmd 0x31: Maximum queue length exceeded + BT: seq 0x9f netfn 0x06 cmd 0x31: Removed from queue + + Avoid this by adding the bmc simulator model to the Qemu powernv + machine. +- ci: Add opal-utils to Debian unstable + + This puts a 'pflash' in the users PATH, allowing more test coverage of + ffspart. +- ci: Drop P8 mambo from Debian unstable + + Debian Unstable has removed OpenSSL 1.0.0 from the repository so mambo + no longer runs: :: + + /opt/ibm/systemsim-p8/bin/systemsim-pegasus: error while loading shared + libraries: libcrypto.so.1.0.0: cannot open shared object file: No such + file or directory + + By removing it from the container these tests will be automatically + skipped. + + Tracked in https://github.com/open-power/op-build/issues/2519 +- ci: Add dtc dependencies for rawhide + + Both F28 and Rawhide build their own dtc version. Rawhide was missing + the required build deps. +- ci: Update Debian unstable packages + + This syncs Debian unstable with Ubuntu 18.04 in order to get the clang + package. It also adds qemu to the Debian install, which makes sense + Debian also has 2.12. +- ci: Use Ubuntu latest config for Debian unstable + + Debian unstable has the same GCOV issue with 8.2 as Ubuntu latest so it + makes sense to share configurations there. +- ci: Disable GCOV builds in ubuntu-latest + + They are known to be broken with GCC 8.2: + https://github.com/open-power/skiboot/issues/206 +- ci: Update gcov comment in Fedora 28 +- plat/qemu: fix platform initialization when the BT device is not present + + A QEMU PowerNV machine does not necessarily have a BT device. It needs + to be defined on the command line with : :: + + -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-bt,bmc=bmc0,irq=10 + + When the QEMU platform is initialized by skiboot, we need to check + that such a device is present and if not, skip the AST initialization. + +Since v6.1-rc1: + +- travis: Coverity fixed their SSL cert +- opal-ci: Use ubuntu:rolling for Ubuntu latest image +- ffspart: Add test for eraseblock size +- ffspart: Add toc test +- hdata/test: workaround dtc bugs + + In dtc v1.4.5 to at least v1.4.7 there have been a few bugs introduced + that change the layout of what's produced in the dts. In order to be + immune from them, we should use the (provided) dtdiff utility, but we + also need to run the dts we're diffing against through a dtb cycle in + order to ensure we get the same format as what the hdat_to_dt to dts + conversion will. + + This fixes a bunch of unit test failures on the version of dtc shipped + with recent Linux distros such as Fedora 29. + + +Mambo Platform +^^^^^^^^^^^^^^ + +- mambo: Merge PMEM_DISK and PMEM_VOLATILE code + + PMEM_VOLATILE and PMEM_DISK can't be used together and are basically + copies of the same code. + + This merges the two and allows them used together. Same API is kept. +- hw/chiptod: test QUIRK_NO_CHIPTOD in opal_resync_timebase + + This allows some test coverage of deep stop states in Linux with + Mambo. +- core/mem_region: mambo reserve kernel payload areas + + Mambo image payloads get overwritten by the OS and by + fast reboot memory clearing because they have no region + defined. Add them, which allows fast reboot to work. + +Qemu platform +^^^^^^^^^^^^^ + +Since v6.2-rc2: +- plat/qemu: use the common OpenPOWER routines to initialize + + Back in 2016, we did not have a large support of the PowerNV devices + under QEMU and we were using our own custom ones. This has changed and + we can now use all the common init routines of the OpenPOWER + platforms. + +Since v6.1: + +- nx: Don't abort on missing NX when using a QEMU machine + + These don't have an NX node (and probably never will) as they + don't provide any coprocessor. However, the DARN instruction + works so this abort is unnecessary. + +POWER8 Platforms +---------------- +- SBE-p8: Do all sbe timer update with xscom lock held + + Without this, on some P8 platforms, we could (falsely) think the SBE timer + had stalled getting the dreaded "timer stuck" message. + + The code was doing the mftb() to set the start of the timeout period while + *not* holding the lock, so the 1ms timeout started sometime when somebody + else had the xscom lock. + + The simple solution is to just do the whole routine holding the xscom lock, + so do it that way. + +Vesnin Platform +^^^^^^^^^^^^^^^ +- platforms/astbmc/vesnin: Send list of PCI devices to BMC through IPMI + + Implements sending a list of installed PCI devices through IPMI protocol. + Each PCI device description is sent as a standalone IPMI message. + A list of devices can be gathered from separate messages using the + session identifier. The session Id is an incremental counter that is + updated at the start of synchronization session. + + +POWER9 Platforms +---------------- + +- STOP API: API conditionally supports 255 SCOM restore entries for each quad. +- hdata/i2c: Skip unknown device type + + Do not add unknown I2C devices to device tree. +- hdata/i2c: Add whitelisting for Host I2C devices + + Many of the devices that we get information about through HDAT are for + use by firmware rather than the host operating system. This patch adds + a boolean flag to hdat_i2c_info structure that indicates whether devices + with a given purpose should be reserved for use inside of OPAL (or some + other firmware component, such as the OCC). +- hdata/iohub: Fix Cumulus Hub ID number +- opal/hmi: Wakeup the cpu before reading core_fir + + When stop state 5 is enabled, reading the core_fir during an HMI can + result in a xscom read error with xscom_read() returning an + OPAL_XSCOM_PARTIAL_GOOD error code and core_fir value of all FFs. At + present this return error code is not handled in decode_core_fir() + hence the invalid core_fir value is sent to the kernel where it + interprets it as a FATAL hmi causing a system check-stop. + + This can be prevented by forcing the core to wake-up using before + reading the core_fir. Hence this patch wraps the call to + read_core_fir() within calls to dctl_set_special_wakeup() and + dctl_clear_special_wakeup(). +- xive: Disable block tracker + + Due to some HW errata, the block tracking facility (performance optimisation + for large systems) should be disabled on Nimbus chips. Disable it unconditionally + for now. +- opal/hmi: Ignore debug trigger inject core FIR. + + Core FIR[60] is a side effect of the work around for the CI Vector Load + issue in DD2.1. Usually this gets delivered as HMI with HMER[17] where + Linux already ignores it. But it looks like in some cases we may happen + to see CORE_FIR[60] while we are already in Malfunction Alert HMI + (HMER[0]) due to other reasons e.g. CAPI recovery or NPU xstop. If that + happens then just ignore it instead of crashing kernel as not recoverable. +- hdata: Make sure reserved node name starts with "ibm, " + + HDAT does not provide consistent label format for reserved memory label. + Few starts with "ibm," while few other starts with component name. +- hdata: Fix dtc warnings + + Fix dtc warnings related to mcbist node. :: + + Warning (reg_format): "reg" property in /xscom@623fc00000000/mcbist@1 has invalid length (4 bytes) (#address-cells == 1, #size-cells == 1) + Warning (reg_format): "reg" property in /xscom@623fc00000000/mcbist@2 has invalid length (4 bytes) (#address-cells == 1, #size-cells == 1) + Warning (reg_format): "reg" property in /xscom@603fc00000000/mcbist@1 has invalid length (4 bytes) (#address-cells == 1, #size-cells == 1) + Warning (reg_format): "reg" property in /xscom@603fc00000000/mcbist@2 has invalid length (4 bytes) (#address-cells == 1, #size-cells == 1) + + Ideally we should add proper xscom range here... but we are not getting that + information in HDAT today. Lets fix warning until we get proper data in HDAT. + +PHB4 +^^^^ + +- phb4: Generate checkstop on AIB ECC corr/uncorr for DD2.0 parts + + On DD2.0 parts, PCIe ECC protection is not warranted in the response + data path. Thus, for these parts, we need to flag any ECC errors + detected from the adjacent AIB RX Data path so the part can be + replaced. + + This patch configures the FIRs so that we escalate these AIB ECC + errors to a checkstop so the parts can be replaced. +- phb4: Reset pfir and nfir if new errors reported during ETU reset + + During fast-reboot new PEC errors can be latched even after ETU-Reset + is asserted. This will result in values of variables nfir_cache and + pfir_cache to be out of sync. + + During step-2 of CRESET nfir_cache and pfir_cache values are used to + bring the PHB out of reset state. However if these variables are out + as noted above of date the nfir/pfir registers are never reset + completely and ETU still remains frozen. + + Hence this patch updates step-2 of phb4_creset to re-read the values of + nfir/pfir registers to check if any new errors were reported after + ETU-reset was asserted, report these new errors and reset the + nfir/pfir registers. This should bring the ETU out of reset + successfully. +- phb4: Disable nodal scoped DMA accesses when PB pump mode is enabled + + By default when a PCIe device issues a read request via the PHB it is first + issued with nodal scope. When accessing GPU memory the NPU does not know at the + time of response if the requested memory page is off node or not. Therefore + every read of GPU memory by a PHB is retried with larger scope which introduces + bandwidth and latency issues. + + On smaller boxes which have pump mode enabled nodal and group scoped reads are + treated the same and both types of request are broadcast to one chip. Therefore + we can avoid the retry by disabling nodal scope on the PHB for these boxes. On + larger boxes nodal (single chip) and group (multiple chip) scoped reads are + treated differently. Therefore we avoid disabling nodal scope on large boxes + which have pump mode disabled to avoid all PHB requests being broadcast to + multiple chips. +- phb4/capp: Only reset FIR bits that cause capp machine check + + During CAPP recovery do_capp_recovery_scoms() will reset the CAPP Fir + register just after CAPP recovery is completed. This has an + unintentional side effect of preventing PRD from analyzing and + reporting this error. If PRD tries to read the CAPP FIR after opal has + already reset it, then it logs a critical error complaining "No active + error bits found". + + To prevent this from happening we update do_capp_recovery_scoms() to + only reset fir bits that cause CAPP machine check (local xstop). This + is done by reading the CAPP Fir Action0/1 & Mask registers and + generating a mask which is then written on CAPP_FIR_CLEAR register. + +- phb4: Check for RX errors after link training + + Some PHB4 PHYs can get stuck in a bad state where they are constantly + retraining the link. This happens transparently to skiboot and Linux + but will causes PCIe to be slow. Resetting the PHB4 clears the + problem. + + We can detect this case by looking at the RX errors count where we + check for link stability. This patch does this by modifying the link + optimal code to check for RX errors. If errors are occurring we + retrain the link irrespective of the chip rev or card. + + Normally when this problem occurs, the RX error count is maxed out at + 255. When there is no problem, the count is 0. We chose 8 as the max + rx errors value to give us some margin for a few errors. There is also + a knob that can be used to set the error threshold for when we should + retrain the link. ie :: + + nvram -p ibm,skiboot --update-config phb-rx-err-max=8 + +- hw/phb4: Add a helper to dump the PELT-V + + The "Partitionable Endpoint Lookup Table (Vector)" is used by the PHB + when processing EEH events. The PELT-V defines which PEs should be + additionally frozen in the event of an error being flagged on a + given PE. Knowing the state of the PELT-V is sometimes useful for + debugging PHB issues so this patch adds a helper to dump it. + +- hw/phb4: Print the PEs in the EEH dump in hex + + Linux always displays the PE number in hexidecimal while skiboot + displays the PEST index (PE number) in decimal. This makes correlating + errors between Skiboot and Linux more annoying than it should be so + this patch makes Skiboot print the PEST number in hex. + +- phb4: Reallocate PEC2 DMA-Read engines to improve GPU-Direct bandwidth + + We reallocate additional 16/8 DMA-Read engines allocated to stack0/1 + on PEC2 respectively. This is needed to improve bandwidth available to + the Mellanox CX5 adapter when trying to read GPU memory (GPU-Direct). + + If kernel cxl driver indicates a request to allocate maximum possible + DMA read engines when calling enable_capi_mode() and card is attached + to PEC2/stack0 slot then we assume its a Mellanox CX5 adapter. We then + allocate additional 16/8 extra DMA read engines to stack0 and stack1 + respectively on PEC2. This is done by populating the + XPEC_PCI_PRDSTKOVR and XPEC_NEST_READ_STACK_OVERRIDE as suggested by + the h/w team. +- phb4: Enable PHB MMIO-0/1 Bars only when mmio window exists + + Presently phb4_probe_stack() will always enable PHB MMIO0/1 windows + even if they doesn't exist in phy_map. Hence we do some minor shuffling + in the phb4_probe_stack() so that MMIO-0/1 Bars are only enabled if + there corresponding MMIO window exists in the phy_map. In case phy_map + for an mmio window is '0' we set the corresponding BAR register to + '0'. +- hw/phb4: Use local_alloc for phb4 structures + + Struct phb4 is fairly heavyweight at 283664 bytes. On systems with + 6x PHBs per socket this results in using 3.2MB of heap space the PHB + structures alone. This is a fairly large chunk of our 12MB heap and + on systems with particularly large PCIe topologies, or additional + PHBs we can fail to boot because we cannot allocate space for the + FDT blob. + + This patch switches to using local_alloc() for the PHB structures + so they don't consume too large a portion of our 12MB heap space. +- phb4: Fix typo in disable lane eq code + + In this commit :: + + commit 737c0ba3d72b8aab05a765a9fc111a48faac0f75 + Author: Michael Neuling <mikey@neuling.org> + Date: Thu Feb 22 10:52:18 2018 +1100 + phb4: Disable lane eq when retrying some nvidia GEN3 devices + + We made a typo and set PH2 twice. This fixes it. + + It worked previously as if only phase 2 (PH2) is set it, skips phase 2 + and phase 3 (PH3). +- phb4: Don't probe a PHB if its garded + + Presently phb4_probe_stack() causes an exception while trying to probe + a PHB if its garded. This causes skiboot to go into a reboot loop with + following exception log: :: + + *********************************************** + Fatal MCE at 000000003006ecd4 .probe_phb4+0x570 + CFAR : 00000000300b98a0 + <snip> + Aborting! + CPU 0018 Backtrace: + S: 0000000031cc37e0 R: 000000003001a51c ._abort+0x4c + S: 0000000031cc3860 R: 0000000030028170 .exception_entry+0x180 + S: 0000000031cc3a40 R: 0000000000001f10 * + S: 0000000031cc3c20 R: 000000003006ecb0 .probe_phb4+0x54c + S: 0000000031cc3e30 R: 0000000030014ca4 .main_cpu_entry+0x5b0 + S: 0000000031cc3f00 R: 0000000030002700 boot_entry+0x1b8 + + This is caused as phb4_probe_stack() will ignore all xscom read/write + errors to enable PHB Bars and then tries to perform an mmio to read + PHB Version registers that cause the fatal MCE. + + We fix this by ignoring the PHB probe if the first xscom_write() to + populate the PHB Bar register fails, which indicates that there is + something wrong with the PHB. +- phb4: Workaround PHB errata with CFG write UR/CA errors + + If the PHB encounters a UR or CA status on a CFG write, it will + incorrectly freeze the wrong PE. Instead of using the PE# specified + in the CONFIG_ADDRESS register, it will use the PE# of whatever + MMIO occurred last. + + Work around this disabling freeze on such errors +- phb4: Handle allocation errors in phb4_eeh_dump_regs() + + If the zalloc fails (and it can be a rather large allocation), + we will overwite memory at 0 instead of failing. +- phb4: Don't try to access non-existent PEST entries + + In a POWER9 chip, some PHB4s have 256 PEs, some have 512. + + Currently, the diagnostics code retrieves 512 unconditionally, + which is wrong and causes us to incorrectly report bogus values + for the "high" PEs on the small PHBs. + + Use the actual number of implemented PEs instead + +CAPI2 +^^^^^ + +- phb4/capp: Use link width to allocate STQ engines to CAPP + + Update phb4_init_capp_regs() to allocates STQ Engines to CAPP/PEC2 + based on link width instead of always assuming it to x8. + + Also re-factor the function slightly to evaluate the link-width only + once and cache it so that it can also be used to allocate DMA read + engines. +- phb4/capp: Update DMA read engines set in APC_FSM_READ_MASK based on link-width + + Commit 47c09cdfe7a3("phb4/capp: Calculate STQ/DMA read engines based + on link-width for PEC") update the CAPP init sequence by calculating + the needed STQ/DMA-read engines based on link width and populating it + in XPEC_NEST_CAPP_CNTL register. This however needs to be synchronized + with the value set in CAPP APC FSM Read Machine Mask Register. + + Hence this patch update phb4_init_capp_regs() to calculate the link + width of the stack on PEC2 and populate the same values as previously + populated in PEC CAPP_CNTL register. +- capp: Fix the capp recovery timeout comparison + + The current capp recovery timeout control loop in + do_capp_recovery_scoms() uses a wrong comparison for return value of + tb_compare(). This may cause do_capp_recovery_scoms() to report an + timeout earlier than the 168ms stipulated time. + + The patch fixes this by updating the loop timeout control branch in + do_capp_recovery_scoms() to use the correct enum tb_cmpval. +- phb4: Disable 32-bit MSI in capi mode + + If a capi device does a DMA write targeting an address lower than 4GB, + it does so through a 32-bit operation, per the PCI spec. In capi mode, + the first TVE entry is configured in bypass mode, so the address is + valid. But with any (bad) luck, the address could be 0xFFFFxxxx, thus + looking like a 32-bit MSI. + + We currently enable both 32-bit and 64-bit MSIs, so the PHB will + interpret the DMA write as a MSI, which very likely results in an EEH + (MSI with a bad payload size). + + We can fix it by disabling 32-bit MSI when switching the PHB to capi + mode. Capi devices are 64-bit. + +NVLINK2 +^^^^^^^ + +Since v6.2-rc2: +- Add purging CPU L2 and L3 caches into NPU hreset. + + If a GPU is passed through to a guest and the guest unexpectedly terminates, + there can be cache lines in CPUs that belong to the GPU. So purge the caches + as part of the reset sequence. L1 is write through, so doesn't need to be purged. + + The sequence to purge the L2 and L3 caches from the hw team: + + L2 purge: + 1. initiate purge :: + + putspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_TYPE L2CAC_FLUSH -all + putspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_TRIGGER ON -all + + 2. check this is off in all caches to know purge completed :: + + getspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_REG_BUSY -all + + 3. :: + + putspy pu.ex EXP.L2.L2MISC.L2CERRS.PRD_PURGE_CMD_TRIGGER OFF -all + + L3 purge: + 1. Start the purge: :: + + putspy pu.ex EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_TTYPE FULL_PURGE -all + putspy pu.ex EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_REQ ON -all + + 2. Ensure that the purge has completed by checking the status bit: :: + + getspy pu.ex EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_REQ -all + + You should see it say OFF if it's done: :: + + p9n.ex k0:n0:s0:p00:c0 + EXP.L3.L3_MISC.L3CERRS.L3_PRD_PURGE_REQ + OFF + +- npu2: Return sensible PCI error when not frozen + + The current kernel calls OPAL_PCI_EEH_FREEZE_STATUS with an uninitialized + @pci_error_type parameter and then analyzes it even if the OPAL call + returned OPAL_SUCCESS. This is results in unexpected EEH events and NPU + freezes. + + This initializes @pci_error_type and @severity to known safe values. + +- npu2: Advertise correct TCE page size + + The P9 NPU workbook says that only 4K/64K/16M/256M page size are supported + and in fact npu2_map_pe_dma_window() supports just these but in absence of + the "ibm,supported-tce-sizes" property Linux assumes the default P9 PHB4 + page sizes - 4K/64K/2M/1G - so when Linux tries 2M/1G TCEs, we get lots of + "Unexpected TCE size" from npu2_tce_kill(). + + This advertises TCE page sizes so Linux could handle it correctly, i.e. + fall back to 4K/64K TCEs. + +Since v6.1: + +- npu2: Add support for relaxed-ordering mode + + Some device drivers support out of order access to GPU memory. This does + not affect the CPU view of memory but it does affect the GPU view of + memory. It should only be enabled if the GPU driver has requested it. + + Add OPAL APIs allowing the driver to query relaxed ordering state or + request it to be set for a device. Current hardware only allows relaxed + ordering to be enabled per PCIe root port. So the code here doesn't + enable relaxed ordering until it has been explicitly requested for every + device on the port. +- Add the other 7 ATSD registers to the device tree. +- npu2/hw-procedures: Don't open code NPU2_NTL_MISC_CFG2_BRICK_ENABLE + + Name this bit properly. There's a lot more cleanup like this to be done, + but I'm catching this one now as part of some related changes. +- npu2/hw-procedures: Enable parity and credit overflow checks + + Enable these error checking features by setting the appropriate bits in + our one-off initialization of each "NTL Misc Config 2" register. + + The exception is NDL RX parity checking, which should be disabled during + the link training procedures. +- npu2: Use correct kill type for TCE invalidation + + kill_type is enum of OPAL_PCI_TCE_KILL_PAGES, OPAL_PCI_TCE_KILL_PE, + OPAL_PCI_TCE_KILL_ALL and phb4_tce_kill() gets it right but + npu2_tce_kill() uses OPAL_PCI_TCE_KILL which is an OPAL API token. + + This fixes an obvious mistype. + +OpenCAPI +^^^^^^^^ + +Since v6.2-rc1: + +- npu2-opencapi: Log extra information on link training failure +- npu2-opencapi: Detect if link trained in degraded mode + +Since v6.1: + +- Support OpenCAPI on Witherspoon platform +- npu2-opencapi: Enable presence detection on ZZ + + Presence detection for opencapi adapters was broken for ZZ planars v3 + and below. All ZZ systems currently used in the lab have had their + planar upgraded, so we can now remove the override we had to force + presence and activate presence detection. Which should improve boot + time. + + Considering the state of opal support on ZZ, this is really only for + lab usage on BML. The opencapi enablement team has okay'd the + change. In the unlikely case somebody tries opencapi on an old ZZ, the + presence detection through i2c will show that no adapter is present + and skiboot won't try to access or train the link. +- npu2-opencapi: Don't send commands to NPU when link is down + + Even if an opencapi link is down, we currently always try to issue a + config read operation when probing for PCI devices, because of the + default scan map used for an opencapi PHB. The config operation fails, + as expected, but it can also raise a FIR bit and trigger an HMI. + + For opencapi, there's no root device like for a "normal" PCI PHB, so + there's no reason to do the config operation. To fix it, we keep the + scan map blank by default, and only add a device once the link is + trained. +- opal/hmi: Catch NPU2 HMIs for opencapi + + HMIs for NPU2 are filtered with the 'compatible' string of the PHB, so + add opencapi to the mix. +- occ: Wait if OCC GPU presence status not immediately available + + It takes a few seconds for the OCC to set everything up in order to read + GPU presence. At present, we try to kick off OCC initialisation as early as + possible to maximise the time it has to read GPU presence. + + Unfortunately sometimes that's not enough, so add a loop in + occ_get_gpu_presence() so that on the first time we try to get GPU presence + we keep trying for up to 2 seconds. Experimentally this seems to be + adequate. +- hw/npu2-hw-procedures: Enable RX auto recal on OpenCAPI links + + The RX_RC_ENABLE_AUTO_RECAL flag is required on OpenCAPI but not NVLink. + + Traditionally, Hostboot sets this value according to the machine type. + However, now that Witherspoon supports both NVLink and OpenCAPI, it can't + tell whether or not a link is OpenCAPI. + + So instead, set it in skiboot, where it will only be triggered after we've + done device detection and found an OpenCAPI device. +- hw/npu2-opencapi: Fix setting of supported OpenCAPI templates + + In opal_npu_tl_set(), we made a typo that means the OPAL_NPU_TL_SET call + may not clear the enable bits for templates that were previously enabled + but are now disabled. + + Fix the typo so we clear NPU2_OTL_CONFIG1_TX_TEMP2_EN as well as + TEMP{1,3}_EN. + +Barreleye G2 and Zaius platforms +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- zaius: Add a slot table +- zaius: Add slots for the Barreleye G2 HDD rack + + The Barreleye G2 is distinct from the Zaius in that it features a 24 + Bay NVMe/SATA HDD rack. To provide meaningful slot names for each NVMe + device we need to define a slot table for the NVMe capable HDD bays. + + Unfortunately this isn't straightforward because the PCIe path to the + NVMe devices isn't fixed. The PCIe topology is something like: + P9 -> HBA card -> 9797 switch -> 20x NVMe HDD slots + + The 9797 switch is partitioned into two (or four) virtual switches which + allow multiple HBA cards to be used (e.g. one per socket). As a result + the exact BDFN of the ports will vary depending on how the system is + configured. + + That said, the virtual switch configuration of the 9797 does not change + the device and function numbers of the switch downports. This means that + we can define a single slot table that maps switch ports to the NVMe bay + names. + + Unfortunately we still need to guess which bus to use this table on, so + we assume that any switch downport we find with the PEX9797 VDID is part + of the 9797 that supports the HDD rack. + +FSP based platforms (firenze and ZZ) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Since v6.2-rc1: +- platform/firenze: Fix branch-to-null crash + + When the bus alloc and free methods were removed we missed a case in the + Firenze platform slot code that relied on the the bus-specific method to + the bus pointer in the request structure. This results in a + branch-to-null during boot and a crash. This patch fixes it by + initialising it manually here. + + +Since v6.1: + +- phb4/capp: Update the expected Eye-catcher for CAPP ucode lid + + Currently on a FSP based P9 system load_capp_code() expects CAPP ucode + lid header to have eye-catcher magic of 'CAPPPSLL'. However skiboot + currently supports CAPP ucode only lids that have a eye-catcher magic + of 'CAPPLIDH'. This prevents skiboot from loading the ucode with this + error message: :: + + CAPP: ucode header invalid + + We fix this issue by updating load_capp_ucode() to use the eye-catcher + value of 'CAPPLIDH' instead of 'CAPPPSLL'. + +- FSP: Improve Reset/Reload log message + + Below message is confusing. Lets make it clear. + + FSP sends "R/R complete notification" whenever there is a dump. We use `flag` + to identify whether its its R/R completion -OR- just new dump notification. :: + + [ 483.406351956,6] FSP: SP says Reset/Reload complete + [ 483.406354278,5] DUMP: FipS dump available. ID = 0x1a00001f [size: 6367640 bytes] + [ 483.406355968,7] A Reset/Reload was NOT done + +Witherspoon platform +^^^^^^^^^^^^^^^^^^^^ + +- platforms/astbmc/witherspoon: Implement OpenCAPI support + + OpenCAPI on Witherspoon is slightly more involved than on Zaius and ZZ, due + to the OpenCAPI links using the SXM2 connectors that are used for NVLink + GPUs. + + This patch adds the regular OpenCAPI platform information, and also a + Witherspoon-specific presence detection callback that uses the previously + added OCC GPU presence detection to figure out the device types plugged + into each SXM2 socket. + + The SXM2 connectors are capable of carrying 2 OpenCAPI links, and future + OpenCAPI devices are expected to make use of this. However, we don't yet + support ganged links and the various implications that has for handling + things like device reset, so for now, we only enable 1 brick per device. + +Contributors +------------ + +The v6.2 release of skiboot contains 240 changesets from 28 developers, working for 2 employers. +A total of 9146 lines were added, and 2610 removed (delta 6536). + +Developers with the most changesets +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +=========================== == ======= +Developer # % +=========================== == ======= +Stewart Smith 58 (24.2%) +Andrew Jeffery 30 (12.5%) +Oliver O'Halloran 27 (11.2%) +Joel Stanley 17 (7.1%) +Vaibhav Jain 14 (5.8%) +Benjamin Herrenschmidt 12 (5.0%) +Frederic Barrat 11 (4.6%) +Nicholas Piggin 11 (4.6%) +Andrew Donnellan 10 (4.2%) +Vasant Hegde 9 (3.8%) +Reza Arbab 8 (3.3%) +Samuel Mendoza-Jonas 5 (2.1%) +Alexey Kardashevskiy 4 (1.7%) +Michael Neuling 4 (1.7%) +Prem Shanker Jha 3 (1.2%) +Cédric Le Goater 2 (0.8%) +Rashmica Gupta 2 (0.8%) +Mahesh J Salgaonkar 2 (0.8%) +Alistair Popple 2 (0.8%) +Shilpasri G Bhat 1 (0.4%) +Adriana Kobylak 1 (0.4%) +Madhavan Srinivasan 1 (0.4%) +Artem Senichev 1 (0.4%) +Russell Currey 1 (0.4%) +Vaidyanathan Srinivasan 1 (0.4%) +Cyril Bur 1 (0.4%) +Jeremy Kerr 1 (0.4%) +Michael Ellerman 1 (0.4%) +=========================== == ======= + + +Developers with the most changed lines +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================= ==== ======= +Developer # % +========================= ==== ======= +Andrew Jeffery 2861 (29.3%) +Stewart Smith 1891 (19.4%) +Prem Shanker Jha 1046 (10.7%) +Andrew Donnellan 799 (8.2%) +Oliver O'Halloran 649 (6.6%) +Reza Arbab 441 (4.5%) +Nicholas Piggin 412 (4.2%) +Vaibhav Jain 278 (2.8%) +Cédric Le Goater 250 (2.6%) +Frederic Barrat 168 (1.7%) +Rashmica Gupta 161 (1.6%) +Joel Stanley 152 (1.6%) +Benjamin Herrenschmidt 138 (1.4%) +Artem Senichev 101 (1.0%) +Samuel Mendoza-Jonas 83 (0.9%) +Michael Neuling 82 (0.8%) +Michael Ellerman 61 (0.6%) +Mahesh J Salgaonkar 50 (0.5%) +Vasant Hegde 44 (0.5%) +Alexey Kardashevskiy 32 (0.3%) +Adriana Kobylak 29 (0.3%) +Alistair Popple 18 (0.2%) +Shilpasri G Bhat 4 (0.0%) +Madhavan Srinivasan 3 (0.0%) +Cyril Bur 3 (0.0%) +Jeremy Kerr 3 (0.0%) +Russell Currey 2 (0.0%) +Vaidyanathan Srinivasan 2 (0.0%) +========================= ==== ======= + + +Developers with the most lines removed +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================= ==== ======= +Developer # % +========================= ==== ======= +Cédric Le Goater 205 (7.9%) +Samuel Mendoza-Jonas 8 (0.3%) +Shilpasri G Bhat 1 (0.0%) +========================= ==== ======= + +Developers with the most signoffs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================= ==== ======= +Developer # % +========================= ==== ======= +Stewart Smith 182 (95.3%) +Alistair Popple 3 (1.6%) +Akshay Adiga 2 (1.0%) +Christophe Lombard 1 (0.5%) +Ryan Grimm 1 (0.5%) +Michael Neuling 1 (0.5%) +Mahesh J Salgaonkar 1 (0.5%) +Total 191 +========================= ==== ======= + +Developers with the most reviews +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +================================ ==== ======= +Developer # % +================================ ==== ======= +Andrew Donnellan 15 (19.7%) +Frederic Barrat 11 (14.5%) +Oliver O'Halloran 9 (11.8%) +Alistair Popple 8 (10.5%) +Vasant Hegde 5 (6.6%) +Samuel Mendoza-Jonas 4 (5.3%) +Christophe Lombard 3 (3.9%) +Gregory S. Still 3 (3.9%) +Mahesh J Salgaonkar 2 (2.6%) +RANGANATHPRASAD G. BRAHMASAMUDRA 2 (2.6%) +Jennifer A. Stofer 2 (2.6%) +AMIT J. TENDOLKAR 2 (2.6%) +Christian R. Geddes 2 (2.6%) +Cédric Le Goater 1 (1.3%) +Shilpasri G Bhat 1 (1.3%) +Daniel M. Crowell 1 (1.3%) +Alexey Kardashevskiy 1 (1.3%) +Joel Stanley 1 (1.3%) +Vaibhav Jain 1 (1.3%) +Nicholas Piggin 1 (1.3%) +Andrew Jeffery 1 (1.3%) +Total 76 +================================ ==== ======= + +Developers with the most test credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================= ==== ======= +Developer # % +========================= ==== ======= +Jenkins Server 3 (12.0%) +Cronus HW CI 3 (12.0%) +Hostboot CI 3 (12.0%) +Jenkins OP Build CI 3 (12.0%) +FSP CI Jenkins 3 (12.0%) +Jenkins OP HW 3 (12.0%) +Vasant Hegde 2 (8.0%) +Andrew Donnellan 1 (4.0%) +Oliver O'Halloran 1 (4.0%) +Andrew Jeffery 1 (4.0%) +HWSV CI 1 (4.0%) +Artem Senichev 1 (4.0%) +Total 25 +========================= ==== ======= + +Developers who gave the most tested-by credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================= ==== ======= +Developer # % +========================= ==== ======= +Prem Shanker Jha 19 (76.0%) +Frederic Barrat 2 (8.0%) +Andrew Jeffery 1 (4.0%) +Vaibhav Jain 1 (4.0%) +Stewart Smith 1 (4.0%) +Benjamin Herrenschmidt 1 (4.0%) +========================= ==== ======= + +Developers with the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================= ==== ======= +Developer # % +========================= ==== ======= +Vasant Hegde 2 (25.0%) +Frederic Barrat 1 (12.5%) +Dawn Sylvia 1 (12.5%) +Meng Li 1 (12.5%) +Tyler Seredynski 1 (12.5%) +Pridhiviraj Paidipeddi 1 (12.5%) +Stephanie Swanson 1 (12.5%) +========================= ==== ======= + +Developers who gave the most report credits +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================= ==== ======= +Developer # % +========================= ==== ======= +Stewart Smith 2 (25.0%) +Vaidyanathan Srinivasan 2 (25.0%) +Vasant Hegde 1 (12.5%) +Vaibhav Jain 1 (12.5%) +Andrew Donnellan 1 (12.5%) +Michael Neuling 1 (12.5%) +========================= ==== ======= + +Employers with the most hackers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +========================= ==== ======= +Developer # % +========================= ==== ======= +IBM 27 (96.4%) +YADRO 1 (3.6%) +========================= ==== ======= diff --git a/roms/skiboot/doc/release-notes/skiboot-6.3-rc1.rst b/roms/skiboot/doc/release-notes/skiboot-6.3-rc1.rst new file mode 100644 index 000000000..93ca5fc3c --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.3-rc1.rst @@ -0,0 +1,930 @@ +.. _skiboot-6.3-rc1: + +skiboot-6.3-rc1 +=============== + +skiboot v6.3-rc1 was released on Friday March 29th 2019. It is the first +release candidate of skiboot 6.3, which will become the new stable release +of skiboot following the 6.2 release, first released December 14th 2018. + +Skiboot 6.3 will mark the basis for op-build v2.3. I expect to tag the final +skiboot 6.3 in the next week. + +skiboot v6.3-rc1 contains all bug fixes as of :ref:`skiboot-6.0.19`, +and :ref:`skiboot-6.2.3` (the currently maintained +stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +This release has been a longer cycle than typical for a variety of reasons. It +also contains a lot of cleanup work and minor bug fixes (much like skiboot 6.2 +did). + +Over skiboot 6.2, we have the following changes: + +.. _skiboot-6.3-rc1-new-features: + +New Features +------------ + +- hw/imc: Enable opal calls to init/start/stop IMC Trace mode + + New OPAL APIs for In-Memory Collection Counter infrastructure(IMC), + including a new device type called OPAL_IMC_COUNTERS_TRACE. +- xive: Add calls to save/restore the queues and VPs HW state + + To be able to support migration of guests using the XIVE native + exploitation mode, (where the queue is effectively owned by the + guest), KVM needs to be able to save and restore the HW-modified + fields of the queue, such as the current queue producer pointer and + generation bit, and to retrieve the modified thread context registers + of the VP from the NVT structure : the VP interrupt pending bits. + + However, there is no need to set back the NVT structure on P9. P10 + should be the same. +- witherspoon: Add nvlink2 interconnect information + + GPUs on Redbud and Sequoia platforms are interconnected in groups of + 2 or 3 GPUs. The problem with that is if the user decides to pass a single + GPU from a group to the userspace, we need to ensure that links between + GPUs do not get enabled. + + A V100 GPU provides a way to disable selected links. In order to only + disable links to peer GPUs, we need a topology map. + + This adds an "ibm,nvlink-peers" property to a GPU DT node with phandles + of peer GPUs and NVLink2 bridges. The index in the property is a GPU link + number. +- platforms/romulus: Also support talos + + The two are similar enough and I'd like to have a slot table for our + Talos. +- OpenCAPI support! (see :ref:`skiboot-6.3-rc1-OpenCAPI` section) +- opal/hmi: set a flag to inform OS that TOD/TB has failed. + + Set a flag to indicate OS about TOD/TB failure as part of new + opal_handle_hmi2 handler. This flag then can be used by OS to make sure + functions depending on TB value (e.g. udelay()) are aware of TB not + ticking. +- astbmc: Enable IPMI HIOMAP for AMI platforms + + Required for Habanero, Palmetto and Romulus. +- power-mgmt : occ : Add 'freq-domain-mask' DT property + + Add a new device-tree property freq-domain-indicator to define group of + CPUs which would share same frequency. This property has been added under + power-mgmt node. It is a bitmask. + + Bitwise AND is taken between this bitmask value and PIR of cpu. All the + CPUs lying in the same frequency domain will have same result for AND. + + For example, For POWER9, 0xFFF0 indicates quad wide frequency domain. + Taking AND with the PIR of CPUs will yield us frequency domain which is + quad wise distribution as last 4 bits have been masked which represent the + cores. + + Similarly, 0xFFF8 will represent core wide frequency domain for P8. + + Also, Add a new device-tree property domain-runs-at which will denote the + strategy OCC is using to change the frequency of a frequency-domain. There + can be two strategy - FREQ_MOST_RECENTLY_SET and FREQ_MAX_IN_DOMAIN. + + FREQ_MOST_RECENTLY_SET : the OCC sets the frequency of the quad to the most + recent frequency value requested by the CPUs in the quad. + + FREQ_MAX_IN_DOMAIN : the OCC sets the frequency of the CPUs in + the Quad to the maximum of the latest frequency requested by each of + the component cores. +- powercap: occ: Fix the powercapping range allowed for user + + OCC provides two limits for minimum powercap. One being hard powercap + minimum which is guaranteed by OCC and the other one is a soft + powercap minimum which is lesser than hard-min and may or may not be + asserted due to various power-thermal reasons. So to allow the users + to access the entire powercap range, this patch exports soft powercap + minimum as the "powercap-min" DT property. And it also adds a new + DT property called "powercap-hard-min" to export the hard-min powercap + limit. +- Add NVDIMM support + + NVDIMMs are memory modules that use a battery backup system to allow the + contents RAM to be saved to non-volatile storage if system power goes + away unexpectedly. This allows them to be used a high-performance + storage device, suitable for serving as a cache for SSDs and the like. + + Configuration of NVDIMMs is handled by hostboot and communicated to OPAL + via the HDAT. We need to parse out the NVDIMM memory ranges and create + memory regions with the "pmem-region" compatible label to make them + available to the host. +- core/exceptions: implement support for MCE interrupts in powersave + + The ISA specifies that MCE interrupts in power saving modes will enter + at 0x200 with powersave bits in SRR1 set. This is not currently + supported properly, the MCE will just happen like a normal interrupt, + but GPRs could be lost, which would lead to crashes (e.g., r1, r2, r13 + etc). + + So check the power save bits similarly to the sreset vector, and + handle this properly. +- core/exceptions: allow recoverable sreset exceptions + + This requires implementing the MSR[RI] bit. Then just allow all + non-fatal sreset exceptions to recover. +- core/exceptions: implement an exception handler for non-powersave sresets + + Detect non-powersave sresets and send them to the normal exception + handler which prints registers and stack. +- Add PVR_TYPE_P9P + + Enable a new PVR to get us running on another p9 variant. + +Deprecated/Removed Features +--------------------------- + +- opal: Deprecate reading the PHB status + + The OPAL_PCI_EEH_FREEZE_STATUS call takes a bunch of parameters, one of + them is @phb_status. It is defined as __be64* and always NULL in + the current Linux upstream but if anyone ever decides to read that status, + then the PHB3's handler will assume it is struct OpalIoPhb3ErrorData* + (which is a lot bigger than 8 bytes) and zero it causing the stack + corruption; p7ioc-phb has the same issue. + + This removes @phb_status from all eeh_freeze_status() hooks and moves + the error message from PHB4 to the affected OPAL handlers. + + As far as we can tell, nobody has ever used this and thus it's safe to remove. +- Remove POWER9N DD1 support + + This is not a shipping product and is no longer supported by Linux + or other firmware components. + +General +------- + +- core/i2c: Various bits of refactoring +- refactor backtrace generation infrastructure +- astbmc: Handle failure to initialise raw flash + + Initialising raw flash lead to a dead assignment to rc. Check the return + code and take the failure path as necessary. Both before and after the + fix we see output along the lines of the following when flash_init() + fails: :: + + [ 53.283182881,7] IRQ: Registering 0800..0ff7 ops @0x300d4b98 (data 0x3052b9d8) + [ 53.283184335,7] IRQ: Registering 0ff8..0fff ops @0x300d4bc8 (data 0x3052b9d8) + [ 53.283185513,7] PHB#0000: Initializing PHB... + [ 53.288260827,4] FLASH: Can't load resource id:0. No system flash found + [ 53.288354442,4] FLASH: Can't load resource id:1. No system flash found + [ 53.342933439,3] CAPP: Error loading ucode lid. index=200ea + [ 53.462749486,2] NVRAM: Failed to load + [ 53.462819095,2] NVRAM: Failed to load + [ 53.462894236,2] NVRAM: Failed to load + [ 53.462967071,2] NVRAM: Failed to load + [ 53.463033077,2] NVRAM: Failed to load + [ 53.463144847,2] NVRAM: Failed to load + + Eventually followed by: :: + + [ 57.216942479,5] INIT: platform wait for kernel load failed + [ 57.217051132,5] INIT: Assuming kernel at 0x20000000 + [ 57.217127508,3] INIT: ELF header not found. Assuming raw binary. + [ 57.217249886,2] NVRAM: Failed to load + [ 57.221294487,0] FATAL: Kernel is zeros, can't execute! + [ 57.221397429,0] Assert fail: core/init.c:615:0 + [ 57.221471414,0] Aborting! + CPU 0028 Backtrace: + S: 0000000031d43c60 R: 000000003001b274 ._abort+0x4c + S: 0000000031d43ce0 R: 000000003001b2f0 .assert_fail+0x34 + S: 0000000031d43d60 R: 0000000030014814 .load_and_boot_kernel+0xae4 + S: 0000000031d43e30 R: 0000000030015164 .main_cpu_entry+0x680 + S: 0000000031d43f00 R: 0000000030002718 boot_entry+0x1c0 + --- OPAL boot --- + + Analysis of the execution paths suggests we'll always "safely" end this + way due the setup sequence for the blocklevel callbacks in flash_init() + and error handling in blocklevel_get_info(), and there's no current risk + of executing from unexpected memory locations. As such the issue is + reduced to down to a fix for poor error hygene in the original change + and a resolution for a Coverity warning (famous last words etc). +- core/flash: Retry requests as necessary in flash_load_resource() + + We would like to successfully boot if we have a dependency on the BMC + for flash even if the BMC is not current ready to service flash + requests. On the assumption that it will become ready, retry for several + minutes to cover a BMC reboot cycle and *eventually* rather than + *immediately* crash out with: :: + + [ 269.549748] reboot: Restarting system + [ 390.297462587,5] OPAL: Reboot request... + [ 390.297737995,5] RESET: Initiating fast reboot 1... + [ 391.074707590,5] Clearing unused memory: + [ 391.075198880,5] PCI: Clearing all devices... + [ 391.075201618,7] Clearing region 201ffe000000-201fff800000 + [ 391.086235699,5] PCI: Resetting PHBs and training links... + [ 391.254089525,3] FFS: Error 17 reading flash header + [ 391.254159668,3] FLASH: Can't open ffs handle: 17 + [ 392.307245135,5] PCI: Probing slots... + [ 392.363723191,5] PCI Summary: + ... + [ 393.423255262,5] OCC: All Chip Rdy after 0 ms + [ 393.453092828,5] INIT: Starting kernel at 0x20000000, fdt at + 0x30800a88 390645 bytes + [ 393.453202605,0] FATAL: Kernel is zeros, can't execute! + [ 393.453247064,0] Assert fail: core/init.c:593:0 + [ 393.453289682,0] Aborting! + CPU 0040 Backtrace: + S: 0000000031e03ca0 R: 000000003001af60 ._abort+0x4c + S: 0000000031e03d20 R: 000000003001afdc .assert_fail+0x34 + S: 0000000031e03da0 R: 00000000300146d8 .load_and_boot_kernel+0xb30 + S: 0000000031e03e70 R: 0000000030026cf0 .fast_reboot_entry+0x39c + S: 0000000031e03f00 R: 0000000030002a4c fast_reset_entry+0x2c + --- OPAL boot --- + + The OPAL flash API hooks directly into the blocklevel layer, so there's + no delay for e.g. the host kernel, just for asynchronously loaded + resources during boot. +- fast-reboot: occ: Call occ_pstates_init() on fast-reset on all machines + + Commit 815417dcda2e ("init, occ: Initialise OCC earlier on BMC systems") + conditionally invoked occ_pstates_init() only on FSP based systems in + load_and_boot_kernel(). Due to this pstate table is re-parsed on FSP + system and skipped on BMC system during fast-reboot. So this patch fixes + this by invoking occ_pstates_init() on all boxes during fast-reboot. +- opal/hmi: Don't retry TOD recovery if it is already in failed state. + + On TOD failure, all cores/thread receives HMI and very first thread that + gets interrupt fixes the TOD where as others just resets the respective + HMER error bit and return. But when TOD is unrecoverable, all the threads + try to do TOD recovery one by one causing threads to spend more time inside + opal. Set a global flag when TOD is unrecoverable so that rest of the + threads go back to linux immediately avoiding lock ups in system + reboot/panic path. +- hw/bt: Do not disable ipmi message retry during OPAL boot + + Currently OPAL doesn't know whether BMC is functioning or not. If BMC is + down (like BMC reboot), then we keep on retry sending message to BMC. So + in some corner cases we may hit hard lockup issue in kernel. + + Ideally we should avoid using synchronous path as much as possible. But + for now commit 01f977c3 added option to disable message retry in synchronous. + But this fix is not required during boot. Hence lets disable IPMI message + retry during OPAL boot. +- hdata/memory: Fix warning message + + Even though we added memory to device tree, we are getting below warning. :: + + [ 57.136949696,3] Unable to use memory range 0 from MSAREA 0 + [ 57.137049753,3] Unable to use memory range 0 from MSAREA 1 + [ 57.137152335,3] Unable to use memory range 0 from MSAREA 2 + [ 57.137251218,3] Unable to use memory range 0 from MSAREA 3 +- hw/bt: Add backend interface to disable ipmi message retry option + + During boot OPAL makes IPMI_GET_BT_CAPS call to BMC to get BT interface + capabilities which includes IPMI message max resend count, message + timeout, etc,. Most of the time OPAL gets response from BMC within + specified timeout. In some corner cases (like mboxd daemon reset in BMC, + BMC reboot, etc) OPAL may not get response within timeout period. In + such scenarios, OPAL resends message until max resend count reaches. + + OPAL uses synchronous IPMI message (ipmi_queue_msg_sync()) for few + operations like flash read, write, etc. Thread will wait in OPAL until + it gets response from BMC. In some corner cases like BMC reboot, thread + may wait in OPAL for long time (more than 20 seconds) and results in + kernel hardlockup. + + This patch introduces new interface to disable message resend option. We + will disable message resend option for synchrous message. This will + greatly reduces kernel hardlock up issues. + + This is short term fix. Long term solution is to convert all synchronous + messages to asynhrounous one. +- ipmi/power: Fix system reboot issue + + Kernel makes reboot/shudown OPAL call for reboot/shutdown. Once kernel + gets response from OPAL it runs opal_poll_events() until firmware + handles the request. + + On BMC based system, OPAL makes IPMI call (IPMI_CHASSIS_CONTROL) to + initiate system reboot/shutdown. At present OPAL queues IPMI messages + and return SUCESS to Host. If BMC is not ready to accept command (like + BMC reboot), then these message will fail. We have to manually + reboot/shutdown the system using BMC interface. + + This patch adds logic to validate message return value. If message failed, + then it will resend the message. At some stage BMC will be ready to accept + message and handles IPMI message. +- firmware-versions: Add test case for parsing VERSION + + Also make it possible to use with afl-lop/afl-fuzz just to help make + *sure* we're all good. + + Additionally, if we hit a entry in VERSION that is larger than our + buffer size, we skip over it gracefully rather than overwriting the + stack. This is only a problem if VERSION isn't trusted, which as of + 4b8cc05a94513816d43fb8bd6178896b430af08f it is verified as part of + Secure Boot. +- core/fast-reboot: improve NMI handling during fast reset + + Improve sreset and MCE handling in fast reboot. Switch the HILE bit + off before copying OPAL's exception vectors, so NMIs can be handled + properly. Also disable MSR[ME] while the vectors are being overwritten +- core/cpu: HID update race + + If the per-core HID register is updated concurrently by multiple + threads, updates can get lost. This has been observed during fast + reboot where the HILE bit does not get cleared on all cores, which + can cause machine check exception interrupts to crash. + + Fix this by only updating HID on thread0. +- SLW: Print verbose info on errors only + + Change print level from debug to warning for reporting + bad EC_PPM_SPECIAL_WKUP_* scom values. To reduce cluttering + in the log print only on error. + +IBM FSP based platforms +----------------------- + +- platforms/firenze: Rework I2C controller fixups +- platforms/zz: Re-enable LXVPD slot information parsing + + From memory this was disabled in the distant past since we were waiting + for an updates to the LXPVD format. It looks like that never happened + so re-enable it for the ZZ platform so that we can get PCI slot location + codes on ZZ. + +HIOMAP +------ +- astbmc: Try IPMI HIOMAP for P8 + + The HIOMAP protocol was developed after the release of P8 in preparation + for P9. As a consequence P9 always uses it, but it has rarely been + enabled for P8. P8DTU has recently added IPMI HIOMAP support to its BMC + firmware, so enable its use in skiboot with P8 machines. Doing so + requires some rework to ensure fallback works correctly as in the past + the fallback was to mbox, which will only work for P9. +- libflash/ipmi-hiomap: Enforce message size for empty response + + The protocol defines the response to the associated messages as empty + except for the command ID and sequence fields. If the BMC is returning + extra data consider the message malformed. +- libflash/ipmi-hiomap: Remove unused close handling + + Issuing a HIOMAP_C_CLOSE is not required by the protocol specification, + rather a close can be implicit in a subsequent + CREATE_{READ,WRITE}_WINDOW request. The implicit close provides an + opportunity to reduce LPC traffic and the implementation takes up that + optimisation, so remove the case from the IPMI callback handler. +- libflash/ipmi-hiomap: Overhaul event handling + + Reworking the event handling was inspired by a bug report by Vasant + where the host would get wedged on multiple flash access attempts in the + face of a persistent error state on the BMC-side. The cause of this bug + was the early-exit based on ctx->update, which erronously assumed that + all events had been completely handled in prior calls to + ipmi_hiomap_handle_events(). This is not true if e.g. + HIOMAP_E_DAEMON_READY is clear in the prior calls. + + Regardless, there were other correctness and efficiency problems with + the handling strategy: + + * Ack-able event state was not restored in the face of errors in the + process of re-establishing protocol state + * It forced needless window restoration with respect to the context in + which ipmi_hiomap_handle_events() was called. + * Tests for HIOMAP_E_DAEMON_READY and HIOMAP_E_FLASH_LOST were redundant + with the overhauled error handling introduced in the previous patch + + Fix all of the above issues and add comments to explain the event + handling flow. +- libflash/ipmi-hiomap: Overhaul error handling + + The aim is to improve the robustness with respect to absence of the + BMC-side daemon. The current error handling roughly mirrors what was + done for the mailbox implementation, but there's room for improvement. + + Errors are split into two classes, those that affect the transport state + and those that affect the window validity. From here, we push the + transport state error checks right to the bottom of the stack, to ensure + the link is known to be in a good state before any message is sent. + Window validity tests remain as they were in the hiomap_window_move() + and ipmi_hiomap_read() functions. Validity tests are not necessary in + the write and erase paths as we will receive an error response from the + BMC when performing a dirty or flush on an invalid window. + + Recovery also remains as it was, done on entry to the blocklevel + callbacks. If an error state is encountered in the middle of an + operation no attempt is made to recover it on the spot, instead the + error is returned up the stack and the caller can choose how it wishes + to respond. +- libflash/ipmi-hiomap: Fix leak of msg in callback + +POWER8 +------ +- hw/phb3/naples: Disable D-states + + Putting "Mellanox Technologies MT27700 Family [ConnectX-4] [15b3:1013]" + (more precisely, the second of 2 its PCI functions, no matter in what + order) into the D3 state causes EEH with the "PCT timeout" error. + This has been noticed on garrison machines only and firestones do not + seem to have this issue. + + This disables D-states changing for devices on root buses on Naples by + installing a config space access filter (copied from PHB4). +- cpufeatures: Always advertise POWER8NVL as DD2 + + Despite the major version of PVR being 1 (0x004c0100) for POWER8NVL, + these chips are functionally equalent to P8/P8E DD2 levels. + + This advertises POWER8NVL as DD2. As the result, skiboot adds + ibm,powerpc-cpu-features/processor-control-facility for such CPUs and + the linux kernel can use hypervisor doorbell messages to wake secondary + threads; otherwise "KVM: CPU %d seems to be stuck" would appear because + of missing LPCR_PECEDH. + +p8dtu Platform +^^^^^^^^^^^^^^ +- p8dtu: Configure BMC graphics + + We can no-longer read the values from the BMC in the way we have in the + past. Values were provided by Eric Chen of SMC. +- p8dtu: Enable HIOMAP support + +Vesnin Platform +^^^^^^^^^^^^^^^ +- platforms/vesnin: Disable PCIe port bifurcation + + PCIe ports connected to CPU1 and CPU3 now work as x16 instead of x8x8. + +- Fix hang in pnv_platform_error_reboot path due to TOD failure. + + On TOD failure, with TB stuck, when linux heads down to + pnv_platform_error_reboot() path due to unrecoverable hmi event, the panic + cpu gets stuck in OPAL inside ipmi_queue_msg_sync(). At this time, rest + all other cpus are in smp_handle_nmi_ipi() waiting for panic cpu to proceed. + But with panic cpu stuck inside OPAL, linux never recovers/reboot. :: + + p0 c1 t0 + NIA : 0x000000003001dd3c <.time_wait+0x64> + CFAR : 0x000000003001dce4 <.time_wait+0xc> + MSR : 0x9000000002803002 + LR : 0x000000003002ecf8 <.ipmi_queue_msg_sync+0xec> + + STACK: SP NIA + 0x0000000031c236e0 0x0000000031c23760 (big-endian) + 0x0000000031c23760 0x000000003002ecf8 <.ipmi_queue_msg_sync+0xec> + 0x0000000031c237f0 0x00000000300aa5f8 <.hiomap_queue_msg_sync+0x7c> + 0x0000000031c23880 0x00000000300aaadc <.hiomap_window_move+0x150> + 0x0000000031c23950 0x00000000300ab1d8 <.ipmi_hiomap_write+0xcc> + 0x0000000031c23a90 0x00000000300a7b18 <.blocklevel_raw_write+0xbc> + 0x0000000031c23b30 0x00000000300a7c34 <.blocklevel_write+0xfc> + 0x0000000031c23bf0 0x0000000030030be0 <.flash_nvram_write+0xd4> + 0x0000000031c23c90 0x000000003002c128 <.opal_write_nvram+0xd0> + 0x0000000031c23d20 0x00000000300051e4 <opal_entry+0x134> + 0xc000001fea6e7870 0xc0000000000a9060 <opal_nvram_write+0x80> + 0xc000001fea6e78c0 0xc000000000030b84 <nvram_write_os_partition+0x94> + 0xc000001fea6e7960 0xc0000000000310b0 <nvram_pstore_write+0xb0> + 0xc000001fea6e7990 0xc0000000004792d4 <pstore_dump+0x1d4> + 0xc000001fea6e7ad0 0xc00000000018a570 <kmsg_dump+0x140> + 0xc000001fea6e7b40 0xc000000000028e5c <panic_flush_kmsg_end+0x2c> + 0xc000001fea6e7b60 0xc0000000000a7168 <pnv_platform_error_reboot+0x68> + 0xc000001fea6e7bd0 0xc0000000000ac9b8 <hmi_event_handler+0x1d8> + 0xc000001fea6e7c80 0xc00000000012d6c8 <process_one_work+0x1b8> + 0xc000001fea6e7d20 0xc00000000012da28 <worker_thread+0x88> + 0xc000001fea6e7db0 0xc0000000001366f4 <kthread+0x164> + 0xc000001fea6e7e20 0xc00000000000b65c <ret_from_kernel_thread+0x5c> + + This is because, there is a while loop towards the end of + ipmi_queue_msg_sync() which keeps looping until "sync_msg" does not match + with "msg". It loops over time_wait_ms() until exit condition is met. In + normal scenario time_wait_ms() calls run pollers so that ipmi backend gets + a chance to check ipmi response and set sync_msg to NULL. :: + + while (sync_msg == msg) + time_wait_ms(10); + + But in the event when TB is in failed state time_wait_ms()->time_wait_poll() + returns immediately without calling pollers and hence we end up looping + forever. This patch fixes this hang by calling opal_run_pollers() in TB + failed state as well. + + +.. _skiboot-6.3-rc1-power9: + +POWER9 +------ + +- Retry link training at PCIe GEN1 if presence detected but training repeatedly failed + + Certain older PCIe 1.0 devices will not train unless the training process starts at GEN1 speeds. + As a last resort when a device will not train, fall back to GEN1 speed for the last training attempt. + + This is verified to fix devices based on the Conexant CX23888 on the Talos II platform. +- hw/phb4: Drop FRESET_DEASSERT_DELAY state + + The delay between the ASSERT_DELAY and DEASSERT_DELAY states is set to + one timebase tick. This state seems to have been a hold over from PHB3 + where it was used to add a 1s delay between de-asserting PERST and + polling the link for the CAPI FPGA. There's no requirement for that here + since the link polling on PHB4 is a bit smarter so we should be fine. +- hw/phb4: Factor out PERST control + + Some time ago Mikey added some code work around a bug we found where a + certain RAID card wouldn't come back again after a fast-reboot. The + workaround is setting the Link Disable bit before asserting PERST and + clear it after de-asserting PERST. + + Currently we do this in the FRESET path, but not in the CRESET path. + This patch moves the PERST control into its own function to reduce + duplication and to the workaround is applied in all circumstances. +- hw/phb4: Remove FRESET presence check + + When we do an freset the first step is to check if a card is present in + the slot. However, this only occurs when we enter phb4_freset() with the + slot state set to SLOT_NORMAL. This occurs in: + + a) The creset path, and + b) When the OS manually requests an FRESET via an OPAL call. + + (a) is problematic because in the boot path the generic code will put the + slot into FRESET_START manually before calling into phb4_freset(). This + can result in a situation where a device is detected on boot, but not + after a CRESET. + + I've noticed this occurring on systems where the PHB's slot presence + detect signal is not wired to an adapter. In this situation we can rely + on the in-band presence mechanism, but the presence check will make + us exit before that has a chance to work. + + Additionally, if we enter from the CRESET path this early exit leaves + the slot's PERST signal being left asserted. This isn't currently an issue, + but if we want to support hotplug of devices into the root port it will + be. +- hw/phb4: Skip FRESET PERST when coming from CRESET + + PERST is asserted at the beginning of the CRESET process to prevent + the downstream device from interacting with the host while the PHB logic + is being reset and re-initialised. There is at least a 100ms wait during + the CRESET processing so it's not necessary to wait this time again + in the FRESET handler. + + This patch extends the delay after re-setting the PHB logic to extend + to the 250ms PERST wait period that we typically use and sets the + skip_perst flag so that we don't wait this time again in the FRESET + handler. +- hw/phb4: Look for the hub-id from in the PBCQ node + + The hub-id is stored in the PBCQ node rather than the stack node so we + never add it to the PHB node. This breaks the lxvpd slot lookup code + since the hub-id is encoded in the VPD record that we need to find the + slot information. +- hdata/iohub: Look for IOVPD on P9 + + P8 and P9 use the same IO VPD setup, so we need to load the IOHUB VPD on + P9 systems too. + +CAPI2 +^^^^^ +- capp/phb4: Prevent HMI from getting triggered when disabling CAPP + + While disabling CAPP an HMI gets triggered as soon as ETU is put in + reset mode. This is caused as before we can disabled CAPP, it detects + PHB link going down and triggers an HMI requesting Opal to perform + CAPP recovery. This has an un-intended side effect of spamming the + Opal logs with malfunction alert messages and may also confuse the + user. + + To prevent this we mask the CAPP FIR error 'PHB Link Down' Bit(31) + when we are disabling CAPP just before we put ETU in reset in + phb4_creset(). Also now since bringing down the PHB link now wont + trigger an HMI and CAPP recovery, hence we manually set the + PHB4_CAPP_RECOVERY flag on the phb to force recovery during creset. + +- phb4/capp: Implement sequence to disable CAPP and enable fast-reset + + We implement h/w sequence to disable CAPP in disable_capi_mode() and + with it also enable fast-reset for CAPI mode in phb4_set_capi_mode(). + + Sequence to disable CAPP is executed in three phases. The first two + phase is implemented in disable_capi_mode() where we reset the CAPP + registers followed by PEC registers to their init values. The final + third final phase is to reset the PHB CAPI Compare/Mask Register and + is done in phb4_init_ioda3(). The reason to move the PHB reset to + phb4_init_ioda3() is because by the time Opal PCI reset state machine + reaches this function the PHB is already un-fenced and its + configuration registers accessible via mmio. +- capp/phb4: Force CAPP to PCIe mode during kernel shutdown + + This patch introduces a new opal syncer for PHB4 named + phb4_host_sync_reset(). We register this opal syncer when CAPP is + activated successfully in phb4_set_capi_mode() so that it will be + called at kernel shutdown during fast-reset. + + During kernel shutdown the function will then repeatedly call + phb->ops->set_capi_mode() to switch switch CAPP to PCIe mode. In case + set_capi_mode() indicates its OPAL_BUSY, which indicates that CAPP is + still transitioning to new state; it calls slot->ops.run_sm() to + ensure that Opal slot reset state machine makes forward progress. + + +Witherspoon Platform +^^^^^^^^^^^^^^^^^^^^ +- platforms/witherspoon: Make PCIe shared slot error message more informative + + If we're missing chips for some reason, we print a warning when configuring + the PCIe shared slot. + + The warning doesn't really make it clear what "shared slot" is, and if it's + printed, it'll come right after a bunch of messages about NPU setup, so + let's clarify the message to explicitly mention PCI. +- witherspoon: Add nvlink2 interconnect information + + See :ref:`skiboot-6.3-rc1-new-features` for details. + +Zaius Platform +^^^^^^^^^^^^^^ + +- zaius: Add BMC description + + Frederic reported that Zaius was failing with a NULL dereference when + trying to initialise IPMI HIOMAP. It turns out that the BMC wasn't + described at all, so add a description. + +p9dsu platform +^^^^^^^^^^^^^^ +- p9dsu: Fix p9dsu default variant + + Add the default when no riser_id is returned from the ipmi query. + + Allow a little more time for BMC reply and cleanup some label strings. + + +PCIe +---- + +See :ref:`skiboot-6.3-rc1-power9` for POWER9 specific PCIe changes. + +- core/pcie-slot: Don't bail early in the power on case + + Exiting early in the power off case makes sense since we can't disable + slot power (or assert PERST) for suprise hotplug slots. However, we + should not exit early in the power-on case since it's possible slot + power may have been disabled (or just not enabled at boot time). +- firenze-pci: Always init slot info from LXVPD + + We can slot information from the LXVPD without having power control + information about that slot. This patch changes the init path so that + we always override the add_properties() call rather than only when we + have power control information about the slot. +- fsp/lxvpd: Print more LXVPD slot information + + Useful to know since it changes the behaviour of the slot core. +- core/pcie-slot: Set power state from the PWRCTL flag + + For some reason we look at the power control indicator and use that to + determine if the slot is "off" rather than the power control flag that + is used to power down the slot. + + While we're here change the default behaviour so that the slot is + assumed to be powered on if there's no slot capability, or if there's + no power control available. +- core/pci: Increase the max slot string size + + The maximum string length for the slot label / device location code in + the PCI summary is currently 32 characters. This results in some IBM + location codes being truncated due to their length, e.g. :: + + PHB#0001:02:11.0 [SWDN] SLOT=C11 x8 + PHB#0001:13:00.0 [EP ] *snip* LOC_CODE=U78D3.ND1.WZS004A-P1-C + PHB#0001:13:00.1 [EP ] *snip* LOC_CODE=U78D3.ND1.WZS004A-P1-C + PHB#0001:13:00.2 [EP ] *snip* LOC_CODE=U78D3.ND1.WZS004A-P1-C + PHB#0001:13:00.3 [EP ] *snip* LOC_CODE=U78D3.ND1.WZS004A-P1-C + + Which obscure the actual location of the card, and it looks bad. This + patch increases the maximum length of the label string to 80 characters + since that's the maximum length for a location code. + + + +.. _skiboot-6.3-rc1-OpenCAPI: + +OpenCAPI +-------- +- npu2/hw-procedures: Fix parallel zcal for opencapi + + For opencapi, we currently do impedance calibration when initializing + the PHY for the device, which could run in parallel if we have + multiple opencapi devices. But if 2 devices are on the same + obus, the 2 calibration sequences could overlap, which likely yields + bad results and is useless anyway since it only needs to be done once + per obus. + + This patch splits the opencapi PHY reset in 2 parts: + + - a 'init' part called serially at boot. That's when zcal is done. If + we have 2 devices on the same socket, the zcal won't be redone, + since we're called serially and we'll see it has already be done for + the obus + - a 'reset' part called during fundamental reset as a prereq for link + training. It does the PHY setup for a set of lanes and the dccal. + + The PHY team confirmed there's no dependency between zcal and the + other reset steps and it can be moved earlier. +- npu2-hw-procedures: Fix zcal in mixed opencapi and nvlink mode + + The zcal procedure needs to be run once per obus. We keep track of + which obus is already calibrated in an array indexed by the obus + number. However, the obus number is inferred from the brick index, + which works well for nvlink but not for opencapi. + + Create an obus_index() function, which, from a device, returns the + correct obus index, irrespective of the device type. +- npu2-opencapi: Fix adapter reset when using 2 adapters + + If two opencapi adapters are on the same obus, we may try to train the + two links in parallel at boot time, when all the PCI links are being + trained. Both links use the same i2c controller to handle the reset + signal, so some care is needed to make sure resetting one doesn't + interfere with the reset of the other. We need to keep track of the + current state of the i2c controller (and use locking). + + This went mostly unnoticed as you need to have 2 opencapi cards on the + same socket and links tended to train anyway because of the retries. +- npu2-opencapi: Extend delay after releasing reset on adapter + + Give more time to the FPGA to process the reset signal. The previous + delay, 5ms, is too short for newer adapters with bigger FPGAs. Extend + it to 250ms. + Ultimately, that delay will likely end up being added to the opencapi + specification, but we are not there yet. +- npu2-opencapi: ODL should be in reset when enabled + + We haven't hit any problem so far, but from the ODL designer, the ODL + should be in reset when it is enabled. + + The ODL remains in reset until we start a fundamental reset to + initiate link training. We still assert and deassert the ODL reset + signal as part of the normal procedure just before training the + link. Asserting is therefore useless at boot, since the ODL is already + in reset, but we keep it as it's only a scom write and it's needed + when we reset/retrain from the OS. +- npu2-opencapi: Keep ODL and adapter in reset at the same time + + Split the function to assert and deassert the reset signal on the ODL, + so that we can keep the ODL in reset while we reset the adapter, + therefore having a window where both sides are in reset. + + It is actually not required with our current DLx at boot time, but I + need to split the ODL reset function for the following patch and it + will become useful/required later when we introduce resetting an + opencapi link from the OS. +- npu2-opencapi: Setup perf counters to detect CRC errors + + It's possible to set up performance counters for the PLL to detect + various conditions for the links in nvlink or opencapi mode. Since + those counters are currently unused, let's configure them when an obus + is in opencapi mode to detect CRC errors on the link. Each link has + two counters: + - CRC error detected by the host + - CRC error detected by the DLx (NAK received by the host) + + We also dump the counters shortly after the link trains, but they can + be read multiple times through cronus, pdbg or linux. The counters are + configured to be reset after each read. + +NVLINK2 +------- +- npu2: Allow ATSD for LPAR other than 0 + + Each XTS MMIO ATSD# register is accompanied by another register - + XTS MMIO ATSD0 LPARID# - which controls LPID filtering for ATSD + transactions. + + When a host system passes a GPU through to a guest, we need to enable + some ATSD for an LPAR. At the moment the host assigns one ATSD to + a NVLink bridge and this maps it to an LPAR when GPU is assigned to + the LPAR. The link number is used for an ATSD index. + + ATSD6&7 stay mapped to the host (LPAR=0) all the time which seems to be + acceptable price for the simplicity. +- npu2: Add XTS_BDF_MAP wildcard refcount + + Currently PID wildcard is programmed into the NPU once and never cleared + up. This works for the bare metal as MSR does not change while the host + OS is running. + + However with the device virtualization, we need to keep track of wildcard + entries use and clear them up before switching a GPU from a host to + a guest or vice versa. + + This adds refcount to a NPU2, one counter per wildcard entry. The index + is a short lparid (4 bits long) which is allocated in opal_npu_map_lpar() + and should be smaller than NPU2_XTS_BDF_MAP_SIZE (defined as 16). + + + +Debugging and simulation +------------------------ + +- external/mambo: Error out if kernel is too large + + If you're trying to boot a gigantic kernel in mambo (which you can + reproduce by building a kernel with CONFIG_MODULES=n) you'll get + misleading errors like: :: + + WARNING: 0: (0): [0:0]: Invalid/unsupported instr 0x00000000[INVALID] + WARNING: 0: (0): PC(EA): 0x0000000030000010 PC(RA):0x0000000030000010 MSR: 0x9000000000000000 LR: 0x0000000000000000 + WARNING: 0: (0): numInstructions = 0 + WARNING: 1: (1): [0:0]: Invalid/unsupported instr 0x00000000[INVALID] + WARNING: 1: (1): PC(EA): 0x0000000000000E40 PC(RA):0x0000000000000E40 MSR: 0x9000000000000000 LR: 0x0000000000000000 + WARNING: 1: (1): numInstructions = 1 + WARNING: 1: (1): Interrupt to 0x0000000000000E40 from 0x0000000000000E40 + INFO: 1: (2): ** Execution stopped: Continuous Interrupt, Instruction caused exception, ** + + So add an error to skiboot.tcl to warn the user before this happens. + Making PAYLOAD_ADDR further back is one way to do this but if there's a + less gross way to generally work around this very niche problem, I can + suggest that instead. +- external/mambo: Populate kernel-base-address in the DT + + skiboot.tcl defines PAYLOAD_ADDR as 0x20000000, which is the default in + skiboot. This is also the default in skiboot unless kernel-base-address + is set in the device tree. + + If you change PAYLOAD_ADDR to something else for mambo, skiboot won't + see it because it doesn't set that DT property, so fix it so that it does. +- external/mambo: allow CPU targeting for most debug utils + + Debug util functions target CPU 0:0:0 by default Some can be + overidden explicitly per invocation, and others can't at all. + Even for those that can be overidden, it is a pain to type + them out when you're debugging a particular thread. + + Provide a new 'target' function that allows the default CPU + target to be changed. Wire that up that default to all other utils. + Provide a new 'S' step command which only steps the target CPU. +- qemu: bt device isn't always hanging off / + + Just use the normal for_each_compatible instead. + + Otherwise in the qemu model as executed by op-test, + we wouldn't go down the astbmc_init() path, thus not having flash. +- devicetree: Add p9-simics.dts + + Add a p9-based devicetree that's suitable for use with Simics. +- devicetree: Move power9-phb4.dts + + Clean up the formatting of power9-phb4.dts and move it to + external/devicetree/p9.dts. This sets us up to include it as the basis + for other trees. +- devicetree: Add nx node to power9-phb4.dts + + A (non-qemu) p9 without an nx node will assert in p9_darn_init(): :: + + dt_for_each_compatible(dt_root, nx, "ibm,power9-nx") + break; + if (!nx) { + if (!dt_node_is_compatible(dt_root, "qemu,powernv")) + assert(nx); + return; + } + + Since NX is this essential, add it to the device tree. +- devicetree: Fix typo in power9-phb4.dts + + Change "impi" to "ipmi". +- devicetree: Fix syntax error in power9-phb4.dts + + Remove the extra space causing this: :: + + Error: power9-phb4.dts:156.15-16 syntax error + FATAL ERROR: Unable to parse input tree +- core/init: enable machine check on secondaries + + Secondary CPUs currently run with MSR[ME]=0 during boot, whih means + if they take a machine check, the system will checkstop. + + Enable ME where possible and allow them to print registers. + +Utilities +--------- +- pflash: Don't try update RO ToC + + In the future it's likely the ToC will be marked as read-only. Don't + error out by assuming its writable. +- pflash: Support encoding/decoding ECC'd partitions + + With the new --ecc option, pflash can add/remove ECC when + reading/writing flash partitions protected by ECC. + + This is *not* flawless with current PNORs out in the wild though, as + they do not typically fill the whole partition with valid ECC data, so + you have to know how big the valid ECC'd data is and specify the size + manually. Note that for some partitions this is pratically impossible + without knowing the details of the content of the partition. + + A future patch is likely to introduce an option to "stop reading data + when ECC starts failing and assume everything is okay rather than error + out" to support reading the "valid" data from existing PNOR images. + diff --git a/roms/skiboot/doc/release-notes/skiboot-6.3-rc2.rst b/roms/skiboot/doc/release-notes/skiboot-6.3-rc2.rst new file mode 100644 index 000000000..fe44f667e --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.3-rc2.rst @@ -0,0 +1,96 @@ +.. _skiboot-6.3-rc2: + +skiboot-6.3-rc2 +=============== + +skiboot v6.3-rc2 was released on Thursday April 11th 2019. It is the second +release candidate of skiboot 6.3, which will become the new stable release +of skiboot following the 6.2 release, first released December 14th 2018. + +Skiboot 6.3 will mark the basis for op-build v2.3. I expect to tag the final +skiboot 6.3 in the next week. + +skiboot v6.3-rc2 contains all bug fixes as of :ref:`skiboot-6.0.19`, +and :ref:`skiboot-6.2.3` (the currently maintained +stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +Over :ref:`skiboot-6.3-rc1`, we have the following changes: + +- libflash/ipmi-hiomap: Fix blocks count issue + + We convert data size to block count and pass block count to BMC. + If data size is not block aligned then we endup sending block count + less than actual data. BMC will write partial data to flash memory. + + Sample log :: + + [ 594.388458416,7] HIOMAP: Marked flash dirty at 0x42010 for 8 + [ 594.398756487,7] HIOMAP: Flushed writes + [ 594.409596439,7] HIOMAP: Marked flash dirty at 0x42018 for 3970 + [ 594.419897507,7] HIOMAP: Flushed writes + + In this case HIOMAP sent data with block count=0 and hence BMC didn't + flush data to flash. + +- opal/hmi: Never trust a cow! + + With opencapi, it's fairly common to trigger HMIs during AFU + development on the FPGA, by not replying in time to an NPU command, + for example. So shift the blame reported by that cow to avoid crowding + my mailbox. +- hw/npu2: Dump (more) npu2 registers on link error and HMIs + + We were already logging some NPU registers during an HMI. This patch + cleans up a bit how it is done and separates what is global from what + is specific to nvlink or opencapi. + + Since we can now receive an error interrupt when an opencapi link goes + down unexpectedly, we also dump the NPU state but we limit it to the + registers of the brick which hit the error. + + The list of registers to dump was worked out with the hw team to + allow for proper debugging. For each register, we print the name as + found in the NPU workbook, the scom address and the register value. +- hw/npu2: Report errors to the OS if an OpenCAPI brick is fenced + + Now that the NPU may report interrupts due to the link going down + unexpectedly, report those errors to the OS when queried by the + 'next_error' PHB callback. + + The hardware doesn't support recovery of the link when it goes down + unexpectedly. So we report the PHB as dead, so that the OS can log the + proper message, notify the drivers and take the devices down. +- hw/npu2: Fix OpenCAPI PE assignment + + When we support mixing NVLink and OpenCAPI devices on the same NPU, we're + going to have to share the same range of 16 PE numbers between NVLink and + OpenCAPI PHBs. + + For OpenCAPI devices, PE assignment is only significant for determining + which System Interrupt Log register is used for a particular brick - unlike + NVLink, it doesn't play any role in determining how links are fenced. + + Split the PE range into a lower half which is used for NVLink, and an upper + half that is used for OpenCAPI, with a fixed PE number assigned per brick. + + As the PE assignment for OpenCAPI devices is fixed, set the PE once + during device init and then ignore calls to the set_pe() operation. + +- opal-api: Reserve 2 OPAL API calls for future OpenCAPI LPC use + + OpenCAPI Lowest Point of Coherency (LPC) memory is going to require + some extra OPAL calls to set up NPU BARs. These calls will most likely be + called OPAL_NPU_LPC_ALLOC and OPAL_NPU_LPC_RELEASE, we're not quite ready + to upstream that code yet though. + +- cpufeatures: Add tm-suspend-hypervisor-assist and tm-suspend-xer-so-bug node + + tm-suspend-hypervisor-assist for P9 >=DD2.2 + And a tm-suspend-xer-so-bug node for P9 DD2.2 only. + + I also treat P9P as P9 DD2.3 and add a unit test for the cpufeatures + infrastructure. + + Fixes: https://github.com/open-power/skiboot/issues/233 diff --git a/roms/skiboot/doc/release-notes/skiboot-6.3-rc3.rst b/roms/skiboot/doc/release-notes/skiboot-6.3-rc3.rst new file mode 100644 index 000000000..6591e27d5 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.3-rc3.rst @@ -0,0 +1,228 @@ +.. _skiboot-6.3-rc3: + +skiboot-6.3-rc3 +=============== + +skiboot v6.3-rc3 was released on Thursday May 2nd 2019. It is the third +release candidate of skiboot 6.3, which will become the new stable release +of skiboot following the 6.2 release, first released December 14th 2018. + +Skiboot 6.3 will mark the basis for op-build v2.3. I expect to tag the final +skiboot 6.3 in the next week (I also predicted this last time, so take my +predictions with a large amount of sodium). + +skiboot v6.3-rc3 contains all bug fixes as of :ref:`skiboot-6.0.19`, +and :ref:`skiboot-6.2.3` (the currently maintained +stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +Over :ref:`skiboot-6.3-rc2`, we have the following changes: + + +- Expose PNOR Flash partitions to host MTD driver via devicetree + + This makes it possible for the host to directly address each + partition without requiring each application to directly parse + the FFS headers. This has been in use for some time already to + allow BOOTKERNFW partition updates from the host. + + All partitions except BOOTKERNFW are marked readonly. + + The BOOTKERNFW partition is currently exclusively used by the TalosII platform + +- Write boot progress to LPC port 80h + + This is an adaptation of what we currently do for op_display() on FSP + machines, inventing an encoding for what we can write into the single + byte at LPC port 80h. + + Port 80h is often used on x86 systems to indicate boot progress/status + and dates back a decent amount of time. Since a byte isn't exactly very + expressive for everything that can go on (and wrong) during boot, it's + all about compromise. + + Some systems (such as Zaius/Barreleye G2) have a physical dual 7 segment + display that display these codes. So far, this has only been driven by + hostboot (see hostboot commit 90ec2e65314c). + +- Write boot progress to LPC ports 81 and 82 + + There's a thought to write more extensive boot progress codes to LPC + ports 81 and 82 to supplement/replace any reliance on port 80. + + We want to still emit port 80 for platforms like Zaius and Barreleye + that have the physical display. Ports 81 and 82 can be monitored by a + BMC though. + +- Copy and convert Romulus descriptors to Talos + + Talos II has some hardware differences from Romulus, therefore + we cannot guarantee Talos II == Romulus in skiboot. Copy and + slightly modify the Romulus files for Talos II. + +- npu2: Disable Probe-to-Invalid-Return-Modified-or-Owned snarfing by default + + V100 GPUs are known to violate NVLink2 protocol in some cases (one is when + memory was accessed by the CPU and they by GPU using so called block + linear mapping) and issue double probes to NPU which can cope with this + problem only if CONFIG_ENABLE_SNARF_CPM ("disable/enable Probe.I.MO + snarfing a cp_m") is not set in the CQ_SM Misc Config register #0. + If the bit is set (which is the case today), NPU issues the machine + check stop. + + The snarfing feature is designed to detect 2 probes in flight and combine + them into one. + + This adds a new "opal-npu2-snarf-cpm" nvram variable which controls + CONFIG_ENABLE_SNARF_CPM for all NVLinks to prevent the machine check + stop from happening. + + This disables snarfing by default as otherwise a broken GPU driver can + crash the entire box even when a GPU is passed through to a guest. + This provides a dial to allow regression tests (might be useful for + a bare metal). To enable snarfing, the user needs to run: :: + + sudo nvram -p ibm,skiboot --update-config opal-npu2-snarf-cpm=enable + + and reboot the host system. + +- hw/npu2: Show name of opencapi error interrupts +- core/pci: Use PHB io-base-location by default for PHB slots + + On witherspoon only the GPU slots and the three pluggable PCI slots + (SLOT0, 1, 2) have platform defined slot names. For builtin devices such + as the SATA controller or the PLX switch that fans out to the GPU slots + we have no location codes which some people consider an issue. + + This patch address the problem by making the ibm,slot-location-code for + the root port device default to the ibm,io-base-location-code which is + typically the location code for the system itself. + + e.g. :: + + pciex@600c3c0100000/ibm,loc-code + "UOPWR.0000000-Node0-Proc0" + + pciex@600c3c0100000/pci@0/ibm,loc-code + "UOPWR.0000000-Node0-Proc0" + + pciex@600c3c0100000/pci@0/usb-xhci@0/ibm,loc-code + "UOPWR.0000000-Node0" + + The PHB node, and the root complex nodes have a loc code of the + processor they are attached to, while the usb-xhci device under the + root port has a location code of the system itself. + +- hw/phb4: Read ibm,loc-code from PBCQ node + + On P9 the PBCQs are subdivided by stacks which implement the PCI Express + logic. When phb4 was forked from phb3 most of the properties that were + in the pbcq node moved into the stack node, but ibm,loc-code was not one + of them. This patch fixes the phb4 init sequence to read the base + location code from the PBCQ node (parent of the stack node) rather than + the stack node itself. +- hw/xscom: add missing P9P chip name +- asm/head: balance branches to avoid link stack predictor mispredicts + + The Linux wrapper for OPAL call and return is arranged like this: :: + + __opal_call: + mflr r0 + std r0,PPC_STK_LROFF(r1) + LOAD_REG_ADDR(r11, opal_return) + mtlr r11 + hrfid -> OPAL + + opal_return: + ld r0,PPC_STK_LROFF(r1) + mtlr r0 + blr + + When skiboot returns to Linux, it branches to LR (i.e., opal_return) + with a blr. This unbalances the link stack predictor and will cause + mispredicts back up the return stack. +- external/mambo: also invoke readline for the non-autorun case +- asm/head.S: set POWER9 radix HID bit at entry + + When running in virtual memory mode, the radix MMU hid bit should not + be changed, so set this in the initial boot SPR setup. + + As a side effect, fast reboot also has HID0:RADIX bit set by the + shared spr init, so no need for an explicit call. +- opal-prd: Fix memory leak in is-fsp-system check +- opal-prd: Check malloc return value +- hw/phb4: Squash the IO bridge window + + The PCI-PCI bridge spec says that bridges that implement an IO window + should hardcode the IO base and limit registers to zero. + Unfortunately, these registers only define the upper bits of the IO + window and the low bits are assumed to be 0 for the base and 1 for the + limit address. As a result, setting both to zero can be mis-interpreted + as a 4K IO window. + + This patch fixes the problem the same way PHB3 does. It sets the IO base + and limit values to 0xf000 and 0x1000 respectively which most software + interprets as a disabled window. + + lspci before patch: :: + + 0000:00:00.0 PCI bridge: IBM Device 04c1 (prog-if 00 [Normal decode]) + I/O behind bridge: 00000000-00000fff + + lspci after patch: :: + + 0000:00:00.0 PCI bridge: IBM Device 04c1 (prog-if 00 [Normal decode]) + I/O behind bridge: None + +- build: link with --orphan-handling=warn + + The linker can warn when the linker script does not explicitly place + all sections. These orphan sections are placed according to + heuristics, which may not always be desirable. Enable this warning. +- build: -fno-asynchronous-unwind-tables + + skiboot does not use unwind tables, this option saves about 100kB, + mostly from .text. +- hw/xscom: Enable sw xstop by default on p9 + + This was disabled at some point during bringup to make life easier for + the lab folks trying to debug NVLink issues. This hack really should + have never made it out into the wild though, so we now have the + following situation occuring in the field: + + 1) A bad happens + 2) The host kernel recieves an unrecoverable HMI and calls into OPAL to + request a platform reboot. + 3) OPAL rejects the reboot attempt and returns to the kernel with + OPAL_PARAMETER. + 4) Kernel panics and attempts to kexec into a kdump kernel. + + A side effect of the HMI seems to be CPUs becoming stuck which results + in the initialisation of the kdump kernel taking a extremely long time + (6+ hours). It's also been observed that after performing a dump the + kdump kernel then crashes itself because OPAL has ended up in a bad + state as a side effect of the HMI. + + All up, it's not very good so re-enable the software checkstop by + default. If people still want to turn it off they can using the nvram + override. +- opal/hmi: Initialize the hmi event with old value of TFMR. + + Do this before we fix TFAC errors. Otherwise the event at host console + shows no thread error reported in TFMR register. + + Without this patch the console event show TFMR with no thread error: + (DEC parity error TFMR[59] injection) :: + + [ 53.737572] Severe Hypervisor Maintenance interrupt [Recovered] + [ 53.737596] Error detail: Timer facility experienced an error + [ 53.737611] HMER: 0840000000000000 + [ 53.737621] TFMR: 3212000870e04000 + + After this patch it shows old TFMR value on host console: :: + + [ 2302.267271] Severe Hypervisor Maintenance interrupt [Recovered] + [ 2302.267305] Error detail: Timer facility experienced an error + [ 2302.267320] HMER: 0840000000000000 + [ 2302.267330] TFMR: 3212000870e14010 diff --git a/roms/skiboot/doc/release-notes/skiboot-6.3.1.rst b/roms/skiboot/doc/release-notes/skiboot-6.3.1.rst new file mode 100644 index 000000000..201cb1ff4 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.3.1.rst @@ -0,0 +1,60 @@ +.. _skiboot-6.3.1: + +============== +skiboot-6.3.1 +============== + +skiboot 6.3.1 was released on Friday May 10th, 2019. It replaces +:ref:`skiboot-6.3` as the current stable release in the 6.3.x series. + +It is recommended that 6.3.1 be used instead of 6.3 version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- platforms/astbmc: Check for SBE validation step + + On some POWER8 astbmc systems an update to the SBE requires pausing at + runtime to ensure integrity of the SBE. If this is required the BMC will + set a chassis boot option IPMI flag using the OEM parameter 0x62. If + Skiboot sees this flag is set it waits until the SBE update is complete + and the flag is cleared. + Unfortunately the mystery operation that validates the SBE also leaves + it in a bad state and unable to be used for timer operations. To + workaround this the flag is checked as soon as possible (ie. when IPMI + and the console are set up), and once complete the system is rebooted. + +- ipmi: ensure forward progress on ipmi_queue_msg_sync() + + BT responses are handled using a timer doing the polling. To hope to + get an answer to an IPMI synchronous message, the timer needs to run. + + We can't just check all timers though as there may be a timer that + wants a lock that's held by a code path calling ipmi_queue_msg_sync(), + and if we did enforce that as a requirement, it's a pretty subtle + API that is asking to be broken. + + So, if we just run a poll function to crank anything that the IPMI + backend needs, then we should be fine. + + This issue shows up very quickly under QEMU when loading the first + flash resource with the IPMI HIOMAP backend. + +- pci/iov: Remove skiboot VF tracking + + This feature was added a few years ago in response to a request to make + the MaxPayloadSize (MPS) field of a Virtual Function match the MPS of the + Physical Function that hosts it. + + The SR-IOV specification states the the MPS field of the VF is "ResvP". + This indicates the VF will use whatever MPS is configured on the PF and + that the field should be treated as a reserved field in the config space + of the VF. In other words, a SR-IOV spec compliant VF should always return + zero in the MPS field. Adding hacks in OPAL to make it non-zero is... + misguided at best. + + Additionally, there is a bug in the way pci_device structures are handled + by VFs that results in a crash on fast-reboot that occurs if VFs are + enabled and then disabled prior to rebooting. This patch fixes the bug by + removing the code entirely. This patch has no impact on SR-IOV support on + the host operating system. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.3.2.rst b/roms/skiboot/doc/release-notes/skiboot-6.3.2.rst new file mode 100644 index 000000000..e8a38e200 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.3.2.rst @@ -0,0 +1,218 @@ +.. _skiboot-6.3.2: + +============== +skiboot-6.3.2 +============== + +skiboot 6.3.2 was released on Monday July 1st, 2019. It replaces +:ref:`skiboot-6.3.1` as the current stable release in the 6.3.x series. + +It is recommended that 6.3.2 be used instead of 6.3.1 version due to the +bug fixes it contains. + +Bug fixes included in this release are: + +- npu2: Purge cache when resetting a GPU + + After putting all a GPU's links in reset, do a cache purge in case we + have CPU cache lines belonging to the now-unaccessible GPU memory. + +- npu2: Reset NVLinks when resetting a GPU + + Resetting a V100 GPU brings its NVLinks down and if an NPU tries using + those, an HMI occurs. We were lucky not to observe this as the bare metal + does not normally reset a GPU and when passed through, GPUs are usually + before NPUs in QEMU command line or Libvirt XML and because of that NPUs + are naturally reset first. However simple change of the device order + brings HMIs. + + This defines a bus control filter for a PCI slot with a GPU with NVLinks + so when the host system issues secondary bus reset to the slot, it resets + associated NVLinks. + +- hw/phb4: Assert Link Disable bit after ETU init + + The cursed RAID card in ozrom1 has a bug where it ignores PERST being + asserted. The PCIe Base spec is a little vague about what happens + while PERST is asserted, but it does clearly specify that when + PERST is de-asserted the Link Training and Status State Machine + (LTSSM) of a device should return to the initial state (Detect) + defined in the spec and the link training process should restart. + + This bug was worked around in 9078f8268922 ("phb4: Delay training till + after PERST is deasserted") by setting the link disable bit at the + start of the FRESET process and clearing it after PERST was + de-asserted. Although this fixed the bug, the patch offered no + explaination of why the fix worked. + + In b8b4c79d4419 ("hw/phb4: Factor out PERST control") the link disable + workaround was moved into phb4_assert_perst(). This is called + always in the CRESET case, but a following patch resulted in + assert_perst() not being called if phb4_freset() was entered following a + CRESET since p->skip_perst was set in the CRESET handler. This is bad + since a side-effect of the CRESET is that the Link Disable bit is + cleared. + + This, combined with the RAID card ignoring PERST results in the PCIe + link being trained by the PHB while we're waiting out the 100ms + ETU reset time. If we hack skiboot to print a DLP trace after returning + from phb4_hw_init() we get: :: + + PHB#0001[0:1]: Initialization complete + PHB#0001[0:1]: TRACE:0x0000102101000000 0ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000001101000000 23ms GEN1:x16:detect + PHB#0001[0:1]: TRACE:0x0000102101000000 23ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000183101000000 29ms training GEN1:x16:config + PHB#0001[0:1]: TRACE:0x00001c5881000000 30ms training GEN1:x08:recovery + PHB#0001[0:1]: TRACE:0x00001c5883000000 30ms training GEN3:x08:recovery + PHB#0001[0:1]: TRACE:0x0000144883000000 33ms presence GEN3:x08:L0 + PHB#0001[0:1]: TRACE:0x0000154883000000 33ms trained GEN3:x08:L0 + PHB#0001[0:1]: CRESET: wait_time = 100 + PHB#0001[0:1]: FRESET: Starts + PHB#0001[0:1]: FRESET: Prepare for link down + PHB#0001[0:1]: FRESET: Assert skipped + PHB#0001[0:1]: FRESET: Deassert + PHB#0001[0:1]: TRACE:0x0000154883000000 0ms trained GEN3:x08:L0 + PHB#0001[0:1]: TRACE: Reached target state + PHB#0001[0:1]: LINK: Start polling + PHB#0001[0:1]: LINK: Electrical link detected + PHB#0001[0:1]: LINK: Link is up + PHB#0001[0:1]: LINK: Went down waiting for stabilty + PHB#0001[0:1]: LINK: DLP train control: 0x0000105101000000 + PHB#0001[0:1]: CRESET: Starts + + What has happened here is that the link is trained to 8x Gen3 33ms after + we return from phb4_init_hw(), and before we've waitined to 100ms + that we normally wait after re-initialising the ETU. When we "deassert" + PERST later on in the FRESET handler the link in L0 (normal) state. At + this point we try to read from the Vendor/Device ID register to verify + that the link is stable and immediately get a PHB fence due to a PCIe + Completion Timeout. Skiboot attempts to recover by doing another CRESET, + but this will encounter the same issue. + + This patch fixes the problem by setting the Link Disable bit (by calling + phb4_assert_perst()) immediately after we return from phb4_init_hw(). + This prevents the link from being trained while PERST is asserted which + seems to avoid the Completion Timeout. With the patch applied we get: :: + + PHB#0001[0:1]: Initialization complete + PHB#0001[0:1]: TRACE:0x0000102101000000 0ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000001101000000 23ms GEN1:x16:detect + PHB#0001[0:1]: TRACE:0x0000102101000000 23ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000909101000000 29ms presence GEN1:x16:disabled + PHB#0001[0:1]: CRESET: wait_time = 100 + PHB#0001[0:1]: FRESET: Starts + PHB#0001[0:1]: FRESET: Prepare for link down + PHB#0001[0:1]: FRESET: Assert skipped + PHB#0001[0:1]: FRESET: Deassert + PHB#0001[0:1]: TRACE:0x0000001101000000 0ms GEN1:x16:detect + PHB#0001[0:1]: TRACE:0x0000102101000000 0ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000001101000000 24ms GEN1:x16:detect + PHB#0001[0:1]: TRACE:0x0000102101000000 36ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000183101000000 97ms training GEN1:x16:config + PHB#0001[0:1]: TRACE:0x00001c5881000000 97ms training GEN1:x08:recovery + PHB#0001[0:1]: TRACE:0x00001c5883000000 97ms training GEN3:x08:recovery + PHB#0001[0:1]: TRACE:0x0000144883000000 99ms presence GEN3:x08:L0 + PHB#0001[0:1]: TRACE: Reached target state + PHB#0001[0:1]: LINK: Start polling + PHB#0001[0:1]: LINK: Electrical link detected + PHB#0001[0:1]: LINK: Link is up + PHB#0001[0:1]: LINK: Link is stable + PHB#0001[0:1]: LINK: Card [9005:028c] Optimal Retry:disabled + PHB#0001[0:1]: LINK: Speed Train:GEN3 PHB:GEN4 DEV:GEN3 + PHB#0001[0:1]: LINK: Width Train:x08 PHB:x08 DEV:x08 + PHB#0001[0:1]: LINK: RX Errors Now:0 Max:8 Lane:0x0000 + +- npu2: Reset PID wildcard and refcounter when mapped to LPID + + Since 105d80f85b "npu2: Use unfiltered mode in XTS tables" we do not + register every PID in the XTS table so the table has one entry per LPID. + Then we added a reference counter to keep track of the entry use when + switching GPU between the host and guest systems (the "Fixes:" tag below). + + The POWERNV platform setup creates such entries and references them + at the boot time when initializing IOMMUs and only removes it when + a GPU is passed through to a guest. This creates a problem as POWERNV + boots via kexec and no defererencing happens; the XTS table state remains + undefined. So when the host kernel boots, skiboot thinks there are valid + XTS entries and does not update the XTS table which breaks ATS. + + This adds the reference counter and the XTS entry reset when a GPU is + assigned to LPID and we cannot rely on the kernel to clean that up. + +- hw/phb4: Use read/write_reg in assert_perst + + While the PHB is fenced we can't use the MMIO interface to access PHB + registers. While processing a complete reset we inject a PHB fence to + isolate the PHB from the rest of the system because the PHB won't + respond to MMIOs from the rest of the system while being reset. + + We assert PERST after the fence has been erected which requires us to + use the XSCOM indirect interface to access the PHB registers rather than + the MMIO interface. Previously we did that when asserting PERST in the + CRESET path. However in b8b4c79d4419 ("hw/phb4: Factor out PERST + control"). This was re-written to use the raw in_be64() accessor. This + means that CRESET would not be asserted in the reset path. On some + Mellanox cards this would prevent them from re-loading their firmware + when the system was fast-reset. + + This patch fixes the problem by replacing the raw {in|out}_be64() + accessors with the phb4_{read|write}_reg() functions. + +- opal-prd: Fix prd message size issue + + If prd messages size is insufficient then read_prd_msg() call fails with + below error. And caller is not reallocating sufficient buffer. Also its + hard to guess the size. + + sample log:: + + Mar 28 03:31:43 zz24p1 opal-prd: FW: error reading from firmware: alloc 32 rc -1: Invalid argument + Mar 28 03:31:43 zz24p1 opal-prd: FW: error reading from firmware: alloc 32 rc -1: Invalid argument + Mar 28 03:31:43 zz24p1 opal-prd: FW: error reading from firmware: alloc 32 rc -1: Invalid argument + + Lets use opal-msg-size device tree property to allocate memory + for prd message. + +- npu2: Fix clearing the FIR bits + + FIR registers are SCOM-only so they cannot be accesses with the indirect + write, and yet we use SCOM-based addresses for these; fix this. + +- opal-gard: Account for ECC size when clearing partition + + When 'opal-gard clear all' is run, it works by erasing the GUARD then + using blockevel_smart_write() to write nothing to the partition. This + second write call is needed because we rely on libflash to set the ECC + bits appropriately when the partition contained ECCed data. + + The API for this is a little odd with the caller specifying how much + actual data to write, and libflash writing size + size/8 bytes + since there is one additional ECC byte for every eight bytes of data. + + We currently do not account for the extra space consumed by the ECC data + in reset_partition() which is used to handle the 'clear all' command. + Which results in the paritition following the GUARD partition being + partially overwritten when the command is used. This patch fixes the + problem by reducing the length we would normally write by the number + of ECC bytes required. + +- nvram: Flag dangerous NVRAM options + + Most nvram options used by skiboot are just for debug or testing for + regressions. They should never be used long term. + + We've hit a number of issues in testing and the field where nvram + options have been set "temporarily" but haven't been properly cleared + after, resulting in crashes or real bugs being masked. + + This patch marks most nvram options used by skiboot as dangerous and + prints a chicken to remind users of the problem. + +- devicetree: Don't set path to dtc in makefile + + By setting the path we fail to build under buildroot which has it's own + set of host tools in PATH, but not at /usr/bin. + + Keep the variable so it can be set if need be but default to whatever + 'dtc' is in the users path. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.3.3.rst b/roms/skiboot/doc/release-notes/skiboot-6.3.3.rst new file mode 100644 index 000000000..f27d38f81 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.3.3.rst @@ -0,0 +1,76 @@ +.. _skiboot-6.3.3: + +============== +skiboot-6.3.3 +============== + +skiboot 6.3.3 was released on Wednesday Aug 6th, 2019. It replaces +:ref:`skiboot-6.3.2` as the current stable release in the 6.3.x series. + +It is recommended that 6.3.3 be used instead of any previous 6.3.x version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- struct p9_sbe_msg doesn't need to be packed + + Only the reg member is sent anywhere (via xscom_write), so the structure + does not need to be packed. + +.. code-block:: text + + Fixes GCC9 build problem: + hw/sbe-p9.c: In function ‘p9_sbe_msg_send’: + hw/sbe-p9.c:270:9: error: taking address of packed member of ‘struct p9_sbe_msg’ may result in an unaligned p + ointer value [-Werror=address-of-packed-member] + 270 | data = &msg->reg[0]; + | ^~~~~~~~~~~~ + +- hdata/vpd: fix printing (char*)0x00 + GCC9 now catches this bug: + +.. code-block:: text + + In file included from hdata/vpd.c:17: + In function ‘vpd_vini_parse’, + inlined from ‘vpd_data_parse’ at hdata/vpd.c:416:3: + /skiboot/include/skiboot.h:93:31: error: ‘%s’ directive argument is null [-Werror=format-overflow=] + 93 | #define prlog(l, f, ...) do { _prlog(l, pr_fmt(f), ##__VA_ARGS__); } while(0) + | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + hdata/vpd.c:390:5: note: in expansion of macro ‘prlog’ + 390 | prlog(PR_WARNING, + | ^~~~~ + hdata/vpd.c: In function ‘vpd_data_parse’: + hdata/vpd.c:391:46: note: format string is defined here + 391 | "VPD: CCIN desc not available for: %s\n", + | ^~ + cc1: all warnings being treated as errors + +- errorlog: Prevent alignment error building with gcc9. + +.. code-block:: text + + Fixes this build error: + [ 52s] hw/fsp/fsp-elog-write.c: In function 'opal_elog_read': + [ 52s] hw/fsp/fsp-elog-write.c:213:12: error: taking address of packed member of 'struct errorlog' may result + in an unaligned pointer value [-Werror=address-of-packed-member] + [ 52s] 213 | list_del(&log_data->link); + [ 52s] | ^~~~~~~~~~~~~~~ + +- Support BMC IPMI heartbeat command + + A few years ago, the OpenBMC code added support for a "heartbeat" + command to send to the host. This command is used after the BMC is reset + to check if the host is running. Support was never added to the host + side however so currently when the BMC sends this command, this appears + in the host console: + IPMI: unknown OEM SEL command ff received + + There is no response needed by the host (other then the low level + acknowledge of the command which already occurs). This commit + handles the command so the error is no longer printed (does nothing with + the command though since no action is needed). Here's the tested output + of this patch in the host console (with debug enabled): + IPMI: BMC issued heartbeat command: 00 + +- Add: add mihawk platform file diff --git a/roms/skiboot/doc/release-notes/skiboot-6.3.4.rst b/roms/skiboot/doc/release-notes/skiboot-6.3.4.rst new file mode 100644 index 000000000..bb879c0a3 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.3.4.rst @@ -0,0 +1,29 @@ +.. _skiboot-6.3.4: + +============== +skiboot-6.3.4 +============== + +skiboot 6.3.4 was released on Thursday Oct 3rd, 2019. It replaces +:ref:`skiboot-6.3.3` as the current stable release in the 6.3.x series. + +It is recommended that 6.3.4 be used instead of any previous 6.3.x version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- hw/phb4: Prevent register accesses when in reset + +- core/platform: Actually disable fast-reboot on P8 + +- xive: fix return value of opal_xive_allocate_irq() + +- hw/phb4: Use standard MIN/MAX macro definitions + + The max() macro definition incorrectly returns the minimum value. The + max() macro is used to ensure that PERST has been asserted for 250ms and + that we wait 100ms seconds for the ETU logic in the CRESET_START PHB4 + PCI slot state. However, by returning the minimum value there is no + guarantee that either of these requirements are met. + +- doc/requirements.txt: pin docutils at 0.14 diff --git a/roms/skiboot/doc/release-notes/skiboot-6.3.5.rst b/roms/skiboot/doc/release-notes/skiboot-6.3.5.rst new file mode 100644 index 000000000..9c6d8bc60 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.3.5.rst @@ -0,0 +1,17 @@ +.. _skiboot-6.3.5: + +============== +skiboot-6.3.5 +============== + +skiboot v6.3.5 was released on Thursday June 4th, 2020. It replaces +:ref:`skiboot-6.3.4` as the current stable release in the 6.3.x series. + +It is recommended that v6.3.5 be used instead of any previous 6.3.x version +due to the bug fixes it contains. + +Bug fixes included in this release are: + +- uart: Drop console write data if BMC becomes unresponsive + +- core/ipmi: Fix use-after-free diff --git a/roms/skiboot/doc/release-notes/skiboot-6.3.rst b/roms/skiboot/doc/release-notes/skiboot-6.3.rst new file mode 100644 index 000000000..3b1fba397 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.3.rst @@ -0,0 +1,1275 @@ +.. _skiboot-6.3: + +skiboot-6.3 +=========== + +skiboot v6.3 was released on Friday May 3rd 2019. It is the first +release of skiboot 6.3, which becomes the new stable release +of skiboot following the 6.2 release, first released December 14th 2018. + +Skiboot 6.3 will mark the basis for op-build v2.3. + +skiboot v6.3 contains all bug fixes as of :ref:`skiboot-6.0.20`, +and :ref:`skiboot-6.2.3` (the currently maintained +stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +Over skiboot 6.2, we have the following changes: + +.. _skiboot-6.3-new-features: + +New Features +------------ + +- hw/imc: Enable opal calls to init/start/stop IMC Trace mode + + New OPAL APIs for In-Memory Collection Counter infrastructure(IMC), + including a new device type called OPAL_IMC_COUNTERS_TRACE. +- xive: Add calls to save/restore the queues and VPs HW state + + To be able to support migration of guests using the XIVE native + exploitation mode, (where the queue is effectively owned by the + guest), KVM needs to be able to save and restore the HW-modified + fields of the queue, such as the current queue producer pointer and + generation bit, and to retrieve the modified thread context registers + of the VP from the NVT structure : the VP interrupt pending bits. + + However, there is no need to set back the NVT structure on P9. P10 + should be the same. +- witherspoon: Add nvlink2 interconnect information + + GPUs on Redbud and Sequoia platforms are interconnected in groups of + 2 or 3 GPUs. The problem with that is if the user decides to pass a single + GPU from a group to the userspace, we need to ensure that links between + GPUs do not get enabled. + + A V100 GPU provides a way to disable selected links. In order to only + disable links to peer GPUs, we need a topology map. + + This adds an "ibm,nvlink-peers" property to a GPU DT node with phandles + of peer GPUs and NVLink2 bridges. The index in the property is a GPU link + number. +- platforms/romulus: Also support talos + + The two are similar enough and I'd like to have a slot table for our + Talos. +- OpenCAPI support! (see :ref:`skiboot-6.3-OpenCAPI` section) +- opal/hmi: set a flag to inform OS that TOD/TB has failed. + + Set a flag to indicate OS about TOD/TB failure as part of new + opal_handle_hmi2 handler. This flag then can be used by OS to make sure + functions depending on TB value (e.g. udelay()) are aware of TB not + ticking. +- astbmc: Enable IPMI HIOMAP for AMI platforms + + Required for Habanero, Palmetto and Romulus. +- power-mgmt : occ : Add 'freq-domain-mask' DT property + + Add a new device-tree property freq-domain-indicator to define group of + CPUs which would share same frequency. This property has been added under + power-mgmt node. It is a bitmask. + + Bitwise AND is taken between this bitmask value and PIR of cpu. All the + CPUs lying in the same frequency domain will have same result for AND. + + For example, For POWER9, 0xFFF0 indicates quad wide frequency domain. + Taking AND with the PIR of CPUs will yield us frequency domain which is + quad wise distribution as last 4 bits have been masked which represent the + cores. + + Similarly, 0xFFF8 will represent core wide frequency domain for P8. + + Also, Add a new device-tree property domain-runs-at which will denote the + strategy OCC is using to change the frequency of a frequency-domain. There + can be two strategy - FREQ_MOST_RECENTLY_SET and FREQ_MAX_IN_DOMAIN. + + FREQ_MOST_RECENTLY_SET : the OCC sets the frequency of the quad to the most + recent frequency value requested by the CPUs in the quad. + + FREQ_MAX_IN_DOMAIN : the OCC sets the frequency of the CPUs in + the Quad to the maximum of the latest frequency requested by each of + the component cores. +- powercap: occ: Fix the powercapping range allowed for user + + OCC provides two limits for minimum powercap. One being hard powercap + minimum which is guaranteed by OCC and the other one is a soft + powercap minimum which is lesser than hard-min and may or may not be + asserted due to various power-thermal reasons. So to allow the users + to access the entire powercap range, this patch exports soft powercap + minimum as the "powercap-min" DT property. And it also adds a new + DT property called "powercap-hard-min" to export the hard-min powercap + limit. +- Add NVDIMM support + + NVDIMMs are memory modules that use a battery backup system to allow the + contents RAM to be saved to non-volatile storage if system power goes + away unexpectedly. This allows them to be used a high-performance + storage device, suitable for serving as a cache for SSDs and the like. + + Configuration of NVDIMMs is handled by hostboot and communicated to OPAL + via the HDAT. We need to parse out the NVDIMM memory ranges and create + memory regions with the "pmem-region" compatible label to make them + available to the host. +- core/exceptions: implement support for MCE interrupts in powersave + + The ISA specifies that MCE interrupts in power saving modes will enter + at 0x200 with powersave bits in SRR1 set. This is not currently + supported properly, the MCE will just happen like a normal interrupt, + but GPRs could be lost, which would lead to crashes (e.g., r1, r2, r13 + etc). + + So check the power save bits similarly to the sreset vector, and + handle this properly. +- core/exceptions: allow recoverable sreset exceptions + + This requires implementing the MSR[RI] bit. Then just allow all + non-fatal sreset exceptions to recover. +- core/exceptions: implement an exception handler for non-powersave sresets + + Detect non-powersave sresets and send them to the normal exception + handler which prints registers and stack. +- Add PVR_TYPE_P9P + + Enable a new PVR to get us running on another p9 variant. + +Since v6.3-rc2: + +- Expose PNOR Flash partitions to host MTD driver via devicetree + + This makes it possible for the host to directly address each + partition without requiring each application to directly parse + the FFS headers. This has been in use for some time already to + allow BOOTKERNFW partition updates from the host. + + All partitions except BOOTKERNFW are marked readonly. + + The BOOTKERNFW partition is currently exclusively used by the TalosII platform + +- Write boot progress to LPC port 80h + + This is an adaptation of what we currently do for op_display() on FSP + machines, inventing an encoding for what we can write into the single + byte at LPC port 80h. + + Port 80h is often used on x86 systems to indicate boot progress/status + and dates back a decent amount of time. Since a byte isn't exactly very + expressive for everything that can go on (and wrong) during boot, it's + all about compromise. + + Some systems (such as Zaius/Barreleye G2) have a physical dual 7 segment + display that display these codes. So far, this has only been driven by + hostboot (see hostboot commit 90ec2e65314c). + +- Write boot progress to LPC ports 81 and 82 + + There's a thought to write more extensive boot progress codes to LPC + ports 81 and 82 to supplement/replace any reliance on port 80. + + We want to still emit port 80 for platforms like Zaius and Barreleye + that have the physical display. Ports 81 and 82 can be monitored by a + BMC though. + +- Add Talos II platform + + Talos II has some hardware differences from Romulus, therefore + we cannot guarantee Talos II == Romulus in skiboot. Copy and + slightly modify the Romulus files for Talos II. + +Since v6.3-rc1: + +- cpufeatures: Add tm-suspend-hypervisor-assist and tm-suspend-xer-so-bug node + + tm-suspend-hypervisor-assist for P9 >=DD2.2 + And a tm-suspend-xer-so-bug node for P9 DD2.2 only. + + I also treat P9P as P9 DD2.3 and add a unit test for the cpufeatures + infrastructure. + + Fixes: https://github.com/open-power/skiboot/issues/233 + + +Deprecated/Removed Features +--------------------------- + +- opal: Deprecate reading the PHB status + + The OPAL_PCI_EEH_FREEZE_STATUS call takes a bunch of parameters, one of + them is @phb_status. It is defined as __be64* and always NULL in + the current Linux upstream but if anyone ever decides to read that status, + then the PHB3's handler will assume it is struct OpalIoPhb3ErrorData* + (which is a lot bigger than 8 bytes) and zero it causing the stack + corruption; p7ioc-phb has the same issue. + + This removes @phb_status from all eeh_freeze_status() hooks and moves + the error message from PHB4 to the affected OPAL handlers. + + As far as we can tell, nobody has ever used this and thus it's safe to remove. +- Remove POWER9N DD1 support + + This is not a shipping product and is no longer supported by Linux + or other firmware components. + +Since v6.3-rc3: + +- Disable fast-reset for POWER8 + + There is a bug with fast-reset when CPU cores are busy, which can be + reproduced by running `stress` and then trying `reboot -ff` (this is + what the op-test test cases FastRebootHostStress and + FastRebootHostStressTorture do). What happens is the cores lock up, + which isn't the best thing in the world when you want them to start + executing instructions again. + + A workaround is to use instruction ramming, which while greatly + increasing the reliability of fast-reset on p8, doesn't make it perfect. + + Instruction ramming is what pdbg was modified to do in order to have the + sreset functionality work reliably on p8. + pdbg patches: https://patchwork.ozlabs.org/project/pdbg/list/?series=96593&state=* + + Fixes: https://github.com/open-power/skiboot/issues/185 + +General +------- + +- core/i2c: Various bits of refactoring +- refactor backtrace generation infrastructure +- astbmc: Handle failure to initialise raw flash + + Initialising raw flash lead to a dead assignment to rc. Check the return + code and take the failure path as necessary. Both before and after the + fix we see output along the lines of the following when flash_init() + fails: :: + + [ 53.283182881,7] IRQ: Registering 0800..0ff7 ops @0x300d4b98 (data 0x3052b9d8) + [ 53.283184335,7] IRQ: Registering 0ff8..0fff ops @0x300d4bc8 (data 0x3052b9d8) + [ 53.283185513,7] PHB#0000: Initializing PHB... + [ 53.288260827,4] FLASH: Can't load resource id:0. No system flash found + [ 53.288354442,4] FLASH: Can't load resource id:1. No system flash found + [ 53.342933439,3] CAPP: Error loading ucode lid. index=200ea + [ 53.462749486,2] NVRAM: Failed to load + [ 53.462819095,2] NVRAM: Failed to load + [ 53.462894236,2] NVRAM: Failed to load + [ 53.462967071,2] NVRAM: Failed to load + [ 53.463033077,2] NVRAM: Failed to load + [ 53.463144847,2] NVRAM: Failed to load + + Eventually followed by: :: + + [ 57.216942479,5] INIT: platform wait for kernel load failed + [ 57.217051132,5] INIT: Assuming kernel at 0x20000000 + [ 57.217127508,3] INIT: ELF header not found. Assuming raw binary. + [ 57.217249886,2] NVRAM: Failed to load + [ 57.221294487,0] FATAL: Kernel is zeros, can't execute! + [ 57.221397429,0] Assert fail: core/init.c:615:0 + [ 57.221471414,0] Aborting! + CPU 0028 Backtrace: + S: 0000000031d43c60 R: 000000003001b274 ._abort+0x4c + S: 0000000031d43ce0 R: 000000003001b2f0 .assert_fail+0x34 + S: 0000000031d43d60 R: 0000000030014814 .load_and_boot_kernel+0xae4 + S: 0000000031d43e30 R: 0000000030015164 .main_cpu_entry+0x680 + S: 0000000031d43f00 R: 0000000030002718 boot_entry+0x1c0 + --- OPAL boot --- + + Analysis of the execution paths suggests we'll always "safely" end this + way due the setup sequence for the blocklevel callbacks in flash_init() + and error handling in blocklevel_get_info(), and there's no current risk + of executing from unexpected memory locations. As such the issue is + reduced to down to a fix for poor error hygene in the original change + and a resolution for a Coverity warning (famous last words etc). +- core/flash: Retry requests as necessary in flash_load_resource() + + We would like to successfully boot if we have a dependency on the BMC + for flash even if the BMC is not current ready to service flash + requests. On the assumption that it will become ready, retry for several + minutes to cover a BMC reboot cycle and *eventually* rather than + *immediately* crash out with: :: + + [ 269.549748] reboot: Restarting system + [ 390.297462587,5] OPAL: Reboot request... + [ 390.297737995,5] RESET: Initiating fast reboot 1... + [ 391.074707590,5] Clearing unused memory: + [ 391.075198880,5] PCI: Clearing all devices... + [ 391.075201618,7] Clearing region 201ffe000000-201fff800000 + [ 391.086235699,5] PCI: Resetting PHBs and training links... + [ 391.254089525,3] FFS: Error 17 reading flash header + [ 391.254159668,3] FLASH: Can't open ffs handle: 17 + [ 392.307245135,5] PCI: Probing slots... + [ 392.363723191,5] PCI Summary: + ... + [ 393.423255262,5] OCC: All Chip Rdy after 0 ms + [ 393.453092828,5] INIT: Starting kernel at 0x20000000, fdt at + 0x30800a88 390645 bytes + [ 393.453202605,0] FATAL: Kernel is zeros, can't execute! + [ 393.453247064,0] Assert fail: core/init.c:593:0 + [ 393.453289682,0] Aborting! + CPU 0040 Backtrace: + S: 0000000031e03ca0 R: 000000003001af60 ._abort+0x4c + S: 0000000031e03d20 R: 000000003001afdc .assert_fail+0x34 + S: 0000000031e03da0 R: 00000000300146d8 .load_and_boot_kernel+0xb30 + S: 0000000031e03e70 R: 0000000030026cf0 .fast_reboot_entry+0x39c + S: 0000000031e03f00 R: 0000000030002a4c fast_reset_entry+0x2c + --- OPAL boot --- + + The OPAL flash API hooks directly into the blocklevel layer, so there's + no delay for e.g. the host kernel, just for asynchronously loaded + resources during boot. +- fast-reboot: occ: Call occ_pstates_init() on fast-reset on all machines + + Commit 815417dcda2e ("init, occ: Initialise OCC earlier on BMC systems") + conditionally invoked occ_pstates_init() only on FSP based systems in + load_and_boot_kernel(). Due to this pstate table is re-parsed on FSP + system and skipped on BMC system during fast-reboot. So this patch fixes + this by invoking occ_pstates_init() on all boxes during fast-reboot. +- opal/hmi: Don't retry TOD recovery if it is already in failed state. + + On TOD failure, all cores/thread receives HMI and very first thread that + gets interrupt fixes the TOD where as others just resets the respective + HMER error bit and return. But when TOD is unrecoverable, all the threads + try to do TOD recovery one by one causing threads to spend more time inside + opal. Set a global flag when TOD is unrecoverable so that rest of the + threads go back to linux immediately avoiding lock ups in system + reboot/panic path. +- hw/bt: Do not disable ipmi message retry during OPAL boot + + Currently OPAL doesn't know whether BMC is functioning or not. If BMC is + down (like BMC reboot), then we keep on retry sending message to BMC. So + in some corner cases we may hit hard lockup issue in kernel. + + Ideally we should avoid using synchronous path as much as possible. But + for now commit 01f977c3 added option to disable message retry in synchronous. + But this fix is not required during boot. Hence lets disable IPMI message + retry during OPAL boot. +- hdata/memory: Fix warning message + + Even though we added memory to device tree, we are getting below warning. :: + + [ 57.136949696,3] Unable to use memory range 0 from MSAREA 0 + [ 57.137049753,3] Unable to use memory range 0 from MSAREA 1 + [ 57.137152335,3] Unable to use memory range 0 from MSAREA 2 + [ 57.137251218,3] Unable to use memory range 0 from MSAREA 3 +- hw/bt: Add backend interface to disable ipmi message retry option + + During boot OPAL makes IPMI_GET_BT_CAPS call to BMC to get BT interface + capabilities which includes IPMI message max resend count, message + timeout, etc,. Most of the time OPAL gets response from BMC within + specified timeout. In some corner cases (like mboxd daemon reset in BMC, + BMC reboot, etc) OPAL may not get response within timeout period. In + such scenarios, OPAL resends message until max resend count reaches. + + OPAL uses synchronous IPMI message (ipmi_queue_msg_sync()) for few + operations like flash read, write, etc. Thread will wait in OPAL until + it gets response from BMC. In some corner cases like BMC reboot, thread + may wait in OPAL for long time (more than 20 seconds) and results in + kernel hardlockup. + + This patch introduces new interface to disable message resend option. We + will disable message resend option for synchrous message. This will + greatly reduces kernel hardlock up issues. + + This is short term fix. Long term solution is to convert all synchronous + messages to asynhrounous one. +- ipmi/power: Fix system reboot issue + + Kernel makes reboot/shudown OPAL call for reboot/shutdown. Once kernel + gets response from OPAL it runs opal_poll_events() until firmware + handles the request. + + On BMC based system, OPAL makes IPMI call (IPMI_CHASSIS_CONTROL) to + initiate system reboot/shutdown. At present OPAL queues IPMI messages + and return SUCESS to Host. If BMC is not ready to accept command (like + BMC reboot), then these message will fail. We have to manually + reboot/shutdown the system using BMC interface. + + This patch adds logic to validate message return value. If message failed, + then it will resend the message. At some stage BMC will be ready to accept + message and handles IPMI message. +- firmware-versions: Add test case for parsing VERSION + + Also make it possible to use with afl-lop/afl-fuzz just to help make + *sure* we're all good. + + Additionally, if we hit a entry in VERSION that is larger than our + buffer size, we skip over it gracefully rather than overwriting the + stack. This is only a problem if VERSION isn't trusted, which as of + 4b8cc05a94513816d43fb8bd6178896b430af08f it is verified as part of + Secure Boot. +- core/fast-reboot: improve NMI handling during fast reset + + Improve sreset and MCE handling in fast reboot. Switch the HILE bit + off before copying OPAL's exception vectors, so NMIs can be handled + properly. Also disable MSR[ME] while the vectors are being overwritten +- core/cpu: HID update race + + If the per-core HID register is updated concurrently by multiple + threads, updates can get lost. This has been observed during fast + reboot where the HILE bit does not get cleared on all cores, which + can cause machine check exception interrupts to crash. + + Fix this by only updating HID on thread0. +- SLW: Print verbose info on errors only + + Change print level from debug to warning for reporting + bad EC_PPM_SPECIAL_WKUP_* scom values. To reduce cluttering + in the log print only on error. + +Since v6.3-rc2: + +- hw/xscom: add missing P9P chip name +- asm/head: balance branches to avoid link stack predictor mispredicts + + The Linux wrapper for OPAL call and return is arranged like this: :: + + __opal_call: + mflr r0 + std r0,PPC_STK_LROFF(r1) + LOAD_REG_ADDR(r11, opal_return) + mtlr r11 + hrfid -> OPAL + + opal_return: + ld r0,PPC_STK_LROFF(r1) + mtlr r0 + blr + + When skiboot returns to Linux, it branches to LR (i.e., opal_return) + with a blr. This unbalances the link stack predictor and will cause + mispredicts back up the return stack. +- external/mambo: also invoke readline for the non-autorun case +- asm/head.S: set POWER9 radix HID bit at entry + + When running in virtual memory mode, the radix MMU hid bit should not + be changed, so set this in the initial boot SPR setup. + + As a side effect, fast reboot also has HID0:RADIX bit set by the + shared spr init, so no need for an explicit call. +- build: link with --orphan-handling=warn + + The linker can warn when the linker script does not explicitly place + all sections. These orphan sections are placed according to + heuristics, which may not always be desirable. Enable this warning. +- build: -fno-asynchronous-unwind-tables + + skiboot does not use unwind tables, this option saves about 100kB, + mostly from .text. +- opal/hmi: Initialize the hmi event with old value of TFMR. + + Do this before we fix TFAC errors. Otherwise the event at host console + shows no thread error reported in TFMR register. + + Without this patch the console event show TFMR with no thread error: + (DEC parity error TFMR[59] injection) :: + + [ 53.737572] Severe Hypervisor Maintenance interrupt [Recovered] + [ 53.737596] Error detail: Timer facility experienced an error + [ 53.737611] HMER: 0840000000000000 + [ 53.737621] TFMR: 3212000870e04000 + + After this patch it shows old TFMR value on host console: :: + + [ 2302.267271] Severe Hypervisor Maintenance interrupt [Recovered] + [ 2302.267305] Error detail: Timer facility experienced an error + [ 2302.267320] HMER: 0840000000000000 + [ 2302.267330] TFMR: 3212000870e14010 + + +IBM FSP based platforms +----------------------- + +- platforms/firenze: Rework I2C controller fixups +- platforms/zz: Re-enable LXVPD slot information parsing + + From memory this was disabled in the distant past since we were waiting + for an updates to the LXPVD format. It looks like that never happened + so re-enable it for the ZZ platform so that we can get PCI slot location + codes on ZZ. + +HIOMAP +------ +- astbmc: Try IPMI HIOMAP for P8 + + The HIOMAP protocol was developed after the release of P8 in preparation + for P9. As a consequence P9 always uses it, but it has rarely been + enabled for P8. P8DTU has recently added IPMI HIOMAP support to its BMC + firmware, so enable its use in skiboot with P8 machines. Doing so + requires some rework to ensure fallback works correctly as in the past + the fallback was to mbox, which will only work for P9. +- libflash/ipmi-hiomap: Enforce message size for empty response + + The protocol defines the response to the associated messages as empty + except for the command ID and sequence fields. If the BMC is returning + extra data consider the message malformed. +- libflash/ipmi-hiomap: Remove unused close handling + + Issuing a HIOMAP_C_CLOSE is not required by the protocol specification, + rather a close can be implicit in a subsequent + CREATE_{READ,WRITE}_WINDOW request. The implicit close provides an + opportunity to reduce LPC traffic and the implementation takes up that + optimisation, so remove the case from the IPMI callback handler. +- libflash/ipmi-hiomap: Overhaul event handling + + Reworking the event handling was inspired by a bug report by Vasant + where the host would get wedged on multiple flash access attempts in the + face of a persistent error state on the BMC-side. The cause of this bug + was the early-exit based on ctx->update, which erronously assumed that + all events had been completely handled in prior calls to + ipmi_hiomap_handle_events(). This is not true if e.g. + HIOMAP_E_DAEMON_READY is clear in the prior calls. + + Regardless, there were other correctness and efficiency problems with + the handling strategy: + + * Ack-able event state was not restored in the face of errors in the + process of re-establishing protocol state + * It forced needless window restoration with respect to the context in + which ipmi_hiomap_handle_events() was called. + * Tests for HIOMAP_E_DAEMON_READY and HIOMAP_E_FLASH_LOST were redundant + with the overhauled error handling introduced in the previous patch + + Fix all of the above issues and add comments to explain the event + handling flow. +- libflash/ipmi-hiomap: Overhaul error handling + + The aim is to improve the robustness with respect to absence of the + BMC-side daemon. The current error handling roughly mirrors what was + done for the mailbox implementation, but there's room for improvement. + + Errors are split into two classes, those that affect the transport state + and those that affect the window validity. From here, we push the + transport state error checks right to the bottom of the stack, to ensure + the link is known to be in a good state before any message is sent. + Window validity tests remain as they were in the hiomap_window_move() + and ipmi_hiomap_read() functions. Validity tests are not necessary in + the write and erase paths as we will receive an error response from the + BMC when performing a dirty or flush on an invalid window. + + Recovery also remains as it was, done on entry to the blocklevel + callbacks. If an error state is encountered in the middle of an + operation no attempt is made to recover it on the spot, instead the + error is returned up the stack and the caller can choose how it wishes + to respond. +- libflash/ipmi-hiomap: Fix leak of msg in callback + +Since v6.3-rc1: + +- libflash/ipmi-hiomap: Fix blocks count issue + + We convert data size to block count and pass block count to BMC. + If data size is not block aligned then we endup sending block count + less than actual data. BMC will write partial data to flash memory. + + Sample log :: + + [ 594.388458416,7] HIOMAP: Marked flash dirty at 0x42010 for 8 + [ 594.398756487,7] HIOMAP: Flushed writes + [ 594.409596439,7] HIOMAP: Marked flash dirty at 0x42018 for 3970 + [ 594.419897507,7] HIOMAP: Flushed writes + + In this case HIOMAP sent data with block count=0 and hence BMC didn't + flush data to flash. + + + +POWER8 +------ +- hw/phb3/naples: Disable D-states + + Putting "Mellanox Technologies MT27700 Family [ConnectX-4] [15b3:1013]" + (more precisely, the second of 2 its PCI functions, no matter in what + order) into the D3 state causes EEH with the "PCT timeout" error. + This has been noticed on garrison machines only and firestones do not + seem to have this issue. + + This disables D-states changing for devices on root buses on Naples by + installing a config space access filter (copied from PHB4). +- cpufeatures: Always advertise POWER8NVL as DD2 + + Despite the major version of PVR being 1 (0x004c0100) for POWER8NVL, + these chips are functionally equalent to P8/P8E DD2 levels. + + This advertises POWER8NVL as DD2. As the result, skiboot adds + ibm,powerpc-cpu-features/processor-control-facility for such CPUs and + the linux kernel can use hypervisor doorbell messages to wake secondary + threads; otherwise "KVM: CPU %d seems to be stuck" would appear because + of missing LPCR_PECEDH. + +p8dtu Platform +^^^^^^^^^^^^^^ +- p8dtu: Configure BMC graphics + + We can no-longer read the values from the BMC in the way we have in the + past. Values were provided by Eric Chen of SMC. +- p8dtu: Enable HIOMAP support + +Vesnin Platform +^^^^^^^^^^^^^^^ +- platforms/vesnin: Disable PCIe port bifurcation + + PCIe ports connected to CPU1 and CPU3 now work as x16 instead of x8x8. + +- Fix hang in pnv_platform_error_reboot path due to TOD failure. + + On TOD failure, with TB stuck, when linux heads down to + pnv_platform_error_reboot() path due to unrecoverable hmi event, the panic + cpu gets stuck in OPAL inside ipmi_queue_msg_sync(). At this time, rest + all other cpus are in smp_handle_nmi_ipi() waiting for panic cpu to proceed. + But with panic cpu stuck inside OPAL, linux never recovers/reboot. :: + + p0 c1 t0 + NIA : 0x000000003001dd3c <.time_wait+0x64> + CFAR : 0x000000003001dce4 <.time_wait+0xc> + MSR : 0x9000000002803002 + LR : 0x000000003002ecf8 <.ipmi_queue_msg_sync+0xec> + + STACK: SP NIA + 0x0000000031c236e0 0x0000000031c23760 (big-endian) + 0x0000000031c23760 0x000000003002ecf8 <.ipmi_queue_msg_sync+0xec> + 0x0000000031c237f0 0x00000000300aa5f8 <.hiomap_queue_msg_sync+0x7c> + 0x0000000031c23880 0x00000000300aaadc <.hiomap_window_move+0x150> + 0x0000000031c23950 0x00000000300ab1d8 <.ipmi_hiomap_write+0xcc> + 0x0000000031c23a90 0x00000000300a7b18 <.blocklevel_raw_write+0xbc> + 0x0000000031c23b30 0x00000000300a7c34 <.blocklevel_write+0xfc> + 0x0000000031c23bf0 0x0000000030030be0 <.flash_nvram_write+0xd4> + 0x0000000031c23c90 0x000000003002c128 <.opal_write_nvram+0xd0> + 0x0000000031c23d20 0x00000000300051e4 <opal_entry+0x134> + 0xc000001fea6e7870 0xc0000000000a9060 <opal_nvram_write+0x80> + 0xc000001fea6e78c0 0xc000000000030b84 <nvram_write_os_partition+0x94> + 0xc000001fea6e7960 0xc0000000000310b0 <nvram_pstore_write+0xb0> + 0xc000001fea6e7990 0xc0000000004792d4 <pstore_dump+0x1d4> + 0xc000001fea6e7ad0 0xc00000000018a570 <kmsg_dump+0x140> + 0xc000001fea6e7b40 0xc000000000028e5c <panic_flush_kmsg_end+0x2c> + 0xc000001fea6e7b60 0xc0000000000a7168 <pnv_platform_error_reboot+0x68> + 0xc000001fea6e7bd0 0xc0000000000ac9b8 <hmi_event_handler+0x1d8> + 0xc000001fea6e7c80 0xc00000000012d6c8 <process_one_work+0x1b8> + 0xc000001fea6e7d20 0xc00000000012da28 <worker_thread+0x88> + 0xc000001fea6e7db0 0xc0000000001366f4 <kthread+0x164> + 0xc000001fea6e7e20 0xc00000000000b65c <ret_from_kernel_thread+0x5c> + + This is because, there is a while loop towards the end of + ipmi_queue_msg_sync() which keeps looping until "sync_msg" does not match + with "msg". It loops over time_wait_ms() until exit condition is met. In + normal scenario time_wait_ms() calls run pollers so that ipmi backend gets + a chance to check ipmi response and set sync_msg to NULL. :: + + while (sync_msg == msg) + time_wait_ms(10); + + But in the event when TB is in failed state time_wait_ms()->time_wait_poll() + returns immediately without calling pollers and hence we end up looping + forever. This patch fixes this hang by calling opal_run_pollers() in TB + failed state as well. + + +.. _skiboot-6.3-power9: + +POWER9 +------ + +- Retry link training at PCIe GEN1 if presence detected but training repeatedly failed + + Certain older PCIe 1.0 devices will not train unless the training process starts at GEN1 speeds. + As a last resort when a device will not train, fall back to GEN1 speed for the last training attempt. + + This is verified to fix devices based on the Conexant CX23888 on the Talos II platform. +- hw/phb4: Drop FRESET_DEASSERT_DELAY state + + The delay between the ASSERT_DELAY and DEASSERT_DELAY states is set to + one timebase tick. This state seems to have been a hold over from PHB3 + where it was used to add a 1s delay between de-asserting PERST and + polling the link for the CAPI FPGA. There's no requirement for that here + since the link polling on PHB4 is a bit smarter so we should be fine. +- hw/phb4: Factor out PERST control + + Some time ago Mikey added some code work around a bug we found where a + certain RAID card wouldn't come back again after a fast-reboot. The + workaround is setting the Link Disable bit before asserting PERST and + clear it after de-asserting PERST. + + Currently we do this in the FRESET path, but not in the CRESET path. + This patch moves the PERST control into its own function to reduce + duplication and to the workaround is applied in all circumstances. +- hw/phb4: Remove FRESET presence check + + When we do an freset the first step is to check if a card is present in + the slot. However, this only occurs when we enter phb4_freset() with the + slot state set to SLOT_NORMAL. This occurs in: + + a) The creset path, and + b) When the OS manually requests an FRESET via an OPAL call. + + (a) is problematic because in the boot path the generic code will put the + slot into FRESET_START manually before calling into phb4_freset(). This + can result in a situation where a device is detected on boot, but not + after a CRESET. + + I've noticed this occurring on systems where the PHB's slot presence + detect signal is not wired to an adapter. In this situation we can rely + on the in-band presence mechanism, but the presence check will make + us exit before that has a chance to work. + + Additionally, if we enter from the CRESET path this early exit leaves + the slot's PERST signal being left asserted. This isn't currently an issue, + but if we want to support hotplug of devices into the root port it will + be. +- hw/phb4: Skip FRESET PERST when coming from CRESET + + PERST is asserted at the beginning of the CRESET process to prevent + the downstream device from interacting with the host while the PHB logic + is being reset and re-initialised. There is at least a 100ms wait during + the CRESET processing so it's not necessary to wait this time again + in the FRESET handler. + + This patch extends the delay after re-setting the PHB logic to extend + to the 250ms PERST wait period that we typically use and sets the + skip_perst flag so that we don't wait this time again in the FRESET + handler. +- hw/phb4: Look for the hub-id from in the PBCQ node + + The hub-id is stored in the PBCQ node rather than the stack node so we + never add it to the PHB node. This breaks the lxvpd slot lookup code + since the hub-id is encoded in the VPD record that we need to find the + slot information. +- hdata/iohub: Look for IOVPD on P9 + + P8 and P9 use the same IO VPD setup, so we need to load the IOHUB VPD on + P9 systems too. + +Since v6.3-rc2: + +- hw/phb4: Squash the IO bridge window + + The PCI-PCI bridge spec says that bridges that implement an IO window + should hardcode the IO base and limit registers to zero. + Unfortunately, these registers only define the upper bits of the IO + window and the low bits are assumed to be 0 for the base and 1 for the + limit address. As a result, setting both to zero can be mis-interpreted + as a 4K IO window. + + This patch fixes the problem the same way PHB3 does. It sets the IO base + and limit values to 0xf000 and 0x1000 respectively which most software + interprets as a disabled window. + + lspci before patch: :: + + 0000:00:00.0 PCI bridge: IBM Device 04c1 (prog-if 00 [Normal decode]) + I/O behind bridge: 00000000-00000fff + + lspci after patch: :: + + 0000:00:00.0 PCI bridge: IBM Device 04c1 (prog-if 00 [Normal decode]) + I/O behind bridge: None + +- hw/xscom: Enable sw xstop by default on p9 + + This was disabled at some point during bringup to make life easier for + the lab folks trying to debug NVLink issues. This hack really should + have never made it out into the wild though, so we now have the + following situation occuring in the field: + + 1) A bad happens + 2) The host kernel recieves an unrecoverable HMI and calls into OPAL to + request a platform reboot. + 3) OPAL rejects the reboot attempt and returns to the kernel with + OPAL_PARAMETER. + 4) Kernel panics and attempts to kexec into a kdump kernel. + + A side effect of the HMI seems to be CPUs becoming stuck which results + in the initialisation of the kdump kernel taking a extremely long time + (6+ hours). It's also been observed that after performing a dump the + kdump kernel then crashes itself because OPAL has ended up in a bad + state as a side effect of the HMI. + + All up, it's not very good so re-enable the software checkstop by + default. If people still want to turn it off they can using the nvram + override. + + +CAPI2 +^^^^^ +- capp/phb4: Prevent HMI from getting triggered when disabling CAPP + + While disabling CAPP an HMI gets triggered as soon as ETU is put in + reset mode. This is caused as before we can disabled CAPP, it detects + PHB link going down and triggers an HMI requesting Opal to perform + CAPP recovery. This has an un-intended side effect of spamming the + Opal logs with malfunction alert messages and may also confuse the + user. + + To prevent this we mask the CAPP FIR error 'PHB Link Down' Bit(31) + when we are disabling CAPP just before we put ETU in reset in + phb4_creset(). Also now since bringing down the PHB link now wont + trigger an HMI and CAPP recovery, hence we manually set the + PHB4_CAPP_RECOVERY flag on the phb to force recovery during creset. + +- phb4/capp: Implement sequence to disable CAPP and enable fast-reset + + We implement h/w sequence to disable CAPP in disable_capi_mode() and + with it also enable fast-reset for CAPI mode in phb4_set_capi_mode(). + + Sequence to disable CAPP is executed in three phases. The first two + phase is implemented in disable_capi_mode() where we reset the CAPP + registers followed by PEC registers to their init values. The final + third final phase is to reset the PHB CAPI Compare/Mask Register and + is done in phb4_init_ioda3(). The reason to move the PHB reset to + phb4_init_ioda3() is because by the time Opal PCI reset state machine + reaches this function the PHB is already un-fenced and its + configuration registers accessible via mmio. +- capp/phb4: Force CAPP to PCIe mode during kernel shutdown + + This patch introduces a new opal syncer for PHB4 named + phb4_host_sync_reset(). We register this opal syncer when CAPP is + activated successfully in phb4_set_capi_mode() so that it will be + called at kernel shutdown during fast-reset. + + During kernel shutdown the function will then repeatedly call + phb->ops->set_capi_mode() to switch switch CAPP to PCIe mode. In case + set_capi_mode() indicates its OPAL_BUSY, which indicates that CAPP is + still transitioning to new state; it calls slot->ops.run_sm() to + ensure that Opal slot reset state machine makes forward progress. + + +Witherspoon Platform +^^^^^^^^^^^^^^^^^^^^ +- platforms/witherspoon: Make PCIe shared slot error message more informative + + If we're missing chips for some reason, we print a warning when configuring + the PCIe shared slot. + + The warning doesn't really make it clear what "shared slot" is, and if it's + printed, it'll come right after a bunch of messages about NPU setup, so + let's clarify the message to explicitly mention PCI. +- witherspoon: Add nvlink2 interconnect information + + See :ref:`skiboot-6.3-new-features` for details. + +Zaius Platform +^^^^^^^^^^^^^^ + +- zaius: Add BMC description + + Frederic reported that Zaius was failing with a NULL dereference when + trying to initialise IPMI HIOMAP. It turns out that the BMC wasn't + described at all, so add a description. + +p9dsu platform +^^^^^^^^^^^^^^ +- p9dsu: Fix p9dsu default variant + + Add the default when no riser_id is returned from the ipmi query. + + Allow a little more time for BMC reply and cleanup some label strings. + + +PCIe +---- + +See :ref:`skiboot-6.3-power9` for POWER9 specific PCIe changes. + +- core/pcie-slot: Don't bail early in the power on case + + Exiting early in the power off case makes sense since we can't disable + slot power (or assert PERST) for suprise hotplug slots. However, we + should not exit early in the power-on case since it's possible slot + power may have been disabled (or just not enabled at boot time). +- firenze-pci: Always init slot info from LXVPD + + We can slot information from the LXVPD without having power control + information about that slot. This patch changes the init path so that + we always override the add_properties() call rather than only when we + have power control information about the slot. +- fsp/lxvpd: Print more LXVPD slot information + + Useful to know since it changes the behaviour of the slot core. +- core/pcie-slot: Set power state from the PWRCTL flag + + For some reason we look at the power control indicator and use that to + determine if the slot is "off" rather than the power control flag that + is used to power down the slot. + + While we're here change the default behaviour so that the slot is + assumed to be powered on if there's no slot capability, or if there's + no power control available. +- core/pci: Increase the max slot string size + + The maximum string length for the slot label / device location code in + the PCI summary is currently 32 characters. This results in some IBM + location codes being truncated due to their length, e.g. :: + + PHB#0001:02:11.0 [SWDN] SLOT=C11 x8 + PHB#0001:13:00.0 [EP ] *snip* LOC_CODE=U78D3.ND1.WZS004A-P1-C + PHB#0001:13:00.1 [EP ] *snip* LOC_CODE=U78D3.ND1.WZS004A-P1-C + PHB#0001:13:00.2 [EP ] *snip* LOC_CODE=U78D3.ND1.WZS004A-P1-C + PHB#0001:13:00.3 [EP ] *snip* LOC_CODE=U78D3.ND1.WZS004A-P1-C + + Which obscure the actual location of the card, and it looks bad. This + patch increases the maximum length of the label string to 80 characters + since that's the maximum length for a location code. + + +Since v6.3-rc3: + +- pci: Try harder to add meaningful ibm,loc-code + + We keep the existing logic of looking to the parent for the slot-label or + slot-location-code, but we add logic to (if all that fails) we look + directly for the slot-location-code (as this should give us the correct + loc code for things directly under the PHB), and otherwise we just look + for a loc-code. + + The applicable bit of PAPR here is: + + R1–12.1–1. Each instance of a hardware entity (FRU) has a platform + unique location code and any node in the OF + device tree that describes a part of a hardware entity must include the + “ibm,loc-code” property with a + value that represents the location code for that hardware entity. + + which we weren't really fully obeying at any recent (ever?) point in + time. Now we should do okay, at least for PCI. + +Since v6.3-rc2: +- core/pci: Use PHB io-base-location by default for PHB slots + + On witherspoon only the GPU slots and the three pluggable PCI slots + (SLOT0, 1, 2) have platform defined slot names. For builtin devices such + as the SATA controller or the PLX switch that fans out to the GPU slots + we have no location codes which some people consider an issue. + + This patch address the problem by making the ibm,slot-location-code for + the root port device default to the ibm,io-base-location-code which is + typically the location code for the system itself. + + e.g. :: + + pciex@600c3c0100000/ibm,loc-code + "UOPWR.0000000-Node0-Proc0" + + pciex@600c3c0100000/pci@0/ibm,loc-code + "UOPWR.0000000-Node0-Proc0" + + pciex@600c3c0100000/pci@0/usb-xhci@0/ibm,loc-code + "UOPWR.0000000-Node0" + + The PHB node, and the root complex nodes have a loc code of the + processor they are attached to, while the usb-xhci device under the + root port has a location code of the system itself. + +- hw/phb4: Read ibm,loc-code from PBCQ node + + On P9 the PBCQs are subdivided by stacks which implement the PCI Express + logic. When phb4 was forked from phb3 most of the properties that were + in the pbcq node moved into the stack node, but ibm,loc-code was not one + of them. This patch fixes the phb4 init sequence to read the base + location code from the PBCQ node (parent of the stack node) rather than + the stack node itself. + + +.. _skiboot-6.3-OpenCAPI: + +OpenCAPI +-------- +- npu2/hw-procedures: Fix parallel zcal for opencapi + + For opencapi, we currently do impedance calibration when initializing + the PHY for the device, which could run in parallel if we have + multiple opencapi devices. But if 2 devices are on the same + obus, the 2 calibration sequences could overlap, which likely yields + bad results and is useless anyway since it only needs to be done once + per obus. + + This patch splits the opencapi PHY reset in 2 parts: + + - a 'init' part called serially at boot. That's when zcal is done. If + we have 2 devices on the same socket, the zcal won't be redone, + since we're called serially and we'll see it has already be done for + the obus + - a 'reset' part called during fundamental reset as a prereq for link + training. It does the PHY setup for a set of lanes and the dccal. + + The PHY team confirmed there's no dependency between zcal and the + other reset steps and it can be moved earlier. +- npu2-hw-procedures: Fix zcal in mixed opencapi and nvlink mode + + The zcal procedure needs to be run once per obus. We keep track of + which obus is already calibrated in an array indexed by the obus + number. However, the obus number is inferred from the brick index, + which works well for nvlink but not for opencapi. + + Create an obus_index() function, which, from a device, returns the + correct obus index, irrespective of the device type. +- npu2-opencapi: Fix adapter reset when using 2 adapters + + If two opencapi adapters are on the same obus, we may try to train the + two links in parallel at boot time, when all the PCI links are being + trained. Both links use the same i2c controller to handle the reset + signal, so some care is needed to make sure resetting one doesn't + interfere with the reset of the other. We need to keep track of the + current state of the i2c controller (and use locking). + + This went mostly unnoticed as you need to have 2 opencapi cards on the + same socket and links tended to train anyway because of the retries. +- npu2-opencapi: Extend delay after releasing reset on adapter + + Give more time to the FPGA to process the reset signal. The previous + delay, 5ms, is too short for newer adapters with bigger FPGAs. Extend + it to 250ms. + Ultimately, that delay will likely end up being added to the opencapi + specification, but we are not there yet. +- npu2-opencapi: ODL should be in reset when enabled + + We haven't hit any problem so far, but from the ODL designer, the ODL + should be in reset when it is enabled. + + The ODL remains in reset until we start a fundamental reset to + initiate link training. We still assert and deassert the ODL reset + signal as part of the normal procedure just before training the + link. Asserting is therefore useless at boot, since the ODL is already + in reset, but we keep it as it's only a scom write and it's needed + when we reset/retrain from the OS. +- npu2-opencapi: Keep ODL and adapter in reset at the same time + + Split the function to assert and deassert the reset signal on the ODL, + so that we can keep the ODL in reset while we reset the adapter, + therefore having a window where both sides are in reset. + + It is actually not required with our current DLx at boot time, but I + need to split the ODL reset function for the following patch and it + will become useful/required later when we introduce resetting an + opencapi link from the OS. +- npu2-opencapi: Setup perf counters to detect CRC errors + + It's possible to set up performance counters for the PLL to detect + various conditions for the links in nvlink or opencapi mode. Since + those counters are currently unused, let's configure them when an obus + is in opencapi mode to detect CRC errors on the link. Each link has + two counters: + - CRC error detected by the host + - CRC error detected by the DLx (NAK received by the host) + + We also dump the counters shortly after the link trains, but they can + be read multiple times through cronus, pdbg or linux. The counters are + configured to be reset after each read. + +Since v6.3-rc1: + +- opal/hmi: Never trust a cow! + + With opencapi, it's fairly common to trigger HMIs during AFU + development on the FPGA, by not replying in time to an NPU command, + for example. So shift the blame reported by that cow to avoid crowding + my mailbox. +- hw/npu2: Dump (more) npu2 registers on link error and HMIs + + We were already logging some NPU registers during an HMI. This patch + cleans up a bit how it is done and separates what is global from what + is specific to nvlink or opencapi. + + Since we can now receive an error interrupt when an opencapi link goes + down unexpectedly, we also dump the NPU state but we limit it to the + registers of the brick which hit the error. + + The list of registers to dump was worked out with the hw team to + allow for proper debugging. For each register, we print the name as + found in the NPU workbook, the scom address and the register value. +- hw/npu2: Report errors to the OS if an OpenCAPI brick is fenced + + Now that the NPU may report interrupts due to the link going down + unexpectedly, report those errors to the OS when queried by the + 'next_error' PHB callback. + + The hardware doesn't support recovery of the link when it goes down + unexpectedly. So we report the PHB as dead, so that the OS can log the + proper message, notify the drivers and take the devices down. +- hw/npu2: Fix OpenCAPI PE assignment + + When we support mixing NVLink and OpenCAPI devices on the same NPU, we're + going to have to share the same range of 16 PE numbers between NVLink and + OpenCAPI PHBs. + + For OpenCAPI devices, PE assignment is only significant for determining + which System Interrupt Log register is used for a particular brick - unlike + NVLink, it doesn't play any role in determining how links are fenced. + + Split the PE range into a lower half which is used for NVLink, and an upper + half that is used for OpenCAPI, with a fixed PE number assigned per brick. + + As the PE assignment for OpenCAPI devices is fixed, set the PE once + during device init and then ignore calls to the set_pe() operation. + +- opal-api: Reserve 2 OPAL API calls for future OpenCAPI LPC use + + OpenCAPI Lowest Point of Coherency (LPC) memory is going to require + some extra OPAL calls to set up NPU BARs. These calls will most likely be + called OPAL_NPU_LPC_ALLOC and OPAL_NPU_LPC_RELEASE, we're not quite ready + to upstream that code yet though. + + + +NVLINK2 +------- +- npu2: Allow ATSD for LPAR other than 0 + + Each XTS MMIO ATSD# register is accompanied by another register - + XTS MMIO ATSD0 LPARID# - which controls LPID filtering for ATSD + transactions. + + When a host system passes a GPU through to a guest, we need to enable + some ATSD for an LPAR. At the moment the host assigns one ATSD to + a NVLink bridge and this maps it to an LPAR when GPU is assigned to + the LPAR. The link number is used for an ATSD index. + + ATSD6&7 stay mapped to the host (LPAR=0) all the time which seems to be + acceptable price for the simplicity. +- npu2: Add XTS_BDF_MAP wildcard refcount + + Currently PID wildcard is programmed into the NPU once and never cleared + up. This works for the bare metal as MSR does not change while the host + OS is running. + + However with the device virtualization, we need to keep track of wildcard + entries use and clear them up before switching a GPU from a host to + a guest or vice versa. + + This adds refcount to a NPU2, one counter per wildcard entry. The index + is a short lparid (4 bits long) which is allocated in opal_npu_map_lpar() + and should be smaller than NPU2_XTS_BDF_MAP_SIZE (defined as 16). + +Since v6.3-rc2: +- npu2: Disable Probe-to-Invalid-Return-Modified-or-Owned snarfing by default + + V100 GPUs are known to violate NVLink2 protocol in some cases (one is when + memory was accessed by the CPU and they by GPU using so called block + linear mapping) and issue double probes to NPU which can cope with this + problem only if CONFIG_ENABLE_SNARF_CPM ("disable/enable Probe.I.MO + snarfing a cp_m") is not set in the CQ_SM Misc Config register #0. + If the bit is set (which is the case today), NPU issues the machine + check stop. + + The snarfing feature is designed to detect 2 probes in flight and combine + them into one. + + This adds a new "opal-npu2-snarf-cpm" nvram variable which controls + CONFIG_ENABLE_SNARF_CPM for all NVLinks to prevent the machine check + stop from happening. + + This disables snarfing by default as otherwise a broken GPU driver can + crash the entire box even when a GPU is passed through to a guest. + This provides a dial to allow regression tests (might be useful for + a bare metal). To enable snarfing, the user needs to run: :: + + sudo nvram -p ibm,skiboot --update-config opal-npu2-snarf-cpm=enable + + and reboot the host system. + +- hw/npu2: Show name of opencapi error interrupts + + +Debugging and simulation +------------------------ + +- external/mambo: Error out if kernel is too large + + If you're trying to boot a gigantic kernel in mambo (which you can + reproduce by building a kernel with CONFIG_MODULES=n) you'll get + misleading errors like: :: + + WARNING: 0: (0): [0:0]: Invalid/unsupported instr 0x00000000[INVALID] + WARNING: 0: (0): PC(EA): 0x0000000030000010 PC(RA):0x0000000030000010 MSR: 0x9000000000000000 LR: 0x0000000000000000 + WARNING: 0: (0): numInstructions = 0 + WARNING: 1: (1): [0:0]: Invalid/unsupported instr 0x00000000[INVALID] + WARNING: 1: (1): PC(EA): 0x0000000000000E40 PC(RA):0x0000000000000E40 MSR: 0x9000000000000000 LR: 0x0000000000000000 + WARNING: 1: (1): numInstructions = 1 + WARNING: 1: (1): Interrupt to 0x0000000000000E40 from 0x0000000000000E40 + INFO: 1: (2): ** Execution stopped: Continuous Interrupt, Instruction caused exception, ** + + So add an error to skiboot.tcl to warn the user before this happens. + Making PAYLOAD_ADDR further back is one way to do this but if there's a + less gross way to generally work around this very niche problem, I can + suggest that instead. +- external/mambo: Populate kernel-base-address in the DT + + skiboot.tcl defines PAYLOAD_ADDR as 0x20000000, which is the default in + skiboot. This is also the default in skiboot unless kernel-base-address + is set in the device tree. + + If you change PAYLOAD_ADDR to something else for mambo, skiboot won't + see it because it doesn't set that DT property, so fix it so that it does. +- external/mambo: allow CPU targeting for most debug utils + + Debug util functions target CPU 0:0:0 by default Some can be + overidden explicitly per invocation, and others can't at all. + Even for those that can be overidden, it is a pain to type + them out when you're debugging a particular thread. + + Provide a new 'target' function that allows the default CPU + target to be changed. Wire that up that default to all other utils. + Provide a new 'S' step command which only steps the target CPU. +- qemu: bt device isn't always hanging off / + + Just use the normal for_each_compatible instead. + + Otherwise in the qemu model as executed by op-test, + we wouldn't go down the astbmc_init() path, thus not having flash. +- devicetree: Add p9-simics.dts + + Add a p9-based devicetree that's suitable for use with Simics. +- devicetree: Move power9-phb4.dts + + Clean up the formatting of power9-phb4.dts and move it to + external/devicetree/p9.dts. This sets us up to include it as the basis + for other trees. +- devicetree: Add nx node to power9-phb4.dts + + A (non-qemu) p9 without an nx node will assert in p9_darn_init(): :: + + dt_for_each_compatible(dt_root, nx, "ibm,power9-nx") + break; + if (!nx) { + if (!dt_node_is_compatible(dt_root, "qemu,powernv")) + assert(nx); + return; + } + + Since NX is this essential, add it to the device tree. +- devicetree: Fix typo in power9-phb4.dts + + Change "impi" to "ipmi". +- devicetree: Fix syntax error in power9-phb4.dts + + Remove the extra space causing this: :: + + Error: power9-phb4.dts:156.15-16 syntax error + FATAL ERROR: Unable to parse input tree +- core/init: enable machine check on secondaries + + Secondary CPUs currently run with MSR[ME]=0 during boot, whih means + if they take a machine check, the system will checkstop. + + Enable ME where possible and allow them to print registers. + +Utilities +--------- +- pflash: Don't try update RO ToC + + In the future it's likely the ToC will be marked as read-only. Don't + error out by assuming its writable. +- pflash: Support encoding/decoding ECC'd partitions + + With the new --ecc option, pflash can add/remove ECC when + reading/writing flash partitions protected by ECC. + + This is *not* flawless with current PNORs out in the wild though, as + they do not typically fill the whole partition with valid ECC data, so + you have to know how big the valid ECC'd data is and specify the size + manually. Note that for some partitions this is pratically impossible + without knowing the details of the content of the partition. + + A future patch is likely to introduce an option to "stop reading data + when ECC starts failing and assume everything is okay rather than error + out" to support reading the "valid" data from existing PNOR images. + +Since v6.3-rc2: + +- opal-prd: Fix memory leak in is-fsp-system check +- opal-prd: Check malloc return value diff --git a/roms/skiboot/doc/release-notes/skiboot-6.4-rc1.rst b/roms/skiboot/doc/release-notes/skiboot-6.4-rc1.rst new file mode 100644 index 000000000..910656f04 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.4-rc1.rst @@ -0,0 +1,788 @@ +.. _skiboot-6.4-rc1: + +skiboot-6.4-rc1 +=============== + +skiboot v6.4-rc1 was released on Monday July 8th 2019. It is the first +release candidate of skiboot 6.4, which will become the new stable release +of skiboot following the 6.3 release, first released May 3rd 2019. + +Skiboot 6.4 will mark the basis for op-build v2.4. I expect this to be a +relatively short -rc cycle. + +skiboot v6.4-rc1 contains all bug fixes as of :ref:`skiboot-6.0.20`, +and :ref:`skiboot-6.3.2` (the currently maintained +stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +Over skiboot 6.3, we have the following changes: + +.. _skiboot-6.4-rc1-new-features: + +New features +------------ + +- platforms/nicole: Add new platform + + The platform is a new platform from YADRO, it's a storage controller for + TATLIN server. It's Based on IBM Romulus reference design (POWER9). + +- platform/zz: Add new platform type + + We have new platform type under ZZ. Lets add them. With this fix +- nvram: Flag dangerous NVRAM options + + Most nvram options used by skiboot are just for debug or testing for + regressions. They should never be used long term. + + We've hit a number of issues in testing and the field where nvram + options have been set "temporarily" but haven't been properly cleared + after, resulting in crashes or real bugs being masked. + + This patch marks most nvram options used by skiboot as dangerous and + prints a chicken to remind users of the problem. + +- hw/phb3: Add verbose EEH output + + Add support for the pci-eeh-verbose NVRAM flag on PHB3. We've had this + on PHB4 since forever and it has proven very useful when debugging EEH + issues. When testing changes to the Linux kernel's EEH implementation + it's fairly common for the kernel to crash before printing the EEH log + so it's helpful to have it in the OPAL log where it can be dumped from + XMON. + + Note that unlike PHB4 we do not enable verbose mode by default. The + nvram option must be used to explicitly enable it. + +- Experimental support for building without FSP code + + Now, with CONFIG_FSP=0/1 we have: + + - 1.6M/1.4M skiboot.lid + - 323K/375K skiboot.lid.xz + +- doc: travis-ci deploy docs! + + Documentation is now automatically deployed if you configure Travis CI + appropriately (we have done this for the open-power branch of skiboot) + +- Big OPAL API Documentation improvement + + A lot more OPAL API calls are now (at least somewhat) documented. +- opal/hmi: Report NPU2 checkstop reason + + The NPU2 is currently not passing any information to linux to explain + the cause of an HMI. NPU2 has three Fault Isolation Registers and over + 30 of those FIR bits are configured to raise an HMI by default. We + won't be able to fit all possible state in the 32-bit xstop_reason + field of the HMI event, but we can still try to encode up to 4 HMI + reasons. +- opal-msg: Enhance opal-get-msg API + + Linux uses :ref:`OPAL_GET_MSG` API to get OPAL messages. This interface + supports upto 8 params (64 bytes). We have a requirement to send bigger data to + Linux. This patch enhances OPAL to send bigger data to Linux. + + - Linux will use "opal-msg-size" device tree property to allocate memory for + OPAL messages (previous patch increased "opal-msg-size" to 64K). + - Replaced `reserved` field in "struct opal_msg" with `size`. So that Linux + side opal_get_msg user can detect actual data size. + - If buffer size < actual message size, then opal_get_msg will copy partial + data and return OPAL_PARTIAL to Linux. + - Add new variable "extended" to "opal_msg_entry" structure to keep track + of messages that has more than 64byte data. We will allocate separate + memory for these messages and once kernel consumes message we will + release that memory. +- core/opal: Increase opal-msg-size size + + Kernel will use `opal-msg-size` property to allocate memory for opal_msg. + We want to send bigger data from OPAL to kernel. Hence increase + opal-msg-size to 64K. +- hw/npu2-opencapi: Add initial support for allocating OpenCAPI LPC memory + + Lowest Point of Coherency (LPC) memory allows the host to access memory on + an OpenCAPI device. + + Define 2 OPAL calls, :ref:`OPAL_NPU_MEM_ALLOC` and :ref:`OPAL_NPU_MEM_RELEASE`, for + assigning and clearing the memory BAR. (We try to avoid using the term + "LPC" to avoid confusion with Low Pin Count.) + + At present, we use a fixed location in the address space, which means we + are restricted to a single range of 4TB, on a single OpenCAPI device per + chip. In future, we'll use some chip ID extension magic to give us more + space, and some sort of allocator to assign ranges to more than one device. +- core/fast-reboot: Add im-feeling-lucky option + + Fast reboot gets disabled for a number of reasons e.g. the availability + of nvlink. However this doesn't actually affect the ability to perform fast + reboot if no nvlink device is actually present. + + Add a nvram option for fast-reset where if it's set to + "im-feeling-lucky" then perform the fast-reboot irrespective of if it's + previously been disabled. + +- platforms/astbmc: Check for SBE validation step + + On some POWER8 astbmc systems an update to the SBE requires pausing at + runtime to ensure integrity of the SBE. If this is required the BMC will + set a chassis boot option IPMI flag using the OEM parameter 0x62. If + Skiboot sees this flag is set it waits until the SBE update is complete + and the flag is cleared. + + Unfortunately the mystery operation that validates the SBE also leaves + it in a bad state and unable to be used for timer operations. To + workaround this the flag is checked as soon as possible (ie. when IPMI + and the console are set up), and once complete the system is rebooted. +- Add P9 DIO interrupt support + + On P9 there are GPIO port 0, 1, 2 for GPIO interrupt, and DIO interrupt + is used to handle the interrupts. + + Add support to the DIO interrupts: + + 1. Add dio_interrupt_register(chip, port, callback) to register the + interrupt + 2. Add dio_interrupt_deregister(chip, port, callback) to deregister; + 3. When interrupt on the port occurs, callback is invoked, and the + interrupt status is cleared. + + +Removed features +---------------- + +- pci/iov: Remove skiboot VF tracking + + This feature was added a few years ago in response to a request to make + the MaxPayloadSize (MPS) field of a Virtual Function match the MPS of the + Physical Function that hosts it. + + The SR-IOV specification states the the MPS field of the VF is "ResvP". + This indicates the VF will use whatever MPS is configured on the PF and + that the field should be treated as a reserved field in the config space + of the VF. In other words, a SR-IOV spec compliant VF should always return + zero in the MPS field. Adding hacks in OPAL to make it non-zero is... + misguided at best. + + Additionally, there is a bug in the way pci_device structures are handled + by VFs that results in a crash on fast-reboot that occurs if VFs are + enabled and then disabled prior to rebooting. This patch fixes the bug by + removing the code entirely. This patch has no impact on SR-IOV support on + the host operating system. +- Remove POWER7 and POWER7+ support + + It's been a good long while since either OPAL POWER7 user touched a + machine, and even longer since they'd have been okay using an old + version rather than tracking master. + + There's also been no testing of OPAL on POWER7 systems for an awfully + long time, so it's pretty safe to assume that it's very much bitrotted. + + It also saves a whole 14kb of xz compressed payload space. +- Remove remnants of :ref:`OPAL_PCI_GET_PHB_DIAG_DATA` + + Never present in a public OPAL release, and only kernels prior to 3.11 + would ever attempt to call it. +- Remove unused :ref:`OPAL_GET_XIVE_SOURCE` + + While this call was technically implemented by skiboot, no code has ever called + it, and it was only ever implemented for the p7ioc-phb back-end (i.e. POWER7). + Since this call was unused in Linux, and that POWER7 with OPAL was only ever + available internally, so it should be safe to remove the call. +- Remove unused :ref:`OPAL_PCI_GET_XIVE_REISSUE` and :ref:`OPAL_PCI_SET_XIVE_REISSUE` + + These seem to be remnants of one of the OPAL incarnations prior to + OPALv3. These calls have never been implemented in skiboot, and never + used by an upstream kernel (nor a PowerKVM kernel). + + It's rather safe to just document them as never existing. +- Remove never implemented :ref:`OPAL_PCI_SET_PHB_TABLE_MEMORY` and document why + + Not ever used by upstream linux or PowerKVM tree. Never implemented in + skiboot (not even in ancient internal only tree). + + So, it's incredibly safe to remove. +- Remove unused :ref:`OPAL_PCI_EEH_FREEZE_STATUS2` + + This call was introduced all the way back at the end of 2012, before + OPAL was public. The #define for the OPAL call was introduced to the + Linux kernel in June 2013, and the call was never used in any kernel + tree ever (as far as we can find). + + Thus, it's quite safe to remove this completely unused and completely + untested OPAL call. +- Document the long removed :ref:`OPAL_REGISTER_OPAL_EXCEPTION_HANDLER` call + + I'm pretty sure this was removed in one of our first ever service packs. + + Fixes: https://github.com/open-power/skiboot/issues/98 +- Remove last remnants of :ref:`OPAL_PCI_SET_PHB_TCE_MEMORY` and :ref:`OPAL_PCI_SET_HUB_TCE_MEMORY` + + Since we have not supported p5ioc systems since skiboot 5.2, it's pretty + safe to just wholesale remove these OPAL calls now. +- Remove remnants of :ref:`OPAL_PCI_SET_PHB_TCE_MEMORY` + + There's no reason we need remnants hanging around that aren't used, so + remove them and save a handful of bytes at runtime. + + Simultaneously, document the OPAL call removal. + + +Secure and Trusted Boot +----------------------- + +- trustedboot: Change PCR and event_type for the skiboot events + + The existing skiboot events are being logged as EV_ACTION, however, the + TCG PC Client spec says that EV_ACTION events should have one of the + pre-defined strings in the event field recorded in the event log. For + instance: + + - "Calling Ready to Boot", + - "Entering ROM Based Setup", + - "User Password Entered", and + - "Start Option ROM Scan. + + None of the EV_ACTION pre-defined strings are applicable to the existing + skiboot events. Based on recent discussions with other POWER teams, this + patch proposes a convention on what PCR and event types should be used + for skiboot events. This also changes the skiboot source code to follow + the convention. + + The TCG PC Client spec defines several event types, other than + EV_ACTION. However, many of them are specific to UEFI events and some + others are related to platform or CRTM events, which is more applicable + to hostboot events. + + Currently, most of the hostboot events are extended to PCR[0,1] and + logged as either EV_PLATFORM_CONFIG_FLAGS, EV_S_CRTM_CONTENTS or + EV_POST_CODE. The "Node Id" and "PAYLOAD" events, though, are extended + to PCR[4,5,6] and logged as EV_COMPACT_HASH. + + For the lack of an event type that fits the specific purpose, + EV_COMPACT_HASH seems to be the most adequate one due to its + flexibility. According to the TCG PC Client spec: + + - May be used for any PCR except 0, 1, 2 and 3. + - The event field may be informative or may be hashed to generate the + digest field, depending on the component recording the event. + + Additionally, the PCR[4,5] seem to be the most adequate PCRs. They would + be used for skiboot and some skiroot events. According to the TCG PC + Client, PCR[4] is intended to represent the entity that manages the + transition between the pre-OS and OS-present state of the platform. + PCR[4], along with PCR[5], identifies the initial OS loader. + + In summary, for skiboot events: + + - Events that represents data should be extended to PCR 4. + - Events that represents config should be extended to PCR 5. + - For the lack of an event type that fits the specific purpose, + both data and config events should be logged as EV_COMPACT_HASH. + +Sensors +------- + +- occ-sensors: Check if OCC is reset while reading inband sensors + + OCC may not be able to mark the sensor buffer as invalid while going + down RESET. If OCC never comes back we will continue to read the stale + sensor data. So verify if OCC is reset while reading the sensor values + and propagate the appropriate error. + +IPMI +---- + +- ipmi: ensure forward progress on ipmi_queue_msg_sync() + + BT responses are handled using a timer doing the polling. To hope to + get an answer to an IPMI synchronous message, the timer needs to run. + + We can't just check all timers though as there may be a timer that + wants a lock that's held by a code path calling ipmi_queue_msg_sync(), + and if we did enforce that as a requirement, it's a pretty subtle + API that is asking to be broken. + + So, if we just run a poll function to crank anything that the IPMI + backend needs, then we should be fine. + + This issue shows up very quickly under QEMU when loading the first + flash resource with the IPMI HIOMAP backend. + +NPU2 +---- + +- npu2: Increase timeout for L2/L3 cache purging + + On NVLink2 bridge reset, we purge all L2/L3 caches in the system. + This is an asynchronous operation, we have a 2ms timeout here. There are + reports that this is not enough and "PURGE L3 on core xxx timed out" + messages appear (for the reference: on the test setup this takes + 280us..780us). + + This defines the timeout as a macro and changes this from 2ms to 20ms. + + This adds a tracepoint to tell how long it took to purge all the caches. +- npu2: Purge cache when resetting a GPU + + After putting all a GPU's links in reset, do a cache purge in case we + have CPU cache lines belonging to the now-unaccessible GPU memory. +- npu2-opencapi: Mask 2 XSL errors + + Commit f8dfd699f584 ("hw/npu2: Setup an error interrupt on some + opencapi FIRs") converted some FIR bits default action from system + checkstop to raising an error interrupt. For 2 XSL error events that + can be triggered by a misbehaving AFU, the error interrupt is raised + twice, once for each link (the XSL logic in the NPU is shared between + 2 links). So a badly behaving AFU could impact another, unsuspecting + opencapi adapter. + + It doesn't look good and it turns out we can do better. We can mask + those 2 XSL errors. The error will also be picked up by the OTL logic, + which is per link. So we'll still get an error interrupt, but only on + the relevant link, and the other opencapi adapter can stay functional. +- npu2: Clear fence state for a brick being reset + + Resetting a GPU before resetting an NVLink leads to occasional HMIs + which fence some bricks and prevent the "reset_ntl" procedure from + succeeding at the "reset_ntl_release" step - the host system requires + reboot; there may be other cases like this as well. + + This adds clearing of the fence bit in NPU.MISC.FENCE_STATE for + the NVLink which we are about to reset. +- npu2: Fix clearing the FIR bits + + FIR registers are SCOM-only so they cannot be accesses with the indirect + write, and yet we use SCOM-based addresses for these; fix this. + +- npu2: Reset NVLinks when resetting a GPU + + Resetting a V100 GPU brings its NVLinks down and if an NPU tries using + those, an HMI occurs. We were lucky not to observe this as the bare metal + does not normally reset a GPU and when passed through, GPUs are usually + before NPUs in QEMU command line or Libvirt XML and because of that NPUs + are naturally reset first. However simple change of the device order + brings HMIs. + + This defines a bus control filter for a PCI slot with a GPU with NVLinks + so when the host system issues secondary bus reset to the slot, it resets + associated NVLinks. +- npu2: Reset PID wildcard and refcounter when mapped to LPID + + Since 105d80f85b "npu2: Use unfiltered mode in XTS tables" we do not + register every PID in the XTS table so the table has one entry per LPID. + Then we added a reference counter to keep track of the entry use when + switching GPU between the host and guest systems (the "Fixes:" tag below). + + The POWERNV platform setup creates such entries and references them + at the boot time when initializing IOMMUs and only removes it when + a GPU is passed through to a guest. This creates a problem as POWERNV + boots via kexec and no defererencing happens; the XTS table state remains + undefined. So when the host kernel boots, skiboot thinks there are valid + XTS entries and does not update the XTS table which breaks ATS. + + This adds the reference counter and the XTS entry reset when a GPU is + assigned to LPID and we cannot rely on the kernel to clean that up. + +PHB4 +---- +- hw/phb4: Make phb4_training_trace() more general + + phb4_training_trace() is used to monitor the Link Training Status + State Machine (LTSSM) of the PHB's data link layer. Currently it is only + used to observe the LTSSM while bringing up the link, but sometimes it's + useful to see what's occurring in other situations (e.g. link disable, or + secondary bus reset). This patch renames it to phb4_link_trace() and + allows the target LTSSM state and a flexible timeout to help in these + situations. +- hw/phb4: Make pci-tracing print at PR_NOTICE + + When pci-tracing is enabled we print each trace status message and the + final trace status at PR_ERROR. The final status messages are similar to + those printed when we fail to train in the non-pci-tracing path and this + has resulted in spurious op-test failures. + + This patch reduces the log-level of the tracing message to PR_NOTICE so + they're not accidently interpreted as actual error messages. PR_NOTICE + messages are still printed to the console during boot. +- hw/phb4: Use read/write_reg in assert_perst + + While the PHB is fenced we can't use the MMIO interface to access PHB + registers. While processing a complete reset we inject a PHB fence to + isolate the PHB from the rest of the system because the PHB won't + respond to MMIOs from the rest of the system while being reset. + + We assert PERST after the fence has been erected which requires us to + use the XSCOM indirect interface to access the PHB registers rather than + the MMIO interface. Previously we did that when asserting PERST in the + CRESET path. However in b8b4c79d4419 ("hw/phb4: Factor out PERST + control"). This was re-written to use the raw in_be64() accessor. This + means that CRESET would not be asserted in the reset path. On some + Mellanox cards this would prevent them from re-loading their firmware + when the system was fast-reset. + + This patch fixes the problem by replacing the raw {in|out}_be64() + accessors with the phb4_{read|write}_reg() functions. + +- hw/phb4: Assert Link Disable bit after ETU init + + The cursed RAID card in ozrom1 has a bug where it ignores PERST being + asserted. The PCIe Base spec is a little vague about what happens + while PERST is asserted, but it does clearly specify that when + PERST is de-asserted the Link Training and Status State Machine + (LTSSM) of a device should return to the initial state (Detect) + defined in the spec and the link training process should restart. + + This bug was worked around in 9078f8268922 ("phb4: Delay training till + after PERST is deasserted") by setting the link disable bit at the + start of the FRESET process and clearing it after PERST was + de-asserted. Although this fixed the bug, the patch offered no + explaination of why the fix worked. + + In b8b4c79d4419 ("hw/phb4: Factor out PERST control") the link disable + workaround was moved into phb4_assert_perst(). This is called + always in the CRESET case, but a following patch resulted in + assert_perst() not being called if phb4_freset() was entered following a + CRESET since p->skip_perst was set in the CRESET handler. This is bad + since a side-effect of the CRESET is that the Link Disable bit is + cleared. + + This, combined with the RAID card ignoring PERST results in the PCIe + link being trained by the PHB while we're waiting out the 100ms + ETU reset time. If we hack skiboot to print a DLP trace after returning + from phb4_hw_init() we get: :: + + PHB#0001[0:1]: Initialization complete + PHB#0001[0:1]: TRACE:0x0000102101000000 0ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000001101000000 23ms GEN1:x16:detect + PHB#0001[0:1]: TRACE:0x0000102101000000 23ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000183101000000 29ms training GEN1:x16:config + PHB#0001[0:1]: TRACE:0x00001c5881000000 30ms training GEN1:x08:recovery + PHB#0001[0:1]: TRACE:0x00001c5883000000 30ms training GEN3:x08:recovery + PHB#0001[0:1]: TRACE:0x0000144883000000 33ms presence GEN3:x08:L0 + PHB#0001[0:1]: TRACE:0x0000154883000000 33ms trained GEN3:x08:L0 + PHB#0001[0:1]: CRESET: wait_time = 100 + PHB#0001[0:1]: FRESET: Starts + PHB#0001[0:1]: FRESET: Prepare for link down + PHB#0001[0:1]: FRESET: Assert skipped + PHB#0001[0:1]: FRESET: Deassert + PHB#0001[0:1]: TRACE:0x0000154883000000 0ms trained GEN3:x08:L0 + PHB#0001[0:1]: TRACE: Reached target state + PHB#0001[0:1]: LINK: Start polling + PHB#0001[0:1]: LINK: Electrical link detected + PHB#0001[0:1]: LINK: Link is up + PHB#0001[0:1]: LINK: Went down waiting for stabilty + PHB#0001[0:1]: LINK: DLP train control: 0x0000105101000000 + PHB#0001[0:1]: CRESET: Starts + + What has happened here is that the link is trained to 8x Gen3 33ms after + we return from phb4_init_hw(), and before we've waitined to 100ms + that we normally wait after re-initialising the ETU. When we "deassert" + PERST later on in the FRESET handler the link in L0 (normal) state. At + this point we try to read from the Vendor/Device ID register to verify + that the link is stable and immediately get a PHB fence due to a PCIe + Completion Timeout. Skiboot attempts to recover by doing another CRESET, + but this will encounter the same issue. + + This patch fixes the problem by setting the Link Disable bit (by calling + phb4_assert_perst()) immediately after we return from phb4_init_hw(). + This prevents the link from being trained while PERST is asserted which + seems to avoid the Completion Timeout. With the patch applied we get: :: + + PHB#0001[0:1]: Initialization complete + PHB#0001[0:1]: TRACE:0x0000102101000000 0ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000001101000000 23ms GEN1:x16:detect + PHB#0001[0:1]: TRACE:0x0000102101000000 23ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000909101000000 29ms presence GEN1:x16:disabled + PHB#0001[0:1]: CRESET: wait_time = 100 + PHB#0001[0:1]: FRESET: Starts + PHB#0001[0:1]: FRESET: Prepare for link down + PHB#0001[0:1]: FRESET: Assert skipped + PHB#0001[0:1]: FRESET: Deassert + PHB#0001[0:1]: TRACE:0x0000001101000000 0ms GEN1:x16:detect + PHB#0001[0:1]: TRACE:0x0000102101000000 0ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000001101000000 24ms GEN1:x16:detect + PHB#0001[0:1]: TRACE:0x0000102101000000 36ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000183101000000 97ms training GEN1:x16:config + PHB#0001[0:1]: TRACE:0x00001c5881000000 97ms training GEN1:x08:recovery + PHB#0001[0:1]: TRACE:0x00001c5883000000 97ms training GEN3:x08:recovery + PHB#0001[0:1]: TRACE:0x0000144883000000 99ms presence GEN3:x08:L0 + PHB#0001[0:1]: TRACE: Reached target state + PHB#0001[0:1]: LINK: Start polling + PHB#0001[0:1]: LINK: Electrical link detected + PHB#0001[0:1]: LINK: Link is up + PHB#0001[0:1]: LINK: Link is stable + PHB#0001[0:1]: LINK: Card [9005:028c] Optimal Retry:disabled + PHB#0001[0:1]: LINK: Speed Train:GEN3 PHB:GEN4 DEV:GEN3 + PHB#0001[0:1]: LINK: Width Train:x08 PHB:x08 DEV:x08 + PHB#0001[0:1]: LINK: RX Errors Now:0 Max:8 Lane:0x0000 + + +Simulators +---------- + +- external/mambo: Bump default POWER9 to Nimbus DD2.3 +- external/mambo: fix tcl startup code for mambo bogus net (repost) + + This fixes a couple issues with external/mambo/skiboot.tcl so I can use the + mambo bogus net. + + * newer distros (ubuntu 18.04) allow tap device to have a user specified + name instead of just tapN so we need to pass in a name not a number. + * need some kind of default for net_mac, and need the mconfig for it + to be set from an env var. +- skiboot.tcl: Add option to wait for GDB server connection + + Add an environment variable which makes Mambo wait for a connection + from gdb prior to starting simulation. +- mambo: Integrate addr2line into backtrace command + + Gives nice output like this: :: + + systemsim % bt + pc: 0xC0000000002BF3D4 _savegpr0_28+0x0 + lr: 0xC00000000004E0F4 opal_call+0x10 + stack:0x000000000041FAE0 0xC00000000004F054 opal_check_token+0x20 + stack:0x000000000041FB50 0xC0000000000500CC __opal_flush_console+0x88 + stack:0x000000000041FBD0 0xC000000000050BF8 opal_flush_console+0x24 + stack:0x000000000041FC00 0xC0000000001F9510 udbg_opal_putc+0x88 + stack:0x000000000041FC40 0xC000000000020E78 udbg_write+0x7c + stack:0x000000000041FC80 0xC0000000000B1C44 console_unlock+0x47c + stack:0x000000000041FD80 0xC0000000000B2424 register_console+0x320 + stack:0x000000000041FE10 0xC0000000003A5328 register_early_udbg_console+0x98 + stack:0x000000000041FE80 0xC0000000003A4F14 setup_arch+0x68 + stack:0x000000000041FEF0 0xC0000000003A0880 start_kernel+0x74 + stack:0x000000000041FF90 0xC00000000000AC60 start_here_common+0x1c + +- mambo: Add addr2func for symbol resolution + + If you supply a VMLINUX_MAP/SKIBOOT_MAP/USER_MAP addr2func can guess + at your symbol name. i.e. :: + + systemsim % p pc + 0xC0000000002A68F8 + systemsim % addr2func [p pc] + fdt_offset_ptr+0x78 + +- lpc-port80h: Don't write port 80h when running under Simics + + Simics doesn't model LPC port 80h. Writing to it terminates the + simulation due to an invalid LPC memory access. This patch adds a + check to ensure port 80h isn't accessed if we are running under + Simics. +- device-tree: speed up fdt building on slow simulators + + Trade size for speed and avoid de-duplicating strings in the fdt. + This costs about 2kB in fdt size, and saves about 8 million instructions + (almost half of all instructions) booting skiboot in mambo. +- fast-reboot:: skip read-only memory checksum for slow simulators + + Skip the fast reboot checksum, which costs about 4 million cycles + booting skiboot in mambo. +- nx: remove check on the "qemu, powernv" property + + commit 95f7b3b9698b ("nx: Don't abort on missing NX when using a QEMU + machine") introduced a check on the property "qemu,powernv" to skip NX + initialization when running under a QEMU machine. + + The QEMU platforms now expose a QUIRK_NO_RNG in the chip. Testing the + "qemu,powernv" property is not necessary anymore. +- plat/qemu: add a POWER8 and POWER9 platform + + These new QEMU platforms have characteristics closer to real OpenPOWER + systems that we use today and define a different BMC depending on the + CPU type. New platform properties are introduced for each, + "qemu,powernv8", "qemu,powernv9" and these should be compatible with + existing QEMUs which only expose the "qemu,powernv" property +- libc/string: speed up common string functions + + Use compiler builtins for the string functions, and compile the + libc/string/ directory with -O2. + + This reduces instructions booting skiboot in mambo by 2.9 million in + slow-sim mode, or 3.8 in normal mode, for less than 1kB image size + increase. + + This can result in the compiler warning more cases of string function + problems. +- external/mambo: Add an option to exit Mambo when the system is shutdown + + Automatically exiting can be convenient for scripting. Will also exit + due to a HW crash (eg. unhandled exception). + +VESNIN platform +--------------- + +- platforms/vesnin: PCI inventory via IPMI OEM + + Replace raw protocol with OEM message supported by OpenBMC's IPMI + plugins. + + BMC-side implementation (IPMI plug-in): + https://github.com/YADRO-KNS/phosphor-pci-inventory + +Utilities +--------- + +- opal-gard: Account for ECC size when clearing partition + + When 'opal-gard clear all' is run, it works by erasing the GUARD then + using blockevel_smart_write() to write nothing to the partition. This + second write call is needed because we rely on libflash to set the ECC + bits appropriately when the partition contained ECCed data. + + The API for this is a little odd with the caller specifying how much + actual data to write, and libflash writing size + size/8 bytes + since there is one additional ECC byte for every eight bytes of data. + + We currently do not account for the extra space consumed by the ECC data + in reset_partition() which is used to handle the 'clear all' command. + Which results in the paritition following the GUARD partition being + partially overwritten when the command is used. This patch fixes the + problem by reducing the length we would normally write by the number + of ECC bytes required. + + +Build and debugging +------------------- + +- Disable -Waddress-of-packed-member for GCC9 + + We throw a bunch of errors in errorlog code otherwise, which we should + fix, but we don't *have* to yet. + +- Fix a lot of sparse warnings +- With new GCC comes larger GCOV binaries + + So we need to change our heap size to make more room for data/bss + without having to change where the console is or have more fun moving + things about. +- Intentionally discard fini_array sections + + Produced in a SKIBOOT_GCOV=1 build, and never called by skiboot. +- external/trace: Add follow option to dump_trace + + When monitoring traces, an option like the tail command's '-f' (follow) + is very useful. This option continues to append to the output as more + data arrives. Add an '-f' option to allow dump_trace to operate + similarly. + + Tail also provides a '-s' (sleep time) option that + accompanies '-f'. This controls how often new input will be polled. Add + a '-s' option that will make dump_trace sleep for N milliseconds before + checking for new input. +- external/trace: Add support for dumping multiple buffers + + dump_trace only can dump one trace buffer at a time. It would be handy + to be able to dump multiple buffers and to see the entries from these + buffers displayed in correct timestamp order. Each trace buffer is + already sorted by timestamp so use a heap to implement an efficient + k-way merge. Use the CCAN heap to implement this sort. However the CCAN + heap does not have a 'heap_replace' operation. We need to 'heap_pop' + then 'heap_push' to replace the root which means rebalancing twice + instead of once. +- external/trace: mmap trace buffers in dump_trace + + The current lseek/read approach used in dump_trace does not correctly + handle certain aspects of the buffers. It does not use the start and end + position that is part of the buffer so it will not begin from the + correct location. It does not move back to the beginning of the trace + buffer file as the buffer wraps around. It also does not handle the + overflow case of the writer overwriting when the reader is up to. + + Mmap the trace buffer file so that the existing reading functions in + extra/trace.c can be used. These functions already handle the cases of + wrapping and overflow. This reduces code duplication and uses functions + that are already unit tested. However this requires a kernel where the + trace buffer sysfs nodes are able to be mmaped (see + https://patchwork.ozlabs.org/patch/1056786/) +- core/trace: Export trace buffers to sysfs + + Every property in the device-tree under /ibm,opal/firmware/exports has a + sysfs node created in /firmware/opal/exports. Add properties with the + physical address and size for each trace buffer so they are exported. +- core/trace: Add pir number to debug_descriptor + + The names given to the trace buffers when exported to sysfs should show + what cpu they are associated with to make it easier to understand there + output. The debug_descriptor currently stores the address and length of + each trace buffer and this is used for adding properties to the device + tree. Extend debug_descriptor to include a cpu associated with each + trace. This will be used for creating properties in the device-tree + under /ibm,opal/firmware/exports/. +- core/trace: Change trace buffer size + + We want to be able to mmap the trace buffers to be used by the + dump_trace tool. As mmaping is done in terms of pages it makes sense + that the size of the trace buffers should be page aligned. This is + slightly complicated by the space taken up by the header at the + beginning of the trace and the room left for an extra trace entry at the + end of the buffer. Change the size of the buffer itself so that the + entire trace buffer size will be page aligned. +- core/trace: Change buffer alignment from 4K to 64K + + We want to be able to mmap the trace buffers to be used by the + dump_trace tool. This means that the trace bufferes must be page + aligned. Currently they are aligned to 4K. Most power systems have a + 64K page size. On systems with a 4K page size, 64K aligned will still be + page aligned. Change the allocation of the trace buffers to be 64K + aligned. + + The trace_info struct that contains the trace buffer is actually what is + allocated aligned memory. This means the trace buffer itself is not + actually aligned and this is the address that is currently exposed + through sysfs. To get around this change the address that is exposed to + sysfs to be the trace_info struct. This means the lock in trace_info is + now visible too. +- external/trace: Use correct width integer byte swapping + + The trace_repeat struct uses be16 for storing the number of repeats. + Currently be32_to_cpu conversion is used to display this member. This + produces an incorrect value. Use be16_to_cpu instead. +- core/trace: Put boot_tracebuf in correct location. + + A position for the boot_tracebuf is allocated in skiboot.lds.S. + However, without a __section attribute the boot trace buffer is not + placed in the correct location, meaning that it also will not be + correctly aligned. Add the __section attribute to ensure it will be + placed in its allocated position. +- core/lock: Add debug options to store backtrace of where lock was taken + + Contrary to popular belief, skiboot developers are imperfect and + occasionally write locking bugs. When we exit skiboot, we check if we're + still holding any locks, and if so, we print an error with a list of the + locks currently held and the locations where they were taken. + + However, this only tells us the location where lock() was called, which may + not be enough to work out what's going on. To give us more to go on with, + we can store backtrace data in the lock and print that out when we + unexpectedly still hold locks. + + Because the backtrace data is rather big, we only enable this if + DEBUG_LOCKS_BACKTRACE is defined, which in turn is switched on when + DEBUG=1. + + (We disable DEBUG_LOCKS_BACKTRACE in some of the memory allocation tests + because the locks used by the memory allocator take up too much room in the + fake skiboot heap.) +- libfdt: upgrade to upstream dtc.git 243176c + + Upgrade libfdt/ to github.com/dgibson/dtc.git 243176c ("Fix bogus + error on rebuild") + + This copies dtc/libfdt/ to skiboot/libfdt/, with the only change in + that directory being the addition of README.skiboot and Makefile.inc. + + This adds about 14kB text, 2.5kB compressed xz. This could be reduced + or mostly eliminated by cutting out fdt version checks and unused + code, but tracking upstream is a bigger benefit at the moment. + + This loses commits: + + - 14ed2b842f61 ("libfdt: add basic sanity check to fdt_open_into") + - bc7bb3d12bc1 ("sparse: fix declaration of fdt_strerror") + + As well as some prehistoric similar kinds of things, which is the + punishment for us not being good downstream citizens and sending + things upstream! Syncing to upstream will make that effort simpler + in future. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.4.rst b/roms/skiboot/doc/release-notes/skiboot-6.4.rst new file mode 100644 index 000000000..f6f632aa9 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.4.rst @@ -0,0 +1,850 @@ +.. _skiboot-6.4: + +skiboot-6.4 +=========== + +skiboot v6.4 was released on Tuesday July 16th 2019. It is the first +release of skiboot 6.4, which becomes the new stable release +of skiboot following the 6.3 release, first released May 3rd 2019. + +Skiboot 6.4 will mark the basis for op-build v2.4. + +skiboot v6.4 contains all bug fixes as of :ref:`skiboot-6.0.20`, +and :ref:`skiboot-6.3.2` (the currently maintained stable releases). + +For how the skiboot stable releases work, see :ref:`stable-rules` for details. + +Over skiboot 6.3, we have the following changes: + +.. _skiboot-6.4-new-features: + +New features +------------ + +Since skiboot v6.4-rc1: + +- npu2-opencapi: Add opencapi support on ZZ + + This patch adds opencapi support on ZZ. It hard-codes the required + device tree entries for the NPU and links. The alternative was to use + HDAT, but it somehow proved too painful to do. + + The new device tree entries activate the npu2 init code on ZZ. On + systems with no opencapi adapters, it should go unnoticed, as presence + detection will skip link training. + +Since skiboot v6.3: + +- platforms/nicole: Add new platform + + The platform is a new platform from YADRO, it's a storage controller for + TATLIN server. It's Based on IBM Romulus reference design (POWER9). + +- platform/zz: Add new platform type + + We have new platform type under ZZ. Lets add them. With this fix +- nvram: Flag dangerous NVRAM options + + Most nvram options used by skiboot are just for debug or testing for + regressions. They should never be used long term. + + We've hit a number of issues in testing and the field where nvram + options have been set "temporarily" but haven't been properly cleared + after, resulting in crashes or real bugs being masked. + + This patch marks most nvram options used by skiboot as dangerous and + prints a chicken to remind users of the problem. + +- hw/phb3: Add verbose EEH output + + Add support for the pci-eeh-verbose NVRAM flag on PHB3. We've had this + on PHB4 since forever and it has proven very useful when debugging EEH + issues. When testing changes to the Linux kernel's EEH implementation + it's fairly common for the kernel to crash before printing the EEH log + so it's helpful to have it in the OPAL log where it can be dumped from + XMON. + + Note that unlike PHB4 we do not enable verbose mode by default. The + nvram option must be used to explicitly enable it. + +- Experimental support for building without FSP code + + Now, with CONFIG_FSP=0/1 we have: + + - 1.6M/1.4M skiboot.lid + - 323K/375K skiboot.lid.xz + +- doc: travis-ci deploy docs! + + Documentation is now automatically deployed if you configure Travis CI + appropriately (we have done this for the open-power branch of skiboot) + +- Big OPAL API Documentation improvement + + A lot more OPAL API calls are now (at least somewhat) documented. +- opal/hmi: Report NPU2 checkstop reason + + The NPU2 is currently not passing any information to linux to explain + the cause of an HMI. NPU2 has three Fault Isolation Registers and over + 30 of those FIR bits are configured to raise an HMI by default. We + won't be able to fit all possible state in the 32-bit xstop_reason + field of the HMI event, but we can still try to encode up to 4 HMI + reasons. +- opal-msg: Enhance opal-get-msg API + + Linux uses :ref:`OPAL_GET_MSG` API to get OPAL messages. This interface + supports upto 8 params (64 bytes). We have a requirement to send bigger data to + Linux. This patch enhances OPAL to send bigger data to Linux. + + - Linux will use "opal-msg-size" device tree property to allocate memory for + OPAL messages (previous patch increased "opal-msg-size" to 64K). + - Replaced `reserved` field in "struct opal_msg" with `size`. So that Linux + side opal_get_msg user can detect actual data size. + - If buffer size < actual message size, then opal_get_msg will copy partial + data and return OPAL_PARTIAL to Linux. + - Add new variable "extended" to "opal_msg_entry" structure to keep track + of messages that has more than 64byte data. We will allocate separate + memory for these messages and once kernel consumes message we will + release that memory. +- core/opal: Increase opal-msg-size size + + Kernel will use `opal-msg-size` property to allocate memory for opal_msg. + We want to send bigger data from OPAL to kernel. Hence increase + opal-msg-size to 64K. +- hw/npu2-opencapi: Add initial support for allocating OpenCAPI LPC memory + + Lowest Point of Coherency (LPC) memory allows the host to access memory on + an OpenCAPI device. + + Define 2 OPAL calls, :ref:`OPAL_NPU_MEM_ALLOC` and :ref:`OPAL_NPU_MEM_RELEASE`, for + assigning and clearing the memory BAR. (We try to avoid using the term + "LPC" to avoid confusion with Low Pin Count.) + + At present, we use a fixed location in the address space, which means we + are restricted to a single range of 4TB, on a single OpenCAPI device per + chip. In future, we'll use some chip ID extension magic to give us more + space, and some sort of allocator to assign ranges to more than one device. +- core/fast-reboot: Add im-feeling-lucky option + + Fast reboot gets disabled for a number of reasons e.g. the availability + of nvlink. However this doesn't actually affect the ability to perform fast + reboot if no nvlink device is actually present. + + Add a nvram option for fast-reset where if it's set to + "im-feeling-lucky" then perform the fast-reboot irrespective of if it's + previously been disabled. + +- platforms/astbmc: Check for SBE validation step + + On some POWER8 astbmc systems an update to the SBE requires pausing at + runtime to ensure integrity of the SBE. If this is required the BMC will + set a chassis boot option IPMI flag using the OEM parameter 0x62. If + Skiboot sees this flag is set it waits until the SBE update is complete + and the flag is cleared. + + Unfortunately the mystery operation that validates the SBE also leaves + it in a bad state and unable to be used for timer operations. To + workaround this the flag is checked as soon as possible (ie. when IPMI + and the console are set up), and once complete the system is rebooted. +- Add P9 DIO interrupt support + + On P9 there are GPIO port 0, 1, 2 for GPIO interrupt, and DIO interrupt + is used to handle the interrupts. + + Add support to the DIO interrupts: + + 1. Add dio_interrupt_register(chip, port, callback) to register the + interrupt + 2. Add dio_interrupt_deregister(chip, port, callback) to deregister; + 3. When interrupt on the port occurs, callback is invoked, and the + interrupt status is cleared. + + +Removed features +---------------- + +Since skiboot v6.3: + +- pci/iov: Remove skiboot VF tracking + + This feature was added a few years ago in response to a request to make + the MaxPayloadSize (MPS) field of a Virtual Function match the MPS of the + Physical Function that hosts it. + + The SR-IOV specification states the the MPS field of the VF is "ResvP". + This indicates the VF will use whatever MPS is configured on the PF and + that the field should be treated as a reserved field in the config space + of the VF. In other words, a SR-IOV spec compliant VF should always return + zero in the MPS field. Adding hacks in OPAL to make it non-zero is... + misguided at best. + + Additionally, there is a bug in the way pci_device structures are handled + by VFs that results in a crash on fast-reboot that occurs if VFs are + enabled and then disabled prior to rebooting. This patch fixes the bug by + removing the code entirely. This patch has no impact on SR-IOV support on + the host operating system. +- Remove POWER7 and POWER7+ support + + It's been a good long while since either OPAL POWER7 user touched a + machine, and even longer since they'd have been okay using an old + version rather than tracking master. + + There's also been no testing of OPAL on POWER7 systems for an awfully + long time, so it's pretty safe to assume that it's very much bitrotted. + + It also saves a whole 14kb of xz compressed payload space. +- Remove remnants of :ref:`OPAL_PCI_GET_PHB_DIAG_DATA` + + Never present in a public OPAL release, and only kernels prior to 3.11 + would ever attempt to call it. +- Remove unused :ref:`OPAL_GET_XIVE_SOURCE` + + While this call was technically implemented by skiboot, no code has ever called + it, and it was only ever implemented for the p7ioc-phb back-end (i.e. POWER7). + Since this call was unused in Linux, and that POWER7 with OPAL was only ever + available internally, so it should be safe to remove the call. +- Remove unused :ref:`OPAL_PCI_GET_XIVE_REISSUE` and :ref:`OPAL_PCI_SET_XIVE_REISSUE` + + These seem to be remnants of one of the OPAL incarnations prior to + OPALv3. These calls have never been implemented in skiboot, and never + used by an upstream kernel (nor a PowerKVM kernel). + + It's rather safe to just document them as never existing. +- Remove never implemented :ref:`OPAL_PCI_SET_PHB_TABLE_MEMORY` and document why + + Not ever used by upstream linux or PowerKVM tree. Never implemented in + skiboot (not even in ancient internal only tree). + + So, it's incredibly safe to remove. +- Remove unused :ref:`OPAL_PCI_EEH_FREEZE_STATUS2` + + This call was introduced all the way back at the end of 2012, before + OPAL was public. The #define for the OPAL call was introduced to the + Linux kernel in June 2013, and the call was never used in any kernel + tree ever (as far as we can find). + + Thus, it's quite safe to remove this completely unused and completely + untested OPAL call. +- Document the long removed :ref:`OPAL_REGISTER_OPAL_EXCEPTION_HANDLER` call + + I'm pretty sure this was removed in one of our first ever service packs. + + Fixes: https://github.com/open-power/skiboot/issues/98 +- Remove last remnants of :ref:`OPAL_PCI_SET_PHB_TCE_MEMORY` and :ref:`OPAL_PCI_SET_HUB_TCE_MEMORY` + + Since we have not supported p5ioc systems since skiboot 5.2, it's pretty + safe to just wholesale remove these OPAL calls now. +- Remove remnants of :ref:`OPAL_PCI_SET_PHB_TCE_MEMORY` + + There's no reason we need remnants hanging around that aren't used, so + remove them and save a handful of bytes at runtime. + + Simultaneously, document the OPAL call removal. + + +Secure and Trusted Boot +----------------------- + +Since skiboot v6.3: + +- trustedboot: Change PCR and event_type for the skiboot events + + The existing skiboot events are being logged as EV_ACTION, however, the + TCG PC Client spec says that EV_ACTION events should have one of the + pre-defined strings in the event field recorded in the event log. For + instance: + + - "Calling Ready to Boot", + - "Entering ROM Based Setup", + - "User Password Entered", and + - "Start Option ROM Scan. + + None of the EV_ACTION pre-defined strings are applicable to the existing + skiboot events. Based on recent discussions with other POWER teams, this + patch proposes a convention on what PCR and event types should be used + for skiboot events. This also changes the skiboot source code to follow + the convention. + + The TCG PC Client spec defines several event types, other than + EV_ACTION. However, many of them are specific to UEFI events and some + others are related to platform or CRTM events, which is more applicable + to hostboot events. + + Currently, most of the hostboot events are extended to PCR[0,1] and + logged as either EV_PLATFORM_CONFIG_FLAGS, EV_S_CRTM_CONTENTS or + EV_POST_CODE. The "Node Id" and "PAYLOAD" events, though, are extended + to PCR[4,5,6] and logged as EV_COMPACT_HASH. + + For the lack of an event type that fits the specific purpose, + EV_COMPACT_HASH seems to be the most adequate one due to its + flexibility. According to the TCG PC Client spec: + + - May be used for any PCR except 0, 1, 2 and 3. + - The event field may be informative or may be hashed to generate the + digest field, depending on the component recording the event. + + Additionally, the PCR[4,5] seem to be the most adequate PCRs. They would + be used for skiboot and some skiroot events. According to the TCG PC + Client, PCR[4] is intended to represent the entity that manages the + transition between the pre-OS and OS-present state of the platform. + PCR[4], along with PCR[5], identifies the initial OS loader. + + In summary, for skiboot events: + + - Events that represents data should be extended to PCR 4. + - Events that represents config should be extended to PCR 5. + - For the lack of an event type that fits the specific purpose, + both data and config events should be logged as EV_COMPACT_HASH. + +Sensors +------- + +Since skiboot v6.3: + +- occ-sensors: Check if OCC is reset while reading inband sensors + + OCC may not be able to mark the sensor buffer as invalid while going + down RESET. If OCC never comes back we will continue to read the stale + sensor data. So verify if OCC is reset while reading the sensor values + and propagate the appropriate error. + +IPMI +---- + +Since skiboot v6.3: + +- ipmi: ensure forward progress on ipmi_queue_msg_sync() + + BT responses are handled using a timer doing the polling. To hope to + get an answer to an IPMI synchronous message, the timer needs to run. + + We can't just check all timers though as there may be a timer that + wants a lock that's held by a code path calling ipmi_queue_msg_sync(), + and if we did enforce that as a requirement, it's a pretty subtle + API that is asking to be broken. + + So, if we just run a poll function to crank anything that the IPMI + backend needs, then we should be fine. + + This issue shows up very quickly under QEMU when loading the first + flash resource with the IPMI HIOMAP backend. + +NPU2 +---- + +Since skiboot v6.4-rc1: + +- witherspoon: Add nvlink peers in finalise_dt() + + This information is consumed by Linux so it needs to be in the DT. Move + it to finalise_dt(). + +Since skiboot v6.3: + +- npu2: Increase timeout for L2/L3 cache purging + + On NVLink2 bridge reset, we purge all L2/L3 caches in the system. + This is an asynchronous operation, we have a 2ms timeout here. There are + reports that this is not enough and "PURGE L3 on core xxx timed out" + messages appear (for the reference: on the test setup this takes + 280us..780us). + + This defines the timeout as a macro and changes this from 2ms to 20ms. + + This adds a tracepoint to tell how long it took to purge all the caches. +- npu2: Purge cache when resetting a GPU + + After putting all a GPU's links in reset, do a cache purge in case we + have CPU cache lines belonging to the now-unaccessible GPU memory. +- npu2-opencapi: Mask 2 XSL errors + + Commit f8dfd699f584 ("hw/npu2: Setup an error interrupt on some + opencapi FIRs") converted some FIR bits default action from system + checkstop to raising an error interrupt. For 2 XSL error events that + can be triggered by a misbehaving AFU, the error interrupt is raised + twice, once for each link (the XSL logic in the NPU is shared between + 2 links). So a badly behaving AFU could impact another, unsuspecting + opencapi adapter. + + It doesn't look good and it turns out we can do better. We can mask + those 2 XSL errors. The error will also be picked up by the OTL logic, + which is per link. So we'll still get an error interrupt, but only on + the relevant link, and the other opencapi adapter can stay functional. +- npu2: Clear fence state for a brick being reset + + Resetting a GPU before resetting an NVLink leads to occasional HMIs + which fence some bricks and prevent the "reset_ntl" procedure from + succeeding at the "reset_ntl_release" step - the host system requires + reboot; there may be other cases like this as well. + + This adds clearing of the fence bit in NPU.MISC.FENCE_STATE for + the NVLink which we are about to reset. +- npu2: Fix clearing the FIR bits + + FIR registers are SCOM-only so they cannot be accesses with the indirect + write, and yet we use SCOM-based addresses for these; fix this. + +- npu2: Reset NVLinks when resetting a GPU + + Resetting a V100 GPU brings its NVLinks down and if an NPU tries using + those, an HMI occurs. We were lucky not to observe this as the bare metal + does not normally reset a GPU and when passed through, GPUs are usually + before NPUs in QEMU command line or Libvirt XML and because of that NPUs + are naturally reset first. However simple change of the device order + brings HMIs. + + This defines a bus control filter for a PCI slot with a GPU with NVLinks + so when the host system issues secondary bus reset to the slot, it resets + associated NVLinks. +- npu2: Reset PID wildcard and refcounter when mapped to LPID + + Since 105d80f85b "npu2: Use unfiltered mode in XTS tables" we do not + register every PID in the XTS table so the table has one entry per LPID. + Then we added a reference counter to keep track of the entry use when + switching GPU between the host and guest systems (the "Fixes:" tag below). + + The POWERNV platform setup creates such entries and references them + at the boot time when initializing IOMMUs and only removes it when + a GPU is passed through to a guest. This creates a problem as POWERNV + boots via kexec and no defererencing happens; the XTS table state remains + undefined. So when the host kernel boots, skiboot thinks there are valid + XTS entries and does not update the XTS table which breaks ATS. + + This adds the reference counter and the XTS entry reset when a GPU is + assigned to LPID and we cannot rely on the kernel to clean that up. + +PHB4 +---- + +Since skiboot v6.3: + +- hw/phb4: Make phb4_training_trace() more general + + phb4_training_trace() is used to monitor the Link Training Status + State Machine (LTSSM) of the PHB's data link layer. Currently it is only + used to observe the LTSSM while bringing up the link, but sometimes it's + useful to see what's occurring in other situations (e.g. link disable, or + secondary bus reset). This patch renames it to phb4_link_trace() and + allows the target LTSSM state and a flexible timeout to help in these + situations. +- hw/phb4: Make pci-tracing print at PR_NOTICE + + When pci-tracing is enabled we print each trace status message and the + final trace status at PR_ERROR. The final status messages are similar to + those printed when we fail to train in the non-pci-tracing path and this + has resulted in spurious op-test failures. + + This patch reduces the log-level of the tracing message to PR_NOTICE so + they're not accidently interpreted as actual error messages. PR_NOTICE + messages are still printed to the console during boot. +- hw/phb4: Use read/write_reg in assert_perst + + While the PHB is fenced we can't use the MMIO interface to access PHB + registers. While processing a complete reset we inject a PHB fence to + isolate the PHB from the rest of the system because the PHB won't + respond to MMIOs from the rest of the system while being reset. + + We assert PERST after the fence has been erected which requires us to + use the XSCOM indirect interface to access the PHB registers rather than + the MMIO interface. Previously we did that when asserting PERST in the + CRESET path. However in b8b4c79d4419 ("hw/phb4: Factor out PERST + control"). This was re-written to use the raw in_be64() accessor. This + means that CRESET would not be asserted in the reset path. On some + Mellanox cards this would prevent them from re-loading their firmware + when the system was fast-reset. + + This patch fixes the problem by replacing the raw {in|out}_be64() + accessors with the phb4_{read|write}_reg() functions. + +- hw/phb4: Assert Link Disable bit after ETU init + + The cursed RAID card in ozrom1 has a bug where it ignores PERST being + asserted. The PCIe Base spec is a little vague about what happens + while PERST is asserted, but it does clearly specify that when + PERST is de-asserted the Link Training and Status State Machine + (LTSSM) of a device should return to the initial state (Detect) + defined in the spec and the link training process should restart. + + This bug was worked around in 9078f8268922 ("phb4: Delay training till + after PERST is deasserted") by setting the link disable bit at the + start of the FRESET process and clearing it after PERST was + de-asserted. Although this fixed the bug, the patch offered no + explaination of why the fix worked. + + In b8b4c79d4419 ("hw/phb4: Factor out PERST control") the link disable + workaround was moved into phb4_assert_perst(). This is called + always in the CRESET case, but a following patch resulted in + assert_perst() not being called if phb4_freset() was entered following a + CRESET since p->skip_perst was set in the CRESET handler. This is bad + since a side-effect of the CRESET is that the Link Disable bit is + cleared. + + This, combined with the RAID card ignoring PERST results in the PCIe + link being trained by the PHB while we're waiting out the 100ms + ETU reset time. If we hack skiboot to print a DLP trace after returning + from phb4_hw_init() we get: :: + + PHB#0001[0:1]: Initialization complete + PHB#0001[0:1]: TRACE:0x0000102101000000 0ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000001101000000 23ms GEN1:x16:detect + PHB#0001[0:1]: TRACE:0x0000102101000000 23ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000183101000000 29ms training GEN1:x16:config + PHB#0001[0:1]: TRACE:0x00001c5881000000 30ms training GEN1:x08:recovery + PHB#0001[0:1]: TRACE:0x00001c5883000000 30ms training GEN3:x08:recovery + PHB#0001[0:1]: TRACE:0x0000144883000000 33ms presence GEN3:x08:L0 + PHB#0001[0:1]: TRACE:0x0000154883000000 33ms trained GEN3:x08:L0 + PHB#0001[0:1]: CRESET: wait_time = 100 + PHB#0001[0:1]: FRESET: Starts + PHB#0001[0:1]: FRESET: Prepare for link down + PHB#0001[0:1]: FRESET: Assert skipped + PHB#0001[0:1]: FRESET: Deassert + PHB#0001[0:1]: TRACE:0x0000154883000000 0ms trained GEN3:x08:L0 + PHB#0001[0:1]: TRACE: Reached target state + PHB#0001[0:1]: LINK: Start polling + PHB#0001[0:1]: LINK: Electrical link detected + PHB#0001[0:1]: LINK: Link is up + PHB#0001[0:1]: LINK: Went down waiting for stabilty + PHB#0001[0:1]: LINK: DLP train control: 0x0000105101000000 + PHB#0001[0:1]: CRESET: Starts + + What has happened here is that the link is trained to 8x Gen3 33ms after + we return from phb4_init_hw(), and before we've waitined to 100ms + that we normally wait after re-initialising the ETU. When we "deassert" + PERST later on in the FRESET handler the link in L0 (normal) state. At + this point we try to read from the Vendor/Device ID register to verify + that the link is stable and immediately get a PHB fence due to a PCIe + Completion Timeout. Skiboot attempts to recover by doing another CRESET, + but this will encounter the same issue. + + This patch fixes the problem by setting the Link Disable bit (by calling + phb4_assert_perst()) immediately after we return from phb4_init_hw(). + This prevents the link from being trained while PERST is asserted which + seems to avoid the Completion Timeout. With the patch applied we get: :: + + PHB#0001[0:1]: Initialization complete + PHB#0001[0:1]: TRACE:0x0000102101000000 0ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000001101000000 23ms GEN1:x16:detect + PHB#0001[0:1]: TRACE:0x0000102101000000 23ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000909101000000 29ms presence GEN1:x16:disabled + PHB#0001[0:1]: CRESET: wait_time = 100 + PHB#0001[0:1]: FRESET: Starts + PHB#0001[0:1]: FRESET: Prepare for link down + PHB#0001[0:1]: FRESET: Assert skipped + PHB#0001[0:1]: FRESET: Deassert + PHB#0001[0:1]: TRACE:0x0000001101000000 0ms GEN1:x16:detect + PHB#0001[0:1]: TRACE:0x0000102101000000 0ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000001101000000 24ms GEN1:x16:detect + PHB#0001[0:1]: TRACE:0x0000102101000000 36ms presence GEN1:x16:polling + PHB#0001[0:1]: TRACE:0x0000183101000000 97ms training GEN1:x16:config + PHB#0001[0:1]: TRACE:0x00001c5881000000 97ms training GEN1:x08:recovery + PHB#0001[0:1]: TRACE:0x00001c5883000000 97ms training GEN3:x08:recovery + PHB#0001[0:1]: TRACE:0x0000144883000000 99ms presence GEN3:x08:L0 + PHB#0001[0:1]: TRACE: Reached target state + PHB#0001[0:1]: LINK: Start polling + PHB#0001[0:1]: LINK: Electrical link detected + PHB#0001[0:1]: LINK: Link is up + PHB#0001[0:1]: LINK: Link is stable + PHB#0001[0:1]: LINK: Card [9005:028c] Optimal Retry:disabled + PHB#0001[0:1]: LINK: Speed Train:GEN3 PHB:GEN4 DEV:GEN3 + PHB#0001[0:1]: LINK: Width Train:x08 PHB:x08 DEV:x08 + PHB#0001[0:1]: LINK: RX Errors Now:0 Max:8 Lane:0x0000 + + +Simulators +---------- + +Since skiboot v6.3: + +- external/mambo: Bump default POWER9 to Nimbus DD2.3 +- external/mambo: fix tcl startup code for mambo bogus net (repost) + + This fixes a couple issues with external/mambo/skiboot.tcl so I can use the + mambo bogus net. + + * newer distros (ubuntu 18.04) allow tap device to have a user specified + name instead of just tapN so we need to pass in a name not a number. + * need some kind of default for net_mac, and need the mconfig for it + to be set from an env var. +- skiboot.tcl: Add option to wait for GDB server connection + + Add an environment variable which makes Mambo wait for a connection + from gdb prior to starting simulation. +- mambo: Integrate addr2line into backtrace command + + Gives nice output like this: :: + + systemsim % bt + pc: 0xC0000000002BF3D4 _savegpr0_28+0x0 + lr: 0xC00000000004E0F4 opal_call+0x10 + stack:0x000000000041FAE0 0xC00000000004F054 opal_check_token+0x20 + stack:0x000000000041FB50 0xC0000000000500CC __opal_flush_console+0x88 + stack:0x000000000041FBD0 0xC000000000050BF8 opal_flush_console+0x24 + stack:0x000000000041FC00 0xC0000000001F9510 udbg_opal_putc+0x88 + stack:0x000000000041FC40 0xC000000000020E78 udbg_write+0x7c + stack:0x000000000041FC80 0xC0000000000B1C44 console_unlock+0x47c + stack:0x000000000041FD80 0xC0000000000B2424 register_console+0x320 + stack:0x000000000041FE10 0xC0000000003A5328 register_early_udbg_console+0x98 + stack:0x000000000041FE80 0xC0000000003A4F14 setup_arch+0x68 + stack:0x000000000041FEF0 0xC0000000003A0880 start_kernel+0x74 + stack:0x000000000041FF90 0xC00000000000AC60 start_here_common+0x1c + +- mambo: Add addr2func for symbol resolution + + If you supply a VMLINUX_MAP/SKIBOOT_MAP/USER_MAP addr2func can guess + at your symbol name. i.e. :: + + systemsim % p pc + 0xC0000000002A68F8 + systemsim % addr2func [p pc] + fdt_offset_ptr+0x78 + +- lpc-port80h: Don't write port 80h when running under Simics + + Simics doesn't model LPC port 80h. Writing to it terminates the + simulation due to an invalid LPC memory access. This patch adds a + check to ensure port 80h isn't accessed if we are running under + Simics. +- device-tree: speed up fdt building on slow simulators + + Trade size for speed and avoid de-duplicating strings in the fdt. + This costs about 2kB in fdt size, and saves about 8 million instructions + (almost half of all instructions) booting skiboot in mambo. +- fast-reboot:: skip read-only memory checksum for slow simulators + + Skip the fast reboot checksum, which costs about 4 million cycles + booting skiboot in mambo. +- nx: remove check on the "qemu, powernv" property + + commit 95f7b3b9698b ("nx: Don't abort on missing NX when using a QEMU + machine") introduced a check on the property "qemu,powernv" to skip NX + initialization when running under a QEMU machine. + + The QEMU platforms now expose a QUIRK_NO_RNG in the chip. Testing the + "qemu,powernv" property is not necessary anymore. +- plat/qemu: add a POWER8 and POWER9 platform + + These new QEMU platforms have characteristics closer to real OpenPOWER + systems that we use today and define a different BMC depending on the + CPU type. New platform properties are introduced for each, + "qemu,powernv8", "qemu,powernv9" and these should be compatible with + existing QEMUs which only expose the "qemu,powernv" property +- libc/string: speed up common string functions + + Use compiler builtins for the string functions, and compile the + libc/string/ directory with -O2. + + This reduces instructions booting skiboot in mambo by 2.9 million in + slow-sim mode, or 3.8 in normal mode, for less than 1kB image size + increase. + + This can result in the compiler warning more cases of string function + problems. +- external/mambo: Add an option to exit Mambo when the system is shutdown + + Automatically exiting can be convenient for scripting. Will also exit + due to a HW crash (eg. unhandled exception). + +VESNIN platform +--------------- + +Since skiboot v6.3: + +- platforms/vesnin: PCI inventory via IPMI OEM + + Replace raw protocol with OEM message supported by OpenBMC's IPMI + plugins. + + BMC-side implementation (IPMI plug-in): + https://github.com/YADRO-KNS/phosphor-pci-inventory + +Utilities +--------- + +Since skiboot v6.3: + +- opal-gard: Account for ECC size when clearing partition + + When 'opal-gard clear all' is run, it works by erasing the GUARD then + using blockevel_smart_write() to write nothing to the partition. This + second write call is needed because we rely on libflash to set the ECC + bits appropriately when the partition contained ECCed data. + + The API for this is a little odd with the caller specifying how much + actual data to write, and libflash writing size + size/8 bytes + since there is one additional ECC byte for every eight bytes of data. + + We currently do not account for the extra space consumed by the ECC data + in reset_partition() which is used to handle the 'clear all' command. + Which results in the paritition following the GUARD partition being + partially overwritten when the command is used. This patch fixes the + problem by reducing the length we would normally write by the number + of ECC bytes required. + + +Build and debugging +------------------- + +Since skiboot v6.3: + +- Disable -Waddress-of-packed-member for GCC9 + + We throw a bunch of errors in errorlog code otherwise, which we should + fix, but we don't *have* to yet. + +- Fix a lot of sparse warnings +- With new GCC comes larger GCOV binaries + + So we need to change our heap size to make more room for data/bss + without having to change where the console is or have more fun moving + things about. +- Intentionally discard fini_array sections + + Produced in a SKIBOOT_GCOV=1 build, and never called by skiboot. +- external/trace: Add follow option to dump_trace + + When monitoring traces, an option like the tail command's '-f' (follow) + is very useful. This option continues to append to the output as more + data arrives. Add an '-f' option to allow dump_trace to operate + similarly. + + Tail also provides a '-s' (sleep time) option that + accompanies '-f'. This controls how often new input will be polled. Add + a '-s' option that will make dump_trace sleep for N milliseconds before + checking for new input. +- external/trace: Add support for dumping multiple buffers + + dump_trace only can dump one trace buffer at a time. It would be handy + to be able to dump multiple buffers and to see the entries from these + buffers displayed in correct timestamp order. Each trace buffer is + already sorted by timestamp so use a heap to implement an efficient + k-way merge. Use the CCAN heap to implement this sort. However the CCAN + heap does not have a 'heap_replace' operation. We need to 'heap_pop' + then 'heap_push' to replace the root which means rebalancing twice + instead of once. +- external/trace: mmap trace buffers in dump_trace + + The current lseek/read approach used in dump_trace does not correctly + handle certain aspects of the buffers. It does not use the start and end + position that is part of the buffer so it will not begin from the + correct location. It does not move back to the beginning of the trace + buffer file as the buffer wraps around. It also does not handle the + overflow case of the writer overwriting when the reader is up to. + + Mmap the trace buffer file so that the existing reading functions in + extra/trace.c can be used. These functions already handle the cases of + wrapping and overflow. This reduces code duplication and uses functions + that are already unit tested. However this requires a kernel where the + trace buffer sysfs nodes are able to be mmaped (see + https://patchwork.ozlabs.org/patch/1056786/) +- core/trace: Export trace buffers to sysfs + + Every property in the device-tree under /ibm,opal/firmware/exports has a + sysfs node created in /firmware/opal/exports. Add properties with the + physical address and size for each trace buffer so they are exported. +- core/trace: Add pir number to debug_descriptor + + The names given to the trace buffers when exported to sysfs should show + what cpu they are associated with to make it easier to understand there + output. The debug_descriptor currently stores the address and length of + each trace buffer and this is used for adding properties to the device + tree. Extend debug_descriptor to include a cpu associated with each + trace. This will be used for creating properties in the device-tree + under /ibm,opal/firmware/exports/. +- core/trace: Change trace buffer size + + We want to be able to mmap the trace buffers to be used by the + dump_trace tool. As mmaping is done in terms of pages it makes sense + that the size of the trace buffers should be page aligned. This is + slightly complicated by the space taken up by the header at the + beginning of the trace and the room left for an extra trace entry at the + end of the buffer. Change the size of the buffer itself so that the + entire trace buffer size will be page aligned. +- core/trace: Change buffer alignment from 4K to 64K + + We want to be able to mmap the trace buffers to be used by the + dump_trace tool. This means that the trace bufferes must be page + aligned. Currently they are aligned to 4K. Most power systems have a + 64K page size. On systems with a 4K page size, 64K aligned will still be + page aligned. Change the allocation of the trace buffers to be 64K + aligned. + + The trace_info struct that contains the trace buffer is actually what is + allocated aligned memory. This means the trace buffer itself is not + actually aligned and this is the address that is currently exposed + through sysfs. To get around this change the address that is exposed to + sysfs to be the trace_info struct. This means the lock in trace_info is + now visible too. +- external/trace: Use correct width integer byte swapping + + The trace_repeat struct uses be16 for storing the number of repeats. + Currently be32_to_cpu conversion is used to display this member. This + produces an incorrect value. Use be16_to_cpu instead. +- core/trace: Put boot_tracebuf in correct location. + + A position for the boot_tracebuf is allocated in skiboot.lds.S. + However, without a __section attribute the boot trace buffer is not + placed in the correct location, meaning that it also will not be + correctly aligned. Add the __section attribute to ensure it will be + placed in its allocated position. +- core/lock: Add debug options to store backtrace of where lock was taken + + Contrary to popular belief, skiboot developers are imperfect and + occasionally write locking bugs. When we exit skiboot, we check if we're + still holding any locks, and if so, we print an error with a list of the + locks currently held and the locations where they were taken. + + However, this only tells us the location where lock() was called, which may + not be enough to work out what's going on. To give us more to go on with, + we can store backtrace data in the lock and print that out when we + unexpectedly still hold locks. + + Because the backtrace data is rather big, we only enable this if + DEBUG_LOCKS_BACKTRACE is defined, which in turn is switched on when + DEBUG=1. + + (We disable DEBUG_LOCKS_BACKTRACE in some of the memory allocation tests + because the locks used by the memory allocator take up too much room in the + fake skiboot heap.) +- libfdt: upgrade to upstream dtc.git 243176c + + Upgrade libfdt/ to github.com/dgibson/dtc.git 243176c ("Fix bogus + error on rebuild") + + This copies dtc/libfdt/ to skiboot/libfdt/, with the only change in + that directory being the addition of README.skiboot and Makefile.inc. + + This adds about 14kB text, 2.5kB compressed xz. This could be reduced + or mostly eliminated by cutting out fdt version checks and unused + code, but tracking upstream is a bigger benefit at the moment. + + This loses commits: + + - 14ed2b842f61 ("libfdt: add basic sanity check to fdt_open_into") + - bc7bb3d12bc1 ("sparse: fix declaration of fdt_strerror") + + As well as some prehistoric similar kinds of things, which is the + punishment for us not being good downstream citizens and sending + things upstream! Syncing to upstream will make that effort simpler + in future. + +General Fixes +------------- + +Since skiboot v6.4-rc1: + +- libflash: Fix broken continuations + + Some of the libflash debug messages don't print a newlines at the end of + the line and assume that the next print will be contigious with the + last. This isn't true in skiboot since log messages are prefixed with a + timestamp. This results in funny looking output such as: :: + + LIBFLASH: Verifying... + LIBFLASH: reading page 0x01963000..0x01964000...[3.084846885,7] same ! + LIBFLASH: reading page 0x01964000..0x01965000...[3.086164489,7] same ! + + Fix this by moving the "same !" debug message to a new line with the + prefix "LIBFLASH: ..." to indicate it's a continuation of the last + statement. + + First reported in https://github.com/open-power/skiboot/issues/51 diff --git a/roms/skiboot/doc/release-notes/skiboot-6.5.1.rst b/roms/skiboot/doc/release-notes/skiboot-6.5.1.rst new file mode 100644 index 000000000..d1a52b34c --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.5.1.rst @@ -0,0 +1,27 @@ +.. _skiboot-6.5.1: + +============== +skiboot-6.5.1 +============== + +skiboot 6.5.1 was released on Thursday October 24th, 2019. It replaces +:ref:`skiboot-6.5` as the current stable release in the 6.5.x series. + +It is recommended that 6.5.1 be used instead of 6.5 version due to the +bug fixes it contains. + +Bug fixes included in this release are: + +- core/ipmi: Fix use-after-free + +- blocklevel: smart_write: Fix unaligned writes to ECC partitions + +- gard: Fix data corruption when clearing single records + +- core/platform: Actually disable fast-reboot on P8 + +- xive: fix return value of opal_xive_allocate_irq() + +- MPIPL: struct opal_mpipl_fadump doesn't needs to be packed + +- core/flash: Validate secure boot content size diff --git a/roms/skiboot/doc/release-notes/skiboot-6.5.2.rst b/roms/skiboot/doc/release-notes/skiboot-6.5.2.rst new file mode 100644 index 000000000..07cf9425c --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.5.2.rst @@ -0,0 +1,28 @@ +.. _skiboot-6.5.2: + +============== +skiboot-6.5.2 +============== + +skiboot 6.5.2 was released on Monday December 9th, 2019. It replaces +:ref:`skiboot-6.5.1` as the current stable release in the 6.5.x series. + +It is recommended that 6.5.2 be used instead of 6.5.1 version due to the +bug fixes it contains. + +Bug fixes included in this release are: +- libstb/tpm: block access to unknown i2c devs on the tpm bus + +- slw: slw_reinit fix array overrun + +- IPMI: Trigger OPAL TI in abort path. + +- platform/mihawk: Add system VPD EEPROM to I2C bus + +- platform/mihawk: Detect old system compatible string + +- npu2/hw-procedures: Remove assertion from check_credits() + +- npu2-opencapi: Fix integer promotion bug in LPC allocation + +- hw/port80: Squash No SYNC error diff --git a/roms/skiboot/doc/release-notes/skiboot-6.5.3.rst b/roms/skiboot/doc/release-notes/skiboot-6.5.3.rst new file mode 100644 index 000000000..3e6eb1fc5 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.5.3.rst @@ -0,0 +1,24 @@ +.. _skiboot-6.5.3: + +============== +skiboot-6.5.3 +============== + +skiboot 6.5.3 was released on Tuesday March 10th, 2020. It replaces +:ref:`skiboot-6.5.2` as the current stable release in the 6.5.x series. + +It is recommended that 6.5.3 be used instead of 6.5.2 version due to the +bug fixes it contains. + +Bug fixes included in this release are: +- npu2-opencapi: Don't drive reset signal permanently + +- mpipl: Rework memory reservation for OPAL dump + +- mpipl: Disable fast-reboot during post MPIPL boot + +- hdata: Update MPIPL support IPL parameter + +- xscom: Don't log xscom errors caused by OPAL calls + +- npu2: Clear fence on all bricks diff --git a/roms/skiboot/doc/release-notes/skiboot-6.5.4.rst b/roms/skiboot/doc/release-notes/skiboot-6.5.4.rst new file mode 100644 index 000000000..b91914edc --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.5.4.rst @@ -0,0 +1,16 @@ +.. _skiboot-6.5.4: + +============== +skiboot-6.5.4 +============== + +skiboot 6.5.4 was released on Friday March 20th, 2020. It replaces +:ref:`skiboot-6.5.3` as the current stable release in the 6.5.x series. + +It is recommended that 6.5.4 be used instead of 6.5.3 version due to the +bug fixes it contains. + +Bug fixes included in this release are: +- errorlog: Increase the severity of abnormal reboot events + +- eSEL: Make sure PANIC logs are sent to BMC before calling assert diff --git a/roms/skiboot/doc/release-notes/skiboot-6.5.rst b/roms/skiboot/doc/release-notes/skiboot-6.5.rst new file mode 100644 index 000000000..1c16e950e --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.5.rst @@ -0,0 +1,20 @@ +.. _skiboot-6.5: + +skiboot-6.5 +=========== + +skiboot v6.5 was released on Friday August 16th 2019. It is the first +release of skiboot 6.5, which becomes the new stable release +of skiboot following the :ref:`skiboot-6.4` release, first released May 3rd 2019. + +.. _skiboot-6.5-new-features: + +New features +------------ + +Support for Memory-preserving IPL (MPIPL) which will be the basis for more +reliable creation of crash dumps, and for crash dumps fro OPAL. + +Support for the Swift platform and NPU3 hardware. + +Support for the Mihawk platform. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.6.1.rst b/roms/skiboot/doc/release-notes/skiboot-6.6.1.rst new file mode 100644 index 000000000..dc1c66369 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.6.1.rst @@ -0,0 +1,31 @@ +.. _skiboot-6.6.1: + +============== +skiboot-6.6.1 +============== + +skiboot 6.6.1 was released on Saturday June 06, 2020. It replaces +:ref:`skiboot-6.6` as the current stable release in the 6.6.x series. + +It is recommended that 6.6.1 be used instead of 6.6 version due to the +bug fixes it contains. + +Bug fixes included in this release are: + +- occ: Fix false negatives in wait_for_all_occ_init() + +- uart: Drop console write data if BMC becomes unresponsive + +- hw/phys-map: Fix OCAPI_MEM BAR values + +- Detect fused core mode and bail out + +- platform/mihawk: Tune equalization settings for opencapi + +- hdata/memory.c: Fix "Inconsistent MSAREA" warnings + +- PSI: Convert prerror to PR_NOTICE + +- sensors: occ: Fix a bug when sensor values are zero + +- sensors: occ: Fix the GPU detection code diff --git a/roms/skiboot/doc/release-notes/skiboot-6.6.2.rst b/roms/skiboot/doc/release-notes/skiboot-6.6.2.rst new file mode 100644 index 000000000..69a344ec2 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.6.2.rst @@ -0,0 +1,17 @@ +.. _skiboot-6.6.2: + +============== +skiboot-6.6.2 +============== + +skiboot 6.6.2 was released on Friday July 03, 2020. It replaces +:ref:`skiboot-6.6.1` as the current stable release in the 6.6.x series. + +It is recommended that 6.6.2 be used instead of 6.6.1 version due to the +bug fixes it contains. + +Bug fixes included in this release are: + +- fsp: Skip sysdump retrieval only in MPIPL boot + +- platform/mihawk: Fix IPMI double-free diff --git a/roms/skiboot/doc/release-notes/skiboot-6.6.3.rst b/roms/skiboot/doc/release-notes/skiboot-6.6.3.rst new file mode 100644 index 000000000..9130513a3 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.6.3.rst @@ -0,0 +1,21 @@ +.. _skiboot-6.6.3: + +============== +skiboot-6.6.3 +============== + +skiboot 6.6.3 was released on Thursday Sep 10, 2020. It replaces +:ref:`skiboot-6.6.3` as the current stable release in the 6.6.x series. + +It is recommended that 6.6.3 be used instead of 6.6.2 version due to the +bug fixes it contains. + +Bug fixes included in this release are: + +- fsp/dump: Handle non-MPIPL scenario + +- hw/phb4: Verify AER support before initialising AER regs + +- hw/phb4: Actually enable error reporting + +- hdata: Add new "smp-cable-connector" VPD keyword diff --git a/roms/skiboot/doc/release-notes/skiboot-6.6.4.rst b/roms/skiboot/doc/release-notes/skiboot-6.6.4.rst new file mode 100644 index 000000000..84016bce3 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.6.4.rst @@ -0,0 +1,18 @@ +.. _skiboot-6.6.4: + +============== +skiboot-6.6.4 +============== + +skiboot 6.6.4 was released on Thursday Oct 22nd, 2020. It replaces +:ref:`skiboot-6.6.3` as the current stable release in the 6.6.x series. + +It is recommended that 6.6.4 be used instead of 6.6.3 version due to the +bug fixes it contains. + + +Bug fixes included in this release are: + +- asm/head: fix power save wakeup register corruption + +- FSP/NVRAM: Do not assert in vNVRAM statistics call diff --git a/roms/skiboot/doc/release-notes/skiboot-6.6.rst b/roms/skiboot/doc/release-notes/skiboot-6.6.rst new file mode 100644 index 000000000..b463fed06 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.6.rst @@ -0,0 +1,65 @@ +.. _skiboot-6.6: + +skiboot-6.6 +=========== + +skiboot v6.6 was released on Wednesday April 22nd 2020. It is the first release +of skiboot 6.6 series, which becomes the new stable release following the +:ref:`skiboot-6.5` release, first released August 16th 2019. + +There hasn't been a skiboot release in a while and this release doesn't contain +a huge number of new features for users, just a lot of bug fixes, and additional +platform support. The next release should be a little more lively with a number +of internal refactorings and new features on the way. + +.. _skiboot-6.6-new-features: + +New features +------------ + +- Skiboot is now dual licensed as Apache 2.0 -OR- GPLv2+ + + There are some files still licensed Apache 2.0 only due to contributions + that we are unable to change the license of, but they are the minority. + +- Skiboot can now be built as little endian, thanks to Team Nick. + + Doing so requires building with: make LITTLE_ENDIAN=1 + +- OpenCAPI reset support + + This is to allow FPGA-based OpenCAPI devices to be re-flashed with a new + device image, then reset to activate the new image. Although it is based + on top of the existing PCI Hotplug support it does require some OS changes + to function. + +- The :ref:`OPAL_PHB_SET_OPTION` and :ref:`OPAL_PHB_GET_OPTION` OPAL calls + + These OPAL calls provide the OS with a means for controlling per-PHB + settings. Currently this allows the OS to enable or disable the the "Global + MMIO EEH Disable" and "4GTE" settings which are available on Power9 / PHB4. + See the PHB specification for more details. + +Removed features +---------------- + +- Fast-reboot is now disabled by default. + + Fast-reboot will continue to be supported, but as an opt-in feature rather + than the default. From the commit (ee07f2c68160) message:: + + This has two user visible changes: + + 1. Full reboot is now the default. In order to get fast-reboot as the + default the nvram option needs to be set: + + nvram -p ibm,skiboot --update-config fast-reset=1 + + 2. The nvram option to force a fast-reboot even when some part of + skiboot has called disable_fast_reboot() has changed from + 'fast-reset=im-feeling-lucky' to 'force-fast-reset=1' because + it's impossible to actually use that 'feature' if fast-reboot is + off by default. + + nvram -p ibm,skiboot --update-config force-fast-reset=1 + diff --git a/roms/skiboot/doc/release-notes/skiboot-6.7.1.rst b/roms/skiboot/doc/release-notes/skiboot-6.7.1.rst new file mode 100644 index 000000000..7937885ea --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.7.1.rst @@ -0,0 +1,33 @@ +.. _skiboot-6.7.1: + +============== +skiboot-6.7.1 +============== + +skiboot 6.7.1 was released on Wednesday January 06, 2021. It replaces +:ref:`skiboot-6.7` as the current stable release in the 6.7.x series. + +It is recommended that 6.7.1 be used instead of 6.7 version due to the +bug fixes it contains. + +Bug fixes included in this release are: + +- SBE: Account cancelled timer request + +- SBE: Rate limit timer requests + +- SBE: Check timer state before scheduling timer + +- platform/mowgli: Limit PHB0/(pec0) to gen3 speed + +- Revert "mowgli: Limit slot1 to Gen3 by default" + +- xscom: Fix xscom error logging caused due to xscom OPAL call + +- xive/p9: Remove assert from xive_eq_for_target() + +- Fix possible deadlock with DEBUG build + +- core/platform: Fallback to full_reboot if fast-reboot fails + +- core/cpu: fix next_ungarded_primary diff --git a/roms/skiboot/doc/release-notes/skiboot-6.7.2.rst b/roms/skiboot/doc/release-notes/skiboot-6.7.2.rst new file mode 100644 index 000000000..05d0c8e14 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.7.2.rst @@ -0,0 +1,29 @@ +.. _skiboot-6.7.2: + +============== +skiboot-6.7.2 +============== + +skiboot 6.7.2 was released on Wednesday June 30, 2021. It replaces +:ref:`skiboot-6.7.1` as the current stable release in the 6.7.x series. + +It is recommended that 6.7.2 be used instead of 6.7.1 version due to the +bug fixes it contains. + +Bug fixes included in this release are: + +- secvar: fix endian conversion + +- secvar/secvar_util: Properly free memory on zalloc fail + +- edk2-compat-process.c: Remove repetitive debug print statements + +- phb4: Avoid MMIO load freeze escalation on every chip + +- phb4: Disable TCE cache line buffer + +- hw/imc: Disable only nest_imc devices if pause_microcode() fail + +- hw/imc: move imc_init() towards end main_cpu_entry() + +- Fix lock error when BT IRQ preempt BT timer diff --git a/roms/skiboot/doc/release-notes/skiboot-6.7.rst b/roms/skiboot/doc/release-notes/skiboot-6.7.rst new file mode 100644 index 000000000..4e2db7311 --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.7.rst @@ -0,0 +1,37 @@ +.. _skiboot-6.7: + +skiboot-6.7 +=========== + +skiboot v6.7 was released on Tuesday November 3rd 2020. It is the first release +of skiboot 6.7 series, which becomes the new stable release following the +:ref:`skiboot-6.6` release, first released Wednesday April 22nd 2020. + +The main reason for this release is the addition of secure variable support and +the Mowgli platform. Aside from these feature, this release is largely bug-fixes. +However, this is expected since we're approaching the end of the P9 product cycle +and development has largely shifted towards enabling a future processor with a +difficult-to-guess name. + +.. _skiboot-6.7-new-features: + +New features +------------ + +- Secure Variable support + + The secure variable API provides the host operating system with space to + store cryptographic keys for OS secure boot. The security comes from the + requirement that all secure variable updates be cryptographically signed + so the keys used to verify the secure boot chain can only be updated by + a user authorized to do so. + +- Fleetwood platform support + + Support was added for the multi-node IBM Fleetwood systems. This support + was largely for internal IBM testing purposes and is not, and will not, ever + be offically supported. + +- Mowgli platform support + + Support was added for the Mowgli platform built by Wistron. diff --git a/roms/skiboot/doc/release-notes/skiboot-6.8.rst b/roms/skiboot/doc/release-notes/skiboot-6.8.rst new file mode 100644 index 000000000..41c1c06dc --- /dev/null +++ b/roms/skiboot/doc/release-notes/skiboot-6.8.rst @@ -0,0 +1,12 @@ +.. _skiboot-6.8: + +skiboot-6.8 +=========== + +skiboot v6.8 was released on Friday May 28th 2021. It is the first release +of skiboot 6.8 series, which becomes the new stable release following the +:ref:`skiboot-6.7` release, first released Tuesday November 3rd 2020. + +This release is entirely focused on bug-fixes and probably first time we have +not added any new feature. However, this is expected as development has +largely shifted towards enabling future processor. |