aboutsummaryrefslogtreecommitdiffstats
path: root/roms/skiboot/doc/release-notes/skiboot-5.7.rst
blob: 9a967c841a16507e952e9bda712dd29653e03582 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
.. _skiboot-5.7:

skiboot-5.7
===========

skiboot v5.7 was released on Tuesday July 25th 2017. It follows two
release candidates of skiboot 5.7, and is now the new stable release
of skiboot following the 5.6 release, first released 24th May 2017.

skiboot v5.7 contains all bug fixes as of :ref:`skiboot-5.4.6`
and :ref:`skiboot-5.1.19` (the currently maintained stable releases). We
do not currently expect to do any 5.6.x stable releases.

For how the skiboot stable releases work, see :ref:`stable-rules` for details.

POWER9 is still in development, and thus all POWER9 users must upgrade
to skiboot v5.7.

This is the second release using the new regular six week release cycle,
similar to op-build, but slightly offset to allow for a short stabilisation
period. Expected release dates and contents are tracked using GitHub milestone
and issues: https://github.com/open-power/skiboot/milestones

New Features
------------

Since :ref:`skiboot-5.6.0`, we have a few new features:

New features in this release for POWER9 systems:

- In Memory Counters (IMC) (See :ref:`imc` for details)
- phb4: Activate shared PCI slot on witherspoon (see :ref:`Shared Slot <shared-slot-rn>`)
- phb4 capi (i.e. CAPI2): Enable capi mode for PHB4 (see :ref:`CAPI on PHB4 <capi2-rn>`)

New feature for IBM FSP based systems:

- fsp/tpo: Provide support for disabling TPO alarm

  This patch adds support for disabling a preconfigured
  Timed-Power-On(TPO) alarm on FSP based systems. Presently once a TPO alarm
  is configured from the kernel it will be triggered even if its
  subsequently disabled.

  With this patch a TPO alarm can be disabled by passing
  y_m_d==hr_min==0 to fsp_opal_tpo_write(). A branch is added to the
  function to handle this case by sending FSP_CMD_TPO_DISABLE message to
  the FSP instead of usual FSP_CMD_TPO_WRITE message. The kernel is
  expected to call opal_tpo_write() with y_m_d==hr_min==0 to request
  opal to disable TPO alarm.


POWER9
------
There are many important changes for POWER9 DD1 and DD2 systems. POWER9 support
should be considered in development and skiboot 5.7 is certainly **NOT**
suitable for POWER9 production environments.

Since :ref:`skiboot-5.7-rc2`:

- platform/witherspoon: Enable eSEL logging

  OpenBMC stack added IPMI OEM extension to log eSEL events.
  Lets enable eSEL logging from OPAL side.

  See: https://github.com/openbmc/openpower-host-ipmi-oem/blob/d9296050bcece5c2eca5ede0932d944b0ced66c9/oemhandler.cpp#L142
  (yes, that is the documentation)
- hdat/i2c: Fix array version check
- mem_region: Check for no-map in reserved nodes

  Regions with the no-map property should be handled seperately to
  "normal" firmware reservations. When creating mem_region regions
  from a reserved-memory DT node use the no-map property to select
  the right reservation type.

- hdata/memory: Add memory reservations to the DT

  Currently we just add these to a list of pre-boot reserved regions
  which is then converted into a the contents of the /reserved-memory/
  node just before Skiboot jumps into the firmware kernel.

  This approach is insufficent because we need to add the ibm,prd-instance
  labels to the various hostboot reserved regions. To do this we want to
  create these resevation nodes inside the HDAT parser rather than having
  the mem_region flattening code handle it. On P8 systems Hostboot placed
  its memory reservations under the /ibm,hostboot/ node and this patch
  makes the HDAT parser do the same.

Since Since :ref:`skiboot-5.7-rc1`:

- HDAT: Add IPMI sensor data under /bmc node
- numa/associativity: Add a new level of NUMA for GPU's

  Today we have an issue where the NUMA nodes corresponding
  to GPU's have the same affinity/distance as normal memory
  nodes. Our reference-points today supports two levels
  [0x4, 0x4] for normal systems and [0x4, 0x3] for Power8E
  systems. This patch adds a new level [0x4, X, 0x2] and
  uses node-id as at all levels for the GPU.
- xive: Enable memory backing of queues

  This dedicates 6x64k pages of memory permanently for the XIVE to
  use for internal queue overflow. This allows the XIVE to deal with
  some corner cases where the internal queues might prove insufficient.

- xive: Properly get rid of donated indirect pages during reset

  Otherwise they keep being used accross kexec causing memory
  corruption in subsequent kernels once KVM has been used.

- cpu: Better handle unknown flags in opal_reinit_cpus()

  At the moment, if we get passed flags we don't know about, we
  return OPAL_UNSUPPORTED but we still perform whatever actions
  was requied by the flags we do support. Additionally, on P8,
  we attempt a SLW re-init which hasn't been supported since
  Murano DD2.0 and will crash your system.

  It's too late to fix on existing systems so Linux will have to
  be careful at least on P8, but to avoid future issues let's clean
  that up, make sure we only use slw_reinit() when HILE isn't
  supported.
- cpu: Unconditionally cleanup TLBs on P9 in opal_reinit_cpus()

  This can work around problems where Linux fails to properly
  cleanup part or all of the TLB on kexec.

- Fix scom addresses for power9 nx checkstop hmi handling.

  Scom addresses for NX status, DMA & ENGINE FIR and PBI FIR has changed
  for Power9. Fixup thoes while handling nx checkstop for Power9.
- Fix scom addresses for power9 core checkstop hmi handling.

  Scom addresses for CORE FIR (Fault Isolation Register) and Malfunction
  Alert Register has changed for Power9. Fixup those while handling core
  checkstop for Power9.

  Without this change HMI handler fails to check for correct reason for
  core checkstop on Power9.

- core/mem_region: check return value of add_region

  The only sensible thing to do if this fails is to abort() as we've
  likely just failed reserving reserved memory regions, and nothing
  good comes from that.

Since Since :ref:`skiboot-5.6.0`:

- hdata: Reserve Trace Areas

  When hostboot is configured to setup in memory tracing it will reserve
  some memory for use by the hardware tracing facility. We need to mark
  these areas as off limits to the operating system and firmware.
- hdata: Make out-of-range idata print at PR_DEBUG

  Some fields just aren't populated on some systems.

- hdata: Ignore unnamed memory reservations.

  Hostboot should name any and all memory reservations that it provides.
  Currently some hostboots export a broken reservation covering the first
  256MB of memory and this causes the system to crash at boot due to an
  invalid free because this overlaps with the static "ibm,os-reserve"
  region (which covers the first 768MB of memory).

  According to the hostboot team unnamed reservations are invalid and can
  be ignored.

- hdata: Check the Host I2C devices array version

  Currently this is not populated on FSP machines which causes some
  obnoxious errors to appear in the boot log. We also only want to
  parse version 1 of this structure since future versions will completely
  change the array item format.

- Ensure P9 DD1 workarounds apply only to Nimbus

  The workarounds for P9 DD1 are only needed for Nimbus. P9 Cumulus will
  be DD1 but don't need these same workarounds.

  This patch ensures the P9 DD1 workarounds only apply to Nimbus. It
  also renames some things to make clear what's what.

- cpu: Cleanup AMR and IAMR when re-initializing CPUs

  There's a bug in current Linux kernels leaving crap in those registers
  accross kexec and not sanitizing them on boot. This breaks kexec under
  some circumstances (such as booting a hash kernel from a radix one
  on P9 DD2.0).

  The long term fix is in Linux, but this workaround is a reasonable
  way of "sanitizing" those SPRs when Linux calls opal_reinit_cpus()
  and shouldn't have adverse effects.

  We could also use that same mechanism to cleanup other things as
  well such as restoring some other SPRs to their default value in
  the future.

- Set POWER9 RPR SPR to 0x00000103070F1F3F.  Same value as P8.

  Without this, thread priorities inside a core don't work.

- cpu: Support setting HID[RADIX] and set it by default on P9

  This adds new opal_reinit_cpus() flags to setup radix or hash
  mode in HID[8] on POWER9.

  By default HID[8] will be set. On P9 DD1.0, Linux will change
  it as needed. On P9 DD2.0 hash works in radix mode (radix is
  really "dual" mode) so KVM won't break and existing kernels
  will work.

  Newer kernels built for hash will call this to clear the HID bit
  and thus get the full size of the TLB as an optimization.

- Add "cleanup_global_tlb" for P9 and later

  Uses broadcast TLBIE's to cleanup the TLB on all cores and on
  the nest MMU

- xive: DD2.0 updates

  Add support for StoreEOI, fix StoreEOI MMIO offset in ESB page,
  and other cleanups

- Update default TSCR value for P9 as recommended by HW folk.

- xive: Fix initialisation of xive_cpu_state struct

  When using XIVE emulation with DEBUG=1, we run into crashes in log_add()
  due to the xive_cpu_state->log_pos being uninitialised (and thus, with
  DEBUG enabled, initialised to the poison value of 0x99999999).


PHB4
^^^^

Since :ref:`skiboot-5.7-rc2`:

- phb4: Add link training trace mode

  Add a mode to PHB4 to trace training process closely. This activates
  as soon as PERST is deasserted and produces human readable output of
  the process.

  This may increase training times since it duplicates some of the
  training code.  This code has it's own simple checks for fence and
  timeout but will fall through to the default training code once done.

  Output produced, looks like the "TRACE:" lines below: ::

      [    3.410799664,7] PHB#0001[0:1]: FRESET: Starts
      [    3.410802000,7] PHB#0001[0:1]: FRESET: Prepare for link down
      [    3.410806624,7] PHB#0001[0:1]: FRESET: Assert skipped
      [    3.410808848,7] PHB#0001[0:1]: FRESET: Deassert
      [    3.410812176,3] PHB#0001[0:1]: TRACE: 0x0000000101000000  0ms
      [    3.417170176,3] PHB#0001[0:1]: TRACE: 0x0000100101000000 12ms presence
      [    3.436289104,3] PHB#0001[0:1]: TRACE: 0x0000180101000000 49ms training
      [    3.436373312,3] PHB#0001[0:1]: TRACE: 0x00001d0811000000 49ms trained
      [    3.436420752,3] PHB#0001[0:1]: TRACE: Link trained.
      [    3.436967856,7] PHB#0001[0:1]: LINK: Start polling
      [    3.437482240,7] PHB#0001[0:1]: LINK: Electrical link detected
      [    3.437996864,7] PHB#0001[0:1]: LINK: Link is up
      [    4.438000048,7] PHB#0001[0:1]: LINK: Link is stable

  Enabled via nvram using: ::

      nvram -p ibm,skiboot --update-config pci-tracing=true

- phb4: Improve reset and link training timing

  This improves PHB reset and link training timing.

- phb4: Add phb4_check_reg() to sanity check failures

  This adds a function phb4_check_reg() to sanity check when we do MMIO
  reads from the PHB to make sure it's not fenced.

- phb4: Remove retry on electrical link timeout

  Currently we retry if we don't detect an electrical link. This is
  pointless as all devices should respond in the given time.

  This patches removes this retry and just returns OPAL_HARDWARE if we
  don't detect an electrical link.

  This has the additional benefit of improving boot times on machines
  that have badly wired presence detect (ie. says a device is present
  when there isn't).

- phb4: Read PERST signal rather than assuming it's asserted

  Currently we assume on boot that PERST is asserted so that we can skip
  having to assert it ourselves.

  This instead reads the PERST status and determines if we need to
  assert it based on that.

- phb4: Fix endian of TLP headers print

  Byte swap TLP headers so they are the same as the PCIe spec.
- phb4: Change timeouts prints to error level

  If the link doesn't have a electrical link or the link doesn't train
  we should make that more obvious to the user.
- phb4: Better logs why the slot didn't work

  Better logs why the slot didn't work and make it a PR_ERR so users
  see it by default.

- phb4: Force verbose EEH logging

  Force verbose EEH. This is a heavy handed and we should turn if off
  later as things stabilise, but is useful for now.
- phb4: Initialization sequence updates

  Mostly errata workarounds, some DD1 specific.

  The step Init_5 was moved to Init_16, so the numbering was updated to
  reflect this.

Since :ref:`skiboot-5.7-rc1`:

- phb4: Do more retries on link training failures
  Currently we only retry once when we have a link training failure.
  This changes this to be 3 retries as 1 retry is not giving us enough
  reliablity.

  This will increase the boot time, especially on systems where we
  incorrectly detect a link presence when there really is nothing
  present. I'll post a followup patch to optimise our timings to help
  mitigate this later.

- phb4: Workaround phy lockup by doing full PHB reset on retry

  For PHB4 it's possible that the phy may end up in a bad state where it
  can no longer recieve data. This can manifest as the link not
  retraining. A simple PERST will not clear this. The PHB must be
  completely reset.

  This changes the retry state to CRESET to do this.

  This issue may also manifest itself as the link training in a degraded
  state (lower speed or narrower width). This patch doesn't attempt to
  fix that (will come later).
- pci: Add ability to trace timing

  PCI link training is responsible for a huge chunk of the skiboot boot
  time, so add the ability to trace it waiting in the main state
  machine.
- pci: Print resetting PHB notice at higher log level

  Currently during boot there a long delay while we wait for the PHBs to
  be reset and train. During this time, there is no output from skiboot
  and the last message doesn't give an indication of what's happening.

  This boosts the PHB reset message from info to notice so users can see
  what's happening during this long period of waiting.
- phb4: Only set one bit in nfir

  The MPIPL procedure says to only set bit 26 when forcing the PEC into
  freeze mode. Currently we set bits 24-27.

  This changes the code to follow spec and only set bit 26.
- phb4: Fix order of pfir/nfir clearing in CRESET

  According to the workbook, pfir must be cleared before the nfir.
  The way we have it now causes the nfir to not clear properly in some
  error circumstances.

  This swaps the order to match the workbook.
- phb4: Remove incorrect state transition

  When waiting in PHB4_SLOT_CRESET_WAIT_CQ for transations to end, we
  incorrectly move onto the next state.  Generally we don't hit this as
  the transactions have ended already anyway.

  This removes the incorrect state transition.
- phb4: Set default lane equalisation

  Set default lane equalisation if there is nothing in the device-tree.

  Default value taken from hdat and confirmed by hardware team. Neatens
  the code up a bit too.
- hdata: Fix phb4 lane-eq property generation

  The lane-eq data we get from hdat is all 7s but what we end up in the
  device tree is: ::

    xscom@603fc00000000/pbcq@4010c00/stack@0/ibm,lane-eq
                     00000000 31c339e0 00000000 0000000c
                     00000000 00000000 00000000 00000000
                     00000000 31c30000 77777777 77777777
                     77777777 77777777 77777777 77777777

  This fixes grabbing the properties from hdat and fixes the call to put
  them in the device tree.
- phb4: Fix PHB4 fence recovery.

  We had a few problems:

  - We used the wrong register to trigger the reset (spec bug)
  - We should clear the PFIR and NFIR while the reset is asserted
  - ... and in the right order !
  - We should only apply the DD1 workaround after the reset has
    been lifted.
  - We should ensure we use ASB whenever we are fenced or doing a
    CRESET
  - Make config ops write with ASB
- phb4: Verbose EEH options

  Enabled via nvram pci-eeh-verbose=true. ie. ::

    nvram -p ibm,skiboot --update-config pci-eeh-verbose=true
- phb4: Print more info when PHB fences

  For now at PHBERR level. We don't have room in the diags data
  passed to Linux for these unfortunately.

Since :ref:`skiboot-5.6.0`:

- phb4: Fix number of index bits in IODA tables

  On PHB4 the number of index bits in the IODA table address register
  was bumped to 10 bits to accomodate for 1024 MSIs and 1024 TVEs (DD2).

  However our macro only defined the field to be 9 bits, thus causing
  "interesting" behaviours on some systems.

- phb4: Harden init with bad PHBs

  Currently if we read all 1's from the EEH or IRQ capabilities, we end
  up train wrecking on some other random code (eg. an assert() in xive).

  This hardens the PHB4 code to look for these bad reads and more
  gracefully fails the init for that PHB alone.  This allows the rest of
  the system to boot and ignore those bad PHBs.

- phb4 capi (i.e. CAPI2): Handle HMI events

  Find the CAPP on the chip associated with the HMI event for PHB4.
  The recovery mode (re-initialization of the capp, resume of functional
  operations) is only available with P9 DD2. A new patch will be provided
  to support this feature.

.. _capi2-rn:

- phb4 capi (i.e. CAPI2): Enable capi mode for PHB4

  Enable the Coherently attached processor interface. The PHB is used as
  a CAPI interface.
  CAPI Adapters can be connected to either PEC0 or PEC2. Single port
  CAPI adapter can be connected to either PEC0 or PEC2, but Dual-Port
  Adapter can be only connected to PEC2
  * CAPP0 attached to PHB0(PEC0 - single port)
  * CAPP1 attached to PHB3(PEC2 - single or dual port)

- hw/phb4: Rework phb4_get_presence_state()

  There are two issues in current implementation: It should return errcode
  visibile to Linux, which has prefix OPAL_*. The code isn't very obvious.

  This returns OPAL_HARDWARE when the PHB is broken. Otherwise, OPAL_SUCCESS
  is always returned. In the mean while, It refactors the code to make it
  obvious: OPAL_PCI_SLOT_PRESENT is returned when the presence signal (low active)
  or PCIe link is active. Otherwise, OPAL_PCI_SLOT_EMPTY is returned.

- phb4: Error injection for config space

  Implement CFG (config space) error injection.

  This works the same as PHB3.  MMIO and DMA error injection require a
  rewrite, so they're unsupported for now.

  While it's not feature complete, this at least provides an easy way to
  inject an error that will trigger EEH.

- phb4: Error clear implementation
- phb4: Mask link down errors during reset

  During a hot reset the PCI link will drop, so we need to mask link down
  events to prevent unnecessary errors.
- phb4: Implement root port initialization

  phb4_root_port_init() was a NOP before, so fix that.
- phb4: Complete reset implementation

  This implements complete reset (creset) functionality for POWER9 DD1.

  Only partially tested and contends with some DD1 errata, but it's a start.

.. _shared-slot-rn:

- phb4: Activate shared PCI slot on witherspoon

  Witherspoon systems come with a 'shared' PCI slot: physically, it
  looks like a x16 slot, but it's actually two x8 slots connected to two
  PHBs of two different chips. Taking advantage of it requires some
  logic on the PCI adapter. Only the Mellanox CX5 adapter is known to
  support it at the time of this writing.

  This patch enables support for the shared slot on witherspoon if a x16
  adapter is detected. Each x8 slot has a presence bit, so both bits
  need to be set for the activation to take place. Slot sharing is
  activated through a gpio.

  Note that there's no easy way to be sure that the card is indeed a
  shared-slot compatible PCI adapter and not a normal x16 card. Plugging
  a normal x16 adapter on the shared slot should be avoided on
  witherspoon, as the link won't train on the second slot, resulting in
  a timeout and a longer boot time. Only the first slot is usable and
  the x16 adapter will end up using only half the lines.

  If the PCI card plugged on the physical slot is only x8 (or less),
  then the presence bit of the second slot is not set, so this patch
  does nothing. The x8 (or less) adapter should work like on any other
  physical slot.

- phb4: Block D-state power management on direct slots

  As current revisions of PHB4 don't properly handle the resulting
  L1 link transition.

- phb4: Call pci config filters

- phb4: Mask out write-1-to-clear registers in RC cfg

  The root complex config space only supports 4-byte accesses. Thus, when
  the client requests a smaller size write, we do a read-modify-write to
  the register.

  However, some register have bits defined as "write 1 to clear".

  If we do a RMW cycles on such a register and such bits are 1 in the
  part that the client doesn't intend to modify, we will accidentally
  write back those 1's and clear the corresponding bit.

  This avoids it by masking out those magic bits from the "old" value
  read from the register.

- phb4: Properly mask out link down errors during reset
- phb3/4: Silence a useless warning

  PHB's don't have base location codes on non-FSP systems and it's
  normal.

- phb4: Workaround bug in spec 053

  Wait for DLP PGRESET to clear *after* lifting the PCIe core reset

- phb4: DD2.0 updates

  Support StoreEOI, full complements of PEs (twice as big TVT)
  and other updates.

  Also renumber init steps to match spec 063

NPU2
^^^^

Note that currently NPU2 support is limited to POWER9 DD1 hardware.

Since :ref:`skiboot-5.6.0`:

- platforms/astbmc/witherspoon.c: Add NPU2 slot mappings

  For NVLink2 to function PCIe devices need to be associated with the right
  NVLinks. This association is supposed to be passed down to Skiboot via HDAT but
  those fields are still not correctly filled out. To work around this we add slot
  tables for the NVLinks similar to what we have for P8+.

- hw/npu2.c: Fix device aperture calculation

  The POWER9 NPU2 implements an address compression scheme to compress 56-bit P9
  physical addresses to 47-bit GPU addresses. System software needs to know both
  addresses, unfortunately the calculation of the compressed address was
  incorrect. Fix it here.

- hw/npu2.c: Change MCD BAR allocation order

  MCD BARs need to be correctly aligned to the size of the region. As GPU
  memory is allocated from the top of memory down we should start allocating
  from the highest GPU memory address to the lowest to ensure correct
  alignment.

- NPU2: Add flag to nvlink config space indicating DL reset state

  Device drivers need to be able to determine if the DL is out of reset or
  not so they can safely probe to see if links have already been trained.
  This patch adds a flag to the vendor specific config space indicating if
  the DL is out of reset.

- hw/npu2.c: Hardcode MSR_SF when setting up npu XTS contexts

  We don't support anything other than 64-bit mode for address translations so we
  can safely hardcode it.

- hw/npu2-hw-procedures.c: Add nvram option to override zcal calculations

  In some rare cases the zcal state machine may fail and flag an error. According
  to hardware designers it is sometimes ok to ignore this failure and use nominal
  values for the calculations. In this case we add a nvram variable
  (nv_zcal_override) which will cause skiboot to ignore the failure and use the
  nominal value specified in nvram.
- npu2: Fix npu2_{read,write}_4b()

  When writing or reading 4-byte values, we need to use the upper half of
  the 64-bit SCOM register.

  Fix npu2_{read,write}_4b() and their callers to use uint32_t, and
  appropriately shift the value being written or returned.


- hw/npu2.c: Fix opal_npu_map_lpar to search for existing BDF
- hw/npu2-hw-procedures.c: Fix running of zcal procedure

    The zcal procedure should only be run once per obus (ie. once per group of 3
    links). Clean up the code and fix the potential buffer overflow due to a typo.
    Also updates the zcal settings to their proper values.
- hw/npu2.c: Add memory coherence directory programming

  The memory coherence directory (MCD) needs to know which system memory addresses
  belong to the GPU. This amounts to setting a BAR and a size in the MCD to cover
  the addresses assigned to each of the GPUs. To ease assignment we assume GPUs
  are assigned memory in a contiguous block per chip.

OCC/Power Management
^^^^^^^^^^^^^^^^^^^^

With this release, it's possible to boot POWER9 systems with the OCC
enabled and change CPU frequencies. Doing so does require other firmware
components to also support this (otherwise the frequency will not be set).

Since :ref:`skiboot-5.6.0`:

- occ: Skip setting cores to nominal frequency in P9

  In P9, once OCC is up, it is supposed to setup the cores to nominal
  frequency. So skip this step in OPAL.
- occ: Fix Pstate ordering for P9

  In P9 the pstate values are positive. They are continuous set of
  unsigned integers [0 to +N] where Pmax is 0 and Pmin is N. The
  linear ordering of pstates for P9 has changed compared to P8.
  P8 has neagtive pstate values advertised as [0 to -N] where Pmax
  is 0 and Pmin is -N. This patch adds helper routines to abstract
  pstate comparison with pmax and adds sanity pstate limit checks.
  This patch also fixes pstate arithmetic by using labs().
- p8-i2c: occ: Add support for OCC to use I2C engines

  This patch adds support to share the I2C engines with host and OCC.
  OCC uses I2C engines to read DIMM temperatures and to communicate with
  GPU. OCC Flag register is used for locking between host and OCC. Host
  requests for the bus by setting a bit in OCC Flag register. OCC sends
  an interrupt to indicate the change in ownership.

opal-prd/PRD
^^^^^^^^^^^^

Since :ref:`skiboot-5.6.0`:

- opal-prd: Handle SBE passthrough message passing

  This patch adds support to send SBE pass through command to HBRT.
- SBE: Add passthrough command support

  SBE sends passthrough command. We have to capture this interrupt and
  send event to HBRT via opal-prd (user space daemon).
- opal-prd: hook up reset_pm_complex

  This change provides the facility to invoke HBRT's reset_pm_complex, in
  the same manner is done with process_occ_reset previously.

  We add a control command for `opal-prd pm-complex reset`, which is just
  an alias for occ_reset at this stage.

- prd: Implement firmware side of opaque PRD channel

  This change introduces the firmware side of the opaque HBRT <--> OPAL
  message channel. We define a base message format to be shared with HBRT
  (in include/prd-fw-msg.h), and allow firmware requests and responses to
  be sent over this channel.

  We don't currently have any notifications defined, so have nothing to do
  for firmware_notify() at this stage.

- opal-prd: Add firmware_request & firmware_notify implementations

  This change adds the implementation of firmware_request() and
  firmware_notify(). To do this, we need to add a message queue, so that
  we can properly handle out-of-order messages coming from firmware.

- opal-prd: Add support for variable-sized messages

  With the introductuion of the opaque firmware channel, we want to
  support variable-sized messages. Rather than expecting to read an
  entire 'struct opal_prd_msg' in one read() call, we can split this
  over mutiple reads, potentially expanding our message buffer.

- opal-prd: Sync hostboot interfaces with HBRT

  This change adds new callbacks defined for p9, and the base thunks for
  the added calls.

- opal-prd: interpret log level prefixes from HBRT

  Interpret the (optional) \*_MRK log prefixes on HBRT messages, and set
  the syslog log priority to suit.

- opal-prd: Add occ reset to usage text
- opal-prd: allow different chips for occ control actions

  The `occ reset` and `occ error` actions can both take a chip id
  argument, but we're currently just using zero. This change changes the
  control message format to pass the chip ID from the control process to
  the opal-prd daemon.


IBM FSP based platforms
-----------------------

Since :ref:`skiboot-5.7-rc2`:

- FSP/CONSOLE: Do not enable input irq in write path

  We use irq for reading input from console, but not in output path.
  Hence do not enable input irq in write path.

  Fixes : 583c8203 (fsp/console: Allocate irq for each hvc console)

Since :ref:`skiboot-5.6.0`:

- FSP/CONSOLE: Fix possible NULL dereference
- platforms/ibm-fsp/firenze: Fix PCI slot power-off pattern

  When powering off the PCI slot, the corresponding bits should
  be set to 0bxx00xx00 instead of 0bxx11xx11. Otherwise, the
  specified PCI slot can't be put into power-off state. Fortunately,
  it didn't introduce any side-effects so far.
- FSP/CONSOLE: Workaround for unresponsive ipmi daemon

  We use TCE mapped area to write data to console. Console header
  (fsp_serbuf_hdr) is modified by both FSP and OPAL (OPAL updates
  next_in pointer in fsp_serbuf_hdr and FSP updates next_out pointer).

  Kernel makes opal_console_write() OPAL call to write data to console.
  OPAL write data to TCE mapped area and sends MBOX command to FSP.
  If our console becomes full and we have data to write to console,
  we keep on waiting until FSP reads data.

  In some corner cases, where FSP is active but not responding to
  console MBOX message (due to buggy IPMI) and we have heavy console
  write happening from kernel, then eventually our console buffer
  becomes full. At this point OPAL starts sending OPAL_BUSY_EVENT to
  kernel. Kernel will keep on retrying. This is creating kernel soft
  lockups. In some extreme case when every CPU is trying to write to
  console, user will not be able to ssh and thinks system is hang.

  If we reset FSP or restart IPMI daemon on FSP, system recovers and
  everything becomes normal.

  This patch adds workaround to above issue by returning OPAL_HARDWARE
  when cosole is full. Side effect of this patch is, we may endup dropping
  latest console data. But better to drop console data than system hang.

- FSP: Set status field in response message for timed out message

  For timed out FSP messages, we set message status as "fsp_msg_timeout".
  But most FSP driver users (like surviellance) are ignoring this field.
  They always look for FSP returned status value in callback function
  (second byte in word1). So we endup treating timed out message as success
  response from FSP.

  Sample output: ::

    [69902.432509048,7] SURV: Sending the heartbeat command to FSP
    [70023.226860117,4] FSP: Response from FSP timed out, word0 = d66a00d7, word1 = 0 state: 3
    ....
    [70023.226901445,7] SURV: Received heartbeat acknowledge from FSP
    [70023.226903251,3] FSP: fsp_trigger_reset() entry

  Here SURV code thought it got valid response from FSP. But actually we didn't
  receive response from FSP.

  This patch fixes above issue by updating status field in response structure.

- FSP: Improve timeout message

- FSP/RTC: Fix possible FSP R/R issue in rtc write path
- hw/fsp/rtc: read/write cached rtc tod on fsp hir.

  Currently fsp-rtc reads/writes the cached RTC TOD on an fsp
  reset. Use latest fsp_in_rr() function to properly read the cached rtc
  value when fsp reset initiated by the hir.

  Below is the kernel trace when we set hw clock, when hir process starts. ::

    [ 1727.775824] NMI watchdog: BUG: soft lockup - CPU#57 stuck for 23s! [hwclock:7688]
    [ 1727.775856] Modules linked in: vmx_crypto ibmpowernv ipmi_powernv uio_pdrv_genirq ipmi_devintf powernv_op_panel uio ipmi_msghandler powernv_rng leds_powernv ip_tables x_tables autofs4 ses enclosure scsi_transport_sas crc32c_vpmsum lpfc ipr tg3 scsi_transport_fc
    [ 1727.775883] CPU: 57 PID: 7688 Comm: hwclock Not tainted 4.10.0-14-generic #16-Ubuntu
    [ 1727.775883] task: c000000fdfdc8400 task.stack: c000000fdfef4000
    [ 1727.775884] NIP: c00000000090540c LR: c0000000000846f4 CTR: 000000003006dd70
    [ 1727.775885] REGS: c000000fdfef79a0 TRAP: 0901   Not tainted  (4.10.0-14-generic)
    [ 1727.775886] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>
    [ 1727.775889]   CR: 28024442  XER: 20000000
    [ 1727.775890] CFAR: c00000000008472c SOFTE: 1
                   GPR00: 0000000030005128 c000000fdfef7c20 c00000000144c900 fffffffffffffff4
                   GPR04: 0000000028024442 c00000000090540c 9000000000009033 0000000000000000
                   GPR08: 0000000000000000 0000000031fc4000 c000000000084710 9000000000001003
                   GPR12: c0000000000846e8 c00000000fba0100
    [ 1727.775897] NIP [c00000000090540c] opal_set_rtc_time+0x4c/0xb0
    [ 1727.775899] LR [c0000000000846f4] opal_return+0xc/0x48
    [ 1727.775899] Call Trace:
    [ 1727.775900] [c000000fdfef7c20] [c00000000090540c] opal_set_rtc_time+0x4c/0xb0 (unreliable)
    [ 1727.775901] [c000000fdfef7c60] [c000000000900828] rtc_set_time+0xb8/0x1b0
    [ 1727.775903] [c000000fdfef7ca0] [c000000000902364] rtc_dev_ioctl+0x454/0x630
    [ 1727.775904] [c000000fdfef7d40] [c00000000035b1f4] do_vfs_ioctl+0xd4/0x8c0
    [ 1727.775906] [c000000fdfef7de0] [c00000000035bab4] SyS_ioctl+0xd4/0xf0
    [ 1727.775907] [c000000fdfef7e30] [c00000000000b184] system_call+0x38/0xe0
    [ 1727.775908] Instruction dump:
    [ 1727.775909] f821ffc1 39200000 7c832378 91210028 38a10020 39200000 38810028 f9210020
    [ 1727.775911] 4bfffe6d e8810020 80610028 4b77f61d <60000000> 7c7f1b78 3860000a 2fbffff4

  This is found when executing the testcase
  https://github.com/open-power/op-test-framework/blob/master/testcases/fspresetReload.py

  With this fix ran fsp hir torture testcase in the above test
  which is working fine.
- occ: Set return variable to correct value

  When entering this section of code rc will be zero. If fsp_mkmsg() fails
  the code responsible for printing an error message won't be set.
  Resetting rc should allow for the error case to trigger if fsp_mkmsg
  fails.
- capp: Fix hang when CAPP microcode LID is missing on FSP machine

  When the LID is absent, we fail early with an error from
  start_preload_resource. In that case, capp_ucode_info.load_result
  isn't set properly causing a subsequent capp_lid_download() to
  call wait_for_resource_loaded() on something that isn't being
  loaded, thus hanging.

- FSP: Add check to detect FSP R/R inside fsp_sync_msg()

  OPAL sends MBOX message to FSP and updates message state from fsp_msg_queued
  -> fsp_msg_sent. fsp_sync_msg() queues message and waits until we get response
  from FSP. During FSP R/R we move outstanding MBOX messages from msgq to rr_queue
  including inflight message (fsp_reset_cmdclass()). But we are not resetting
  inflight message state.

  In extreme croner case where we sent message to FSP via fsp_sync_msg() path
  and FSP R/R happens before getting respose from FSP, then we will endup waiting
  in fsp_sync_msg() until everything becomes normal.

  This patch adds fsp_in_rr() check to fsp_sync_msg() and return error to caller
    if FSP is in R/R.
- FSP: Add check to detect FSP R/R inside fsp_sync_msg()

  OPAL sends MBOX message to FSP and updates message state from fsp_msg_queued
  -> fsp_msg_sent. fsp_sync_msg() queues message and waits until we get response
  from FSP. During FSP R/R we move outstanding MBOX messages from msgq to rr_queue
  including inflight message (fsp_reset_cmdclass()). But we are not resetting
  inflight message state.

  In extreme croner case where we sent message to FSP via fsp_sync_msg() path
  and FSP R/R happens before getting respose from FSP, then we will endup waiting
  in fsp_sync_msg() until everything becomes normal.

  This patch adds fsp_in_rr() check to fsp_sync_msg() and return error to caller
    if FSP is in R/R.
- capp: Fix hang when CAPP microcode LID is missing on FSP machine

  When the LID is absent, we fail early with an error from
  start_preload_resource. In that case, capp_ucode_info.load_result
  isn't set properly causing a subsequent capp_lid_download() to
  call wait_for_resource_loaded() on something that isn't being
  loaded, thus hanging.
- FSP/CONSOLE: Do not free fsp_msg in error path

  as we reuse same msg to send next output message.

- platform/zz: Acknowledge OCC_LOAD mbox message in ZZ

  In P9 FSP box, OCC image is pre-loaded. So do not handle the load
  command and send SUCCESS to FSP on recieving OCC_LOAD mbox message.

- FSP/RTC: Improve error log

astbmc systems
--------------

Since :ref:`skiboot-5.6.0`:

- platforms/astbmc: Don't validate model on palmetto

  The platform isn't compatible with palmetto until the root device-tree
  node's "model" property is NULL or "palmetto". However, we could have
  "TN71-BP012" for the property on palmetto. ::

       linux# cat /proc/device-tree/model
       TN71-BP012

  This skips the validation on root device-tree node's "model" property
  on palmetto, meaning we check the "compatible" property only.


General
-------

Since :ref:`skiboot-5.7-rc2`:

- core/pci: Fix mem-leak on fast-reboot

  Fast-reboot has a memory leak which causes the system to crash after about
  250 fast-reboots. The patch fixes the memory leak.
  The cause of the leak was the pci_device's being freed, without freeing
  the pci_slot within it.

- gcov: properly handle gard and pflash code coverage

Since :ref:`skiboot-5.6.0`:

- Reduce log level on non-error log messages

  90% of what we print isn't useful to a normal user. This
  dramatically reduces the amount of messages printed by
  OPAL in normal circumstances.

- init: Silence messages and call ourselves "OPAL"
- psi: Switch to ESB mode later

  There's an errata, if we switch to ESB mode before setting up
  the various ESB mode related registers, a pending interrupts
  can go wrong.

- lpc: Enable "new" SerIRQ mode
- hw/ipmi/ipmi-sel: missing newline in prlog warning

- p8-i2c OCC lock: fix locking in p9_i2c_bus_owner_change
- Convert important polling loops to spin at lowest SMT priority

  The pattern of calling cpu_relax() inside a polling loop does
  not suit the powerpc SMT priority instructions. Prefrred is to
  set a low priority then spin until break condition is reached,
  then restore priority.

- Improve cpu_idle when PM is disabled

  Split cpu_idle() into cpu_idle_delay() and cpu_idle_job() rather than
  requesting the idle type as a function argument. Have those functions
  provide a default polling (non-PM) implentation which spin at the
  lowest SMT priority.

- core/fdt: Always add a reserve map

  Currently we skip adding the reserved ranges block to the generated
  FDT blob if we are excluding the root node. This can result in a DTB
  that dtc will barf on because the reserved memory ranges overlap with
  the start of the dt_struct block. As an example: ::

    $ fdtdump broken.dtb -d
    /dts-v1/;
    // magic:               0xd00dfeed
    // totalsize:           0x7f3 (2035)
    // off_dt_struct:       0x30  <----\
    // off_dt_strings:      0x7b8       | this is bad!
    // off_mem_rsvmap:      0x30  <----/
    // version:             17
    // last_comp_version:   16
    // boot_cpuid_phys:     0x0
    // size_dt_strings:     0x3b
    // size_dt_struct:      0x788

    /memreserve/ 0x100000000 0x300000004;
    /memreserve/ 0x3300000001 0x169626d2c;
    /memreserve/ 0x706369652d736c6f 0x7473000000000003;
            *continues*

  With this patch: ::

    $ fdtdump working.dtb -d
    /dts-v1/;
    // magic:               0xd00dfeed
    // totalsize:           0x803 (2051)
    // off_dt_struct:       0x40
    // off_dt_strings:      0x7c8
    // off_mem_rsvmap:      0x30
    // version:             17
    // last_comp_version:   16
    // boot_cpuid_phys:     0x0
    // size_dt_strings:     0x3b
    // size_dt_struct:      0x788

    // 0040: tag: 0x00000001 (FDT_BEGIN_NODE)
    / {
    // 0048: tag: 0x00000003 (FDT_PROP)
    // 07fb: string: phandle
    // 0054: value
        phandle = <0x00000001>;
            *continues*

- hw/lpc-mbox: Use message registers for interrupts

  Currently the BMC raises the interrupt using the BMC control register.
  It does so on all accesses to the 16 'data' registers meaning that when
  the BMC only wants to set the ATTN (on which we have interrupts enabled)
  bit we will also get a control register based interrupt.

  The solution here is to mask that interrupt permanantly and enable
  interrupts on the protocol defined 'response' data byte.

PCI
---

Since :ref:`skiboot-5.6.0`:

- pci: Wait 20ms before checking presence detect on PCIe

  As the PHB presence logic has a debounce timer that can take
  a while to settle.

- phb3+iov: Fixup support for config space filters

  The filter should be called before the HW access and its
  return value control whether to perform the access or not
- core/pci: Use PCI slot's power facality in pci_enable_bridge()

  The current implmentation has incorrect assumptions: there is
  always a PCI slot associated with root port and PCIe switch
  downstream port and all of them are capable to change its
  power state by register PCICAP_EXP_SLOTCTL. Firstly, there
  might not a PCI slot associated with the root port or PCIe
  switch downstream port. Secondly, the power isn't controlled
  by standard config register (PCICAP_EXP_SLOTCTL). There are
  I2C slave devices used to control the power states on Tuleta.

  In order to use the PCI slot's methods to manage the power
  states, this does:

  * Introduce PCI_SLOT_FLAG_ENFORCE, indicates the request operation
    is enforced to be applied.
  * pci_enable_bridge() is split into 3 functions: pci_bridge_power_on()
    to power it on; pci_enable_bridge() as a place holder and
    pci_bridge_wait_link() to wait the downstream link to come up.
  * In pci_bridge_power_on(), the PCI slot's specific power management
    methods are used if there is a PCI slot associated with the PCIe
    switch downstream port or root port.
- platforms/astbmc/slots.c: Allow comparison of bus numbers when matching slots

  When matching devices on multiple down stream PLX busses we need to compare more
  than just the device-id of the PCIe BDFN, so increase the mask to do so.

Debugging, Tests and simulators
-------------------------------

Since :ref:`skiboot-5.7-rc2`:

- boot_tests: add PFLASH_TO_COPY for OpenBMC
- travis: Add debian stretch and unstable

  At the moment, we mark them both as being able to fail, as we're
  hitting an assert in one of the unit tests on debian stretch, and
  that hasn't yet been chased down.

- core/backtrace: Serialise printing backtraces

  Add a lock so that only one thread can print a backtrace at a time.
  This should prevent multiple threads from garbaling each other's
  backtraces.

Since :ref:`skiboot-5.7-rc1`:

- lpc: remove double LPC prefix from messages
- opal-ci/fetch-debian-jessie-installer: follow redirects
  Fixes some CI failures
- test/qemu-jessie: bail out fast on kernel panic
- test/qemu-jessie: dump boot log on failure
- travis: add fedora26
- xz: add fallthrough annotations to silence GCC7 warning

Since :ref:`skiboot-5.6.0`:

- boot-tests: add OpenBMC support
- boot_test.sh: Add SMC BMC support

  Your BMC needs a special debug image flashed to use this, the exact
  image and methods aren't something I can publish here, but if you work
  for IBM or SMC you can find out from the right sources.

  A few things are needed to move around to be able to flash to a SMC BMC.

  For a start, the SSH daemon will only accept connections after a special
  incantation (which I also can't share), but you should put that in the
  ~/.skiboot_boot_tests file along with some other default login information
  we don't publicise too broadly (because Security Through Obscurity is
  *obviously* a good idea....)

  We also can't just directly "ssh /bin/true", we need an expect script,
  and we can't scp, but we can anonymous rsync!

  You also need a pflash binary to copy over.
- hdata_to_dt: Add PVR overrides to the usage text
- mambo: Add a reservation for the initramfs

  On most systems the initramfs is loaded inside the part of memory
  reserved for the OS [0x0-0x30000000] and skiboot will never touch it.
  On mambo it's loaded at 0x80000000 and if you're unlucky skiboot can
  allocate over the top of it and corrupt the initramfs blob.

  There might be the downside that the kernel cannot re-use the initramfs
  memory since it's marked as reserved, but the kernel might also free it
  anyway.
- mambo: Update P9 PVR to reflect Scale out 24 core chips

  The P9 PVR bits 48:51 don't indicate a revision but instead different
  configurations.  From BookIV we have:

  ==== ===================
  Bits Configuration
  ==== ===================
     0 Scale out 12 cores
     1 Scale out 24 cores
     2 Scale up 12 cores
     3 Scale up 24 cores
  ==== ===================

  Skiboot will mostly the use "Scale out 24 core" configuration
  (ie. SMT4 not SMT8) so reflect this in mambo.
- core: Move enable_mambo_console() into chip initialisation

  Rather than having a wart in main_cpu_entry() that initialises the mambo
  console, we can move it into init_chips() which is where we discover that we're
  on mambo.

- mambo: Create multiple chips when we have multiple CPUs

  Currently when we boot mambo with multiple CPUs, we create multiple CPU nodes in
  the device tree, and each claims to be on a separate chip.

  However we don't create multiple xscom nodes, which means skiboot only knows
  about a single chip, and all CPUs end up on it. At the moment mambo is not able
  to create multiple xscom controllers. We can create fake ones, just by faking
  the device tree up, but that seems uglier than this solution.

  So create a mambo-chip for each CPU other than 0, to tell skiboot we want a
  separate chip created. This then enables Linux to see multiple chips: ::

      smp: Brought up 2 nodes, 2 CPUs
      numa: Node 0 CPUs: 0
      numa: Node 1 CPUs: 1

- chip: Add support for discovering chips on mambo

  Currently the only way for skiboot to discover chips is by looking for xscom
  nodes. But on mambo it's currently not possible to create multiple xscom nodes,
  which means we can only simulate a single chip system.

  However it seems we can fairly cleanly add support for a special mambo chip
  node, and use that to instantiate multiple chips.

  Add a check in init_chip() that we're not clobbering an already initialised
  chip, now that we have two places that initialise chips.
- mambo: Make xscom claim to be DD 2.0

  In the mambo tcl we set the CPU version to DD 2.0, because mambo is not
  bug compatible with DD 1.

  But in xscom_read_cfam_chipid() we have a hard coded value, to work
  around the lack of the f000f register, which claims to be P9 DD 1.0.

  This doesn't seem to cause crashes or anything, but at boot we do see: ::

      [    0.003893084,5] XSCOM: chip 0x0 at 0x1a0000000000 [P9N DD1.0]

  So fix it to claim that the xscom is also DD 2.0 to match the CPU.

- mambo: Match whole string when looking up symbols with linsym/skisym

  linsym/skisym use a regex to match the symbol name, and accepts a
  partial match against the entry in the symbol map, which can lead to
  somewhat confusing results, eg: ::

      systemsim % linsym early_setup
      0xc000000000027890
      systemsim % linsym early_setup$
      0xc000000000aa8054
      systemsim % linsym early_setup_secondary
      0xc000000000027890

  I don't think that's the behaviour we want, so append a $ to the name so
  that the symbol has to match against the whole entry, eg: ::

      systemsim % linsym early_setup
      0xc000000000aa8054

- Disable nap on P8 Mambo, public release has bugs
- mambo: Allow loading multiple CPIOs

  Currently we have support for loading a single CPIO and telling Linux to
  use it as the initrd. But the Linux code actually supports having
  multiple CPIOs contiguously in memory, between initrd-start and end, and
  will unpack them all in order. That is a really nice feature as it means
  you can have a base CPIO with your root filesystem, and then tack on
  others as you need for various tests etc.

  So expand the logic to handle SKIBOOT_INITRD, and treat it as a comma
  separated list of CPIOs to load. I chose comma as it's fairly rare in
  filenames, but we could make it space, colon, whatever. Or we could add
  a new environment variable entirely. The code also supports trimming
  whitespace from the values, so you can have "cpio1, cpio2".
- hdata/test: Add memory reservations to hdata_to_dt

  Currently memory reservations are parsed, but since they are not
  processed until mem_region_init() they don't appear in the output
  device tree blob. Several bugs have been found with memory reservations
  so we want them to be part of the test output.

  Add them and clean up several usages of printf() since we want only the
  dtb to appear in standard out.


pflash/libffs
-------------

Since :ref:`skiboot-5.7-rc2`:

- pflash option to retrieve PNOR partition flags

  This commit extends pflash with an option to retrieve and print
  information for a particular partition, including the content from
  "pflash -i" and a verbose list of set miscellaneous flags. -i option
  is also updated to print a short list of flags in addition to the
  ECC flag, with one character per flag. A test of the new option is
  included in libflash/test.

Since :ref:`skiboot-5.6.0`:

- libflash/libffs: Zero checksum words

  On writing ffs entries to flash libffs doesn't zero checksum words
  before calculating the checksum across the entire structure. This causes
  an inaccurate calculation of the checksum as it may calculate a checksum
  on non-zero checksum bytes.

- libffs: Fix ffs_lookup_part() return value

  It would return success when the part wasn't found
- libflash/libffs: Correctly update the actual size of the partition

  libffs has been updating FFS partition information in the wrong place
  which leads to incomplete erases and corruption.
- libflash: Initialise entries list earlier

  In the bail-out path we call ffs_close() to tear down the partially
  initialised ffs_handle. ffs_close() expects the entries list to be
  initialised so we need to do that earlier to prevent a null pointer
  dereference.

mbox-flash
----------

mbox-flash is the emerging standard way of talking to host PNOR flash
on POWER9 systems.

- libflash/mbox-flash: Implement MARK_WRITE_ERASED mbox call

  Version two of the mbox-flash protocol defines a new command:
  MARK_WRITE_ERASED.

  This command provides a simple way to mark a region of flash as all 0xff
  without the need to go and write all 0xff. This is an optimisation as
  there is no need for an erase before a write, it is the responsibility of
  the BMC to deal with the flash correctly, however in v1 it was ambiguous
  what a client should do if the flash should be erased but not actually
  written to. This allows of a optimal path to resolve this problem.

- libflash/mbox-flash: Update to V2 of the protocol

  Updated version 2 of the protocol can be found at:
  https://github.com/openbmc/mboxbridge/blob/master/Documentation/mbox_protocol.md

  This commit changes mbox-flash such that it will preferentially talk
  version 2 to any capable daemon but still remain capable of talking to
  v1 daemons.

  Version two changes some of the command definitions for increased
  consistency and usability.
  Version two includes more attention bits - these are now dealt with at a
  simple level.
- libflash/mbox-flash: Implement MARK_WRITE_ERASED mbox call

  Version two of the mbox-flash protocol defines a new command:
  MARK_WRITE_ERASED.

  This command provides a simple way to mark a region of flash as all 0xff
  without the need to go and write all 0xff. This is an optimisation as
  there is no need for an erase before a write, it is the responsibility of
  the BMC to deal with the flash correctly, however in v1 it was ambiguous
  what a client should do if the flash should be erased but not actually
  written to. This allows of a optimal path to resolve this problem.

- libflash/mbox-flash: Update to V2 of the protocol

  Updated version 2 of the protocol can be found at:
  https://github.com/openbmc/mboxbridge/blob/master/Documentation/mbox_protocol.md

  This commit changes mbox-flash such that it will preferentially talk
  version 2 to any capable daemon but still remain capable of talking to
  v1 daemons.

  Version two changes some of the command definitions for increased
  consistency and usability.
  Version two includes more attention bits - these are now dealt with at a
  simple level.

- hw/lpc-mbox: Use message registers for interrupts

  Currently the BMC raises the interrupt using the BMC control register.
  It does so on all accesses to the 16 'data' registers meaning that when
  the BMC only wants to set the ATTN (on which we have interrupts enabled)
  bit we will also get a control register based interrupt.

  The solution here is to mask that interrupt permanantly and enable
  interrupts on the protocol defined 'response' data byte.


Contributors
------------

* Processed 232 csets from 29 developers.
* 1 employer found
* A total of 13043 lines added, 2517 removed (delta 10526)

Extending the analysis done for some previous releases, we can see our trends
in code review across versions:

======= ====== ======== ========= ========= ===========
Release	csets  Ack %    Reviews % Tested %  Reported %
======= ====== ======== ========= ========= ===========
5.0	329    15 (5%)  20 (6%)   1 (0%)    0 (0%)
5.1	372    13 (3%)  38 (10%)  1 (0%)    4 (1%)
5.2-rc1	334    20 (6%)  34 (10%)  6 (2%)    11 (3%)
5.3-rc1	302    36 (12%) 53 (18%)  4 (1%)    5 (2%)
5.4	361    16 (4%)  28 (8%)   1 (0%)    9 (2%)
5.5	408    11 (3%)  48 (12%)  14 (3%)   10 (2%)
5.6	87     12 (14%)  6 (7%)   5 (6%)    2 (2%)
5.7	232    30 (13%) 32 (14%)  5 (2%)    2 (1%)
======= ====== ======== ========= ========= ===========

This cycle has been good for reviews/acks, scoring second highest percentage
ever on both, as well as being right up there on absolute numbers.


Developers with the most changesets
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

========================= ==== =======
Developer                    #       %
========================= ==== =======
Benjamin Herrenschmidt      41 (17.7%)
Stewart Smith               31 (13.4%)
Michael Neuling             28 (12.1%)
Oliver O'Halloran           18 (7.8%)
Vasant Hegde                18 (7.8%)
Jeremy Kerr                 12 (5.2%)
Alistair Popple             11 (4.7%)
Gavin Shan                  10 (4.3%)
Russell Currey               9 (3.9%)
Michael Ellerman             9 (3.9%)
Madhavan Srinivasan          7 (3.0%)
Cyril Bur                    6 (2.6%)
Christophe Lombard           5 (2.2%)
Shilpasri G Bhat             5 (2.2%)
Andrew Donnellan             3 (1.3%)
Nicholas Piggin              3 (1.3%)
Mahesh Salgaonkar            2 (0.9%)
Anju T Sudhakar              2 (0.9%)
Hemant Kumar                 2 (0.9%)
Matt Brown                   1 (0.4%)
Michael Tritz                1 (0.4%)
Joel Stanley                 1 (0.4%)
Balbir Singh                 1 (0.4%)
Frederic Barrat              1 (0.4%)
Andrew Jeffery               1 (0.4%)
Pridhiviraj Paidipeddi       1 (0.4%)
Reza Arbab                   1 (0.4%)
Suraj Jitindar Singh         1 (0.4%)
Vaibhav Jain                 1 (0.4%)
========================= ==== =======


Developers with the most changed lines
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

========================= ==== =======
Developer                    #       %
========================= ==== =======
Hemant Kumar              3056 (23.0%)
Stewart Smith             1826 (13.7%)
Benjamin Herrenschmidt    1348 (10.1%)
Christophe Lombard         937 (7.0%)
Shilpasri G Bhat           770 (5.8%)
Madhavan Srinivasan        755 (5.7%)
Jeremy Kerr                731 (5.5%)
Cyril Bur                  674 (5.1%)
Alistair Popple            477 (3.6%)
Gavin Shan                 414 (3.1%)
Russell Currey             396 (3.0%)
Michael Neuling            336 (2.5%)
Vasant Hegde               308 (2.3%)
Oliver O'Halloran          300 (2.3%)
Anju T Sudhakar            300 (2.3%)
Michael Tritz              167 (1.3%)
Frederic Barrat            113 (0.8%)
Nicholas Piggin             93 (0.7%)
Mahesh Salgaonkar           76 (0.6%)
Michael Ellerman            66 (0.5%)
Suraj Jitindar Singh        59 (0.4%)
Andrew Donnellan            53 (0.4%)
Joel Stanley                20 (0.2%)
Balbir Singh                12 (0.1%)
Reza Arbab                  10 (0.1%)
Vaibhav Jain                 9 (0.1%)
Pridhiviraj Paidipeddi       2 (0.0%)
Matt Brown                   1 (0.0%)
Andrew Jeffery               1 (0.0%)
========================= ==== =======

Developers with the most signoffs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(total 242)

========================= ==== =======
Developer                    #       %
========================= ==== =======
Stewart Smith              201 (83.1%)
Michael Neuling             29 (12.0%)
Madhavan Srinivasan          4 (1.7%)
Suraj Jitindar Singh         3 (1.2%)
Anju T Sudhakar              2 (0.8%)
Hemant Kumar                 2 (0.8%)
Cyril Bur                    1 (0.4%)
========================= ==== =======

Developers with the most reviews
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(total 32)

========================= ==== =======
Developer                    #       %
========================= ==== =======
Vasant Hegde                 8 (25.0%)
Cyril Bur                    7 (21.9%)
Andrew Donnellan             5 (15.6%)
Frederic Barrat              5 (15.6%)
Andrew Jeffery               2 (6.2%)
Gavin Shan                   2 (6.2%)
Joel Stanley                 1 (3.1%)
Oliver O'Halloran            1 (3.1%)
Alistair Popple              1 (3.1%)
========================= ==== =======

Developers with the most test credits
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(total 5)

========================== ==== =======
Developer                    #       %
========================== ==== =======
Vasant Hegde                  2 (40.0%)
Oliver O'Halloran             1 (20.0%)
Ananth N Mavinakayanahalli    1 (20.0%)
Michael Ellerman              1 (20.0%)
========================== ==== =======

Developers who gave the most tested-by credits
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(total 5)

========================= ==== =======
Developer                    #       %
========================= ==== =======
Jeremy Kerr                  2 (40.0%)
Vasant Hegde                 1 (20.0%)
Oliver O'Halloran            1 (20.0%)
Michael Ellerman             1 (20.0%)
========================= ==== =======

Developers with the most report credits
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(total 2)

========================= ==== =======
Developer                    #       %
========================= ==== =======
Oliver O'Halloran            1 (50.0%)
Alastair D'Silva             1 (50.0%)
========================= ==== =======

Developers who gave the most report credits
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(total 2)

========================= ==== =======
Developer                    #       %
========================= ==== =======
Andrew Donnellan             1 (50.0%)
Stewart Smith                1 (50.0%)
========================= ==== =======