aboutsummaryrefslogtreecommitdiffstats
path: root/docs/devel/tcg-plugins.rst
diff options
context:
space:
mode:
Diffstat (limited to 'docs/devel/tcg-plugins.rst')
-rw-r--r--docs/devel/tcg-plugins.rst438
1 files changed, 438 insertions, 0 deletions
diff --git a/docs/devel/tcg-plugins.rst b/docs/devel/tcg-plugins.rst
new file mode 100644
index 000000000..f93ef4fe5
--- /dev/null
+++ b/docs/devel/tcg-plugins.rst
@@ -0,0 +1,438 @@
+..
+ Copyright (C) 2017, Emilio G. Cota <cota@braap.org>
+ Copyright (c) 2019, Linaro Limited
+ Written by Emilio Cota and Alex Bennée
+
+QEMU TCG Plugins
+================
+
+QEMU TCG plugins provide a way for users to run experiments taking
+advantage of the total system control emulation can have over a guest.
+It provides a mechanism for plugins to subscribe to events during
+translation and execution and optionally callback into the plugin
+during these events. TCG plugins are unable to change the system state
+only monitor it passively. However they can do this down to an
+individual instruction granularity including potentially subscribing
+to all load and store operations.
+
+Usage
+-----
+
+Any QEMU binary with TCG support has plugins enabled by default.
+Earlier releases needed to be explicitly enabled with::
+
+ configure --enable-plugins
+
+Once built a program can be run with multiple plugins loaded each with
+their own arguments::
+
+ $QEMU $OTHER_QEMU_ARGS \
+ -plugin tests/plugin/libhowvec.so,inline=on,count=hint \
+ -plugin tests/plugin/libhotblocks.so
+
+Arguments are plugin specific and can be used to modify their
+behaviour. In this case the howvec plugin is being asked to use inline
+ops to count and break down the hint instructions by type.
+
+Writing plugins
+---------------
+
+API versioning
+~~~~~~~~~~~~~~
+
+This is a new feature for QEMU and it does allow people to develop
+out-of-tree plugins that can be dynamically linked into a running QEMU
+process. However the project reserves the right to change or break the
+API should it need to do so. The best way to avoid this is to submit
+your plugin upstream so they can be updated if/when the API changes.
+
+All plugins need to declare a symbol which exports the plugin API
+version they were built against. This can be done simply by::
+
+ QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION;
+
+The core code will refuse to load a plugin that doesn't export a
+``qemu_plugin_version`` symbol or if plugin version is outside of QEMU's
+supported range of API versions.
+
+Additionally the ``qemu_info_t`` structure which is passed to the
+``qemu_plugin_install`` method of a plugin will detail the minimum and
+current API versions supported by QEMU. The API version will be
+incremented if new APIs are added. The minimum API version will be
+incremented if existing APIs are changed or removed.
+
+Lifetime of the query handle
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Each callback provides an opaque anonymous information handle which
+can usually be further queried to find out information about a
+translation, instruction or operation. The handles themselves are only
+valid during the lifetime of the callback so it is important that any
+information that is needed is extracted during the callback and saved
+by the plugin.
+
+Plugin life cycle
+~~~~~~~~~~~~~~~~~
+
+First the plugin is loaded and the public qemu_plugin_install function
+is called. The plugin will then register callbacks for various plugin
+events. Generally plugins will register a handler for the *atexit*
+if they want to dump a summary of collected information once the
+program/system has finished running.
+
+When a registered event occurs the plugin callback is invoked. The
+callbacks may provide additional information. In the case of a
+translation event the plugin has an option to enumerate the
+instructions in a block of instructions and optionally register
+callbacks to some or all instructions when they are executed.
+
+There is also a facility to add an inline event where code to
+increment a counter can be directly inlined with the translation.
+Currently only a simple increment is supported. This is not atomic so
+can miss counts. If you want absolute precision you should use a
+callback which can then ensure atomicity itself.
+
+Finally when QEMU exits all the registered *atexit* callbacks are
+invoked.
+
+Exposure of QEMU internals
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The plugin architecture actively avoids leaking implementation details
+about how QEMU's translation works to the plugins. While there are
+conceptions such as translation time and translation blocks the
+details are opaque to plugins. The plugin is able to query select
+details of instructions and system configuration only through the
+exported *qemu_plugin* functions.
+
+API
+~~~
+
+.. kernel-doc:: include/qemu/qemu-plugin.h
+
+Internals
+---------
+
+Locking
+~~~~~~~
+
+We have to ensure we cannot deadlock, particularly under MTTCG. For
+this we acquire a lock when called from plugin code. We also keep the
+list of callbacks under RCU so that we do not have to hold the lock
+when calling the callbacks. This is also for performance, since some
+callbacks (e.g. memory access callbacks) might be called very
+frequently.
+
+ * A consequence of this is that we keep our own list of CPUs, so that
+ we do not have to worry about locking order wrt cpu_list_lock.
+ * Use a recursive lock, since we can get registration calls from
+ callbacks.
+
+As a result registering/unregistering callbacks is "slow", since it
+takes a lock. But this is very infrequent; we want performance when
+calling (or not calling) callbacks, not when registering them. Using
+RCU is great for this.
+
+We support the uninstallation of a plugin at any time (e.g. from
+plugin callbacks). This allows plugins to remove themselves if they no
+longer want to instrument the code. This operation is asynchronous
+which means callbacks may still occur after the uninstall operation is
+requested. The plugin isn't completely uninstalled until the safe work
+has executed while all vCPUs are quiescent.
+
+Example Plugins
+---------------
+
+There are a number of plugins included with QEMU and you are
+encouraged to contribute your own plugins plugins upstream. There is a
+``contrib/plugins`` directory where they can go.
+
+- tests/plugins
+
+These are some basic plugins that are used to test and exercise the
+API during the ``make check-tcg`` target.
+
+- contrib/plugins/hotblocks.c
+
+The hotblocks plugin allows you to examine the where hot paths of
+execution are in your program. Once the program has finished you will
+get a sorted list of blocks reporting the starting PC, translation
+count, number of instructions and execution count. This will work best
+with linux-user execution as system emulation tends to generate
+re-translations as blocks from different programs get swapped in and
+out of system memory.
+
+If your program is single-threaded you can use the ``inline`` option for
+slightly faster (but not thread safe) counters.
+
+Example::
+
+ ./aarch64-linux-user/qemu-aarch64 \
+ -plugin contrib/plugins/libhotblocks.so -d plugin \
+ ./tests/tcg/aarch64-linux-user/sha1
+ SHA1=15dd99a1991e0b3826fede3deffc1feba42278e6
+ collected 903 entries in the hash table
+ pc, tcount, icount, ecount
+ 0x0000000041ed10, 1, 5, 66087
+ 0x000000004002b0, 1, 4, 66087
+ ...
+
+- contrib/plugins/hotpages.c
+
+Similar to hotblocks but this time tracks memory accesses::
+
+ ./aarch64-linux-user/qemu-aarch64 \
+ -plugin contrib/plugins/libhotpages.so -d plugin \
+ ./tests/tcg/aarch64-linux-user/sha1
+ SHA1=15dd99a1991e0b3826fede3deffc1feba42278e6
+ Addr, RCPUs, Reads, WCPUs, Writes
+ 0x000055007fe000, 0x0001, 31747952, 0x0001, 8835161
+ 0x000055007ff000, 0x0001, 29001054, 0x0001, 8780625
+ 0x00005500800000, 0x0001, 687465, 0x0001, 335857
+ 0x0000000048b000, 0x0001, 130594, 0x0001, 355
+ 0x0000000048a000, 0x0001, 1826, 0x0001, 11
+
+The hotpages plugin can be configured using the following arguments:
+
+ * sortby=reads|writes|address
+
+ Log the data sorted by either the number of reads, the number of writes, or
+ memory address. (Default: entries are sorted by the sum of reads and writes)
+
+ * io=on
+
+ Track IO addresses. Only relevant to full system emulation. (Default: off)
+
+ * pagesize=N
+
+ The page size used. (Default: N = 4096)
+
+- contrib/plugins/howvec.c
+
+This is an instruction classifier so can be used to count different
+types of instructions. It has a number of options to refine which get
+counted. You can give a value to the ``count`` argument for a class of
+instructions to break it down fully, so for example to see all the system
+registers accesses::
+
+ ./aarch64-softmmu/qemu-system-aarch64 $(QEMU_ARGS) \
+ -append "root=/dev/sda2 systemd.unit=benchmark.service" \
+ -smp 4 -plugin ./contrib/plugins/libhowvec.so,count=sreg -d plugin
+
+which will lead to a sorted list after the class breakdown::
+
+ Instruction Classes:
+ Class: UDEF not counted
+ Class: SVE (68 hits)
+ Class: PCrel addr (47789483 hits)
+ Class: Add/Sub (imm) (192817388 hits)
+ Class: Logical (imm) (93852565 hits)
+ Class: Move Wide (imm) (76398116 hits)
+ Class: Bitfield (44706084 hits)
+ Class: Extract (5499257 hits)
+ Class: Cond Branch (imm) (147202932 hits)
+ Class: Exception Gen (193581 hits)
+ Class: NOP not counted
+ Class: Hints (6652291 hits)
+ Class: Barriers (8001661 hits)
+ Class: PSTATE (1801695 hits)
+ Class: System Insn (6385349 hits)
+ Class: System Reg counted individually
+ Class: Branch (reg) (69497127 hits)
+ Class: Branch (imm) (84393665 hits)
+ Class: Cmp & Branch (110929659 hits)
+ Class: Tst & Branch (44681442 hits)
+ Class: AdvSimd ldstmult (736 hits)
+ Class: ldst excl (9098783 hits)
+ Class: Load Reg (lit) (87189424 hits)
+ Class: ldst noalloc pair (3264433 hits)
+ Class: ldst pair (412526434 hits)
+ Class: ldst reg (imm) (314734576 hits)
+ Class: Loads & Stores (2117774 hits)
+ Class: Data Proc Reg (223519077 hits)
+ Class: Scalar FP (31657954 hits)
+ Individual Instructions:
+ Instr: mrs x0, sp_el0 (2682661 hits) (op=0xd5384100/ System Reg)
+ Instr: mrs x1, tpidr_el2 (1789339 hits) (op=0xd53cd041/ System Reg)
+ Instr: mrs x2, tpidr_el2 (1513494 hits) (op=0xd53cd042/ System Reg)
+ Instr: mrs x0, tpidr_el2 (1490823 hits) (op=0xd53cd040/ System Reg)
+ Instr: mrs x1, sp_el0 (933793 hits) (op=0xd5384101/ System Reg)
+ Instr: mrs x2, sp_el0 (699516 hits) (op=0xd5384102/ System Reg)
+ Instr: mrs x4, tpidr_el2 (528437 hits) (op=0xd53cd044/ System Reg)
+ Instr: mrs x30, ttbr1_el1 (480776 hits) (op=0xd538203e/ System Reg)
+ Instr: msr ttbr1_el1, x30 (480713 hits) (op=0xd518203e/ System Reg)
+ Instr: msr vbar_el1, x30 (480671 hits) (op=0xd518c01e/ System Reg)
+ ...
+
+To find the argument shorthand for the class you need to examine the
+source code of the plugin at the moment, specifically the ``*opt``
+argument in the InsnClassExecCount tables.
+
+- contrib/plugins/lockstep.c
+
+This is a debugging tool for developers who want to find out when and
+where execution diverges after a subtle change to TCG code generation.
+It is not an exact science and results are likely to be mixed once
+asynchronous events are introduced. While the use of -icount can
+introduce determinism to the execution flow it doesn't always follow
+the translation sequence will be exactly the same. Typically this is
+caused by a timer firing to service the GUI causing a block to end
+early. However in some cases it has proved to be useful in pointing
+people at roughly where execution diverges. The only argument you need
+for the plugin is a path for the socket the two instances will
+communicate over::
+
+
+ ./sparc-softmmu/qemu-system-sparc -monitor none -parallel none \
+ -net none -M SS-20 -m 256 -kernel day11/zImage.elf \
+ -plugin ./contrib/plugins/liblockstep.so,sockpath=lockstep-sparc.sock \
+ -d plugin,nochain
+
+which will eventually report::
+
+ qemu-system-sparc: warning: nic lance.0 has no peer
+ @ 0x000000ffd06678 vs 0x000000ffd001e0 (2/1 since last)
+ @ 0x000000ffd07d9c vs 0x000000ffd06678 (3/1 since last)
+ Δ insn_count @ 0x000000ffd07d9c (809900609) vs 0x000000ffd06678 (809900612)
+ previously @ 0x000000ffd06678/10 (809900609 insns)
+ previously @ 0x000000ffd001e0/4 (809900599 insns)
+ previously @ 0x000000ffd080ac/2 (809900595 insns)
+ previously @ 0x000000ffd08098/5 (809900593 insns)
+ previously @ 0x000000ffd080c0/1 (809900588 insns)
+
+- contrib/plugins/hwprofile.c
+
+The hwprofile tool can only be used with system emulation and allows
+the user to see what hardware is accessed how often. It has a number of options:
+
+ * track=read or track=write
+
+ By default the plugin tracks both reads and writes. You can use one
+ of these options to limit the tracking to just one class of accesses.
+
+ * source
+
+ Will include a detailed break down of what the guest PC that made the
+ access was. Not compatible with the pattern option. Example output::
+
+ cirrus-low-memory @ 0xfffffd00000a0000
+ pc:fffffc0000005cdc, 1, 256
+ pc:fffffc0000005ce8, 1, 256
+ pc:fffffc0000005cec, 1, 256
+
+ * pattern
+
+ Instead break down the accesses based on the offset into the HW
+ region. This can be useful for seeing the most used registers of a
+ device. Example output::
+
+ pci0-conf @ 0xfffffd01fe000000
+ off:00000004, 1, 1
+ off:00000010, 1, 3
+ off:00000014, 1, 3
+ off:00000018, 1, 2
+ off:0000001c, 1, 2
+ off:00000020, 1, 2
+ ...
+
+- contrib/plugins/execlog.c
+
+The execlog tool traces executed instructions with memory access. It can be used
+for debugging and security analysis purposes.
+Please be aware that this will generate a lot of output.
+
+The plugin takes no argument::
+
+ qemu-system-arm $(QEMU_ARGS) \
+ -plugin ./contrib/plugins/libexeclog.so -d plugin
+
+which will output an execution trace following this structure::
+
+ # vCPU, vAddr, opcode, disassembly[, load/store, memory addr, device]...
+ 0, 0xa12, 0xf8012400, "movs r4, #0"
+ 0, 0xa14, 0xf87f42b4, "cmp r4, r6"
+ 0, 0xa16, 0xd206, "bhs #0xa26"
+ 0, 0xa18, 0xfff94803, "ldr r0, [pc, #0xc]", load, 0x00010a28, RAM
+ 0, 0xa1a, 0xf989f000, "bl #0xd30"
+ 0, 0xd30, 0xfff9b510, "push {r4, lr}", store, 0x20003ee0, RAM, store, 0x20003ee4, RAM
+ 0, 0xd32, 0xf9893014, "adds r0, #0x14"
+ 0, 0xd34, 0xf9c8f000, "bl #0x10c8"
+ 0, 0x10c8, 0xfff96c43, "ldr r3, [r0, #0x44]", load, 0x200000e4, RAM
+
+- contrib/plugins/cache.c
+
+Cache modelling plugin that measures the performance of a given L1 cache
+configuration, and optionally a unified L2 per-core cache when a given working
+set is run::
+
+ qemu-x86_64 -plugin ./contrib/plugins/libcache.so \
+ -d plugin -D cache.log ./tests/tcg/x86_64-linux-user/float_convs
+
+will report the following::
+
+ core #, data accesses, data misses, dmiss rate, insn accesses, insn misses, imiss rate
+ 0 996695 508 0.0510% 2642799 18617 0.7044%
+
+ address, data misses, instruction
+ 0x424f1e (_int_malloc), 109, movq %rax, 8(%rcx)
+ 0x41f395 (_IO_default_xsputn), 49, movb %dl, (%rdi, %rax)
+ 0x42584d (ptmalloc_init.part.0), 33, movaps %xmm0, (%rax)
+ 0x454d48 (__tunables_init), 20, cmpb $0, (%r8)
+ ...
+
+ address, fetch misses, instruction
+ 0x4160a0 (__vfprintf_internal), 744, movl $1, %ebx
+ 0x41f0a0 (_IO_setb), 744, endbr64
+ 0x415882 (__vfprintf_internal), 744, movq %r12, %rdi
+ 0x4268a0 (__malloc), 696, andq $0xfffffffffffffff0, %rax
+ ...
+
+The plugin has a number of arguments, all of them are optional:
+
+ * limit=N
+
+ Print top N icache and dcache thrashing instructions along with their
+ address, number of misses, and its disassembly. (default: 32)
+
+ * icachesize=N
+ * iblksize=B
+ * iassoc=A
+
+ Instruction cache configuration arguments. They specify the cache size, block
+ size, and associativity of the instruction cache, respectively.
+ (default: N = 16384, B = 64, A = 8)
+
+ * dcachesize=N
+ * dblksize=B
+ * dassoc=A
+
+ Data cache configuration arguments. They specify the cache size, block size,
+ and associativity of the data cache, respectively.
+ (default: N = 16384, B = 64, A = 8)
+
+ * evict=POLICY
+
+ Sets the eviction policy to POLICY. Available policies are: :code:`lru`,
+ :code:`fifo`, and :code:`rand`. The plugin will use the specified policy for
+ both instruction and data caches. (default: POLICY = :code:`lru`)
+
+ * cores=N
+
+ Sets the number of cores for which we maintain separate icache and dcache.
+ (default: for linux-user, N = 1, for full system emulation: N = cores
+ available to guest)
+
+ * l2=on
+
+ Simulates a unified L2 cache (stores blocks for both instructions and data)
+ using the default L2 configuration (cache size = 2MB, associativity = 16-way,
+ block size = 64B).
+
+ * l2cachesize=N
+ * l2blksize=B
+ * l2assoc=A
+
+ L2 cache configuration arguments. They specify the cache size, block size, and
+ associativity of the L2 cache, respectively. Setting any of the L2
+ configuration arguments implies ``l2=on``.
+ (default: N = 2097152 (2MB), B = 64, A = 16)