If we only have 8 or fewer spans at once (as is the normal case), don't
do allocation, which might interfere with tracing.
This doesn't change our test_generate_coinmoves() benchmark.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we have USDT compiled in, scanning the array of spans becomes
prohibitive if we have really large numbers of requests. In the
bookkeeper code, when catching up with 1.6M channel events, this
became clear in profiling.
Use a hash table instead.
Before:
tests/test_coinmoves.py::test_generate_coinmoves (100,000, sqlite3):
Time (from start to end of l2 node): 269 seconds (vs 14 with HAVE_USDT=0)
Worst latency: 4.0 seconds
After:
tests/test_coinmoves.py::test_generate_coinmoves (100,000, sqlite3):
Time (from start to end of l2 node): 14 seconds
Worst latency: 4.3 seconds
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It now simply renames tal names, so it's harmless to do even if we're
not going to do memleak detection.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Basically, `devtools/reduce-includes.sh */*.c`.
Build time from make clean (RUST=0) (includes building external libs):
Before:
real 0m38.944000-40.416000(40.1131+/-0.4)s
user 3m6.790000-17.159000(15.0571+/-2.8)s
sys 0m35.304000-37.336000(36.8942+/-0.57)s
After:
real 0m37.872000-39.974000(39.5466+/-0.59)s
user 3m1.211000-14.968000(12.4556+/-3.9)s
sys 0m35.008000-36.830000(36.4143+/-0.5)s
Build time after touch config.vars (RUST=0):
Before:
real 0m19.831000-21.862000(21.5528+/-0.58)s
user 2m15.361000-30.731000(28.4798+/-4.4)s
sys 0m21.056000-22.339000(22.0346+/-0.35)s
After:
real 0m18.384000-21.307000(20.8605+/-0.92)s
user 2m5.585000-26.843000(23.6017+/-6.7)s
sys 0m19.650000-22.003000(21.4943+/-0.69)s
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Each header should only include the other headers it needs to compile;
`devtools/reduce-includes.sh */*.h` does this. The C files then need
additional includes if they don't compile.
And remove the entirely useless wire/onion_wire.h, which only serves to include wire/onion_wiregen.h.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
notleak() doesn't work for lightningd since the first span is created before
memleak (or anything else!) is initialized, so we have to mark it manually.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This doesn't happen very often, but can with autoclean.
However, we rarely traverse to the end, since we always expect to find
what we're looking for, and we fill from the front. So even a large
array (unless it's used) is fine.
Subtle: when we injected a parent, we used "active_spans" as the (arbitrary)
key. That can now change with reallocation, and so if that memory were reused
we could have a key clash. So we use "&active_spans" which doesn't change.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't ever actually close the remote span (we don't have its key,
after all), and we keep a pointer to the parent.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is much faster to give 64 bits of data, and we don't need
cryptographic randomness.
This brings us back to 413ns per trace.
Before:
real 0m5.819000-6.472000(6.2064+/-0.26)s
user 0m3.779000-4.101000(3.956+/-0.12)s
sys 0m2.040000-2.431000(2.2496+/-0.15)s
After:
real 0m3.981000-4.247000(4.1276+/-0.11)s
user 0m3.979000-4.245000(4.126+/-0.11)s
sys 0m0.000000-0.002000(0.001+/-0.00063)s
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: lightingd: trimmed overhead of tracing infrastructure.
There's an EBPF limit anyway, so stick with a 512-byte buffer.
This brings us back to 621ns per trace:
Before:
real 0m13.441000-14.592000(14.2686+/-0.43)s
user 0m11.265000-12.289000(11.9626+/-0.37)s
sys 0m2.175000-2.381000(2.3048+/-0.072)s
After:
real 0m5.819000-6.472000(6.2064+/-0.26)s
user 0m3.779000-4.101000(3.956+/-0.12)s
sys 0m2.040000-2.431000(2.2496+/-0.15)s
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Avoids allocations. Also assume that name and value parameters
outlive the trace span, so don't copy.
Before:
real 0m16.421000-18.407000(17.8128+/-0.72)s
user 0m14.242000-16.041000(15.5382+/-0.67)s
sys 0m2.179000-2.363000(2.273+/-0.061)s
After:
real 0m13.441000-14.592000(14.2686+/-0.43)s
user 0m11.265000-12.289000(11.9626+/-0.37)s
sys 0m2.175000-2.381000(2.3048+/-0.072)s
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. trace_span_start() is always called with a string literal, so
no copy needed (and we can use a macro to enforce this).
2. trace_span_tag() name and value are always longer-lived than
the span, so no need to copy these either.
Before:
real 0m18.524000-19.100000(18.7674+/-0.21)s
user 0m16.171000-16.833000(16.424+/-0.26)s
sys 0m2.259000-2.400000(2.337+/-0.059)s
After:
real 0m16.421000-18.407000(17.8128+/-0.72)s
user 0m14.242000-16.041000(15.5382+/-0.67)s
sys 0m2.179000-2.363000(2.273+/-0.061)s
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Testing parenting handling revealed several issues:
1. By calling "trace_span_start" when CLN_TRACEPARENT is set produces a bogus
entry, for which the span_id is overwritten so we never end it.
2. We don't need to close the remote parent when we close the first child: in
fact, this causes the remaining traces to be detached from the parent!
3. Suspension should return current to the parent, not to NULL.
Now the traces balance as we expect.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I added this debugging because the next test revealed a mismatch, so
I wanted to see where it was happening.
The comment in lightningd suggests it's possible, but I can't see any
code which suspends in the lightningd io_loop, so I cannot see how
this is triggered.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This happens with autoclean, which does a datastore request then frees
the parent command without waiting for a response (see clean_finished).
This leaks a trace, and causes a crash if the pointer is later reused.
My solution is to create a trace variant which declares the trace key
to be a tal ptr and then we can clean up in the destructor if this happens.
This fixes the issue for me.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: autoclean: fixed occasional crash when tracepoints compiled in.
chanbackup with many peers can do more than 128 concurrent rpc commands.
autoclean is the other plugin which can do many requests at once, so I
expect a similar issue there.
I tested this by rebuilding with `MAX_ACTIVE_SPANS` 1, which autoclean
tests triggered immediately.
The real fix is probably to use a hash table with a large initial size.
```
Mar 24 06:30:45 mlbb2 sh[28000]: chanbackup: common/trace.c:190: trace_span_slot: Assertion `s' failed.
Mar 24 06:30:45 mlbb2 sh[28000]: chanbackup: FATAL SIGNAL 6 (version v25.02)
Mar 24 06:30:45 mlbb2 sh[28000]: 0x5575232bac4f send_backtrace
Mar 24 06:30:45 mlbb2 sh[28000]: common/daemon.c:33
Mar 24 06:30:45 mlbb2 sh[28000]: 0x5575232baceb crashdump
Mar 24 06:30:45 mlbb2 sh[28000]: common/daemon.c:78
Mar 24 06:30:45 mlbb2 sh[28000]: 0x7f2958cd851f ???
Mar 24 06:30:45 mlbb2 sh[28000]: ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
Mar 24 06:30:45 mlbb2 sh[28000]: 0x7f2958d2c9fc __pthread_kill_implementation
Mar 24 06:30:45 mlbb2 sh[28000]: ./nptl/pthread_kill.c:44
Mar 24 06:30:45 mlbb2 sh[28000]: 0x7f2958d2c9fc __pthread_kill_internal
Mar 24 06:30:45 mlbb2 sh[28000]: ./nptl/pthread_kill.c:78
Mar 24 06:30:45 mlbb2 sh[28000]: 0x7f2958d2c9fc __GI___pthread_kill
Mar 24 06:30:45 mlbb2 sh[28000]: ./nptl/pthread_kill.c:89
Mar 24 06:30:45 mlbb2 sh[28000]: 0x7f2958cd8475 __GI_raise
Mar 24 06:30:45 mlbb2 sh[28000]: ../sysdeps/posix/raise.c:26
Mar 24 06:30:45 mlbb2 sh[28000]: 0x7f2958cbe7f2 __GI_abort
Mar 24 06:30:45 mlbb2 sh[28000]: ./stdlib/abort.c:79
Mar 24 06:30:45 mlbb2 sh[28000]: 0x7f2958cbe71a __assert_fail_base
Mar 24 06:30:45 mlbb2 sh[28000]: ./assert/assert.c:94
Mar 24 06:30:45 mlbb2 sh[28000]: 0x7f2958ccfe95 __GI___assert_fail
Mar 24 06:30:45 mlbb2 sh[28000]: ./assert/assert.c:103
Mar 24 06:30:45 mlbb2 sh[28000]: 0x5575232ab7fa trace_span_slot
Mar 24 06:30:45 mlbb2 sh[28000]: common/trace.c:190
Mar 24 06:30:45 mlbb2 sh[28000]: 0x5575232abc9f trace_span_start
Mar 24 06:30:45 mlbb2 sh[28000]: common/trace.c:267
Mar 24 06:30:45 mlbb2 sh[28000]: 0x5575232a7c34 send_outreq
Mar 24 06:30:45 mlbb2 sh[28000]: plugins/libplugin.c:1112
```
Changelog-Fixed: autoclean/chanbackup: fixed tracepoint crash on large number of requests.
Fixes: https://github.com/ElementsProject/lightning/issues/8177
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It turns out that under some circumstances we end up clearing the
pointee of `current` but not the pointer. Thus when we select the next
slot we can end up reusing the same slot, making it its own parent.
We forcefull break these cycles by enforcing that `current` should
never be returned and be set as its own parent.
Changelog-None
Trace spans form a tree, but we don't actually check that the
structure doesn't break. Breakage can for example come if we use the
same key accidentally, making a new span its own ancestor.
We have the space in memory set aside anyway, so let's just copy the
`trace_id` into the span itself, rather than resolving the `root` at
time of emission.
After adding the DB query instrumentation we ran into a couple of
issues, with spans not being resumed correctly, and it was rather hard
to identify the problem. This adds debug statements so we can trace
the tracing (traception if you will).
Changelog-None
The integration with opentelemetry was sub-optimal: it was generating
jaeger-style traces, with short traceIds and we were considering the
entire lifetime as a single trace. This PR changes that to a trace for
startup and then a trace for any event that doesn't already have a
parent.
We also allow using the `CLN_TRACEPARENT` envvar to attach the startup
to a remote / external trace, potentially by whatever started the main
process. This is useful to see the startup trace in the wider context
of whatever tooling is built around it.
Changelog-Added: tracing: It is now possible to inject a parent for the startup trace by setting the `CLN_TRACEPARENT` envvar