Re: failing T810-tsan on ppc64el

Subject: Re: failing T810-tsan on ppc64el

Date: Wed, 30 Aug 2023 15:07:23 +0200

To: David Bremner

Cc: notmuch@notmuchmail.org

From: Kevin Boulain


On 2023-08-25 at 08:07 -03, David Bremner <david@tethera.net> wrote:
> I can just disable tsan tests on ppc64el for Debian, but I wondered if
> there is an underlying bug that only shows up on ppc64el

I see you've skipped T810-tsan in 90c61828, I think it's fair for the
time being. I did some digging and sent a patch to silence the races
reported by TSan in
https://buildd.debian.org/status/fetch.php?pkg=notmuch&arch=ppc64el&ver=0.38%7Erc0-1&stamp=1692959868&raw=0

However, the races are just a red herring and the diff actually shows
the key problem (easier to see once the suppressions have been updated):
  line 12: 19
Surely, you're more familiar than me with the output so you might have
noticed already, but it means this line fails:
  https://git.notmuchmail.org/git?p=notmuch;a=blob;f=test/T810-tsan.sh;h=4071e2968f2ad62eb6642c68b39f2750327682d0;hb=HEAD#l68

The call stack is as follow (I'm linking to the latest versions for
simplicity, not the ones packaged by Debian):
  _load_key_file https://git.notmuchmail.org/git?p=notmuch;a=blob;f=lib/open.cc;h=54d1faf30127c8a72f3c96e838cf9d4edba7e70a;hb=HEAD#l155
  g_key_file_load_from_file https://gitlab.gnome.org/GNOME/glib/-/blob/197e6d6f5df6bc809bd3deaadec9519af2951cd2/glib/gkeyfile.c#L937
  g_key_file_load_from_fd https://gitlab.gnome.org/GNOME/glib/-/blob/197e6d6f5df6bc809bd3deaadec9519af2951cd2/glib/gkeyfile.c#L829

The rest is available at
https://gitlab.gnome.org/GNOME/glib/-/issues/1672#note_1831968
but doesn't matter in this case.

So, fstat failed but the syscall isn't actually performed: strace only
reports an open followed nearly immediately by a close for this fd. That
means glibc must be returning the EINVAL that we're seeing. There's a
case where ___fxstat64 can actually set errno to that:
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fxstat64.c;h=b8c4c0a13c7c9d5a426a9c17a96d5db8fd08455e;hb=HEAD#l52

So, something is feeding an invalid 'vers' to glibc (inspecting the
assembly in gdb reveals it's set to 0 and it's checked against a 1).
TSan actually installs interceptors for fstat:
  __interceptor_fstat64 (fd=3, buf=0x7fffee63d418) at ../../../../src/libsanitizer/tsan/tsan_rtl.h:242
  [...]
  ___fxstat64 (vers=0, fd=3, buf=0x7fffee63d418) at ../sysdeps/unix/sysv/linux/fxstat64.c:50

Because it's called __interceptor_fstat64 I believe it's
https://github.com/llvm/llvm-project/blob/04b1276ad3b8976241228be8a966b1557f63492f/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp#L1634
(not sure how to get the sources for TSan under Debian), which does
hardcode 'vers' to 0. Commenting out
  TEST_CFLAGS="${TEST_CFLAGS:-} -fsanitize=thread"
to eliminate the interceptor in the test suffices to make fstat succeed.

I'll see if I can find something more, otherwise I'll probably just
report this problem upstream.
_______________________________________________
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-leave@notmuchmail.org

Thread: