* 2020-08-16 19:28:51+03, Tomi Ollila wrote:
> Good stuff -- implementation looks like port of the php code in
>
> https://www.iamcal.com/understanding-bidirectional-text
>
> to emacs lisp... anyway nice implementation took be a bit of
> time for me to understand it...
I don't read PHP and didn't try to read that code at all but the idea is
simple enough.
> thoughts
>
> - is it slow to execute it always, pure lisp implementation;
> (string-match "[\u202a-\u202e]") could be done before that.
> (if it were executed often could loop with `looking-at`
> (and then moving point based on match-end) be faster...
I don't see any speed issues but if we want to optimize I would create a
new sanitize function which walks just once across the characters
without using regular expressions. But currently I think it's
unnecessary micro optimization.
> - *but* adding U+202C's in `notmuch-sanitize` is doing it too early, as
> some functions truncate the strings afterwards if those are too long
> (e.g. `notmuch-search-insert-authors`) so those get lost..
Good point. This would mean that we shouldn't do "bidi ctrl char
balancing" in notmuch-sanitize. We should call the new
notmuch-balance-bidi-ctrl-chars function in various places before
inserting arbitrary strings to buffer and before combining such strings
with other strings.
> (what I noticed when looking `notmuch-search-insert-authors` that it uses
> `length` to check the length of a string -- but that also counts these bidi
> mode changing "characters" (as one char). `string-width` would be better
> there -- and probably in many other places.)
Yes, definitely string-width when truncating is based on width and when
using tabular format in buffers. With that function zero-width
characters really have no width.
--
/// Teemu Likonen - .-.. http://www.iki.fi/tlikonen/
// OpenPGP: 4E1055DC84E9DFF613D78557719D69D324539450