On Sun Jul 21 15:23 -0500 2013, Tomi Ollila <tomi.ollila@iki.fi> wrote: > On Tue, Jul 02 2013, John Lenz <lenz@math.uic.edu> wrote: > > > For my client, the largest bottleneck for displaying large threads is > > exporting each html part individually since by default notmuch will not > > show the json parts. For large threads there can be quite a few parts and > > each must be exported and decoded one by one. Also, I then have to deal > > with all the crazy charsets which I can do through a library but is a > > pain. > > This looks like a useful option. I just wonder what effect does different > charsets do to the output (is text/html content output verbatim (with just > json/sexp escaping of '"' -characters). > > If you added test(s) showing what happens with different charsets > (like one message having 3 text/html parts, one us-ascii, one iso-8859-1 > and one utf-8) that would make things clearer and (also) protect us from > regressions. > Here is a test I wrote. I tried to follow the other tests in formatting. Let me know if you want this as a single patch combined with the code to enable the option, I can resend it. #!/usr/bin/env bash test_description="include html parts when showing message" . ./test-lib.sh cat <<EOF > ${MAIL_DIR}/msg From: A <a@example.com> To: B <b@example.com> Subject: html message Date: Sat, 01 January 2000 00:00:00 +0000 Message-ID: <htmlmessage> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="==-==" --==-== Content-Type: text/html; charset=UTF-8 EOF # The Unicode fraction symbol 1/2 is U+00BD and is encoded # in UTF-8 as two bytes: octal 302 275 echo $'<p>0.5 equals \302\275</p>' >> ${MAIL_DIR}/msg cat <<EOF >> ${MAIL_DIR}/msg --==-== Content-Type: text/html; charset=ISO-8859-1 EOF # The ISO-8859-1 encoding of U+00BD is a single byte: octal 275 echo $'<p>0.5 equals \275</p>' >> ${MAIL_DIR}/msg cat <<EOF >> ${MAIL_DIR}/msg --==-== Content-Type: text/plain; charset=UTF-8 0.5 equals 1/2 --==-==-- EOF notmuch new > /dev/null cat <<EOF > EXPECTED.head [[[{"id": "htmlmessage", "match":true, "excluded": false, "date_relative":"2000-01-01", "timestamp": 946684800, "filename": "${MAIL_DIR}/msg", "tags": ["inbox", "unread"], "headers": { "Date": "Sat, 01 Jan 2000 00:00:00 +0000", "From": "A <a@example.com>", "Subject": "html message", "To": "B <b@example.com>"}, "body": [{ "content-type": "multipart/alternative", "id": 1, EOF cat EXPECTED.head > EXPECTED.nohtml cat <<EOF >> EXPECTED.nohtml "content": [ { "id": 2, "content-charset": "UTF-8", "content-length": 21, "content-type": "text/html"}, { "id": 3, "content-charset": "ISO-8859-1", "content-length": 20, "content-type": "text/html"}, { "id": 4, "content-type": "text/plain", "content": "0.5 equals 1/2\\n"} ]}]},[]]]] EOF # Both the UTF-8 and ISO-8859-1 part should have U+00BD cat EXPECTED.head > EXPECTED.withhtml cat <<EOF >> EXPECTED.withhtml "content": [ { "id": 2, "content-type": "text/html", "content": "<p>0.5 equals \\u00bd</p>\\n"}, { "id": 3, "content-type": "text/html", "content": "<p>0.5 equals \\u00bd</p>\\n"}, { "id": 4, "content-type": "text/plain", "content": "0.5 equals 1/2\\n"} ]}]},[]]]] EOF test_begin_subtest "html parts excluded by default" notmuch show --format=json id:htmlmessage >OUTPUT test_expect_equal_json "$(cat OUTPUT)" "$(cat EXPECTED.nohtml)" test_begin_subtest "html parts included" notmuch show --format=json --include-html id:htmlmessage > OUTPUT test_expect_equal_json "$(cat OUTPUT)" "$(cat EXPECTED.withhtml)" test_done