On Thu, Jul 25 2013, John Lenz <lenz@math.uic.edu> wrote: > On Sun Jul 21 15:23 -0500 2013, Tomi Ollila <tomi.ollila@iki.fi> wrote: >> On Tue, Jul 02 2013, John Lenz <lenz@math.uic.edu> wrote: >> >> > For my client, the largest bottleneck for displaying large threads is >> > exporting each html part individually since by default notmuch will not >> > show the json parts. For large threads there can be quite a few parts and >> > each must be exported and decoded one by one. Also, I then have to deal >> > with all the crazy charsets which I can do through a library but is a >> > pain. >> >> This looks like a useful option. I just wonder what effect does different >> charsets do to the output (is text/html content output verbatim (with just >> json/sexp escaping of '"' -characters). >> >> If you added test(s) showing what happens with different charsets >> (like one message having 3 text/html parts, one us-ascii, one iso-8859-1 >> and one utf-8) that would make things clearer and (also) protect us from >> regressions. >> > Here is a test I wrote. I tried to follow the other tests in formatting. > Let me know if you want this as a single patch combined with the code > to enable the option, I can resend it. I took your patch, modified it a bit and put it at the end of 'multipart' test. The diff for viewing is attached at the end. The next question is should we have new option as --include-html or as --include-html=(true|false) or even --body=(true|false|text-and-html) See --exclude option in http://notmuchmail.org/manpages/notmuch-search-1/ and --body option in http://notmuchmail.org/manpages/notmuch-show-1/ for comparison... Tomi --8<----8<----8<----8<----8<-- diff --git a/test/multipart b/test/multipart index c974226..11f10bd 100755 --- a/test/multipart +++ b/test/multipart @@ -647,4 +647,84 @@ notmuch show --format=raw --part=3 id:base64-part-with-crlf > crlf.out echo -n -e "\xEF\x0D\x0A" > crlf.expected test_expect_equal_file crlf.out crlf.expected -test_done \ No newline at end of file + +# The ISO-8859-1 encoding of U+00BD is a single byte: octal 275 +# (Portability note: Dollar-Single ($'...', ANSI C-style escape sequences) +# quoting works on bash, ksh, zsh, *BSD sh but not on dash, ash nor busybox sh) +readonly u_00bd_latin1=$'\275' + +# The Unicode fraction symbol 1/2 is U+00BD and is encoded +# in UTF-8 as two bytes: octal 302 275 +readonly u_00bd_utf8=$'\302\275' + +cat <<EOF > ${MAIL_DIR}/include-html +From: A <a@example.com> +To: B <b@example.com> +Subject: html message +Date: Sat, 01 January 2000 00:00:00 +0000 +Message-ID: <htmlmessage> +MIME-Version: 1.0 +Content-Type: multipart/alternative; boundary="==-==" + +--==-== +Content-Type: text/html; charset=UTF-8 + +<p>0.5 equals ${u_00bd_utf8}</p> + +--==-== +Content-Type: text/html; charset=ISO-8859-1 + +<p>0.5 equals ${u_00bd_latin1}</p> + +--==-== +Content-Type: text/plain; charset=UTF-8 + +0.5 equals ${u_00bd_utf8} + +--==-==-- +EOF + +notmuch new > /dev/null + +cat_expected_head () +{ + cat <<EOF +[[[{"id": "htmlmessage", "match":true, "excluded": false, "date_relative":"2000-01-01", + "timestamp": 946684800, + "filename": "${MAIL_DIR}/include-html", + "tags": ["inbox", "unread"], + "headers": { "Date": "Sat, 01 Jan 2000 00:00:00 +0000", "From": "A <a@example.com>", + "Subject": "html message", "To": "B <b@example.com>"}, + "body": [{ + "content-type": "multipart/alternative", "id": 1, +EOF +} + +cat_expected_head > EXPECTED.nohtml +cat <<EOF >> EXPECTED.nohtml +"content": [ + { "id": 2, "content-charset": "UTF-8", "content-length": 21, "content-type": "text/html"}, + { "id": 3, "content-charset": "ISO-8859-1", "content-length": 20, "content-type": "text/html"}, + { "id": 4, "content-type": "text/plain", "content": "0.5 equals \\u00bd\\n"} +]}]},[]]]] +EOF + +# Both the UTF-8 and ISO-8859-1 part should have U+00BD +cat_expected_head > EXPECTED.withhtml +cat <<EOF >> EXPECTED.withhtml +"content": [ + { "id": 2, "content-type": "text/html", "content": "<p>0.5 equals \\u00bd</p>\\n"}, + { "id": 3, "content-type": "text/html", "content": "<p>0.5 equals \\u00bd</p>\\n"}, + { "id": 4, "content-type": "text/plain", "content": "0.5 equals \\u00bd\\n"} +]}]},[]]]] +EOF + +test_begin_subtest "html parts excluded by default" +notmuch show --format=json id:htmlmessage > OUTPUT +test_expect_equal_json "$(cat OUTPUT)" "$(cat EXPECTED.nohtml)" + +test_begin_subtest "html parts included" +notmuch show --format=json --include-html id:htmlmessage > OUTPUT +test_expect_equal_json "$(cat OUTPUT)" "$(cat EXPECTED.withhtml)" + +test_done