When email is received in Hastymail with for instance ISO-2022-JP/8-Bits character encoding, very often used in Japan, Hastymail will convert it default into UTF-8 to display.
I have digged through all PHP scripts of Hastymail and everything is fixed UTF-8.
When replying on differently encoded emails it stays encoded in UTF-8.
It is very well possible that receiver cannot see such email i.e. it is good practise to reply emails in the same character encoding. This is not possible now with Hastymail. There is no option to set the character-encoding except for 8-bits, Base64 and Quoted-Printable.
Other open source webmailers like Horde follow senders encoding correctly.
I think Hastymail would be much better when non-fixed character encoding would be implemented.
The PHP Manual on multi-byte encoding is here also quite clear in:
http://jp.php.net/manual/en/mbstring.ja-basic.php
Hastymail uses mb_convert_encoding of the PHP Multibyte String Functions to convert other encodings into UTF-8.
Since Hastymail already depends on PHP4 and if other (Db)features are needed by users even on PHP5 maybe a good idea to use more mb_ functions like in this example?
- assume Encoding is derived like this $enc = mb_detect_encoding($received_text)
- mb_language("japanese");
- mb_internal_encoding("EUC-JP");
- //日本語メール送信
- $to = "katou@example.com";
- $subject = "例の件について";
- $body = "どうでしょう?";
- $from = mb_encode_mimeheader(mb_convert_encoding("山本 正喜","JIS","EUC-JP"))."<masaki@example.com>";
- //ちゃんと日本語メールが送信できます
- mb_send_mail($to,$subject,$body,"From:".$from);
Further no comment - I like this light-weight HastyMail very much after a few days playing with it!
Cheers!
Fred
The problem as I see it is that in order to display different character sets on the same HTML page we have no choice but to convert them all to UTF-8. This is obvious when considering the mailbox view. Imagine subjects in different encodings and languages. Once converted to UTF-8 they can all be properly displayed on the same page under a single character set. HTML only allows a web page to have one character set. So while it is feasible to enable outgoing messages to use a different character set, we cannot correctly display that on the compose page, because the interface and the contents of the textarea would be in different encodings. The only thing I can think of would be to convert to UTF-8 as we do now while the reply is being composed, then after hitting send we convert back to the original character set.
I already assigned the bug you created for this to myself in the tracker. I appreciate the feedback. I will think about it some more and let you know what I come up with.
Take care,
Jason
Hi Jason,
Thanks for considering! I also think it is the best idea to do the conversion back to original just before sending causing the least impact for update of your great mailer package! But on the other hand I would introduce a parameter for preferred local Character-Set e.g. change all hard-coded UTF-8 with a presettable variable $local_character_set. I might give it try to play around and learn from your application and gain some programming experience in PHP.
FYI This is a sample list of the combinations CharacterSet and Encoding used by various senders:
| COUNTRY OF ORIGIN | CharacterSet | Encoding |
| Taiwan | big5 | base64 |
| China | gb2312 | base64 |
| Singapore | us-ascii | 8-Bit |
| Japan | gb2312 | quoted-printable |
| Netherlands | iso-8859-1 | quoted-printable |
| Japan | iso-2022-jp | 8-Bit |
| USA | utf-8 | 7-Bit |
| Brasil | utf-8 | quoted-printable |
Found some rare exception which is not showing Japanese correctly in HastyMail UTF-8 characterset view,
this was copied from the email's header-details in hastymail after pushing "Full-headers":
| Content-Type: | text/plain; charset="iso-2022-jp" |
|---|---|
| Content-Transfer-Encoding: | 7bit |
However it does not show the character-set at the bottom of the page under "Message Parts", only the 7-bits encoding with Mime-Type text/plain as below:
Message Parts| Mime-type | Filename | Description | Charset | Encoding | Size | |
|---|---|---|---|---|---|---|
| ->View | Download | text / plain | message_1 | 7bit | 7.7 KB |
Other observations:
When pushing "Full-headers" the email will show correctly readable. When pushing "Small-headers" it does not show the mail-contents readable. Not sure if has to be considered a bug or the sender did not comply to RFC's?
thanks and best regards,
Fred
| Haste is waste | Hastymail does prevail |
B.T.W.
Only Internet Explorer (7.0) can show that exceptional email's contents.
Opera 9.6 shows this error:
XML parsing failed: syntax error (Line: 1, Character: 23546)
Reparse document as HTML
Error:invalid character
Specification:http://www.w3.org/TR/REC-xml/#NT-Char
FireFox 3.0:
XML Parsing Error: not well-formed
Location: https://tyo.pilship.com/ilohamail/hasty/?page=message&uid=35435&mailbox_page=3&sort_by=ARRIVAL&filter_by=ALL&mailbox=INBOX
Line Number 1, Column 23548:
Safari 3.2.2 also not able to render, error:
This page contains the following errors:error on line 1 at column 14237: internal errorBelow is a rendering of the page up to the first error.
cheers,
Fred
These problems are because we are not properly handling a character in the email content that needs to be replaced with an HTML entity. This strict behavior is caused by the application/xhtml+xml content header. Issuing this header allows browsers that support it to more efficiently render pages. The cost is that a single incorrect entity causes the entire page to fail. You can disable the header by changing the $http_content_header variable in the index.php file of the latest SVN version to html instead of xhtml. This should make it possible to view the message that is failing
If you can send a copy of the message (forwarded as an attachment so it is not altered) to jason@hastymail.org I will see if I can reproduce and fix the issue.
Thanks for the feedback,
Jason