Hastymail2

Hastymail2 is an Open Source IMAP webmail client written in PHP. Our focus is compliance, usability, security, and speed.

Hastymail 2

avatar
Character Encoding (Non UTF-8)

When email is received in Hastymail with for instance ISO-2022-JP/8-Bits character encoding, very often used in Japan, Hastymail will convert it default into UTF-8 to display.

I have digged through all PHP scripts of Hastymail and everything is fixed UTF-8.

When replying on differently encoded emails it stays encoded in UTF-8.

It is very well possible that receiver cannot see such email i.e. it is good practise to reply emails in the same character encoding. This is not possible now with Hastymail. There is no option to set the character-encoding except for 8-bits, Base64 and Quoted-Printable.

Other open source webmailers like Horde follow senders encoding correctly.

I think Hastymail would be much better when non-fixed character encoding would be implemented.

The PHP Manual on multi-byte encoding is here also quite clear in:

http://jp.php.net/manual/en/mbstring.ja-basic.php

Hastymail uses mb_convert_encoding of the PHP  Multibyte String Functions to convert other encodings into UTF-8.

Since Hastymail already depends on PHP4 and if other (Db)features are needed by users even on PHP5 maybe a good idea to use more mb_ functions like in this example?

  1. assume Encoding is derived like this $enc = mb_detect_encoding($received_text)
  2. mb_language("japanese");
  3. mb_internal_encoding("EUC-JP");
  4.  
  5. //日本語メール送信
  6. $to = "katou@example.com";
  7. $subject = "例の件について";
  8. $body = "どうでしょう?";
  9. $from = mb_encode_mimeheader(mb_convert_encoding("山本 正喜","JIS","EUC-JP"))."<masaki@example.com>";
  10.  
  11. //ちゃんと日本語メールが送信できます
  12. mb_send_mail($to,$subject,$body,"From:".$from);

Further no comment -  I like this light-weight HastyMail very much after a few days playing with it!

Cheers!

Fred

Reply /Quote
avatar
Re: Character Encoding (Non UTF-8)

The problem as I see it is that in order to display different character sets on the same HTML page we have no choice but to convert them all to UTF-8. This is obvious when considering the mailbox view. Imagine subjects in different encodings and languages. Once converted to UTF-8 they can all be properly displayed on the same page under a single character set. HTML only allows a web page to have one character set. So while it is feasible to enable outgoing messages to use a different character set, we cannot correctly display that on the compose page, because the interface and the contents of the textarea would be in different encodings.  The only thing I can think of would be to convert to UTF-8 as we do now while the reply is being composed, then after hitting send we convert back to the original character set.

I already assigned the bug you created for this to myself in the tracker. I appreciate the feedback. I will think about it some more and let you know what I come up with.

Take care,
Jason

Reply /Quote
avatar
Re: Character Encoding (Non UTF-8)

Hi Jason,

Thanks for considering! I also think it is the best idea to do the conversion back to original just before sending causing the least impact for update of your great mailer package! But on the other hand I would introduce a parameter for preferred local Character-Set e.g. change all hard-coded UTF-8 with a presettable variable $local_character_set. I might give it try to play around and learn from your application and gain some programming experience in PHP.

FYI This is a sample list of the combinations CharacterSet and Encoding used by various senders:


COUNTRY OF ORIGIN
 CharacterSet Encoding
 Taiwan big5 base64
 China
 gb2312 base64
 Singapore us-ascii 8-Bit
 Japan gb2312 quoted-printable
 Netherlands iso-8859-1 quoted-printable
 Japan iso-2022-jp
 8-Bit
 USA utf-8 7-Bit
 Brasil utf-8 quoted-printable
   

 

 

 

 

 

 

 

 

 

Found some rare exception which is not showing Japanese correctly in HastyMail UTF-8 characterset view,

this was copied from the email's header-details in hastymail after pushing "Full-headers":

Content-Type: text/plain; charset="iso-2022-jp"
Content-Transfer-Encoding: 7bit

However it does not show the character-set at the bottom of the page under "Message Parts", only the 7-bits encoding with Mime-Type text/plain as below:

Message Parts

Mime-type Filename Description Charset Encoding Size
->View  |  Download text / plain message_1     7bit 7.7 KB

Other observations:

When pushing "Full-headers" the email will show correctly readable. When pushing "Small-headers" it does not show the mail-contents readable. Not sure if has to be considered a bug or the sender did not comply to RFC's?

thanks and best regards,

Fred

Haste is waste
Hastymail does prevail

 

Reply /Quote
avatar
Re: Character Encoding (Non UTF-8)

B.T.W.

Only Internet Explorer (7.0) can show that exceptional email's contents.

Opera 9.6 shows this error:

XML parsing failed: syntax error (Line: 1, Character: 23546)

Reparse document as HTML
Error:invalid character
Specification:http://www.w3.org/TR/REC-xml/#NT-Char

FireFox 3.0:

XML Parsing Error: not well-formed
Location: https://tyo.pilship.com/ilohamail/hasty/?page=message&uid=35435&mailbox_page=3&sort_by=ARRIVAL&filter_by=ALL&mailbox=INBOX
Line Number 1, Column 23548:

Safari 3.2.2 also not able to render, error:

This page contains the following errors:error on line 1 at column 14237: internal errorBelow is a rendering of the page up to the first error.

cheers,

Fred

Reply /Quote
avatar
Re: Character Encoding (Non UTF-8)

These problems are because we are not properly handling a character in the email content that needs to be replaced with an HTML entity. This strict behavior is caused by the application/xhtml+xml content header. Issuing this header allows browsers that support it to more efficiently render pages. The cost is that a single incorrect entity causes the entire page to fail. You can disable the header by changing the $http_content_header variable in the index.php file of the latest SVN version to html instead of xhtml. This should make it possible to view the message that is failing

If you can send a copy of the message (forwarded as an attachment so it is not altered) to jason@hastymail.org I will see if I can reproduce and fix the issue.

Thanks for the feedback,
Jason

Reply /Quote
Get Hastymail at SourceForge.net. Fast, secure and Free Open Source software downloads