Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: In November 2003, UTF-8 was restricted by RFC 3629 to match the constraints of the UTF-16 character encoding: explicitly prohibiting code points corresponding to the high and low surrogate characters removed more than 3% of the three-byte sequences, and ending at U+10FFFF removed more than 48% of the four-byte sequences and all five- and six-byte sequences.

...

The standard UTF-8 character set used by MySQL databases (called utf8) does not truly support all UTF-8 characters - it can only use a maximum of 3 bytes per character, leaving out lager multi. This leaves out the remaining 4-byte characters, including all “emojis” (😕 for example).

Both Atlassian and ourselves recommend using PostgreSQL, an SQL database that fully supports multiall possible UTF-byte 8 characters. According to Atlassian, as of JIRA Jira 7.3 they also support MySQL 5.7, which should apparently work with the utf8mb4 character set.  However if upgrading JIRA Jira and MySQL is not possible, there are some things that can be done using JEMH to alleviate the problems.

...

One of JEMH's great features is its modular pre-processing task system.  Particular email processing problems can be overcome by enabling specific tasks to run before the main email processing begins.

The MySQL Subject Cleaner pre-processing task has been added to JEMH.  See JEMH-5291 for the versions it was added in.  This task filters out 4-byte characters from email subjects, meaning that JIRA Jira should not have a problem storing the resulting issue summary.  To see what versions this was added in, check the above improvement issue.

...

Cleaning email body content using a Body Cleanup Regular Expression

If your JIRA Jira is running on MySQL, multiunsupported 4-byte characters in the email body could also be a problem.  JIRA Jira will try to save the content as the description or a comment, and may fail if multi-byte such characters are present and unsupported.  If you suspect this to be the case, you can use the Body Cleanup Regexps setting found under Profile>Email to cut out these characters, allowing successful processing.

...