How To Use Pre-Processing Tasks
- 1 Scenario
- 2 Setting up a Pre-Processing Task
- 3 Pre-Processing Task details
- 3.1 Script Pre Processing Task
- 3.2 Missing To: Address
- 3.3 Groupwise Addressee
- 3.4 Content Type Mapper
- 3.5 Mime Multipart
- 3.6 Invalid Content Transfer Encoding Remover
- 3.7 MIME Multipart Boundary Updater
- 3.8 Quoted Address Removal
- 3.9 Single Leading Quoted Personal Fixer
- 3.10 Trailing Dot Address Removal
- 3.11 Double Route Address Removal
- 3.12 Content Disposition Filename with unquoted filenames
- 3.13 Missing From Address
- 3.14 Illegal Disposition Attribute Task
- 3.15 Potential Spam Header Locator
- 3.16 Local Address Removal Task
- 3.17 Single Quoted contenttype Removal Task
- 3.18 Semi-colon address separator removal Task
- 3.19 MySQL Subject Cleaner
- 3.20 Single Quote to Double Quote email address personal part converter
- 4 Debugging
- 5 Related articles
Tasks available vary by JEMH version
Scenario
Pre-processing tasks allow email message headers to be manipulated in specific ways in order to change the outcome of email processing.
Pre-processing tasks are designed to help overcome specific problems that emails can have. It is strongly recommended that a task is not enabled unless you are certain of the cause of a processing problem.
Setting up a Pre-Processing Task
Enabling the use of tasks
First, JEMH's Auditing feature must be enabled. To do this, go to JEMH>Auditing>Settings>Inbound auditing and click the Toggle Switch. This is required as the Auditing function handles the storage of incoming emails.
In this example we will be enabling the Groupwise Addressee Pre-processing task, but the process is the same for all other Tasks. Edit the Email section of the Profile:
Scroll down to the Pre-processing section, and check the Use Reprocessed Message check-box.
Selecting Pre-Processing Tasks
Under Use Reprocessed Message, you will see a list of available pre-processing tasks. Select tasks with a click, de-select tasks with CTRL
+ click.
In this example, the Groupwise task will be selected:
Once selected, scroll to the bottom of the page and save the changes to the JEMH profile by using the Submit button.
Once the form is saved the selected task will be shown on the main configuration view. Some tasks can be configured further. Clicking the edit icon next to a task opens a task-specific configuration pop-up. Where no configuration exists, some helpful text will be presented.
Pre-Processing Task details
Script Pre Processing Task
See subpage : How to use Script Pre-Proc Task
Missing To: Address
Since 1.7.x
Function
If the To: header is missing (perhaps because the message was BCC delivered and the final leg mail host didn't include a courteous Delivered-To: header, then JEMH can inject a configurable value. By missing, it could also be the case that the To: header is present without a value.
Scenario
In this case, clicking the edit icon brings up the editable address entry form. A single valid SMTP address should be supplied, that can include personal part if needed, eg:
<user@place.com>
"Personal Part" <user@place.com>
Here is what it looks like after selecting during editing the Email section:
Clicking the pen icon above allows entry of the address:
Groupwise Addressee
Since 1.5.x
Function
Fixes various problems that can occur with Groupwise addressees.
Scenario
In this case, clicking the edit icon brings up the properties dialog for the task:
In this case, the options have been set to manipulate the nominated Groupwise Group (which did not expand to addressees, and therefore made the message unreadable), converting spaces to underscores, and embedding the result in the %group% renamed address. The result of this is that When a group 'Some Whitelisted Groupname' is encountered, it is replaced with the valid email address <Some_Whitelisted_Groupname@yourco.net>, enabling the message to be processed - automatically!
Group Handling
Remove (all groups) such that only email addresses remain.
Rename (some or all groups) to email addresses using the remaining fields.
Group Whitelisting
If specified, lists (comma separated) the actual group names listed in the Email addressee fields that should be renamed to email addresses. Leaving this field empty means that all Groups will be converted.
Space Replacement Char
Removing the spaces in the group enables (test cases to date) a valid email addressee to be formed, default is an underscore. Only the first character of a supplied value is used.
Rename Pattern
The given pattern %group%@yourco.net allows the group above, to be injected into this template, so that addresses are formed.
Content Type Mapper
Since 1.6.x
Illegal Main content CharType
Maps top level email header MIME content types that are invalid for Java Mail, to ones that are valid (see https://docs.oracle.com/javase/8/docs/technotes/guides/intl/encoding.doc.html )
Scenario
Lets say you wish to process the following email:
Subject: TEST
To: changeme@thiswontwork.com
From: sender@example.com
Date: Wed, 13 Nov 2015 11:22:11 +0900
MIME-Version: 1.0
Content-type: text/plain; charset=GB2312
Content-transfer-encoding: base64
<encoded content>
Charset GB2312 is not valid for Java Mail to process. Therefore, when the mail is processed, you may find some characters are not shown properly. In order to get around this we will map to the similar but valid GBK charset.
Configuration
Select the task from the list of tasks and save
Click Edit on the task
Enter the mapping that you would like to make
Mappings for Known Problem Content Types
The following is a list of known configurations for problem content types.
Problem Type | Description | Configuration |
---|---|---|
GB2312 | GB2312 charset is not valid for Java Mail to process. Mapping to the similar GBK charset makes it work. | GB2312=GBK |
CP932 | CP932 is Microsoft's extension of the Shift_JIS character encoding. Java Mail does not recognise this as an alias. MS932 is a supported alias. | CP932=MS932 |
Mime Multipart
since 3.3.52
Function
This pre-proc task can also update Mime Multipart encoded mail to correct illegal/incorrect mime types, eg, in the following, the Sender declares the content type of the attachment within to ‘uknown/pdf’, which in itself causes no problem, however, when downloading, client Browsers don’t know how to handle that, and cannot associate with the PDF client reader app.
From: <sender@blah.com>
Subject: stuff
To: <c@d.com>
CC: <a@b.com>
Date: Fri, 27 Aug 2021 05:15:17 +0200
Message-ID: <OF15486DFF.F74916F2-ONC125873E.0011E105@nnnnn.com>
Content-Type: multipart/mixed;
boundary="PART.BOUNDARY.MU.123.456.789"
MIME-Version: 1.0
--PART.BOUNDARY.MU.123.456.789
Content-Transfer-Encoding: base64
Content-Type: text/plain; charset="borked"
payload here
--PART.BOUNDARY.MU.123.456.789
Content-Type: unknown/pdf; name="FBB BE.PDF"
Content-Disposition: attachment; filename="FBB BE.PDF"
Content-Transfer-Encoding: base64
encoded content here
--PART.BOUNDARY.MU.123.456.789--
This can then be corrected with:
unknown/pdf=application/pdf
Invalid Content Transfer Encoding Remover
Since 1.7.x
Function
Allows illegal/redundant content-encoding value (e.g. UTF-8) present in the Content-Transfer-Encoding header to be removed or replaced with another value.
The below example caters to a specific issue where an email has the Content-Transfer-Encoding header defined with a value of UTF-8, this is an invalid content encoding type and will cause email processing to fail.
The pre-processing task can be configured with a key=value replacement for invalid content transfer encoding values, in the screenshot below the content transfer value UTF-8 has been substituted for an empty value by specifying UTF-8= in the pre-processing task configuration.
Configuration
Email header before pre-processing task | Email header after pre-processing task |
---|---|
Content-Transfer-Encoding: UTF-8 | Content-Transfer-Encoding: |
MIME Multipart Boundary Updater
Since 1.7.x
Function
Allows illegal MIME multipart boundaries to be fixed dynamically.
Configuration
No configurable properties are available for this pre-processing task.
Quoted Address Removal
Since 1.5.x
Function
Filter the "From", "To" and "Cc" addresses for illegal quoting of addresses.
Scenario
JEMH may encounter an incoming email with illegal quotes in an address field, for example:
To: "Bob Bobbington" <'bob@bob.com'>
The Quoted Address Removal task can fix this field in order to allow successful processing:
To: "Bob Bobbington" <bob@bob.com>
Configuration
No configurable properties are available for this pre-processing task.
Single Leading Quoted Personal Fixer
Since 1.9.x
Function
Filter From, To and Cc addresses for illegal single leading quoting personal names.
Scenario
JEMH may encounter an incoming email that does not correctly quote the personal name of an address:
" <some@place.com>
This pre-processing task can fix this problem in order for processing to be successful:
<some@place.com>
Configuration
No configurable properties are available for this pre-processing task.
Trailing Dot Address Removal
Since 1.7.x
Function
Filter the "From", "To" and "Cc" addresses for illegal trailing dot.
Scenario
JEMH may encounter an incoming email with trailing dots in an address field, for example:
From: "Bob Bobbington" <bob@bob.com.>
The Trailing Dot Address Removal task can "fix" this field:
From: "Bob Bobbington" <bob@bob.com>
Configuration
No configurable properties are available for this pre-processing task.
Double Route Address Removal
Since 1.7.x
Function
Filter the "From", "To" and "Cc" addresses for illegal double address routes.
Scenario
JEMH may encounter an incoming email with double address routes in an address field, for example:
From: <bob@bob.com> <bob@bob.com>
This can be fixed:
From: "bob@bob.com" <bob@bob.com>
Configuration
No configurable properties are available for this pre-processing task.
Content Disposition Filename with unquoted filenames
Since 1.8.x
Function
Filter body content for Content Disposition multi-parts that are non-compliant and not parseable (spaces in file names that are not quoted).
Scenario
When attachments are present in a multi-part email, JEMH may encounter file names that contain spaces that are unquoted.
Configuration
No configurable properties are available for this pre-processing task.
Missing From Address
SINCE 3.0.19
Function
Allows messages without a From: address to be dynamically updated for a given address
Scenario
In some cases, JEMH may encounter a mail which does not have the From address specified.
Configuration
The DefaultAddress to be used if the From Address is not specified is configurable; but do ensure that a valid email address is used (x@y.com) format
Illegal Disposition Attribute Task
SINCE 3.1.3
Function
Allows messages with missing Content-Disposition attribute to be defaulted to Content-Disposition: attachment
Scenario
In some cases, if the mail server fails to successfully specify the Content Type of an attachment (if it should be inline, attached and so on), this ends up throwing an error in JEMH as JEMH fails to successfully parse the mail. Therefore, it's important to ensure that the content type of an attachment is specified accordingly to ensure successful parsing phase.
Configuration
No configurable properties are available for this pre-processing task and if Content Type is invalid/not specified - default value (Attachment) will be inject.
Potential Spam Header Locator
SINCE 3.1.5
Function
Allows the user to specify the a specific header and value in order to determine if the mail is a spam mail (potentially generated by an Automated Service/s)
Scenario
Automated systems may send out numerous emails which are not easy to identify as a spam email and don't always contain the corresponding spam header. With this task, you can specify the Header and the Header value in which JEMH will perform a validation against to check the existence of that Header and Header Value. If found, Spam flag will be injected in which you can configure the Email section of the Project Mapping to process as per configuration.
Configuration
Following values are configurable:
Spam Header - the header value which will be used to perform the search against
Spam Header Value - the value which the header (configured against Spam Header) contains
Do note that a case insensitive search is performed against the contents for both of the values.
Local Address Removal Task
Function
Filters From, To and Cc/Bcc addresses for only local email addresses (missing @domain) and removes them. There are no configuration options for this Task.
Scenario
Automated system is sending out emails that only contains the first part of email (the name section e.g.bob instead of bob@localhost.com) and as a result it causing issues with processing these emails. With this task, it will detect whether any of the email addresses within From, To and Cc/Bcc do not contain the domain part of the email address (@localhost)
Configuration
There are no Configuration options for this task.
Single Quoted contenttype Removal Task
Function
Remove any singles quotes that are found within the Content-Type value. e.g. Content-Type: text/plain; charset='utf-8'.
Scenario
Sometimes JEMH may enter a email that contains Single Quotes around the charset value. Which will cause the email to not be processed correctly. This task will remove the Single Quotes so that JEMH is able to process the email with no issues. e.g. Changing From (Content-Type: text/plain; charset='utf-8') to (Content-Type: text/plain; charset=utf-8).
Configuration
There are no Configuration options for this task.
Semi-colon address separator removal Task
Function
Removes Semi-Colons from the To/Cc headers when they are used to separate recipients.
Scenario
JEMH receives an email that use a Semi-Colon to separate the recipients within the To/Cc headers. This task will remove the Semi-Colon and will replace this with a comma so that it uses the correct format. E.g. changes header from: user@domain.com;user2@domain.com to user@domain.com, user@domain.com
Configuration
There are no Configuration options for this task.
MySQL Subject Cleaner
Function
Removes 4-byte characters from encoded email subjects which can cause failures when saving to MySQL databases.
Scenario
Jira is configured to save your data into a Mysql database and you are receiving emails that contain 4-byte characters within the subject which can cause issue when save into the database. This task will match and remove these 4-byte characters so that there is no issues with saving into Mysql.
Configuration
There are no Configuration options for this task.
Single Quote to Double Quote email address personal part converter
since 5.0.0
Function
Converts quoting of personal parts from single to double quote.
Scenario
Email personal parts containing : are not valid when single quoted.
Configuration
There are no Configuration options for this task.
Debugging
Auditing allows exporting of both the unprocessed mail and the post-processed mail. When a change has affected an email, a new export icon is shown that contains the delta email, which is helpful in diagnosing what may have gone wrong with a task: