7Network Working Group J. Degener
8Request for Comments: 5173 P. Guenther
9Updates: 5229 Sendmail, Inc.
10Category: Standards Track April 2008
14 Sieve Email Filtering: Body Extension
18 This document specifies an Internet standards track protocol for the
19 Internet community, and requests discussion and suggestions for
20 improvements. Please refer to the current edition of the "Internet
21 Official Protocol Standards" (STD 1) for the standardization state
22 and status of this protocol. Distribution of this memo is unlimited.
26 This document defines a new command for the "Sieve" email filtering
27 language that tests for the occurrence of one or more strings in the
28 body of an email message.
58Degener & Guenther Standards Track [Page 1]
60RFC 5173 Sieve Email Filtering: Body Extension April 2008
65 The "body" test checks for the occurrence of one or more strings in
66 the body of an email message. Such a test was initially discussed
67 for the [SIEVE] base document, but was subsequently removed because
68 it was thought to be too costly to implement.
70 Nevertheless, several server vendors have implemented some form of
73 This document reintroduces the "body" test as an extension, and
74 specifies its syntax and semantics.
762. Conventions Used in This Document
78 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
79 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
80 document are to be interpreted as described in [KEYWORDS].
82 Conventions for notations are as in [SIEVE] Section 1.1, including
83 the use of the "Usage:" label for the definition of text and tagged
86 The rules for interpreting the grammar are defined in [SIEVE] and
87 inherited by this specification. In particular, readers of this
88 document are reminded that according to [SIEVE] Sections 2.6.2 and
89 2.6.3, optional arguments such as COMPARATOR and MATCH-TYPE can
923. Capability Identifier
94 The capability string associated with the extension defined in this
99 Usage: "body" [COMPARATOR] [MATCH-TYPE] [BODY-TRANSFORM]
100 <key-list: string-list>
102 The body test matches content in the body of an email message, that
103 is, anything following the first empty line after the header. (The
104 empty line itself, if present, is not considered to be part of the
107 The COMPARATOR and MATCH-TYPE keyword parameters are defined in
108 [SIEVE]. As specified in Sections 2.7.1 and 2.7.3 of [SIEVE], the
109 default COMPARATOR is "i;ascii-casemap" and the default MATCH-TYPE is
114Degener & Guenther Standards Track [Page 2]
116RFC 5173 Sieve Email Filtering: Body Extension April 2008
119 The BODY-TRANSFORM is a keyword parameter that governs how a set of
120 strings to be matched against are extracted from the body of the
121 message. If a message consists of a header only, not followed by an
122 empty line, then that set is empty and all "body" tests return false,
123 including those that test for an empty string. (This is similar to
124 how the "header" test always fails when the named header fields
125 aren't present.) Otherwise, the transform must be followed as
126 defined below in Section 5.
128 Note that the transformations defined here do *not* match against
129 each line of the message independently, so the strings will usually
130 contain CRLFs. How these can be matched is governed by the
131 comparator and match-type. For example, with the default comparator
132 of "i;ascii-casemap", they can be included literally in the key
133 strings, or be matched with the "*" or "?" wildcards of the :matches
134 match-type, or be skipped with :contains.
138 Prior to matching content in a message body, "transformations" can be
139 applied that filter and decode certain parts of the body. These
140 transformations are selected by a "BODY-TRANSFORM" keyword parameter.
143 / ":content" <content-types: string-list>
146 The default transformation is :text.
1485.1. Body Transform ":raw"
150 The ":raw" transform matches against the entire undecoded body of a
151 message as a single item.
153 If the specified body-transform is ":raw", the [MIME] structure of
154 the body is irrelevant. The implementation MUST NOT remove any
155 transfer encoding from the message, MUST NOT refuse to filter
156 messages with syntactic errors (unless the environment it is part of
157 rejects them outright), and MUST treat multipart boundaries or the
158 MIME headers of enclosed body parts as part of the content being
159 matched against, instead of MIME structures to interpret.
170Degener & Guenther Standards Track [Page 3]
172RFC 5173 Sieve Email Filtering: Body Extension April 2008
179 # This will match a message containing the literal text
180 # "MAKE MONEY FAST" in body parts (ignoring any
181 # content-transfer-encodings) or MIME headers other than
182 # the outermost RFC 2822 header.
184 if body :raw :contains "MAKE MONEY FAST" {
1885.2. Body Transform ":content"
190 If the body transform is ":content", the MIME parts that have the
191 specified content types are matched against independently.
193 If an individual content type begins or ends with a '/' (slash) or
194 contains multiple slashes, then it matches no content types.
195 Otherwise, if it contains a slash, then it specifies a full
196 <type>/<subtype> pair, and matches only that specific content type.
197 If it is the empty string, all MIME content types are matched.
198 Otherwise, it specifies a <type> only, and any subtype of that type
201 The search for MIME parts matching the :content specification is
202 recursive and automatically descends into multipart and
203 message/rfc822 MIME parts. All MIME parts with matching types are
204 searched for the key strings. The test returns true if any
205 combination of a searched MIME part and key-list argument match.
207 If the :content specification matches a multipart MIME part, only the
208 prologue and epilogue sections of the part will be searched for the
209 key strings, treating the entire prologue and the entire epilogue as
210 separate strings; the contents of nested parts are only searched if
211 their respective types match the :content specification.
213 If the :content specification matches a message/rfc822 MIME part,
214 only the header of the nested message will be searched for the key
215 strings, treating the header as a single string; the contents of the
216 nested message body parts are only searched if their content type
217 matches the :content specification.
219 For other MIME types, the entire part will be searched as a single
226Degener & Guenther Standards Track [Page 4]
228RFC 5173 Sieve Email Filtering: Body Extension April 2008
231 (Matches against container types with an empty match string can be
232 useful as tests for the existence of such parts.)
240 Content-Type: multipart/mixed; boundary=outer
242 & This is a multi-part message in MIME format.
245 Content-Type: multipart/alternative; boundary=inner
247 & This is a nested multi-part message in MIME format.
250 Content-Type: text/plain; charset="us-ascii"
255 Content-Type: text/html; charset="us-ascii"
257 % <html><body>Hello</body></html>
261 & This is the end of the inner MIME multipart.
264 Content-Type: message/rfc822
267 ! Subject: hello request
273 & This is the end of the outer MIME multipart.
282Degener & Guenther Standards Track [Page 5]
284RFC 5173 Sieve Email Filtering: Body Extension April 2008
287 In the above example, the '&', '$', '%', and '!' characters at the
288 start of a line are used to illustrate what portions of the example
289 message are used in tests:
291 - the lines starting with '&' are the ones that are tested when a
292 'body :content "multipart" :contains "MIME"' test is executed.
294 - the lines starting with '$' are the ones that are tested when a
295 'body :content "text/plain" :contains "Hello"' test is executed.
297 - the lines starting with '%' are the ones that are tested when a
298 'body :content "text/html" :contains "Hello"' test is executed.
300 - the lines starting with '$' or '%' are the ones that are tested
301 when a 'body :content "text" :contains "Hello"' test is executed.
303 - the lines starting with '!' are the ones that are tested when a
304 'body :content "message/rfc822" :contains "Hello"' test is
307 Comparisons are performed on octets. Implementations decode the
308 content-transfer-encoding and convert text to [UTF-8] as input to the
309 comparator. MIME parts that cannot be decoded and converted MAY be
310 treated as plain US-ASCII, omitted, or processed according to local
311 conventions. A NUL octet (character zero) SHOULD NOT cause early
312 termination of the content being compared against. Implementations
313 MUST support the "quoted-printable", "base64", "7bit", "8bit", and
314 "binary" content transfer encodings. Implementations MUST be capable
315 of converting to UTF-8 the US-ASCII, ISO-8859-1, and the US-ASCII
316 subset of ISO-8859-* character sets.
318 Each matched part is matched against independently: search
319 expressions MUST NOT match across MIME part boundaries. MIME headers
320 of the containing part MUST NOT be included in the data.
338Degener & Guenther Standards Track [Page 6]
340RFC 5173 Sieve Email Filtering: Body Extension April 2008
345 require ["body", "fileinto"];
347 # Save any message with any text MIME part that contains the
348 # words "missile" or "coordinates" in the "secrets" folder.
350 if body :content "text" :contains ["missile", "coordinates"] {
354 # Save any message with an audio/mp3 MIME part in
355 # the "jukebox" folder.
357 if body :content "audio/mp3" :contains "" {
3615.3. Body Transform ":text"
363 The ":text" body transform matches against the results of an
364 implementation's best effort at extracting UTF-8 encoded text from a
367 It is unspecified whether this transformation results in a single
368 string or multiple strings being matched against. All the text
369 extracted from a given non-container MIME part MUST be in the same
372 In simple implementations, :text MAY be treated the same as :content
375 Sophisticated implementations MAY strip mark-up from the text prior
376 to matching, and MAY convert media types other than text to text
379 (For example, they may be able to convert proprietary text editor
380 formats to text or apply optical character recognition algorithms to
384 require ["body", "fileinto"];
386 # Save messages mentioning the project schedule in the
387 # project/schedule folder.
388 if body :text :contains "project schedule" {
389 fileinto "project/schedule";
394Degener & Guenther Standards Track [Page 7]
396RFC 5173 Sieve Email Filtering: Body Extension April 2008
3996. Interaction with Other Sieve Extensions
401 Any extension that extends the grammar for the COMPARATOR or MATCH-
402 TYPE nonterminals will also affect the implementation of "body".
404 Wildcard expressions used with "body" are exempt from the side
405 effects described in [VARIABLES]. That is, they MUST NOT set match
406 variables (${1}, ${2}...) to the input values corresponding to
407 wildcard sequences in the matched pattern. However, if the extension
408 is present, variable references in the key strings or content type
409 strings are evaluated as described in this document.
4117. IANA Considerations
413 The following template specifies the IANA registration of the Sieve
414 extension specified in this document:
417 Subject: Registration of new Sieve extension
419 Capability name: body
420 Description: Provides a test for matching against the
421 body of the message being processed
423 Contact Address: The Sieve discussion list
424 <ietf-mta-filters@imc.org>
4268. Security Considerations
428 The system MUST be sized and restricted in such a manner that even
429 malicious use of body matching does not deny service to other users
432 Filters relying on string matches in the raw body of an email message
433 may be more general than intended. Text matches are no replacement
434 for a spam, virus, or other security related filtering system.
438 This document has been revised in part based on comments and
439 discussions that took place on and off the SIEVE mailing list.
440 Thanks to Cyrus Daboo, Ned Freed, Bob Johannessen, Simon Josefsson,
441 Mark E. Mallett, Chris Markle, Alexey Melnikov, Ken Murchison, Greg
442 Shapiro, Tim Showalter, Nigel Swinson, Dowson Tong, and Christian
443 Vogt for reviews and suggestions.
450Degener & Guenther Standards Track [Page 8]
452RFC 5173 Sieve Email Filtering: Body Extension April 2008
45710.1. Normative References
459 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate
460 Requirement Levels", BCP 14, RFC 2119, March 1997.
462 [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
463 Extensions (MIME) Part One: Format of Internet Message
464 Bodies", RFC 2045, November 1996.
466 [SIEVE] Guenther, P., Ed., and T. Showalter, Ed., "Sieve: An
467 Email Filtering Language", RFC 5228, January 2008.
469 [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO
470 10646", STD 63, RFC 3629, November 2003.
47210.2. Informative References
474 [VARIABLES] Homme, K., "Sieve Email Filtering: Variables Extension",
475 RFC 5229, January 2008.
480 5245 College Ave, Suite #127
483 EMail: jutta@pobox.com
488 6425 Christie Ave, 4th Floor
491 EMail: guenther@sendmail.com
506Degener & Guenther Standards Track [Page 9]
508RFC 5173 Sieve Email Filtering: Body Extension April 2008
511Full Copyright Statement
513 Copyright (C) The IETF Trust (2008).
515 This document is subject to the rights, licenses and restrictions
516 contained in BCP 78, and except as set forth therein, the authors
517 retain all their rights.
519 This document and the information contained herein are provided on an
520 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
521 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
522 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
523 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
524 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
525 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
529 The IETF takes no position regarding the validity or scope of any
530 Intellectual Property Rights or other rights that might be claimed to
531 pertain to the implementation or use of the technology described in
532 this document or the extent to which any license under such rights
533 might or might not be available; nor does it represent that it has
534 made any independent effort to identify any such rights. Information
535 on the procedures with respect to rights in RFC documents can be
536 found in BCP 78 and BCP 79.
538 Copies of IPR disclosures made to the IETF Secretariat and any
539 assurances of licenses to be made available, or the result of an
540 attempt made to obtain a general license or permission for the use of
541 such proprietary rights by implementers or users of this
542 specification can be obtained from the IETF on-line IPR repository at
543 http://www.ietf.org/ipr.
545 The IETF invites any interested party to bring to its attention any
546 copyrights, patents or patent applications, or other proprietary
547 rights that may cover technology that may be required to implement
548 this standard. Please address the information to the IETF at
562Degener & Guenther Standards Track [Page 10]