7Network Working Group A. Gulbrandsen
8Request for Comments: 4978 Oryx Mail Systems GmbH
9Category: Standards Track August 2007
12 The IMAP COMPRESS Extension
16 This document specifies an Internet standards track protocol for the
17 Internet community, and requests discussion and suggestions for
18 improvements. Please refer to the current edition of the "Internet
19 Official Protocol Standards" (STD 1) for the standardization state
20 and status of this protocol. Distribution of this memo is unlimited.
24 The COMPRESS extension allows an IMAP connection to be effectively
25 and efficiently compressed.
29 1. Introduction and Overview .......................................2
30 2. Conventions Used in This Document ...............................2
31 3. The COMPRESS Command ............................................3
32 4. Compression Efficiency ..........................................4
33 5. Formal Syntax ...................................................6
34 6. Security Considerations .........................................6
35 7. IANA Considerations .............................................6
36 8. Acknowledgements ................................................7
37 9. References ......................................................7
38 9.1. Normative References .......................................7
39 9.2. Informative References .....................................7
58Gulbrandsen Standards Track [Page 1]
60RFC 4978 The IMAP COMPRESS Extension August 2007
631. Introduction and Overview
65 A server which supports the COMPRESS extension indicates this with
66 one or more capability names consisting of "COMPRESS=" followed by a
67 supported compression algorithm name as described in this document.
69 The goal of COMPRESS is to reduce the bandwidth usage of IMAP.
71 Compared to PPP compression (see [RFC1962]) and modem-based
72 compression (see [MNP] and [V42BIS]), COMPRESS offers much better
73 compression efficiency. COMPRESS can be used together with Transport
74 Security Layer (TLS) [RFC4346], Simple Authentication and Security
75 layer (SASL) encryption, Virtual Private Networks (VPNs), etc.
76 Compared to TLS compression [RFC3749], COMPRESS has the following
79 - COMPRESS can be implemented easily both by IMAP servers and
82 - IMAP COMPRESS benefits from an intimate knowledge of the IMAP
83 protocol's state machine, allowing for dynamic and aggressive
84 optimization of the underlying compression algorithm's parameters.
86 - When the TLS layer implements compression, any protocol using that
87 layer can transparently benefit from that compression (e.g., SMTP
88 and IMAP). COMPRESS is specific to IMAP.
90 In order to increase interoperation, it is desirable to have as few
91 different compression algorithms as possible, so this document
92 specifies only one. The DEFLATE algorithm (defined in [RFC1951]) is
93 standard, widely available and fairly efficient, so it is the only
94 algorithm defined by this document.
96 In order to increase interoperation, IMAP servers that advertise this
97 extension SHOULD also advertise the TLS DEFLATE compression mechanism
98 as defined in [RFC3749]. IMAP clients MAY use either COMPRESS or TLS
99 compression, however, if the client and server support both, it is
100 RECOMMENDED that the client choose TLS compression.
102 The extension adds one new command (COMPRESS) and no new responses.
1042. Conventions Used in This Document
106 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
107 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
108 document are to be interpreted as described in [RFC2119].
110 Formal syntax is defined by [RFC4234] as modified by [RFC3501].
114Gulbrandsen Standards Track [Page 2]
116RFC 4978 The IMAP COMPRESS Extension August 2007
119 In the examples, "C:" and "S:" indicate lines sent by the client and
120 server respectively. "[...]" denotes elision.
1223. The COMPRESS Command
124 Arguments: Name of compression mechanism: "DEFLATE".
128 Result: OK The server will compress its responses and expects the
129 client to compress its commands.
130 NO Compression is already active via another layer.
131 BAD Command unknown, invalid or unknown argument, or COMPRESS
134 The COMPRESS command instructs the server to use the named
135 compression mechanism ("DEFLATE" is the only one defined) for all
136 commands and/or responses after COMPRESS.
138 The client MUST NOT send any further commands until it has seen the
139 result of COMPRESS. If the response was OK, the client MUST compress
140 starting with the first command after COMPRESS. If the server
141 response was BAD or NO, the client MUST NOT turn on compression.
143 If the server responds NO because it knows that the same mechanism is
144 active already (e.g., because TLS has negotiated the same mechanism),
145 it MUST send COMPRESSIONACTIVE as resp-text-code (see [RFC3501],
146 Section 7.1), and the resp-text SHOULD say which layer compresses.
148 If the server issues an OK response, the server MUST compress
149 starting immediately after the CRLF which ends the tagged OK
150 response. (Responses issued by the server before the OK response
151 will, of course, still be uncompressed.) If the server issues a BAD
152 or NO response, the server MUST NOT turn on compression.
154 For DEFLATE (as for many other compression mechanisms), the
155 compressor can trade speed against quality. When decompressing there
156 isn't much of a tradeoff. Consequently, the client and server are
157 both free to pick the best reasonable rate of compression for the
160 When COMPRESS is combined with TLS (see [RFC4346]) or SASL (see
161 [RFC4422]) security layers, the sending order of the three extensions
162 MUST be first COMPRESS, then SASL, and finally TLS. That is, before
163 data is transmitted it is first compressed. Second, if a SASL
164 security layer has been negotiated, the compressed data is then
165 signed and/or encrypted accordingly. Third, if a TLS security layer
166 has been negotiated, the data from the previous step is signed and/or
170Gulbrandsen Standards Track [Page 3]
172RFC 4978 The IMAP COMPRESS Extension August 2007
175 encrypted accordingly. When receiving data, the processing order
176 MUST be reversed. This ensures that before sending, data is
177 compressed before it is encrypted, independent of the order in which
178 the client issues COMPRESS, AUTHENTICATE, and STARTTLS.
180 The following example illustrates how commands and responses are
181 compressed during a simple login sequence:
183 S: * OK [CAPABILITY IMAP4REV1 STARTTLS COMPRESS=DEFLATE]
187 From this point on, everything is encrypted.
190 S: b OK Logged in as arnt
191 C: c compress deflate
192 S: d OK DEFLATE active
194 From this point on, everything is compressed before being
197 The following example demonstrates how a server may refuse to
200 S: * OK [CAPABILITY IMAP4REV1 STARTTLS COMPRESS=DEFLATE]
202 C: c compress deflate
203 S: c NO [COMPRESSIONACTIVE] DEFLATE active via TLS
2054. Compression Efficiency
207 This section is informative, not normative.
209 IMAP poses some unusual problems for a compression layer.
211 Upstream is fairly simple. Most IMAP clients send the same few
212 commands again and again, so any compression algorithm that can
213 exploit repetition works efficiently. The APPEND command is an
214 exception; clients that send many APPEND commands may want to
215 surround large literals with flushes in the same way as is
216 recommended for servers later in this section.
218 Downstream has the unusual property that several kinds of data are
219 sent, confusing all dictionary-based compression algorithms.
226Gulbrandsen Standards Track [Page 4]
228RFC 4978 The IMAP COMPRESS Extension August 2007
231 One type is IMAP responses. These are highly compressible; zlib
232 using its least CPU-intensive setting compresses typical responses to
233 25-40% of their original size.
235 Another type is email headers. These are equally compressible, and
236 benefit from using the same dictionary as the IMAP responses.
238 A third type is email body text. Text is usually fairly short and
239 includes much ASCII, so the same compression dictionary will do a
240 good job here, too. When multiple messages in the same thread are
241 read at the same time, quoted lines etc. can often be compressed
244 Finally, attachments (non-text email bodies) are transmitted, either
245 in binary form or encoded with base-64.
247 When attachments are retrieved in binary form, DEFLATE may be able to
248 compress them, but the format of the attachment is usually not IMAP-
249 like, so the dictionary built while compressing IMAP does not help.
250 The compressor has to adapt its dictionary from IMAP to the
251 attachment's format, and then back. A few file formats aren't
252 compressible at all using deflate, e.g., .gz, .zip, and .jpg files.
254 When attachments are retrieved in base-64 form, the same problems
255 apply, but the base-64 encoding adds another problem. 8-bit
256 compression algorithms such as deflate work well on 8-bit file
257 formats, however base-64 turns a file into something resembling 6-bit
258 bytes, hiding most of the 8-bit file format from the compressor.
260 When using the zlib library (see [RFC1951]), the functions
261 deflateInit2(), deflate(), inflateInit2(), and inflate() suffice to
262 implement this extension. The windowBits value must be in the range
263 -8 to -15, or else deflateInit2() uses the wrong format.
264 deflateParams() can be used to improve compression rate and resource
265 use. The Z_FULL_FLUSH argument to deflate() can be used to clear the
266 dictionary (the receiving peer does not need to do anything).
268 A client can improve downstream compression by implementing BINARY
269 (defined in [RFC3516]) and using FETCH BINARY instead of FETCH BODY.
270 In the author's experience, the improvement ranges from 5% to 40%
271 depending on the attachment being downloaded.
273 A server can improve downstream compression if it hints to the
274 compressor that the data type is about to change strongly, e.g., by
275 sending a Z_FULL_FLUSH at the start and end of large non-text
276 literals (before and after '*CHAR8' in the definition of literal in
277 RFC 3501, page 86). Small literals are best left alone. A possible
282Gulbrandsen Standards Track [Page 5]
284RFC 4978 The IMAP COMPRESS Extension August 2007
287 A server can improve the CPU efficiency both of the server and the
288 client if it adjusts the compression level (e.g., using the
289 deflateParams() function in zlib) at these points, to avoid trying to
290 compress incompressible attachments. A very simple strategy is to
291 change the level to 0 at the start of a literal provided the first
292 two bytes are either 0x1F 0x8B (as in deflate-compressed files) or
293 0xFF 0xD8 (JPEG), and to keep it at 1-5 the rest of the time. More
294 complex strategies are possible.
298 The following syntax specification uses the Augmented Backus-Naur
299 Form (ABNF) notation as specified in [RFC4234]. This syntax augments
300 the grammar specified in [RFC3501]. [RFC4234] defines SP and
301 [RFC3501] defines command-auth, capability, and resp-text-code.
303 Except as noted otherwise, all alphabetic characters are case-
304 insensitive. The use of upper or lower case characters to define
305 token strings is for editorial clarity only. Implementations MUST
306 accept these strings in a case-insensitive fashion.
308 command-auth =/ compress
310 compress = "COMPRESS" SP algorithm
312 capability =/ "COMPRESS=" algorithm
313 ;; multiple COMPRESS capabilities allowed
315 algorithm = "DEFLATE"
317 resp-text-code =/ "COMPRESSIONACTIVE"
319 Note that due the syntax of capability names, future algorithm names
3226. Security Considerations
324 As for TLS compression [RFC3749].
3267. IANA Considerations
328 The IANA has added COMPRESS=DEFLATE to the list of IMAP capabilities.
338Gulbrandsen Standards Track [Page 6]
340RFC 4978 The IMAP COMPRESS Extension August 2007
345 Eric Burger, Dave Cridland, Tony Finch, Ned Freed, Philip Guenther,
346 Randall Gellens, Tony Hansen, Cullen Jennings, Stephane Maes, Alexey
347 Melnikov, Lyndon Nerenberg, and Zoltan Ordogh have all helped with
350 The author would also like to thank various people in the rooms at
351 meetings, whose help is real, but not reflected in the author's
3569.1. Normative References
358 [RFC1951] Deutsch, P., "DEFLATE Compressed Data Format Specification
359 version 1.3", RFC 1951, May 1996.
361 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
362 Requirement Levels", BCP 14, RFC 2119, March 1997.
364 [RFC3501] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION
365 4rev1", RFC 3501, March 2003.
367 [RFC4234] Crocker, D. and P. Overell, "Augmented BNF for Syntax
368 Specifications: ABNF", RFC 4234, October 2005.
3709.2. Informative References
372 [RFC1962] Rand, D., "The PPP Compression Control Protocol (CCP)",
375 [RFC3516] Nerenberg, L., "IMAP4 Binary Content Extension", RFC 3516,
378 [RFC3749] Hollenbeck, S., "Transport Layer Security Protocol
379 Compression Methods", RFC 3749, May 2004.
381 [RFC4346] Dierks, T. and E. Rescorla, "The Transport Layer Security
382 (TLS) Protocol Version 1.1", RFC 4346, April 2006.
384 [RFC4422] Melnikov, A. and K. Zeilenga, "Simple Authentication and
385 Security Layer (SASL)", RFC 4422, June 2006.
387 [V42BIS] ITU, "V.42bis: Data compression procedures for data
388 circuit-terminating equipment (DCE) using error correction
389 procedures", http://www.itu.int/rec/T-REC-V.42bis, January
394Gulbrandsen Standards Track [Page 7]
396RFC 4978 The IMAP COMPRESS Extension August 2007
399 [MNP] Gilbert Held, "The Complete Modem Reference", Second
400 Edition, Wiley Professional Computing, ISBN 0-471-00852-4,
406 Oryx Mail Systems GmbH
411 Fax: +49 89 4502 9758
450Gulbrandsen Standards Track [Page 8]
452RFC 4978 The IMAP COMPRESS Extension August 2007
455Full Copyright Statement
457 Copyright (C) The IETF Trust (2007).
459 This document is subject to the rights, licenses and restrictions
460 contained in BCP 78, and except as set forth therein, the authors
461 retain all their rights.
463 This document and the information contained herein are provided on an
464 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
465 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
466 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
467 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
468 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
469 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
473 The IETF takes no position regarding the validity or scope of any
474 Intellectual Property Rights or other rights that might be claimed to
475 pertain to the implementation or use of the technology described in
476 this document or the extent to which any license under such rights
477 might or might not be available; nor does it represent that it has
478 made any independent effort to identify any such rights. Information
479 on the procedures with respect to rights in RFC documents can be
480 found in BCP 78 and BCP 79.
482 Copies of IPR disclosures made to the IETF Secretariat and any
483 assurances of licenses to be made available, or the result of an
484 attempt made to obtain a general license or permission for the use of
485 such proprietary rights by implementers or users of this
486 specification can be obtained from the IETF on-line IPR repository at
487 http://www.ietf.org/ipr.
489 The IETF invites any interested party to bring to its attention any
490 copyrights, patents or patent applications, or other proprietary
491 rights that may cover technology that may be required to implement
492 this standard. Please address the information to the IETF at
506Gulbrandsen Standards Track [Page 9]