1
2
3
4
5
6
7Network Working Group A. Gulbrandsen
8Request for Comments: 4978 Oryx Mail Systems GmbH
9Category: Standards Track August 2007
10
11
12 The IMAP COMPRESS Extension
13
14Status of this Memo
15
16 This document specifies an Internet standards track protocol for the
17 Internet community, and requests discussion and suggestions for
18 improvements. Please refer to the current edition of the "Internet
19 Official Protocol Standards" (STD 1) for the standardization state
20 and status of this protocol. Distribution of this memo is unlimited.
21
22Abstract
23
24 The COMPRESS extension allows an IMAP connection to be effectively
25 and efficiently compressed.
26
27 Table of Contents
28
29 1. Introduction and Overview .......................................2
30 2. Conventions Used in This Document ...............................2
31 3. The COMPRESS Command ............................................3
32 4. Compression Efficiency ..........................................4
33 5. Formal Syntax ...................................................6
34 6. Security Considerations .........................................6
35 7. IANA Considerations .............................................6
36 8. Acknowledgements ................................................7
37 9. References ......................................................7
38 9.1. Normative References .......................................7
39 9.2. Informative References .....................................7
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58Gulbrandsen Standards Track [Page 1]
59
60RFC 4978 The IMAP COMPRESS Extension August 2007
61
62
631. Introduction and Overview
64
65 A server which supports the COMPRESS extension indicates this with
66 one or more capability names consisting of "COMPRESS=" followed by a
67 supported compression algorithm name as described in this document.
68
69 The goal of COMPRESS is to reduce the bandwidth usage of IMAP.
70
71 Compared to PPP compression (see [RFC1962]) and modem-based
72 compression (see [MNP] and [V42BIS]), COMPRESS offers much better
73 compression efficiency. COMPRESS can be used together with Transport
74 Security Layer (TLS) [RFC4346], Simple Authentication and Security
75 layer (SASL) encryption, Virtual Private Networks (VPNs), etc.
76 Compared to TLS compression [RFC3749], COMPRESS has the following
77 (dis)advantages:
78
79 - COMPRESS can be implemented easily both by IMAP servers and
80 clients.
81
82 - IMAP COMPRESS benefits from an intimate knowledge of the IMAP
83 protocol's state machine, allowing for dynamic and aggressive
84 optimization of the underlying compression algorithm's parameters.
85
86 - When the TLS layer implements compression, any protocol using that
87 layer can transparently benefit from that compression (e.g., SMTP
88 and IMAP). COMPRESS is specific to IMAP.
89
90 In order to increase interoperation, it is desirable to have as few
91 different compression algorithms as possible, so this document
92 specifies only one. The DEFLATE algorithm (defined in [RFC1951]) is
93 standard, widely available and fairly efficient, so it is the only
94 algorithm defined by this document.
95
96 In order to increase interoperation, IMAP servers that advertise this
97 extension SHOULD also advertise the TLS DEFLATE compression mechanism
98 as defined in [RFC3749]. IMAP clients MAY use either COMPRESS or TLS
99 compression, however, if the client and server support both, it is
100 RECOMMENDED that the client choose TLS compression.
101
102 The extension adds one new command (COMPRESS) and no new responses.
103
1042. Conventions Used in This Document
105
106 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
107 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
108 document are to be interpreted as described in [RFC2119].
109
110 Formal syntax is defined by [RFC4234] as modified by [RFC3501].
111
112
113
114Gulbrandsen Standards Track [Page 2]
115
116RFC 4978 The IMAP COMPRESS Extension August 2007
117
118
119 In the examples, "C:" and "S:" indicate lines sent by the client and
120 server respectively. "[...]" denotes elision.
121
1223. The COMPRESS Command
123
124 Arguments: Name of compression mechanism: "DEFLATE".
125
126 Responses: None
127
128 Result: OK The server will compress its responses and expects the
129 client to compress its commands.
130 NO Compression is already active via another layer.
131 BAD Command unknown, invalid or unknown argument, or COMPRESS
132 already active.
133
134 The COMPRESS command instructs the server to use the named
135 compression mechanism ("DEFLATE" is the only one defined) for all
136 commands and/or responses after COMPRESS.
137
138 The client MUST NOT send any further commands until it has seen the
139 result of COMPRESS. If the response was OK, the client MUST compress
140 starting with the first command after COMPRESS. If the server
141 response was BAD or NO, the client MUST NOT turn on compression.
142
143 If the server responds NO because it knows that the same mechanism is
144 active already (e.g., because TLS has negotiated the same mechanism),
145 it MUST send COMPRESSIONACTIVE as resp-text-code (see [RFC3501],
146 Section 7.1), and the resp-text SHOULD say which layer compresses.
147
148 If the server issues an OK response, the server MUST compress
149 starting immediately after the CRLF which ends the tagged OK
150 response. (Responses issued by the server before the OK response
151 will, of course, still be uncompressed.) If the server issues a BAD
152 or NO response, the server MUST NOT turn on compression.
153
154 For DEFLATE (as for many other compression mechanisms), the
155 compressor can trade speed against quality. When decompressing there
156 isn't much of a tradeoff. Consequently, the client and server are
157 both free to pick the best reasonable rate of compression for the
158 data they send.
159
160 When COMPRESS is combined with TLS (see [RFC4346]) or SASL (see
161 [RFC4422]) security layers, the sending order of the three extensions
162 MUST be first COMPRESS, then SASL, and finally TLS. That is, before
163 data is transmitted it is first compressed. Second, if a SASL
164 security layer has been negotiated, the compressed data is then
165 signed and/or encrypted accordingly. Third, if a TLS security layer
166 has been negotiated, the data from the previous step is signed and/or
167
168
169
170Gulbrandsen Standards Track [Page 3]
171
172RFC 4978 The IMAP COMPRESS Extension August 2007
173
174
175 encrypted accordingly. When receiving data, the processing order
176 MUST be reversed. This ensures that before sending, data is
177 compressed before it is encrypted, independent of the order in which
178 the client issues COMPRESS, AUTHENTICATE, and STARTTLS.
179
180 The following example illustrates how commands and responses are
181 compressed during a simple login sequence:
182
183 S: * OK [CAPABILITY IMAP4REV1 STARTTLS COMPRESS=DEFLATE]
184 C: a starttls
185 S: a OK TLS active
186
187 From this point on, everything is encrypted.
188
189 C: b login arnt tnra
190 S: b OK Logged in as arnt
191 C: c compress deflate
192 S: d OK DEFLATE active
193
194 From this point on, everything is compressed before being
195 encrypted.
196
197 The following example demonstrates how a server may refuse to
198 compress twice:
199
200 S: * OK [CAPABILITY IMAP4REV1 STARTTLS COMPRESS=DEFLATE]
201 [...]
202 C: c compress deflate
203 S: c NO [COMPRESSIONACTIVE] DEFLATE active via TLS
204
2054. Compression Efficiency
206
207 This section is informative, not normative.
208
209 IMAP poses some unusual problems for a compression layer.
210
211 Upstream is fairly simple. Most IMAP clients send the same few
212 commands again and again, so any compression algorithm that can
213 exploit repetition works efficiently. The APPEND command is an
214 exception; clients that send many APPEND commands may want to
215 surround large literals with flushes in the same way as is
216 recommended for servers later in this section.
217
218 Downstream has the unusual property that several kinds of data are
219 sent, confusing all dictionary-based compression algorithms.
220
221
222
223
224
225
226Gulbrandsen Standards Track [Page 4]
227
228RFC 4978 The IMAP COMPRESS Extension August 2007
229
230
231 One type is IMAP responses. These are highly compressible; zlib
232 using its least CPU-intensive setting compresses typical responses to
233 25-40% of their original size.
234
235 Another type is email headers. These are equally compressible, and
236 benefit from using the same dictionary as the IMAP responses.
237
238 A third type is email body text. Text is usually fairly short and
239 includes much ASCII, so the same compression dictionary will do a
240 good job here, too. When multiple messages in the same thread are
241 read at the same time, quoted lines etc. can often be compressed
242 almost to zero.
243
244 Finally, attachments (non-text email bodies) are transmitted, either
245 in binary form or encoded with base-64.
246
247 When attachments are retrieved in binary form, DEFLATE may be able to
248 compress them, but the format of the attachment is usually not IMAP-
249 like, so the dictionary built while compressing IMAP does not help.
250 The compressor has to adapt its dictionary from IMAP to the
251 attachment's format, and then back. A few file formats aren't
252 compressible at all using deflate, e.g., .gz, .zip, and .jpg files.
253
254 When attachments are retrieved in base-64 form, the same problems
255 apply, but the base-64 encoding adds another problem. 8-bit
256 compression algorithms such as deflate work well on 8-bit file
257 formats, however base-64 turns a file into something resembling 6-bit
258 bytes, hiding most of the 8-bit file format from the compressor.
259
260 When using the zlib library (see [RFC1951]), the functions
261 deflateInit2(), deflate(), inflateInit2(), and inflate() suffice to
262 implement this extension. The windowBits value must be in the range
263 -8 to -15, or else deflateInit2() uses the wrong format.
264 deflateParams() can be used to improve compression rate and resource
265 use. The Z_FULL_FLUSH argument to deflate() can be used to clear the
266 dictionary (the receiving peer does not need to do anything).
267
268 A client can improve downstream compression by implementing BINARY
269 (defined in [RFC3516]) and using FETCH BINARY instead of FETCH BODY.
270 In the author's experience, the improvement ranges from 5% to 40%
271 depending on the attachment being downloaded.
272
273 A server can improve downstream compression if it hints to the
274 compressor that the data type is about to change strongly, e.g., by
275 sending a Z_FULL_FLUSH at the start and end of large non-text
276 literals (before and after '*CHAR8' in the definition of literal in
277 RFC 3501, page 86). Small literals are best left alone. A possible
278 boundary is 5k.
279
280
281
282Gulbrandsen Standards Track [Page 5]
283
284RFC 4978 The IMAP COMPRESS Extension August 2007
285
286
287 A server can improve the CPU efficiency both of the server and the
288 client if it adjusts the compression level (e.g., using the
289 deflateParams() function in zlib) at these points, to avoid trying to
290 compress incompressible attachments. A very simple strategy is to
291 change the level to 0 at the start of a literal provided the first
292 two bytes are either 0x1F 0x8B (as in deflate-compressed files) or
293 0xFF 0xD8 (JPEG), and to keep it at 1-5 the rest of the time. More
294 complex strategies are possible.
295
2965. Formal Syntax
297
298 The following syntax specification uses the Augmented Backus-Naur
299 Form (ABNF) notation as specified in [RFC4234]. This syntax augments
300 the grammar specified in [RFC3501]. [RFC4234] defines SP and
301 [RFC3501] defines command-auth, capability, and resp-text-code.
302
303 Except as noted otherwise, all alphabetic characters are case-
304 insensitive. The use of upper or lower case characters to define
305 token strings is for editorial clarity only. Implementations MUST
306 accept these strings in a case-insensitive fashion.
307
308 command-auth =/ compress
309
310 compress = "COMPRESS" SP algorithm
311
312 capability =/ "COMPRESS=" algorithm
313 ;; multiple COMPRESS capabilities allowed
314
315 algorithm = "DEFLATE"
316
317 resp-text-code =/ "COMPRESSIONACTIVE"
318
319 Note that due the syntax of capability names, future algorithm names
320 must be atoms.
321
3226. Security Considerations
323
324 As for TLS compression [RFC3749].
325
3267. IANA Considerations
327
328 The IANA has added COMPRESS=DEFLATE to the list of IMAP capabilities.
329
330
331
332
333
334
335
336
337
338Gulbrandsen Standards Track [Page 6]
339
340RFC 4978 The IMAP COMPRESS Extension August 2007
341
342
3438. Acknowledgements
344
345 Eric Burger, Dave Cridland, Tony Finch, Ned Freed, Philip Guenther,
346 Randall Gellens, Tony Hansen, Cullen Jennings, Stephane Maes, Alexey
347 Melnikov, Lyndon Nerenberg, and Zoltan Ordogh have all helped with
348 this document.
349
350 The author would also like to thank various people in the rooms at
351 meetings, whose help is real, but not reflected in the author's
352 mailbox.
353
3549. References
355
3569.1. Normative References
357
358 [RFC1951] Deutsch, P., "DEFLATE Compressed Data Format Specification
359 version 1.3", RFC 1951, May 1996.
360
361 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
362 Requirement Levels", BCP 14, RFC 2119, March 1997.
363
364 [RFC3501] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION
365 4rev1", RFC 3501, March 2003.
366
367 [RFC4234] Crocker, D. and P. Overell, "Augmented BNF for Syntax
368 Specifications: ABNF", RFC 4234, October 2005.
369
3709.2. Informative References
371
372 [RFC1962] Rand, D., "The PPP Compression Control Protocol (CCP)",
373 RFC 1962, June 1996.
374
375 [RFC3516] Nerenberg, L., "IMAP4 Binary Content Extension", RFC 3516,
376 April 2003.
377
378 [RFC3749] Hollenbeck, S., "Transport Layer Security Protocol
379 Compression Methods", RFC 3749, May 2004.
380
381 [RFC4346] Dierks, T. and E. Rescorla, "The Transport Layer Security
382 (TLS) Protocol Version 1.1", RFC 4346, April 2006.
383
384 [RFC4422] Melnikov, A. and K. Zeilenga, "Simple Authentication and
385 Security Layer (SASL)", RFC 4422, June 2006.
386
387 [V42BIS] ITU, "V.42bis: Data compression procedures for data
388 circuit-terminating equipment (DCE) using error correction
389 procedures", http://www.itu.int/rec/T-REC-V.42bis, January
390 1990.
391
392
393
394Gulbrandsen Standards Track [Page 7]
395
396RFC 4978 The IMAP COMPRESS Extension August 2007
397
398
399 [MNP] Gilbert Held, "The Complete Modem Reference", Second
400 Edition, Wiley Professional Computing, ISBN 0-471-00852-4,
401 May 1994.
402
403Author's Address
404
405 Arnt Gulbrandsen
406 Oryx Mail Systems GmbH
407 Schweppermannstr. 8
408 D-81671 Muenchen
409 Germany
410
411 Fax: +49 89 4502 9758
412 EMail: arnt@oryx.com
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450Gulbrandsen Standards Track [Page 8]
451
452RFC 4978 The IMAP COMPRESS Extension August 2007
453
454
455Full Copyright Statement
456
457 Copyright (C) The IETF Trust (2007).
458
459 This document is subject to the rights, licenses and restrictions
460 contained in BCP 78, and except as set forth therein, the authors
461 retain all their rights.
462
463 This document and the information contained herein are provided on an
464 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
465 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
466 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
467 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
468 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
469 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
470
471Intellectual Property
472
473 The IETF takes no position regarding the validity or scope of any
474 Intellectual Property Rights or other rights that might be claimed to
475 pertain to the implementation or use of the technology described in
476 this document or the extent to which any license under such rights
477 might or might not be available; nor does it represent that it has
478 made any independent effort to identify any such rights. Information
479 on the procedures with respect to rights in RFC documents can be
480 found in BCP 78 and BCP 79.
481
482 Copies of IPR disclosures made to the IETF Secretariat and any
483 assurances of licenses to be made available, or the result of an
484 attempt made to obtain a general license or permission for the use of
485 such proprietary rights by implementers or users of this
486 specification can be obtained from the IETF on-line IPR repository at
487 http://www.ietf.org/ipr.
488
489 The IETF invites any interested party to bring to its attention any
490 copyrights, patents or patent applications, or other proprietary
491 rights that may cover technology that may be required to implement
492 this standard. Please address the information to the IETF at
493 ietf-ipr@ietf.org.
494
495
496
497
498
499
500
501
502
503
504
505
506Gulbrandsen Standards Track [Page 9]
507
508