1 ../imapserver/server.go:145
2
3
4
5
6
7Internet Engineering Task Force (IETF) P. Resnick, Ed.
8Request for Comments: 6855 Qualcomm Incorporated
9Obsoletes: 5738 C. Newman, Ed.
10Category: Standards Track Oracle
11ISSN: 2070-1721 S. Shen, Ed.
12 CNNIC
13 March 2013
14
15
16 IMAP Support for UTF-8
17
18Abstract
19
20 This specification extends the Internet Message Access Protocol
21 (IMAP) to support UTF-8 encoded international characters in user
22 names, mail addresses, and message headers. This specification
23 replaces RFC 5738.
24
25Status of This Memo
26
27 This is an Internet Standards Track document.
28
29 This document is a product of the Internet Engineering Task Force
30 (IETF). It represents the consensus of the IETF community. It has
31 received public review and has been approved for publication by the
32 Internet Engineering Steering Group (IESG). Further information on
33 Internet Standards is available in Section 2 of RFC 5741.
34
35 Information about the current status of this document, any errata,
36 and how to provide feedback on it may be obtained at
37 http://www.rfc-editor.org/info/rfc6855.
38
39Copyright Notice
40
41 Copyright (c) 2013 IETF Trust and the persons identified as the
42 document authors. All rights reserved.
43
44 This document is subject to BCP 78 and the IETF Trust's Legal
45 Provisions Relating to IETF Documents
46 (http://trustee.ietf.org/license-info) in effect on the date of
47 publication of this document. Please review these documents
48 carefully, as they describe your rights and restrictions with respect
49 to this document. Code Components extracted from this document must
50 include Simplified BSD License text as described in Section 4.e of
51 the Trust Legal Provisions and are provided without warranty as
52 described in the Simplified BSD License.
53
54
55
56
57
58Resnick, et al. Standards Track [Page 1]
59
60RFC 6855 IMAP Support for UTF-8 March 2013
61
62
63Table of Contents
64
65 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2
66 2. Conventions Used in This Document . . . . . . . . . . . . . . 2
67 3. "UTF8=ACCEPT" IMAP Capability and UTF-8 in IMAP
68 Quoted-Strings . . . . . . . . . . . . . . . . . . . . . . . . 3
69 4. IMAP UTF8 "APPEND" Data Extension . . . . . . . . . . . . . . 4
70 5. "LOGIN" Command and UTF-8 . . . . . . . . . . . . . . . . . . 5
71 6. "UTF8=ONLY" Capability . . . . . . . . . . . . . . . . . . . . 5
72 7. Dealing with Legacy Clients . . . . . . . . . . . . . . . . . 6
73 8. Issues with UTF-8 Header Mailstore . . . . . . . . . . . . . . 7
74 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
75 10. Security Considerations . . . . . . . . . . . . . . . . . . . 8
76 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9
77 11.1. Normative References . . . . . . . . . . . . . . . . . . 9
78 11.2. Informative References . . . . . . . . . . . . . . . . . 10
79 Appendix A. Design Rationale . . . . . . . . . . . . . . . . . . 11
80 Appendix B. Acknowledgments . . . . . . . . . . . . . . . . . . . 11
81
821. Introduction
83
84 This specification forms part of the Email Address
85 Internationalization protocols described in the Email Address
86 Internationalization Framework document [RFC6530]. It extends IMAP
87 [RFC3501] to permit UTF-8 [RFC3629] in headers, as described in
88 "Internationalized Email Headers" [RFC6532]. It also adds a
89 mechanism to support mailbox names using the UTF-8 charset. This
90 specification creates two new IMAP capabilities to allow servers to
91 advertise these new extensions.
92
93 This specification assumes that the IMAP server will be operating in
94 a fully internationalized environment, i.e., one in which all clients
95 accessing the server will be able to accept non-ASCII message header
96 fields and other information, as specified in Section 3. At least
97 during a transition period, that assumption will not be realistic for
98 many environments; the issues involved are discussed in Section 7
99 below.
100
101 This specification replaces an earlier, experimental approach to the
102 same problem [RFC5738].
103
1042. Conventions Used in This Document
105
106 The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY"
107 in this document are to be interpreted as defined in "Key words for
108 use in RFCs to Indicate Requirement Levels" [RFC2119].
109
110
111
112
113
114Resnick, et al. Standards Track [Page 2]
115
116RFC 6855 IMAP Support for UTF-8 March 2013
117
118
119 The formal syntax uses the Augmented Backus-Naur Form (ABNF)
120 [RFC5234] notation. In addition, rules from IMAP [RFC3501], UTF-8
121 [RFC3629], Extensions to IMAP ABNF [RFC4466], and IMAP "LIST" command
122 extensions [RFC5258] are also referenced. This document assumes that
123 the reader will have a reasonably good understanding of these RFCs.
124
1253. "UTF8=ACCEPT" IMAP Capability and UTF-8 in IMAP Quoted-Strings
126
127 The "UTF8=ACCEPT" capability indicates that the server supports the
128 ability to open mailboxes containing internationalized messages with
129 the "SELECT" and "EXAMINE" commands, and the server can provide UTF-8
130 responses to the "LIST" and "LSUB" commands. This capability also
131 affects other IMAP extensions that can return mailbox names or their
132 prefixes, such as NAMESPACE [RFC2342] and ACL [RFC4314].
133
134 The "UTF8=ONLY" capability, described in Section 6, implies the
135 "UTF8=ACCEPT" capability. A server is said to support "UTF8=ACCEPT"
136 if it advertises either "UTF8=ACCEPT" or "UTF8=ONLY".
137
138 A client MUST use the "ENABLE" command [RFC5161] with the
139 "UTF8=ACCEPT" option (defined in Section 4 below) to indicate to the
140 server that the client accepts UTF-8 in quoted-strings and supports
141 the "UTF8=ACCEPT" extension. The "ENABLE UTF8=ACCEPT" command is
142 only valid in the authenticated state.
143
144 The IMAP base specification [RFC3501] forbids the use of 8-bit
145 characters in atoms or quoted-strings. Thus, a UTF-8 string can only
146 be sent as a literal. This can be inconvenient from a coding
147 standpoint, and unless the server offers IMAP non-synchronizing
148 literals [RFC2088], this requires an extra round trip for each UTF-8
149 string sent by the client. When the IMAP server supports
150 "UTF8=ACCEPT", it supports UTF-8 in quoted-strings with the following
151 syntax:
152
153 quoted =/ DQUOTE *uQUOTED-CHAR DQUOTE 9051:6856 ../imapclient/parse.go:759 ../imapserver/pack.go:47
154 ; QUOTED-CHAR is not modified, as it will affect
155 ; other RFC 3501 ABNF non-terminals.
156
157 uQUOTED-CHAR = QUOTED-CHAR / UTF8-2 / UTF8-3 / UTF8-4
158
159 UTF8-2 = <Defined in Section 4 of RFC 3629>
160
161 UTF8-3 = <Defined in Section 4 of RFC 3629>
162
163 UTF8-4 = <Defined in Section 4 of RFC 3629>
164
165 When this extended quoting mechanism is used by the client, the
166 server MUST reject, with a "BAD" response, any octet sequences with
167
168
169
170Resnick, et al. Standards Track [Page 3]
171
172RFC 6855 IMAP Support for UTF-8 March 2013
173
174
175 the high bit set that fail to comply with the formal syntax
176 requirements of UTF-8 [RFC3629]. The IMAP server MUST NOT send UTF-8
177 in quoted-strings to the client unless the client has indicated
178 support for that syntax by using the "ENABLE UTF8=ACCEPT" command.
179
180 If the server supports "UTF8=ACCEPT", the client MAY use extended
181 quoted syntax with any IMAP argument that permits a string (including
182 astring and nstring). However, if characters outside the US-ASCII
183 repertoire are used in an inappropriate place, the results would be
184 the same as if other syntactically valid but semantically invalid
185 characters were used. Specific cases where UTF-8 characters are
186 permitted or not permitted are described in the following paragraphs.
187
188 All IMAP servers that support "UTF8=ACCEPT" SHOULD accept UTF-8 in
189 mailbox names, and those that also support the Mailbox International
190 Naming Convention described in RFC 3501, Section 5.1.3, MUST accept
191 UTF8-quoted mailbox names and convert them to the appropriate
192 internal format. Mailbox names MUST comply with the Net-Unicode 3501:999 9051:979 ../store/account.go:2800
193 Definition ([RFC5198], Section 2) with the specific exception that
194 they MUST NOT contain control characters (U+0000-U+001F and U+0080-U+
195 009F), a delete character (U+007F), a line separator (U+2028), or a
196 paragraph separator (U+2029).
197
198 Once an IMAP client has enabled UTF-8 support with the "ENABLE ../imapserver/search.go:57
199 UTF8=ACCEPT" command, it MUST NOT issue a "SEARCH" command that
200 contains a charset specification. If an IMAP server receives such a
201 "SEARCH" command in that situation, it SHOULD reject the command with
202 a "BAD" response (due to the conflicting charset labels).
203
2044. IMAP UTF8 "APPEND" Data Extension 9051:3406 3501:2527 ../imapserver/server.go:2714 ../imapserver/server.go:2741
205
206 If the server supports "UTF8=ACCEPT", then the server accepts UTF-8
207 headers in the "APPEND" command message argument. A client that
208 sends a message with UTF-8 headers to the server MUST send them using
209 the "UTF8" data extension to the "APPEND" command. If the server
210 also advertises the "CATENATE" capability [RFC4469], the client can
211 use the same data extension to include such a message in a catenated
212 message part. The ABNF for the "APPEND" data extension and
213 "CATENATE" extension follows:
214
215 utf8-literal = "UTF8" SP "(" literal8 ")"
216
217 literal8 = <Defined in RFC 4466>
218
219 append-data =/ utf8-literal 9051:6325 3501:4547 ../imapserver/server.go:2717
220
221 cat-part =/ utf8-literal
222
223
224
225
226Resnick, et al. Standards Track [Page 4]
227
228RFC 6855 IMAP Support for UTF-8 March 2013
229
230
231 If an IMAP server supports "UTF8=ACCEPT" and the IMAP client has not
232 issued the "ENABLE UTF8=ACCEPT" command, the server MUST reject, with
233 a "NO" response, an "APPEND" command that includes any 8-bit
234 character in message header fields.
235
2365. "LOGIN" Command and UTF-8
237
238 This specification does not extend the IMAP "LOGIN" command [RFC3501]
239 to support UTF-8 usernames and passwords. Whenever a client needs to
240 use UTF-8 usernames or passwords, it MUST use the IMAP "AUTHENTICATE"
241 command, which is already capable of passing UTF-8 usernames and
242 credentials.
243
244 Although using the IMAP "AUTHENTICATE" command in this way makes it
245 syntactically legal to have a UTF-8 username or password, there is no
246 guarantee that the user provisioning system utilized by the IMAP
247 server will allow such identities. This is an implementation
248 decision and may depend on what identity system the IMAP server is
249 configured to use.
250
2516. "UTF8=ONLY" Capability ../imapserver/server.go:17
252
253 The "UTF8=ONLY" capability indicates that the server supports
254 "UTF8=ACCEPT" (see Section 4) and that it requires support for UTF-8
255 from clients. In particular, this means that the server will send
256 UTF-8 in quoted-strings, and it will not accept the older
257 international mailbox name convention (modified UTF-7 [RFC3501]).
258 Because these are incompatible changes to IMAP, explicit server
259 announcement and client confirmation is necessary: clients MUST use
260 the "ENABLE UTF8=ACCEPT" command before using this server. A server
261 that advertises "UTF8=ONLY" will reject, with a "NO [CANNOT]"
262 response [RFC5530], any command that might require UTF-8 support and
263 is not preceded by an "ENABLE UTF8=ACCEPT" command.
264
265 IMAP clients that find support for a server that announces
266 "UTF8=ONLY" problematic are encouraged to at least detect the
267 announcement and provide an informative error message to the
268 end-user.
269
270 Because the "UTF8=ONLY" server capability includes support for
271 "UTF8=ACCEPT", the capability string will include, at most, one of
272 those and never both. For the client, "ENABLE UTF8=ACCEPT" is always
273 used -- never "ENABLE UTF8=ONLY".
274
275
276
277
278
279
280
281
282Resnick, et al. Standards Track [Page 5]
283
284RFC 6855 IMAP Support for UTF-8 March 2013
285
286
2877. Dealing with Legacy Clients
288
289 In most situations, it will be difficult or impossible for the
290 implementer or operator of an IMAP (or POP) server to know whether
291 all of the clients that might access it, or the associated mail store
292 more generally, will be able to support the facilities defined in
293 this document. In almost all cases, servers that conform to this
294 specification will have to be prepared to deal with clients that do
295 not enable the relevant capabilities. Unfortunately, there is no
296 completely satisfactory way to do so other than for systems that wish
297 to receive email that requires SMTPUTF8 capabilities to be sure that
298 all components of those systems -- including IMAP and other clients
299 selected by users -- are upgraded appropriately.
300
301 When a message that requires SMTPUTF8 is encountered and the client
302 does not enable UTF-8 capability, choices available to the server
303 include hiding the problematic message(s), creating in-band or
304 out-of-band notifications or error messages, or somehow trying to
305 create a surrogate of the message with the intention of providing
306 useful information to that client about what has occurred. Such
307 surrogate messages cannot be actual substitutes for the original
308 message: they will almost always be impossible to reply to (either at
309 all or without loss of information) and the new header fields or
310 specialized constructs for server-client communications may go beyond
311 the requirements of current email specifications (e.g., [RFC5322]).
312 Consequently, such messages may confuse some legacy mail user agents
313 (including IMAP clients) or not provide expected information to
314 users. There are also trade-offs in constructing surrogates of the
315 original message between accepting complexity and additional
316 computation costs in order to try to preserve as much information as
317 possible (for example, in "Post-Delivery Message Downgrading for
318 Internationalized Email Messages" [RFC6857]) and trying to minimize
319 those costs while still providing useful information (for example, in
320 "Simplified POP and IMAP Downgrading for Internationalized Email"
321 [RFC6858]).
322
323 Implementations that choose to perform downgrading SHOULD use one of
324 the standardized algorithms provided in RFC 6857 or RFC 6858.
325 Getting downgrade algorithms right, and minimizing the risk of
326 operational problems and harm to the email system, is tricky and
327 requires careful engineering. These two algorithms are well
328 understood and carefully designed.
329
330 Because such messages are really surrogates of the original ones, not
331 really "downgraded" ones (although that terminology is often used for
332 convenience), they inevitably have relationships to the originals
333 that the IMAP specification [RFC3501] did not anticipate. This
334 brings up two concerns in particular: First, digital signatures
335
336
337
338Resnick, et al. Standards Track [Page 6]
339
340RFC 6855 IMAP Support for UTF-8 March 2013
341
342
343 computed over and intended for the original message will often not be
344 applicable to the surrogate message, and will often fail signature
345 verification. (It will be possible for some digital signatures to be
346 verified, if they cover only parts of the original message that are
347 not affected in the creation of the surrogate.) Second, servers that
348 may be accessed by the same user with different clients or methods
349 (e.g., POP or webmail systems in addition to IMAP or IMAP clients
350 with different capabilities) will need to exert extreme care to be
351 sure that UIDVALIDITY [RFC3501] behaves as the user would expect.
352 Those issues may be especially sensitive if the server caches the
353 surrogate message or computes and stores it when the message arrives
354 with the intent of making either form available depending on client
355 capabilities. Additionally, in order to cope with the case when a
356 server compliant with this extension returns the same UIDVALIDITY to
357 both legacy and "UTF8=ACCEPT"-aware clients, a client upgraded from
358 being non-"UTF8=ACCEPT"-aware MUST discard its cache of messages
359 downloaded from the server.
360
361 The best (or "least bad") approach for any given environment will
362 depend on local conditions, local assumptions about user behavior,
363 the degree of control the server operator has over client usage and
364 upgrading, the options that are actually available, and so on. It is
365 impossible, at least at the time of publication of this
366 specification, to give good advice that will apply to all situations,
367 or even particular profiles of situations, other than "upgrade legacy
368 clients as soon as possible".
369
3708. Issues with UTF-8 Header Mailstore
371
372 When an IMAP server uses a mailbox format that supports UTF-8 headers
373 and it permits selection or examination of that mailbox without
374 issuing "ENABLE UTF8=ACCEPT" first, it is the responsibility of the
375 server to comply with the IMAP base specification [RFC3501] and the
376 Internet Message Format [RFC5322] with respect to all header
377 information transmitted over the wire. The issue of handling
378 messages containing non-ASCII characters in legacy environments is
379 discussed in Section 7.
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394Resnick, et al. Standards Track [Page 7]
395
396RFC 6855 IMAP Support for UTF-8 March 2013
397
398
3999. IANA Considerations
400
401 This document redefines two capabilities ("UTF8=ACCEPT" and
402 "UTF8=ONLY") in the "IMAP 4 Capabilities" registry [RFC3501]. Three
403 other capabilities that were described in the experimental
404 predecessor to this document ("UTF8=ALL", "UTF8=APPEND", "UTF8=USER")
405 are now OBSOLETE. IANA has updated the registry as follows:
406
407 OLD:
408 +--------------+-----------------+
409 | UTF8=ACCEPT | [RFC5738] |
410 | UTF8=ALL | [RFC5738] |
411 | UTF8=APPEND | [RFC5738] |
412 | UTF8=ONLY | [RFC5738] |
413 | UTF8=USER | [RFC5738] |
414 +--------------+-----------------+
415
416
417 NEW:
418 +------------------------+---------------------+
419 | UTF8=ACCEPT | [RFC6855] |
420 | UTF8=ALL (OBSOLETE) | [RFC5738] [RFC6855]|
421 | UTF8=APPEND (OBSOLETE) | [RFC5738] [RFC6855]|
422 | UTF8=ONLY | [RFC6855] |
423 | UTF8=USER (OBSOLETE) | [RFC5738] [RFC6855]|
424 +------------------------+---------------------+
425
42610. Security Considerations
427
428 The security considerations of UTF-8 [RFC3629] and SASLprep [RFC4013]
429 apply to this specification, particularly with respect to use of
430 UTF-8 in usernames and passwords. Otherwise, this is not believed to
431 alter the security considerations of IMAP.
432
433 Special considerations, some of them with security implications,
434 occur if a server that conforms to this specification is accessed by
435 a client that does not, as well as in some more complex situations in
436 which a given message is accessed by multiple clients that might use
437 different protocols and/or support different capabilities. Those
438 issues are discussed in Section 7.
439
440
441
442
443
444
445
446
447
448
449
450Resnick, et al. Standards Track [Page 8]
451
452RFC 6855 IMAP Support for UTF-8 March 2013
453
454
45511. References
456
45711.1. Normative References
458
459 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
460 Requirement Levels", BCP 14, RFC 2119, March 1997.
461
462 [RFC3501] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION
463 4rev1", RFC 3501, March 2003.
464
465 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO
466 10646", STD 63, RFC 3629, November 2003.
467
468 [RFC4013] Zeilenga, K., "SASLprep: Stringprep Profile for User Names
469 and Passwords", RFC 4013, February 2005.
470
471 [RFC4466] Melnikov, A. and C. Daboo, "Collected Extensions to IMAP4
472 ABNF", RFC 4466, April 2006.
473
474 [RFC4469] Resnick, P., "Internet Message Access Protocol (IMAP)
475 CATENATE Extension", RFC 4469, April 2006.
476
477 [RFC5161] Gulbrandsen, A. and A. Melnikov, "The IMAP ENABLE
478 Extension", RFC 5161, March 2008.
479
480 [RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network
481 Interchange", RFC 5198, March 2008.
482
483 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax
484 Specifications: ABNF", STD 68, RFC 5234, January 2008.
485
486 [RFC5258] Leiba, B. and A. Melnikov, "Internet Message Access
487 Protocol version 4 - LIST Command Extensions", RFC 5258,
488 June 2008.
489
490 [RFC5322] Resnick, P., Ed., "Internet Message Format", RFC 5322,
491 October 2008.
492
493 [RFC6530] Klensin, J. and Y. Ko, "Overview and Framework for
494 Internationalized Email", RFC 6530, February 2012.
495
496 [RFC6532] Yang, A., Steele, S., and N. Freed, "Internationalized
497 Email Headers", RFC 6532, February 2012.
498
499 [RFC6857] Fujiwara, K., "Post-Delivery Message Downgrading for
500 Internationalized Email Messages", RFC 6857, March 2013.
501
502
503
504
505
506Resnick, et al. Standards Track [Page 9]
507
508RFC 6855 IMAP Support for UTF-8 March 2013
509
510
511 [RFC6858] Gulbrandsen, A., "Simplified POP and IMAP Downgrading for
512 Internationalized Email", RFC 6858, March 2013.
513
51411.2. Informative References
515
516 [RFC2088] Myers, J., "IMAP4 non-synchronizing literals", RFC 2088,
517 January 1997.
518
519 [RFC2342] Gahrns, M. and C. Newman, "IMAP4 Namespace", RFC 2342,
520 May 1998.
521
522 [RFC4314] Melnikov, A., "IMAP4 Access Control List (ACL) Extension",
523 RFC 4314, December 2005.
524
525 [RFC5530] Gulbrandsen, A., "IMAP Response Codes", RFC 5530,
526 May 2009.
527
528 [RFC5738] Resnick, P. and C. Newman, "IMAP Support for UTF-8",
529 RFC 5738, March 2010.
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562Resnick, et al. Standards Track [Page 10]
563
564RFC 6855 IMAP Support for UTF-8 March 2013
565
566
567Appendix A. Design Rationale
568
569 This non-normative section discusses the reasons behind some of the
570 design choices in this specification.
571
572 The "UTF8=ONLY" mechanism simplifies diagnosis of interoperability
573 problems when legacy support goes away. In the situation where
574 backwards compatibility is not working anyway, the non-conforming
575 "just-send-UTF-8 IMAP" has the advantage that it might work with some
576 legacy clients. However, the difficulty of diagnosing
577 interoperability problems caused by a "just-send-UTF-8 IMAP"
578 mechanism is the reason the "UTF8=ONLY" capability mechanism was
579 chosen.
580
581Appendix B. Acknowledgments
582
583 The authors wish to thank the participants of the EAI working group
584 for their contributions to this document, with particular thanks to
585 Harald Alvestrand, David Black, Randall Gellens, Arnt Gulbrandsen,
586 Kari Hurtta, John Klensin, Xiaodong Lee, Charles Lindsey, Alexey
587 Melnikov, Subramanian Moonesamy, Shawn Steele, Daniel Taharlev, and
588 Joseph Yee for their specific contributions to the discussion.
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618Resnick, et al. Standards Track [Page 11]
619
620RFC 6855 IMAP Support for UTF-8 March 2013
621
622
623Authors' Addresses
624
625 Pete Resnick (editor)
626 Qualcomm Incorporated
627 5775 Morehouse Drive
628 San Diego, CA 92121-1714
629 USA
630
631 Phone: +1 858 651 4478
632 EMail: presnick@qti.qualcomm.com
633
634
635 Chris Newman (editor)
636 Oracle
637 800 Royal Oaks
638 Monrovia, CA 91016
639 USA
640
641 Phone:
642 EMail: chris.newman@oracle.com
643
644
645 Sean Shen (editor)
646 CNNIC
647 No.4 South 4th Zhongguancun Street
648 Beijing, 100190
649 China
650
651 Phone: +86 10-58813038
652 EMail: shenshuo@cnnic.cn
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674Resnick, et al. Standards Track [Page 12]
675
676