5Internet Engineering Task Force (IETF) P. Resnick
6Request for Comments: 9755 Episteme
8Category: Standards Track CNNIC
9ISSN: 2070-1721 A. Gulbrandsen
14 IMAP Support for UTF-8
18 This specification extends the Internet Message Access Protocol,
19 specifically IMAP4rev1 (RFC 3501), to support UTF-8 encoded
20 international characters in user names, mail addresses, and message
21 headers. This specification replaces RFC 6855. This specification
22 does not extend IMAP4rev2 (RFC 9051), since that protocol includes
23 everything in this extension.
27 This is an Internet Standards Track document.
29 This document is a product of the Internet Engineering Task Force
30 (IETF). It represents the consensus of the IETF community. It has
31 received public review and has been approved for publication by the
32 Internet Engineering Steering Group (IESG). Further information on
33 Internet Standards is available in Section 2 of RFC 7841.
35 Information about the current status of this document, any errata,
36 and how to provide feedback on it may be obtained at
37 https://www.rfc-editor.org/info/rfc9755.
41 Copyright (c) 2025 IETF Trust and the persons identified as the
42 document authors. All rights reserved.
44 This document is subject to BCP 78 and the IETF Trust's Legal
45 Provisions Relating to IETF Documents
46 (https://trustee.ietf.org/license-info) in effect on the date of
47 publication of this document. Please review these documents
48 carefully, as they describe your rights and restrictions with respect
49 to this document. Code Components extracted from this document must
50 include Revised BSD License text as described in Section 4.e of the
51 Trust Legal Provisions and are provided without warranty as described
52 in the Revised BSD License.
57 2. Requirements Language
58 3. "UTF8=ACCEPT" IMAP Capability and UTF-8 in IMAP Quoted-Strings
60 5. "LOGIN" Command and UTF-8
61 6. FETCH BODYSTRUCTURE and message/global
62 7. "UTF8=ONLY" Capability
63 8. Dealing with Legacy Clients
64 9. Issues with UTF-8 Header Mailstore
65 10. IANA Considerations
66 11. Security Considerations
68 12.1. Normative References
69 12.2. Informative References
70 Appendix A. Design Rationale
71 Appendix B. Changes Since RFC 6855
73 B.2. FETCH BODYSTRUCTURE
79 This specification forms part of the Email Address
80 Internationalization protocols described in the Email Address
81 Internationalization Framework document [RFC6530]. It extends IMAP
82 [RFC3501] to permit UTF-8 [RFC3629] in headers, as described in
83 "Internationalized Email Headers" [RFC6532]. It also adds a
84 mechanism to support mailbox names using the UTF-8 charset. This
85 specification creates two new IMAP capabilities to allow servers to
86 advertise these new extensions.
88 This specification assumes that the IMAP server will be operating in
89 a fully internationalized environment, i.e., one in which all clients
90 accessing the server will be able to accept non-ASCII message header
91 fields and other information, as specified in Section 3. At least
92 during a transition period, that assumption will not be realistic for
93 many environments; the issues involved are discussed in Section 7
96 This specification replaces an earlier, experimental approach to the
97 same problem; see [RFC5738] as well as [RFC6855].
992. Requirements Language
101 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
102 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
103 "OPTIONAL" in this document are to be interpreted as described in
104 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
105 capitals, as shown here.
1073. "UTF8=ACCEPT" IMAP Capability and UTF-8 in IMAP Quoted-Strings
109 The "UTF8=ACCEPT" capability indicates that the server supports the
110 ability to open mailboxes containing internationalized messages with
111 the "SELECT" and "EXAMINE" commands, and the server can provide UTF-8
112 responses to the "LIST" and "LSUB" commands. This capability also
113 affects other IMAP extensions that can return mailbox names or their
114 prefixes, such as NAMESPACE [RFC2342] and ACL [RFC4314].
116 The "UTF8=ONLY" capability, described in Section 7, implies the
117 "UTF8=ACCEPT" capability. A server is said to support "UTF8=ACCEPT"
118 if it advertises either "UTF8=ACCEPT" or "UTF8=ONLY".
120 A client MUST use the "ENABLE" command [RFC5161] with the
121 "UTF8=ACCEPT" option (defined in Section 4 below) to indicate to the
122 server that the client accepts UTF-8 in quoted-strings and supports
123 the "UTF8=ACCEPT" extension. The "ENABLE UTF8=ACCEPT" command is
124 only valid in the authenticated state.
126 The IMAP base specification [RFC3501] forbids the use of 8-bit
127 characters in atoms or quoted-strings. Thus, a UTF-8 string can only
128 be sent as a literal. This can be inconvenient from a coding
129 standpoint, and unless the server offers IMAP non-synchronizing
130 literals [RFC7888], this requires an extra round trip for each UTF-8
131 string sent by the client. When the IMAP server supports
132 "UTF8=ACCEPT", it supports UTF-8 in quoted-strings with the following
133 ABNF syntax [RFC5234]:
135 quoted =/ DQUOTE *uQUOTED-CHAR DQUOTE
136 ; QUOTED-CHAR is not modified, as it will affect
137 ; other RFC 3501 ABNF non-terminals.
139 uQUOTED-CHAR = QUOTED-CHAR / UTF8-2 / UTF8-3 / UTF8-4
141 UTF8-2 = <Defined in Section 4 of RFC 3629>
143 UTF8-3 = <Defined in Section 4 of RFC 3629>
145 UTF8-4 = <Defined in Section 4 of RFC 3629>
147 When this extended quoting mechanism is used by the client, the
148 server MUST reject, with a "BAD" response, any octet sequences with
149 the high bit set that fail to comply with the formal syntax
150 requirements of UTF-8 [RFC3629]. The IMAP server MUST NOT send UTF-8
151 in quoted-strings to the client unless the client has indicated
152 support for that syntax by using the "ENABLE UTF8=ACCEPT" command.
154 If the server supports "UTF8=ACCEPT", the client MAY use extended
155 quoted syntax with any IMAP argument that permits a string (including
156 astring and nstring). However, if characters outside the US-ASCII
157 repertoire are used in an inappropriate place, the results would be
158 the same as if other syntactically valid but semantically invalid
159 characters were used. Specific cases where UTF-8 characters are
160 permitted or not permitted are described in the following paragraphs.
162 All IMAP servers that support "UTF8=ACCEPT" SHOULD accept UTF-8 in
163 mailbox names, and those that also support the Mailbox International
164 Naming Convention described in [RFC3501], Section 5.1.3, MUST accept
165 UTF-8 in mailbox names and convert them to the appropriate internal
166 format. Mailbox names MUST comply with the Net-Unicode Definition
167 ([RFC5198], Section 2) with the specific exception that they MUST NOT
168 contain control characters (U+0000 - U+001F and U+0080 - U+009F), a
169 delete character (U+007F), a line separator (U+2028), or a paragraph
172 Once an IMAP client has enabled UTF-8 support with the "ENABLE
173 UTF8=ACCEPT" command, it MUST NOT issue a "SEARCH" command that
174 contains a charset specification. If an IMAP server receives such a
175 "SEARCH" command in that situation, it SHOULD reject the command with
176 a "BAD" response (due to the conflicting charset labels). This also
177 applies to any IMAP command or extension that includes an optional
178 charset label and associated strings in the command arguments,
179 including the MULTISEARCH extension. For commands with a mandatory
180 charset field, such as SORT and THREAD, servers SHOULD reject charset
181 values other than UTF-8 with a "BAD" response (due to the conflicting
186 If the server supports "UTF8=ACCEPT", then the server accepts UTF-8
187 headers in the "APPEND" command message argument.
189 If an IMAP server supports "UTF8=ACCEPT" and the IMAP client has not
190 issued the "ENABLE UTF8=ACCEPT" command, the server MUST reject, with
191 a "NO" response, an "APPEND" command that includes any 8-bit
192 character in message header fields.
1945. "LOGIN" Command and UTF-8
196 This specification does not extend the IMAP "LOGIN" command [RFC3501]
197 to support UTF-8 usernames and passwords. Whenever a client needs to
198 use UTF-8 usernames or passwords, it MUST use the IMAP "AUTHENTICATE"
199 command, which is already capable of passing UTF-8 usernames and
202 Although using the IMAP "AUTHENTICATE" command in this way makes it
203 syntactically legal to have a UTF-8 username or password, there is no
204 guarantee that the user provisioning system utilized by the IMAP
205 server will allow such identities. This is an implementation
206 decision and may depend on what identity system the IMAP server is
2096. FETCH BODYSTRUCTURE and message/global
211 [RFC9051], Section 7.5.2 treats message/global like message/rfc,
212 which means that for some messages, the response to FETCH
213 BODYSTRUCTURE varies depending on whether IMAP4rev1 or IMAP4rev2 is
216 [RFC6855] does not extend [RFC3501] in this respect. This document
217 extends the media-message ABNF production to match [RFC9051].
219 media-message = DQUOTE "MESSAGE" DQUOTE SP
220 DQUOTE ("RFC822" / "GLOBAL") DQUOTE
222 When IMAP4rev1 and UTF8=ACCEPT has been enabled, the server MAY treat
223 message/global like message/rfc822 when computing the body structure,
224 but MAY also treat it as described in [RFC3501]. Clients MUST accept
227 When IMAP4rev2 and UTF8=ACCEPT are in use, the server MUST behave as
228 described in [RFC9051].
2307. "UTF8=ONLY" Capability
232 The "UTF8=ONLY" capability indicates that the server supports
233 "UTF8=ACCEPT" (see Section 3) and that it requires support for UTF-8
234 from clients. In particular, this means that the server will send
235 UTF-8 in quoted-strings, and it will not accept the older
236 international mailbox name convention (modified UTF-7 [RFC3501]).
237 Because these are incompatible changes to IMAP, explicit server
238 announcement and client confirmation are necessary: clients MUST use
239 the "ENABLE UTF8=ACCEPT" command before using this server. A server
240 that advertises "UTF8=ONLY" will reject, with a "NO [CANNOT]"
241 response [RFC5530], any command that might require UTF-8 support and
242 is not preceded by an "ENABLE UTF8=ACCEPT" command.
244 IMAP clients that find support for a server that announces
245 "UTF8=ONLY" problematic are encouraged to at least detect the
246 announcement and provide an informative error message to the end
249 Because the "UTF8=ONLY" server capability includes support for
250 "UTF8=ACCEPT", the capability string will include, at most, one of
251 those and never both. For the client, "ENABLE UTF8=ACCEPT" is always
252 used -- never "ENABLE UTF8=ONLY".
2548. Dealing with Legacy Clients
256 In most situations, it will be difficult or impossible for the
257 implementer or operator of an IMAP (or POP) server to know whether
258 all of the clients that might access it, or the associated mail store
259 more generally, will be able to support the facilities defined in
260 this document. In almost all cases, servers that conform to this
261 specification will have to be prepared to deal with clients that do
262 not enable the relevant capabilities. Unfortunately, there is no
263 completely satisfactory way to do so other than for systems that wish
264 to receive email that requires SMTPUTF8 capabilities to be sure that
265 all components of those systems -- including IMAP and other clients
266 selected by users -- are upgraded appropriately.
268 When a message that requires SMTPUTF8 is encountered and the client
269 does not enable UTF-8 capability, choices available to the server
270 include hiding the problematic message(s), creating in-band or out-
271 of-band notifications or error messages, or somehow trying to create
272 a surrogate of the message with the intention of providing useful
273 information to that client about what has occurred. Such surrogate
274 messages cannot be actual substitutes for the original message: they
275 will almost always be impossible to reply to (either at all or
276 without loss of information) and the new header fields or specialized
277 constructs for server-client communications may go beyond the
278 requirements of current email specifications (e.g., [RFC5322]).
279 Consequently, such messages may confuse some legacy mail user agents
280 (including IMAP clients) or not provide expected information to
281 users. There are also trade-offs in constructing surrogates of the
282 original message between accepting complexity and additional
283 computation costs in order to try to preserve as much information as
284 possible (for example, in "Post-Delivery Message Downgrading for
285 Internationalized Email Messages" [RFC6857]) and trying to minimize
286 those costs while still providing useful information (for example, in
287 "Simplified POP and IMAP Downgrading for Internationalized Email"
290 Implementations that choose to perform downgrading SHOULD use one of
291 the standardized algorithms provided in [RFC6857] or [RFC6858].
292 Getting downgrade algorithms right, and minimizing the risk of
293 operational problems and harm to the email system, is tricky and
294 requires careful engineering. These two algorithms are well
295 understood and carefully designed.
297 Because such messages are really surrogates of the original ones, not
298 really "downgraded" ones (although that terminology is often used for
299 convenience), they inevitably have relationships to the originals
300 that the IMAP specification [RFC3501] did not anticipate. This
301 brings up two concerns in particular: First, digital signatures
302 computed over and intended for the original message will often not be
303 applicable to the surrogate message, and will often fail signature
304 verification. (It will be possible for some digital signatures to be
305 verified, if they cover only parts of the original message that are
306 not affected in the creation of the surrogate.) Second, servers that
307 may be accessed by the same user with different clients or methods
308 (e.g., POP or webmail systems in addition to IMAP or IMAP clients
309 with different capabilities) will need to exert extreme care to be
310 sure that UIDVALIDITY [RFC3501] behaves as the user would expect.
311 Those issues may be especially sensitive if the server caches the
312 surrogate message or computes and stores it when the message arrives
313 with the intent of making either form available depending on client
314 capabilities. Additionally, in order to cope with the case when a
315 server compliant with this extension returns the same UIDVALIDITY to
316 both legacy and "UTF8=ACCEPT"-aware clients, a client upgraded from
317 being non-"UTF8=ACCEPT"-aware MUST discard its cache of messages
318 downloaded from the server.
320 The best (or "least bad") approach for any given environment will
321 depend on local conditions, local assumptions about user behavior,
322 the degree of control the server operator has over client usage and
323 upgrading, the options that are actually available, and so on. It is
324 impossible, at least at the time of publication of this
325 specification, to give good advice that will apply to all situations,
326 or even particular profiles of situations, other than "upgrade legacy
327 clients as soon as possible".
3299. Issues with UTF-8 Header Mailstore
331 When an IMAP server uses a mailbox format that supports UTF-8 headers
332 and it permits selection or examination of that mailbox without
333 issuing "ENABLE UTF8=ACCEPT" first, it is the responsibility of the
334 server to comply with the IMAP base specification [RFC3501] and the
335 Internet Message Format [RFC5322] with respect to all header
336 information transmitted over the wire. The issue of handling
337 messages containing non-ASCII characters in legacy environments is
338 discussed in Section 8.
34010. IANA Considerations
342 the "IMAP Capabilities" registry contained a number of references to
343 [RFC6855]. IANA has updated them point to this document instead.
344 The affected references are:
348 * UTF8=ALL (OBSOLETE)
350 * UTF8=APPEND (OBSOLETE)
354 * UTF8=USER (OBSOLETE)
35611. Security Considerations
358 The security considerations of UTF-8 [RFC3629] and PRECIS Usernames
359 and Passwords [RFC8265] apply to this specification, particularly
360 with respect to use of UTF-8 in usernames and passwords. Otherwise,
361 this is not believed to alter the security considerations of IMAP.
363 Special considerations, some of them with security implications,
364 occur if a server that conforms to this specification is accessed by
365 a client that does not, as well as in some more complex situations in
366 which a given message is accessed by multiple clients that might use
367 different protocols and/or support different capabilities. Those
368 issues are discussed in Section 8.
37212.1. Normative References
374 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
375 Requirement Levels", BCP 14, RFC 2119,
376 DOI 10.17487/RFC2119, March 1997,
377 <https://www.rfc-editor.org/info/rfc2119>.
379 [RFC3501] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION
380 4rev1", RFC 3501, DOI 10.17487/RFC3501, March 2003,
381 <https://www.rfc-editor.org/info/rfc3501>.
383 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO
384 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November
385 2003, <https://www.rfc-editor.org/info/rfc3629>.
387 [RFC5161] Gulbrandsen, A., Ed. and A. Melnikov, Ed., "The IMAP
388 ENABLE Extension", RFC 5161, DOI 10.17487/RFC5161, March
389 2008, <https://www.rfc-editor.org/info/rfc5161>.
391 [RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network
392 Interchange", RFC 5198, DOI 10.17487/RFC5198, March 2008,
393 <https://www.rfc-editor.org/info/rfc5198>.
395 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
396 Specifications: ABNF", STD 68, RFC 5234,
397 DOI 10.17487/RFC5234, January 2008,
398 <https://www.rfc-editor.org/info/rfc5234>.
400 [RFC5322] Resnick, P., Ed., "Internet Message Format", RFC 5322,
401 DOI 10.17487/RFC5322, October 2008,
402 <https://www.rfc-editor.org/info/rfc5322>.
404 [RFC6530] Klensin, J. and Y. Ko, "Overview and Framework for
405 Internationalized Email", RFC 6530, DOI 10.17487/RFC6530,
406 February 2012, <https://www.rfc-editor.org/info/rfc6530>.
408 [RFC6532] Yang, A., Steele, S., and N. Freed, "Internationalized
409 Email Headers", RFC 6532, DOI 10.17487/RFC6532, February
410 2012, <https://www.rfc-editor.org/info/rfc6532>.
412 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
413 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
414 May 2017, <https://www.rfc-editor.org/info/rfc8174>.
416 [RFC8265] Saint-Andre, P. and A. Melnikov, "Preparation,
417 Enforcement, and Comparison of Internationalized Strings
418 Representing Usernames and Passwords", RFC 8265,
419 DOI 10.17487/RFC8265, October 2017,
420 <https://www.rfc-editor.org/info/rfc8265>.
42212.2. Informative References
424 [RFC2342] Gahrns, M. and C. Newman, "IMAP4 Namespace", RFC 2342,
425 DOI 10.17487/RFC2342, May 1998,
426 <https://www.rfc-editor.org/info/rfc2342>.
428 [RFC4314] Melnikov, A., "IMAP4 Access Control List (ACL) Extension",
429 RFC 4314, DOI 10.17487/RFC4314, December 2005,
430 <https://www.rfc-editor.org/info/rfc4314>.
432 [RFC5530] Gulbrandsen, A., "IMAP Response Codes", RFC 5530,
433 DOI 10.17487/RFC5530, May 2009,
434 <https://www.rfc-editor.org/info/rfc5530>.
436 [RFC5738] Resnick, P. and C. Newman, "IMAP Support for UTF-8",
437 RFC 5738, DOI 10.17487/RFC5738, March 2010,
438 <https://www.rfc-editor.org/info/rfc5738>.
440 [RFC6855] Resnick, P., Ed., Newman, C., Ed., and S. Shen, Ed., "IMAP
441 Support for UTF-8", RFC 6855, DOI 10.17487/RFC6855, March
442 2013, <https://www.rfc-editor.org/info/rfc6855>.
444 [RFC6857] Fujiwara, K., "Post-Delivery Message Downgrading for
445 Internationalized Email Messages", RFC 6857,
446 DOI 10.17487/RFC6857, March 2013,
447 <https://www.rfc-editor.org/info/rfc6857>.
449 [RFC6858] Gulbrandsen, A., "Simplified POP and IMAP Downgrading for
450 Internationalized Email", RFC 6858, DOI 10.17487/RFC6858,
451 March 2013, <https://www.rfc-editor.org/info/rfc6858>.
453 [RFC7888] Melnikov, A., Ed., "IMAP4 Non-synchronizing Literals",
454 RFC 7888, DOI 10.17487/RFC7888, May 2016,
455 <https://www.rfc-editor.org/info/rfc7888>.
457 [RFC8620] Jenkins, N. and C. Newman, "The JSON Meta Application
458 Protocol (JMAP)", RFC 8620, DOI 10.17487/RFC8620, July
459 2019, <https://www.rfc-editor.org/info/rfc8620>.
461 [RFC9051] Melnikov, A., Ed. and B. Leiba, Ed., "Internet Message
462 Access Protocol (IMAP) - Version 4rev2", RFC 9051,
463 DOI 10.17487/RFC9051, August 2021,
464 <https://www.rfc-editor.org/info/rfc9051>.
466Appendix A. Design Rationale
468 This non-normative section discusses the reasons behind some of the
469 design choices in this specification.
471 The "UTF8=ONLY" mechanism simplifies diagnosis of interoperability
472 problems when legacy support goes away. In the situation where
473 backwards compatibility is not working anyway, the non-conforming
474 "just-send-UTF-8 IMAP" has the advantage that it might work with some
475 legacy clients. However, the difficulty of diagnosing
476 interoperability problems caused by a "just-send-UTF-8 IMAP"
477 mechanism is the reason the "UTF8=ONLY" capability mechanism was
480Appendix B. Changes Since RFC 6855
482 This non-normative section describes the changes made since
487 This document removes APPEND's UTF8 data item, making the
488 UTF8-related syntax compatible with IMAP4rev2 as defined by [RFC9051]
489 and making it simpler for clients to support IMAP4rev1 and IMAP4rev2
492 IMAP4rev2 [RFC9051] provides roughly the same abilities as [RFC6855]
493 but does not include APPEND's UTF8 item. None of [RFC6855],
494 IMAP4rev2, or JMAP [RFC8620] specify any way to learn whether a
495 particular message was stored using the UTF8 data item. As of today,
496 an IMAP client cannot learn whether a particular message was stored
497 using the UTF8 data item, nor would it be able to trust that
498 information even if IMAP4rev1 and 2 were extended to provide that
501 In July 2023, one of the authors found only one IMAP client that uses
502 the UTF8 data item, and that client uses it incorrectly (it sends the
503 data item for all messages if the server supports UTF8=ACCEPT,
504 without regard to whether a particular message includes any UTF8 at
507 For these reasons, it was judged best to revise [RFC6855] and adopt
508 the same syntax as IMAP4rev2.
510B.2. FETCH BODYSTRUCTURE
512 [RFC6532] defines a new media type, message/global, which is
513 substantially like message/rfc822 except that the submessage may
514 (also) use the syntax defined in [RFC6532]. [RFC3501] and [RFC9051]
515 define a FETCH item to return the MIME structure of a message, which
516 servers usually compute once and store.
518 None of the RFCs point out to implementers that IMAP4rev1 and
519 IMAP4rev2 are slightly different, so storing the BODYSTRUCTURE in the
520 way servers and clients often do can easily lead to problems.
522 This document makes the syntax optional, making it simple for server
523 authors to implement this extension correctly. This implies that
524 clients need to parse and handle both varieties, which they need to
525 do anyway if they want to support both IMAP4rev1 and IMAP4rev2.
529 This document is an almost unchanged copy of [RFC6855], which was
530 written by Pete Resnick, Chris Newman, and Sean Shen. Sean has since
531 changed jobs and the current authors do not have a new email address
532 for him. We cannot be sure that he would approve of the changes in
533 this document, so we did not list him as author, but do gratefully
534 acknowledge his work on [RFC6855]. Jiankang Yao replaces him.
536 The next paragraph is a straight copy of the acknowledgments in
539 | The authors wish to thank the participants of the EAI working
540 | group for their contributions to this document, with particular
541 | thanks to Harald Alvestrand, David Black, Randall Gellens, Arnt
542 | Gulbrandsen, Kari Hurtta, John Klensin, Xiaodong Lee, Charles
543 | Lindsey, Alexey Melnikov, Subramanian Moonesamy, Shawn Steele,
544 | Daniel Taharlev, and Joseph Yee for their specific contributions
547 Many of them also reread the document during this revision.
552 Episteme Technology Consulting LLC
553 503 West Indiana Avenue
554 Urbana, IL 61801-4941
555 United States of America
556 Email: resnick@episteme.net
561 No.4 South 4th Zhongguancun Street
565 Email: yaojk@cnnic.cn
570 6 Rond Point Schumann, Bd. 1
573 Email: arnt@gulbrandsen.priv.no