1
2
3
4
5
6
7Network Working Group M. Crispin
8Request for Comments: 5256 Panda Programming
9Category: Standards Track K. Murchison
10 Carnegie Mellon University
11 June 2008
12
13
14 Internet Message Access Protocol - SORT and THREAD Extensions
15
16Status of This Memo
17
18 This document specifies an Internet standards track protocol for the
19 Internet community, and requests discussion and suggestions for
20 improvements. Please refer to the current edition of the "Internet
21 Official Protocol Standards" (STD 1) for the standardization state
22 and status of this protocol. Distribution of this memo is unlimited.
23
24Abstract
25
26 This document describes the base-level server-based sorting and
27 threading extensions to the IMAP protocol. These extensions provide
28 substantial performance improvements for IMAP clients that offer
29 sorted and threaded views.
30
311. Introduction
32
33 The SORT and THREAD extensions to the [IMAP] protocol provide a means
34 of server-based sorting and threading of messages, without requiring
35 that the client download the necessary data to do so itself. This is
36 particularly useful for online clients as described in [IMAP-MODELS].
37
38 A server that supports the base-level SORT extension indicates this
39 with a capability name which starts with "SORT". Future, upwards-
40 compatible extensions to the SORT extension will all start with
41 "SORT", indicating support for this base level.
42
43 A server that supports the THREAD extension indicates this with one
44 or more capability names consisting of "THREAD=" followed by a
45 supported threading algorithm name as described in this document.
46 This provides for future upwards-compatible extensions.
47
48 A server that implements the SORT and/or THREAD extensions MUST
49 collate strings in accordance with the requirements of I18NLEVEL=1,
50 as described in [IMAP-I18N], and SHOULD implement and advertise the
51 I18NLEVEL=1 extension. Alternatively, a server MAY implement
52 I18NLEVEL=2 (or higher) and comply with the rules of that level.
53
54
55
56
57
58Crispin & Murchison Standards Track [Page 1]
59
60RFC 5256 IMAP Sort June 2008
61
62
63 Discussion: The SORT and THREAD extensions predate [IMAP-I18N] by
64 several years. At the time of this writing, all known server
65 implementations of SORT and THREAD comply with the rules of
66 I18NLEVEL=1, but do not necessarily advertise it. As discussed in
67 [IMAP-I18N] section 4.5, all server implementations should
68 eventually be updated to comply with the I18NLEVEL=2 extension.
69
70 Historical note: The REFERENCES threading algorithm is based on the
71 [THREADING] algorithm written and used in "Netscape Mail and News"
72 versions 2.0 through 3.0.
73
742. Terminology
75
76 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
77 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
78 document are to be interpreted as described in [KEYWORDS].
79
80 The word "can" (not "may") is used to refer to a possible
81 circumstance or situation, as opposed to an optional facility of the
82 protocol.
83
84 "User" is used to refer to a human user, whereas "client" refers to
85 the software being run by the user.
86
87 In examples, "C:" and "S:" indicate lines sent by the client and
88 server, respectively.
89
902.1. Base Subject ../store/account.go:458
91
92 Subject sorting and threading use the "base subject", which has
93 specific subject artifacts removed. Due to the complexity of these
94 artifacts, the formal syntax for the subject extraction rules is
95 ambiguous. The following procedure is followed to determine the
96 "base subject", using the [ABNF] formal syntax rules described in
97 section 5:
98
99 (1) Convert any RFC 2047 encoded-words in the subject to [UTF-8]
100 as described in "Internationalization Considerations".
101 Convert all tabs and continuations to space. Convert all ../message/threadsubject.go:22
102 multiple spaces to a single space.
103
104 (2) Remove all trailing text of the subject that matches the 5256:817 ../message/threadsubject.go:90
105 subj-trailer ABNF; repeat until no more matches are possible.
106
107 (3) Remove all prefix text of the subject that matches the subj- 5256:811 ../message/threadsubject.go:36 ../message/threadsubject.go:53
108 leader ABNF.
109
110
111
112
113
114Crispin & Murchison Standards Track [Page 2]
115
116RFC 5256 IMAP Sort June 2008
117
118
119 (4) If there is prefix text of the subject that matches the subj-
120 blob ABNF, and removing that prefix leaves a non-empty subj-
121 base, then remove the prefix text.
122
123 (5) Repeat (3) and (4) until no matches remain. ../message/threadsubject.go:109
124
125 Note: It is possible to defer step (2) until step (6), but this
126 requires checking for subj-trailer in step (4).
127
128 (6) If the resulting text begins with the subj-fwd-hdr ABNF and 5256:805 ../message/threadsubject.go:115
129 ends with the subj-fwd-trl ABNF, remove the subj-fwd-hdr and
130 subj-fwd-trl and repeat from step (2).
131
132 (7) The resulting text is the "base subject" used in the SORT.
133
134 All servers and disconnected (as described in [IMAP-MODELS]) clients
135 MUST use exactly this algorithm to determine the "base subject".
136 Otherwise, there is potential for a user to get inconsistent results
137 based on whether they are running in connected or disconnected mode.
138
1392.2. Sent Date
140
141 As used in this document, the term "sent date" refers to the date and
142 time from the Date: header, adjusted by time zone to normalize to
143 UTC. For example, "31 Dec 2000 16:01:33 -0800" is equivalent to the
144 UTC date and time of "1 Jan 2001 00:01:33 +0000".
145
146 If the time zone is invalid, the date and time SHOULD be treated as
147 UTC. If the time is also invalid, the time SHOULD be treated as
148 00:00:00. If there is no valid date or time, the date and time
149 SHOULD be treated as 00:00:00 on the earliest possible date.
150
151 This differs from the date-related criteria in the SEARCH command
152 (described in [IMAP] section 6.4.4), which use just the date and not
153 the time, and are not adjusted by time zone.
154
155 If the sent date cannot be determined (a Date: header is missing or
156 cannot be parsed), the INTERNALDATE for that message is used as the
157 sent date.
158
159 When comparing two sent dates that match exactly, the order in which
160 the two messages appear in the mailbox (that is, by sequence number)
161 is used as a tie-breaker to determine the order.
162
163
164
165
166
167
168
169
170Crispin & Murchison Standards Track [Page 3]
171
172RFC 5256 IMAP Sort June 2008
173
174
1753. Additional Commands
176
177 These commands are extensions to the [IMAP] base protocol.
178
179 The section headings are intended to correspond with where they would
180 be located in the main document if they were part of the base
181 specification.
182
183BASE.6.4.SORT. SORT Command
184
185 Arguments: sort program
186 charset specification
187 searching criteria (one or more)
188
189 Data: untagged responses: SORT
190
191 Result: OK - sort completed
192 NO - sort error: can't sort that charset or
193 criteria
194 BAD - command unknown or arguments invalid
195
196 The SORT command is a variant of SEARCH with sorting semantics for
197 the results. There are two arguments before the searching
198 criteria argument: a parenthesized list of sort criteria, and the
199 searching charset.
200
201 The charset argument is mandatory (unlike SEARCH) and indicates
202 the [CHARSET] of the strings that appear in the searching
203 criteria. The US-ASCII and [UTF-8] charsets MUST be implemented.
204 All other charsets are optional.
205
206 There is also a UID SORT command that returns unique identifiers
207 instead of message sequence numbers. Note that there are separate
208 searching criteria for message sequence numbers and UIDs; thus,
209 the arguments to UID SORT are interpreted the same as in SORT.
210 This is analogous to the behavior of UID SEARCH, as opposed to UID
211 COPY, UID FETCH, or UID STORE.
212
213 The SORT command first searches the mailbox for messages that
214 match the given searching criteria using the charset argument for
215 the interpretation of strings in the searching criteria. It then
216 returns the matching messages in an untagged SORT response, sorted
217 according to one or more sort criteria.
218
219 Sorting is in ascending order. Earlier dates sort before later
220 dates; smaller sizes sort before larger sizes; and strings are
221 sorted according to ascending values established by their
222 collation algorithm (see "Internationalization Considerations").
223
224
225
226Crispin & Murchison Standards Track [Page 4]
227
228RFC 5256 IMAP Sort June 2008
229
230
231 If two or more messages exactly match according to the sorting
232 criteria, these messages are sorted according to the order in
233 which they appear in the mailbox. In other words, there is an
234 implicit sort criterion of "sequence number".
235
236 When multiple sort criteria are specified, the result is sorted in
237 the priority order that the criteria appear. For example,
238 (SUBJECT DATE) will sort messages in order by their base subject
239 text; and for messages with the same base subject text, it will
240 sort by their sent date.
241
242 Untagged EXPUNGE responses are not permitted while the server is
243 responding to a SORT command, but are permitted during a UID SORT
244 command.
245
246 The defined sort criteria are as follows. Refer to the Formal
247 Syntax section for the precise syntactic definitions of the
248 arguments. If the associated RFC-822 header for a particular
249 criterion is absent, it is treated as the empty string. The empty
250 string always collates before non-empty strings.
251
252 ARRIVAL
253 Internal date and time of the message. This differs from the
254 ON criteria in SEARCH, which uses just the internal date.
255
256 CC
257 [IMAP] addr-mailbox of the first "cc" address.
258
259 DATE
260 Sent date and time, as described in section 2.2.
261
262 FROM
263 [IMAP] addr-mailbox of the first "From" address.
264
265 REVERSE
266 Followed by another sort criterion, has the effect of that
267 criterion but in reverse (descending) order.
268 Note: REVERSE only reverses a single criterion, and does not
269 affect the implicit "sequence number" sort criterion if all
270 other criteria are identical. Consequently, a sort of
271 REVERSE SUBJECT is not the same as a reverse ordering of a
272 SUBJECT sort. This can be avoided by use of additional
273 criteria, e.g., SUBJECT DATE vs. REVERSE SUBJECT REVERSE
274 DATE. In general, however, it's better (and faster, if the
275 client has a "reverse current ordering" command) to reverse
276 the results in the client instead of issuing a new SORT.
277
278
279
280
281
282Crispin & Murchison Standards Track [Page 5]
283
284RFC 5256 IMAP Sort June 2008
285
286
287 SIZE
288 Size of the message in octets.
289
290 SUBJECT
291 Base subject text.
292
293 TO
294 [IMAP] addr-mailbox of the first "To" address.
295
296 Example: C: A282 SORT (SUBJECT) UTF-8 SINCE 1-Feb-1994
297 S: * SORT 2 84 882
298 S: A282 OK SORT completed
299 C: A283 SORT (SUBJECT REVERSE DATE) UTF-8 ALL
300 S: * SORT 5 3 4 1 2
301 S: A283 OK SORT completed
302 C: A284 SORT (SUBJECT) US-ASCII TEXT "not in mailbox"
303 S: * SORT
304 S: A284 OK SORT completed
305
306BASE.6.4.THREAD. THREAD Command
307
308Arguments: threading algorithm
309 charset specification
310 searching criteria (one or more)
311
312Data: untagged responses: THREAD
313
314Result: OK - thread completed
315 NO - thread error: can't thread that charset or
316 criteria
317 BAD - command unknown or arguments invalid
318
319 The THREAD command is a variant of SEARCH with threading semantics
320 for the results. Thread has two arguments before the searching
321 criteria argument: a threading algorithm and the searching
322 charset.
323
324 The charset argument is mandatory (unlike SEARCH) and indicates
325 the [CHARSET] of the strings that appear in the searching
326 criteria. The US-ASCII and [UTF-8] charsets MUST be implemented.
327 All other charsets are optional.
328
329 There is also a UID THREAD command that returns unique identifiers
330 instead of message sequence numbers. Note that there are separate
331 searching criteria for message sequence numbers and UIDs; thus the
332 arguments to UID THREAD are interpreted the same as in THREAD.
333 This is analogous to the behavior of UID SEARCH, as opposed to UID
334 COPY, UID FETCH, or UID STORE.
335
336
337
338Crispin & Murchison Standards Track [Page 6]
339
340RFC 5256 IMAP Sort June 2008
341
342
343 The THREAD command first searches the mailbox for messages that
344 match the given searching criteria using the charset argument for
345 the interpretation of strings in the searching criteria. It then
346 returns the matching messages in an untagged THREAD response,
347 threaded according to the specified threading algorithm.
348
349 All collation is in ascending order. Earlier dates collate before
350 later dates and strings are collated according to ascending values
351 established by their collation algorithm (see
352 "Internationalization Considerations").
353
354 Untagged EXPUNGE responses are not permitted while the server is
355 responding to a THREAD command, but are permitted during a UID
356 THREAD command.
357
358 The defined threading algorithms are as follows:
359
360 ORDEREDSUBJECT
361
362 The ORDEREDSUBJECT threading algorithm is also referred to as
363 "poor man's threading". The searched messages are sorted by
364 base subject and then by the sent date. The messages are then
365 split into separate threads, with each thread containing
366 messages with the same base subject text. Finally, the threads
367 are sorted by the sent date of the first message in the thread.
368
369 The top level or "root" in ORDEREDSUBJECT threading contains
370 the first message of every thread. All messages in the root
371 are siblings of each other. The second message of a thread is
372 the child of the first message, and subsequent messages of the
373 thread are siblings of the second message and hence children of
374 the message at the root. Hence, there are no grandchildren in
375 ORDEREDSUBJECT threading.
376
377 Children in ORDEREDSUBJECT threading do not have descendents.
378 Client implementations SHOULD treat descendents of a child in a
379 server response as being siblings of that child.
380
381 REFERENCES
382
383 The REFERENCES threading algorithm threads the searched
384 messages by grouping them together in parent/child
385 relationships based on which messages are replies to others.
386 The parent/child relationships are built using two methods:
387 reconstructing a message's ancestry using the references
388 contained within it; and checking the original (not base)
389 subject of a message to see if it is a reply to (or forward of)
390 another message.
391
392
393
394Crispin & Murchison Standards Track [Page 7]
395
396RFC 5256 IMAP Sort June 2008
397
398
399 Note: "Message ID" in the following description refers to a
400 normalized form of the msg-id in [RFC2822]. The actual text
401 in RFC 2822 may use quoting, resulting in multiple ways of
402 expressing the same Message ID. Implementations of the
403 REFERENCES threading algorithm MUST normalize any msg-id in
404 order to avoid false non-matches due to differences in
405 quoting.
406
407 For example, the msg-id
408 <"01KF8JCEOCBS0045PS"@xxx.yyy.com>
409 and the msg-id
410 <01KF8JCEOCBS0045PS@xxx.yyy.com>
411 MUST be interpreted as being the same Message ID.
412
413 The references used for reconstructing a message's ancestry are
414 found using the following rules:
415
416 If a message contains a References header line, then use the
417 Message IDs in the References header line as the references.
418
419 If a message does not contain a References header line, or
420 the References header line does not contain any valid
421 Message IDs, then use the first (if any) valid Message ID
422 found in the In-Reply-To header line as the only reference
423 (parent) for this message.
424
425 Note: Although [RFC2822] permits multiple Message IDs in
426 the In-Reply-To header, in actual practice this
427 discipline has not been followed. For example,
428 In-Reply-To headers have been observed with message
429 addresses after the Message ID, and there are no good
430 heuristics for software to determine the difference.
431 This is not a problem with the References header,
432 however.
433
434 If a message does not contain an In-Reply-To header line, or
435 the In-Reply-To header line does not contain a valid Message
436 ID, then the message does not have any references (NIL).
437
438 A message is considered to be a reply or forward if the base
439 subject extraction rules, applied to the original subject,
440 remove any of the following: a subj-refwd, a "(fwd)" subj-
441 trailer, or a subj-fwd-hdr and subj-fwd-trl.
442
443 The REFERENCES algorithm is significantly more complex than ../store/threads.go:236
444 ORDEREDSUBJECT and consists of six main steps. These steps are
445 outlined in detail below.
446
447
448
449
450Crispin & Murchison Standards Track [Page 8]
451
452RFC 5256 IMAP Sort June 2008
453
454
455 (1) For each searched message:
456
457 (A) Using the Message IDs in the message's references, link
458 the corresponding messages (those whose Message-ID
459 header line contains the given reference Message ID)
460 together as parent/child. Make the first reference the
461 parent of the second (and the second a child of the
462 first), the second the parent of the third (and the
463 third a child of the second), etc. The following rules
464 govern the creation of these links:
465
466 If a message does not contain a Message-ID header
467 line, or the Message-ID header line does not
468 contain a valid Message ID, then assign a unique
469 Message ID to this message.
470
471 If two or more messages have the same Message ID,
472 then only use that Message ID in the first (lowest
473 sequence number) message, and assign a unique
474 Message ID to each of the subsequent messages with
475 a duplicate of that Message ID.
476
477 If no message can be found with a given Message ID,
478 create a dummy message with this ID. Use this
479 dummy message for all subsequent references to this
480 ID.
481
482 If a message already has a parent, don't change the
483 existing link. This is done because the References
484 header line may have been truncated by a Mail User
485 Agent (MUA). As a result, there is no guarantee
486 that the messages corresponding to adjacent Message
487 IDs in the References header line are parent and
488 child.
489
490 Do not create a parent/child link if creating that
491 link would introduce a loop. For example, before
492 making message A the parent of B, make sure that A
493 is not a descendent of B.
494
495 Note: Message ID comparisons are case-sensitive. ../store/account.go:453
496
497 (B) Create a parent/child link between the last reference
498 (or NIL if there are no references) and the current
499 message. If the current message already has a parent,
500 it is probably the result of a truncated References
501 header line, so break the current parent/child link
502 before creating the new correct one. As in step 1.A,
503
504
505
506Crispin & Murchison Standards Track [Page 9]
507
508RFC 5256 IMAP Sort June 2008
509
510
511 do not create the parent/child link if creating that
512 link would introduce a loop. Note that if this message
513 has no references, it will now have no parent.
514
515 Note: The parent/child links created in steps 1.A
516 and 1.B MUST be kept consistent with one another at
517 ALL times.
518
519 (2) Gather together all of the messages that have no parents
520 and make them all children (siblings of one another) of a
521 dummy parent (the "root"). These messages constitute the
522 first (head) message of the threads created thus far.
523
524 (3) Prune dummy messages from the thread tree. Traverse each
525 thread under the root, and for each message:
526
527 If it is a dummy message with NO children, delete it.
528
529 If it is a dummy message with children, delete it, but
530 promote its children to the current level. In other
531 words, splice them in with the dummy's siblings.
532
533 Do not promote the children if doing so would make them
534 children of the root, unless there is only one child.
535
536 (4) Sort the messages under the root (top-level siblings only)
537 by sent date as described in section 2.2. In the case of a
538 dummy message, sort its children by sent date and then use
539 the first child for the top-level sort.
540
541 (5) Gather together messages under the root that have the same
542 base subject text.
543
544 (A) Create a table for associating base subjects with
545 messages, called the subject table.
546
547 (B) Populate the subject table with one message per each
548 base subject. For each child of the root:
549
550 (i) Find the subject of this thread, by using the
551 base subject from either the current message or
552 its first child if the current message is a
553 dummy. This is the thread subject.
554
555 (ii) If the thread subject is empty, skip this
556 message.
557
558
559
560
561
562Crispin & Murchison Standards Track [Page 10]
563
564RFC 5256 IMAP Sort June 2008
565
566
567 (iii) Look up the message associated with the thread
568 subject in the subject table.
569
570 (iv) If there is no message in the subject table with
571 the thread subject, add the current message and
572 the thread subject to the subject table.
573
574 Otherwise, if the message in the subject table is
575 not a dummy, AND either of the following criteria
576 are true:
577
578 The current message is a dummy, OR
579
580 The message in the subject table is a reply
581 or forward and the current message is not.
582
583 then replace the message in the subject table
584 with the current message.
585
586 (C) Merge threads with the same thread subject. For each
587 child of the root:
588
589 (i) Find the message's thread subject as in step
590 5.B.i above.
591
592 (ii) If the thread subject is empty, skip this
593 message.
594
595 (iii) Lookup the message associated with this thread
596 subject in the subject table.
597
598 (iv) If the message in the subject table is the
599 current message, skip this message.
600
601 Otherwise, merge the current message with the one in
602 the subject table using the following rules:
603
604 If both messages are dummies, append the current
605 message's children to the children of the message
606 in the subject table (the children of both messages
607 become siblings), and then delete the current
608 message.
609
610 If the message in the subject table is a dummy and
611 the current message is not, make the current
612 message a child of the message in the subject table
613 (a sibling of its children).
614
615
616
617
618Crispin & Murchison Standards Track [Page 11]
619
620RFC 5256 IMAP Sort June 2008
621
622
623 If the current message is a reply or forward and
624 the message in the subject table is not, make the
625 current message a child of the message in the
626 subject table (a sibling of its children).
627
628 Otherwise, create a new dummy message and make both
629 the current message and the message in the subject
630 table children of the dummy. Then replace the
631 message in the subject table with the dummy
632 message.
633
634 Note: Subject comparisons are case-insensitive,
635 as described under "Internationalization
636 Considerations".
637
638 (6) Traverse the messages under the root and sort each set of
639 siblings by sent date as described in section 2.2.
640 Traverse the messages in such a way that the "youngest" set
641 of siblings are sorted first, and the "oldest" set of
642 siblings are sorted last (grandchildren are sorted before
643 children, etc). In the case of a dummy message (which can
644 only occur with top-level siblings), use its first child
645 for sorting.
646
647 Example: C: A283 THREAD ORDEREDSUBJECT UTF-8 SINCE 5-MAR-2000
648 S: * THREAD (166)(167)(168)(169)(172)(170)(171)
649 (173)(174 (175)(176)(178)(181)(180))(179)(177
650 (183)(182)(188)(184)(185)(186)(187)(189))(190)
651 (191)(192)(193)(194 195)(196 (197)(198))(199)
652 (200 202)(201)(203)(204)(205)(206 207)(208)
653 S: A283 OK THREAD completed
654 C: A284 THREAD ORDEREDSUBJECT US-ASCII TEXT "gewp"
655 S: * THREAD
656 S: A284 OK THREAD completed
657 C: A285 THREAD REFERENCES UTF-8 SINCE 5-MAR-2000
658 S: * THREAD (166)(167)(168)(169)(172)((170)(179))
659 (171)(173)((174)(175)(176)(178)(181)(180))
660 ((177)(183)(182)(188 (184)(189))(185 186)(187))
661 (190)(191)(192)(193)((194)(195 196))(197 198)
662 (199)(200 202)(201)(203)(204)(205 206 207)(208)
663 S: A285 OK THREAD completed
664
665 Note: The line breaks in the first and third server
666 responses are for editorial clarity and do not appear in
667 real THREAD responses.
668
669
670
671
672
673
674Crispin & Murchison Standards Track [Page 12]
675
676RFC 5256 IMAP Sort June 2008
677
678
6794. Additional Responses
680
681 These responses are extensions to the [IMAP] base protocol.
682
683 The section headings of these responses are intended to correspond
684 with where they would be located in the main document.
685
686BASE.7.2.SORT. SORT Response
687
688 Data: zero or more numbers
689
690 The SORT response occurs as a result of a SORT or UID SORT
691 command. The number(s) refer to those messages that match the
692 search criteria. For SORT, these are message sequence numbers;
693 for UID SORT, these are unique identifiers. Each number is
694 delimited by a space.
695
696 Example: S: * SORT 2 3 6
697
698BASE.7.2.THREAD. THREAD Response
699
700 Data: zero or more threads
701
702 The THREAD response occurs as a result of a THREAD or UID THREAD
703 command. It contains zero or more threads. A thread consists of
704 a parenthesized list of thread members.
705
706 Thread members consist of zero or more message numbers, delimited
707 by spaces, indicating successive parent and child. This continues
708 until the thread splits into multiple sub-threads, at which point,
709 the thread nests into multiple sub-threads with the first member
710 of each sub-thread being siblings at this level. There is no
711 limit to the nesting of threads.
712
713 The messages numbers refer to those messages that match the search
714 criteria. For THREAD, these are message sequence numbers; for UID
715 THREAD, these are unique identifiers.
716
717 Example: S: * THREAD (2)(3 6 (4 23)(44 7 96))
718
719 The first thread consists only of message 2. The second thread
720 consists of the messages 3 (parent) and 6 (child), after which it
721 splits into two sub-threads; the first of which contains messages
722 4 (child of 6, sibling of 44) and 23 (child of 4), and the second
723 of which contains messages 44 (child of 6, sibling of 4), 7 (child
724 of 44), and 96 (child of 7). Since some later messages are
725 parents of earlier messages, the messages were probably moved from
726 some other mailbox at different times.
727
728
729
730Crispin & Murchison Standards Track [Page 13]
731
732RFC 5256 IMAP Sort June 2008
733
734
735 -- 2
736
737 -- 3
738 \-- 6
739 |-- 4
740 | \-- 23
741 |
742 \-- 44
743 \-- 7
744 \-- 96
745
746 Example: S: * THREAD ((3)(5))
747
748 In this example, 3 and 5 are siblings of a parent that does not
749 match the search criteria (and/or does not exist in the mailbox);
750 however they are members of the same thread.
751
7525. Formal Syntax of SORT and THREAD Commands and Responses
753
754 The following syntax specification uses the Augmented Backus-Naur
755 Form (ABNF) notation as specified in [ABNF]. It also uses [ABNF]
756 rules defined in [IMAP].
757
758sort = ["UID" SP] "SORT" SP sort-criteria SP search-criteria
759
760sort-criteria = "(" sort-criterion *(SP sort-criterion) ")"
761
762sort-criterion = ["REVERSE" SP] sort-key
763
764sort-key = "ARRIVAL" / "CC" / "DATE" / "FROM" / "SIZE" /
765 "SUBJECT" / "TO"
766
767thread = ["UID" SP] "THREAD" SP thread-alg SP search-criteria
768
769thread-alg = "ORDEREDSUBJECT" / "REFERENCES" / thread-alg-ext
770
771thread-alg-ext = atom
772 ; New algorithms MUST be registered with IANA
773
774search-criteria = charset 1*(SP search-key)
775
776charset = atom / quoted
777 ; CHARSET values MUST be registered with IANA
778
779sort-data = "SORT" *(SP nz-number)
780
781thread-data = "THREAD" [SP 1*thread-list]
782
783
784
785
786Crispin & Murchison Standards Track [Page 14]
787
788RFC 5256 IMAP Sort June 2008
789
790
791thread-list = "(" (thread-members / thread-nested) ")"
792
793thread-members = nz-number *(SP nz-number) [SP thread-nested]
794
795thread-nested = 2*thread-list
796
797 The following syntax describes base subject extraction rules (2)-(6):
798
799subject = *subj-leader [subj-middle] *subj-trailer
800
801subj-refwd = ("re" / ("fw" ["d"])) *WSP [subj-blob] ":"
802
803subj-blob = "[" *BLOBCHAR "]" *WSP
804
805subj-fwd = subj-fwd-hdr subject subj-fwd-trl 5256:128 ../message/threadsubject.go:115
806
807subj-fwd-hdr = "[fwd:"
808
809subj-fwd-trl = "]"
810
811subj-leader = (*subj-blob subj-refwd) / WSP 5256:107 ../message/threadsubject.go:36 ../message/threadsubject.go:53
812
813subj-middle = *subj-blob (subj-base / subj-fwd)
814 ; last subj-blob is subj-base if subj-base would
815 ; otherwise be empty
816
817subj-trailer = "(fwd)" / WSP 5256:104 ../message/threadsubject.go:90
818
819subj-base = NONWSP *(*WSP NONWSP)
820 ; can be a subj-blob
821
822BLOBCHAR = %x01-5a / %x5c / %x5e-ff
823 ; any CHAR8 except '[' and ']'.
824 ; SHOULD comply with [UTF-8]
825
826NONWSP = %x01-08 / %x0a-1f / %x21-ff
827 ; any CHAR8 other than WSP.
828 ; SHOULD comply with [UTF-8]
829
8306. Security Considerations
831
832 The SORT and THREAD extensions do not raise any security
833 considerations that are not present in the base [IMAP] protocol, and
834 these issues are discussed in [IMAP]. Nevertheless, it is important
835 to remember that [IMAP] protocol transactions, including message
836 data, are sent in the clear over the network unless protection from
837 snooping is negotiated, either by the use of STARTTLS, privacy
838 protection in AUTHENTICATE, or some other protection mechanism.
839
840
841
842Crispin & Murchison Standards Track [Page 15]
843
844RFC 5256 IMAP Sort June 2008
845
846
847 Although not a security consideration, it is important to recognize
848 that sorting by REFERENCES can lead to misleading threading trees.
849 For example, a message with false References: header data will cause
850 a thread to be incorporated into another thread.
851
852 The process of extracting the base subject may lead to incorrect
853 collation if the extracted data was significant text as opposed to a
854 subject artifact.
855
8567. Internationalization Considerations
857
858 As stated in the introduction, the rules of I18NLEVEL=1 as described
859 in [IMAP-I18N] MUST be followed; that is, the SORT and THREAD
860 extensions MUST collate strings according to the i;unicode-casemap
861 collation described in [UNICASEMAP]. Servers SHOULD also advertise
862 the I18NLEVEL=1 extension. Alternatively, a server MAY implement
863 I18NLEVEL=2 (or higher) and comply with the rules of that level.
864
865 As discussed in [IMAP-I18N] section 4.5, all server implementations
866 should eventually be updated to support the [IMAP-I18N] I18NLEVEL=2
867 extension.
868
869 Translations of the "re" or "fw"/"fwd" tokens are not specified for
870 removal in the base subject extraction process. An attempt to add
871 such translated tokens would result in a geometrically complex, and
872 ultimately unimplementable, task.
873
874 Instead, note that [RFC2822] section 3.6.5 recommends that "re:"
875 (from the Latin "res", meaning "in the matter of") be used to
876 identify a reply. Although it is evident that, from the multiple
877 forms of token to identify a forwarded message, there is considerable
878 variation found in the wild, the variations are (still) manageable.
879 Consequently, it is suggested that "re:" and one of the variations of
880 the tokens for a forward supported by the base subject extraction
881 rules be adopted for Internet mail messages, since doing so makes it
882 a simple display-time task to localize the token language for the
883 user.
884
8858. IANA Considerations
886
887 [IMAP] capabilities are registered by publishing a standards track or
888 IESG-approved experimental RFC. This document constitutes
889 registration of the SORT and THREAD capabilities in the [IMAP]
890 capabilities registry.
891
892
893
894
895
896
897
898Crispin & Murchison Standards Track [Page 16]
899
900RFC 5256 IMAP Sort June 2008
901
902
903 This document creates a new [IMAP] threading algorithms registry,
904 which registers threading algorithms by publishing a standards track
905 or IESG-approved experimental RFC. This document constitutes
906 registration of the ORDEREDSUBJECT and REFERENCES algorithms in that
907 registry.
908
9099. Normative References
910
911 [ABNF] Crocker, D., Ed., and P. Overell, "Augmented BNF for
912 Syntax Specifications: ABNF", STD 68, RFC 5234, January
913 2008.
914
915 [CHARSET] Freed, N. and J. Postel, "IANA Charset Registration
916 Procedures", BCP 19, RFC 2978, October 2000.
917
918 [IMAP] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL -
919 VERSION 4rev1", RFC 3501, March 2003.
920
921 [IMAP-I18N] Newman, C., Gulbrandsen, A., and A. Melnikov, "Internet
922 Message Access Protocol Internationalization", RFC
923 5255, June 2008.
924
925 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate
926 Requirement Levels", BCP 14, RFC 2119, March 1997.
927
928 [RFC2822] Resnick, P., Ed., "Internet Message Format", RFC 2822,
929 April 2001.
930
931 [UNICASEMAP] Crispin, M., "i;unicode-casemap - Simple Unicode
932 Collation Algorithm", RFC 5051, October 2007.
933
934 [UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO
935 10646", STD 63, RFC 3629, November 2003.
936
93710. Informative References
938
939 [IMAP-MODELS] Crispin, M., "Distributed Electronic Mail Models in
940 IMAP4", RFC 1733, December 1994.
941
942 [THREADING] Zawinski, J. "Message Threading",
943 http://www.jwz.org/doc/threading.html, 1997-2002.
944
945
946
947
948
949
950
951
952
953
954Crispin & Murchison Standards Track [Page 17]
955
956RFC 5256 IMAP Sort June 2008
957
958
959Authors' Addresses
960
961 Mark R. Crispin
962 Panda Programming
963 6158 NE Lariat Loop
964 Bainbridge Island, WA 98110-2098
965
966 Phone: +1 (206) 842-2385
967 EMail: IMAP+SORT+THREAD@Lingling.Panda.COM
968
969
970 Kenneth Murchison
971 Carnegie Mellon University
972 5000 Forbes Avenue
973 Cyert Hall 285
974 Pittsburgh, PA 15213
975
976 Phone: +1 (412) 268-2638
977 EMail: murch@andrew.cmu.edu
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010Crispin & Murchison Standards Track [Page 18]
1011
1012RFC 5256 IMAP Sort June 2008
1013
1014
1015Full Copyright Statement
1016
1017 Copyright (C) The IETF Trust (2008).
1018
1019 This document is subject to the rights, licenses and restrictions
1020 contained in BCP 78, and except as set forth therein, the authors
1021 retain all their rights.
1022
1023 This document and the information contained herein are provided on an
1024 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1025 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
1026 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
1027 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
1028 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1029 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
1030
1031Intellectual Property
1032
1033 The IETF takes no position regarding the validity or scope of any
1034 Intellectual Property Rights or other rights that might be claimed to
1035 pertain to the implementation or use of the technology described in
1036 this document or the extent to which any license under such rights
1037 might or might not be available; nor does it represent that it has
1038 made any independent effort to identify any such rights. Information
1039 on the procedures with respect to rights in RFC documents can be
1040 found in BCP 78 and BCP 79.
1041
1042 Copies of IPR disclosures made to the IETF Secretariat and any
1043 assurances of licenses to be made available, or the result of an
1044 attempt made to obtain a general license or permission for the use of
1045 such proprietary rights by implementers or users of this
1046 specification can be obtained from the IETF on-line IPR repository at
1047 http://www.ietf.org/ipr.
1048
1049 The IETF invites any interested party to bring to its attention any
1050 copyrights, patents or patent applications, or other proprietary
1051 rights that may cover technology that may be required to implement
1052 this standard. Please address the information to the IETF at
1053 ietf-ipr@ietf.org.
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066Crispin & Murchison Standards Track [Page 19]
1067
1068