1
2
3
4
5
6
7Internet Engineering Task Force (IETF) R. Bellis
8Request for Comments: 8490 ISC
9Updates: 1035, 7766 S. Cheshire
10Category: Standards Track Apple Inc.
11ISSN: 2070-1721 J. Dickinson
12 S. Dickinson
13 Sinodun
14 T. Lemon
15 Nibbhaya Consulting
16 T. Pusateri
17 Unaffiliated
18 March 2019
19
20
21 DNS Stateful Operations
22
23Abstract
24
25 This document defines a new DNS OPCODE for DNS Stateful Operations
26 (DSO). DSO messages communicate operations within persistent
27 stateful sessions using Type Length Value (TLV) syntax. Three TLVs
28 are defined that manage session timeouts, termination, and encryption
29 padding, and a framework is defined for extensions to enable new
30 stateful operations. This document updates RFC 1035 by adding a new
31 DNS header OPCODE that has both different message semantics and a new
32 result code. This document updates RFC 7766 by redefining a session,
33 providing new guidance on connection reuse, and providing a new
34 mechanism for handling session idle timeouts.
35
36Status of This Memo
37
38 This is an Internet Standards Track document.
39
40 This document is a product of the Internet Engineering Task Force
41 (IETF). It represents the consensus of the IETF community. It has
42 received public review and has been approved for publication by the
43 Internet Engineering Steering Group (IESG). Further information on
44 Internet Standards is available in Section 2 of RFC 7841.
45
46 Information about the current status of this document, any errata,
47 and how to provide feedback on it may be obtained at
48 https://www.rfc-editor.org/info/rfc8490.
49
50
51
52
53
54
55
56
57
58Bellis, et al. Standards Track [Page 1]
59
60RFC 8490 DNS Stateful Operations March 2019
61
62
63Copyright Notice
64
65 Copyright (c) 2019 IETF Trust and the persons identified as the
66 document authors. All rights reserved.
67
68 This document is subject to BCP 78 and the IETF Trust's Legal
69 Provisions Relating to IETF Documents
70 (https://trustee.ietf.org/license-info) in effect on the date of
71 publication of this document. Please review these documents
72 carefully, as they describe your rights and restrictions with respect
73 to this document. Code Components extracted from this document must
74 include Simplified BSD License text as described in Section 4.e of
75 the Trust Legal Provisions and are provided without warranty as
76 described in the Simplified BSD License.
77
78Table of Contents
79
80 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4
81 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 6
82 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6
83 4. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 9
84 4.1. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 9
85 4.1.1. Session Management . . . . . . . . . . . . . . . . . 9
86 4.1.2. Long-Lived Subscriptions . . . . . . . . . . . . . . 9
87 4.2. Applicable Transports . . . . . . . . . . . . . . . . . . 10
88 5. Protocol Details . . . . . . . . . . . . . . . . . . . . . . 11
89 5.1. DSO Session Establishment . . . . . . . . . . . . . . . . 12
90 5.1.1. DSO Session Establishment Failure . . . . . . . . . . 13
91 5.1.2. DSO Session Establishment Success . . . . . . . . . . 14
92 5.2. Operations after DSO Session Establishment . . . . . . . 14
93 5.3. DSO Session Termination . . . . . . . . . . . . . . . . . 15
94 5.3.1. Handling Protocol Errors . . . . . . . . . . . . . . 15
95 5.4. Message Format . . . . . . . . . . . . . . . . . . . . . 16
96 5.4.1. DNS Header Fields in DSO Messages . . . . . . . . . . 17
97 5.4.2. DSO Data . . . . . . . . . . . . . . . . . . . . . . 18
98 5.4.3. DSO Unidirectional Messages . . . . . . . . . . . . . 20
99 5.4.4. TLV Syntax . . . . . . . . . . . . . . . . . . . . . 21
100 5.4.5. Unrecognized TLVs . . . . . . . . . . . . . . . . . . 22
101 5.4.6. EDNS(0) and TSIG . . . . . . . . . . . . . . . . . . 23
102 5.5. Message Handling . . . . . . . . . . . . . . . . . . . . 24
103 5.5.1. Delayed Acknowledgement Management . . . . . . . . . 25
104 5.5.2. MESSAGE ID Namespaces . . . . . . . . . . . . . . . . 26
105 5.5.3. Error Responses . . . . . . . . . . . . . . . . . . . 27
106 5.6. Responder-Initiated Operation Cancellation . . . . . . . 28
107 6. DSO Session Lifecycle and Timers . . . . . . . . . . . . . . 29
108 6.1. DSO Session Initiation . . . . . . . . . . . . . . . . . 29
109 6.2. DSO Session Timeouts . . . . . . . . . . . . . . . . . . 30
110 6.3. Inactive DSO Sessions . . . . . . . . . . . . . . . . . . 31
111
112
113
114Bellis, et al. Standards Track [Page 2]
115
116RFC 8490 DNS Stateful Operations March 2019
117
118
119 6.4. The Inactivity Timeout . . . . . . . . . . . . . . . . . 32
120 6.4.1. Closing Inactive DSO Sessions . . . . . . . . . . . . 32
121 6.4.2. Values for the Inactivity Timeout . . . . . . . . . . 33
122 6.5. The Keepalive Interval . . . . . . . . . . . . . . . . . 34
123 6.5.1. Keepalive Interval Expiry . . . . . . . . . . . . . . 34
124 6.5.2. Values for the Keepalive Interval . . . . . . . . . . 34
125 6.6. Server-Initiated DSO Session Termination . . . . . . . . 36
126 6.6.1. Server-Initiated Retry Delay Message . . . . . . . . 37
127 6.6.2. Misbehaving Clients . . . . . . . . . . . . . . . . . 38
128 6.6.3. Client Reconnection . . . . . . . . . . . . . . . . . 38
129 7. Base TLVs for DNS Stateful Operations . . . . . . . . . . . . 40
130 7.1. Keepalive TLV . . . . . . . . . . . . . . . . . . . . . . 40
131 7.1.1. Client Handling of Received Session Timeout Values . 42
132 7.1.2. Relationship to edns-tcp-keepalive EDNS(0) Option . . 43
133 7.2. Retry Delay TLV . . . . . . . . . . . . . . . . . . . . . 44
134 7.2.1. Retry Delay TLV Used as a Primary TLV . . . . . . . . 44
135 7.2.2. Retry Delay TLV Used as a Response Additional TLV . . 46
136 7.3. Encryption Padding TLV . . . . . . . . . . . . . . . . . 46
137 8. Summary Highlights . . . . . . . . . . . . . . . . . . . . . 47
138 8.1. QR Bit and MESSAGE ID . . . . . . . . . . . . . . . . . . 47
139 8.2. TLV Usage . . . . . . . . . . . . . . . . . . . . . . . . 48
140 9. Additional Considerations . . . . . . . . . . . . . . . . . . 50
141 9.1. Service Instances . . . . . . . . . . . . . . . . . . . . 50
142 9.2. Anycast Considerations . . . . . . . . . . . . . . . . . 51
143 9.3. Connection Sharing . . . . . . . . . . . . . . . . . . . 52
144 9.4. Operational Considerations for Middleboxes . . . . . . . 53
145 9.5. TCP Delayed Acknowledgement Considerations . . . . . . . 54
146 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 57
147 10.1. DSO OPCODE Registration . . . . . . . . . . . . . . . . 57
148 10.2. DSO RCODE Registration . . . . . . . . . . . . . . . . . 57
149 10.3. DSO Type Code Registry . . . . . . . . . . . . . . . . . 57
150 11. Security Considerations . . . . . . . . . . . . . . . . . . . 59
151 11.1. TLS Zero Round-Trip Considerations . . . . . . . . . . . 59
152 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 60
153 12.1. Normative References . . . . . . . . . . . . . . . . . . 60
154 12.2. Informative References . . . . . . . . . . . . . . . . . 61
155 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 63
156 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 63
157
158
159
160
161
162
163
164
165
166
167
168
169
170Bellis, et al. Standards Track [Page 3]
171
172RFC 8490 DNS Stateful Operations March 2019
173
174
1751. Introduction
176
177 This document specifies a mechanism for managing stateful DNS
178 connections. DNS most commonly operates over a UDP transport, but it
179 can also operate over streaming transports; the original DNS RFC
180 specifies DNS-over-TCP [RFC1035], and a profile for DNS-over-TLS
181 [RFC7858] has been specified. These transports can offer persistent
182 long-lived sessions and therefore, when using them for transporting
183 DNS messages, it is of benefit to have a mechanism that can establish
184 parameters associated with those sessions, such as timeouts. In such
185 situations, it is also advantageous to support server-initiated
186 messages (such as DNS Push Notifications [Push]).
187
188 The existing Extension Mechanism for DNS (EDNS(0)) [RFC6891] is
189 explicitly defined to only have "per-message" semantics. While
190 EDNS(0) has been used to signal at least one session-related
191 parameter (edns-tcp-keepalive EDNS(0) Option [RFC7828]), the result
192 is less than optimal due to the restrictions imposed by the EDNS(0)
193 semantics and the lack of server-initiated signaling. For example, a
194 server cannot arbitrarily instruct a client to close a connection
195 because the server can only send EDNS(0) options in responses to
196 queries that contained EDNS(0) options.
197
198 This document defines a new DNS OPCODE for DNS Stateful Operations
199 (DSO) with a value of 6. DSO messages are used to communicate
200 operations within persistent stateful sessions, expressed using Type
201 Length Value (TLV) syntax. This document defines an initial set of
202 three TLVs used to manage session timeouts, termination, and
203 encryption padding.
204
205 All three TLVs defined here are mandatory for all implementations of
206 DSO. Further TLVs may be defined in additional specifications.
207
208 DSO messages may or may not be acknowledged. Whether a DSO message
209 is to be acknowledged (a DSO request message) or is not to be
210 acknowledged (a DSO unidirectional message) is specified in the
211 definition of that particular DSO message type. The MESSAGE ID is
212 nonzero for DSO request messages, and zero for DSO unidirectional
213 messages. Messages are pipelined and responses may appear out of
214 order when multiple requests are being processed concurrently.
215
216 The format for DSO messages (Section 5.4) differs somewhat from the
217 traditional DNS message format used for standard queries and
218 responses. The standard twelve-byte header is used, but the four
219 count fields (QDCOUNT, ANCOUNT, NSCOUNT, ARCOUNT) are set to zero,
220 and accordingly their corresponding sections are not present.
221
222
223
224
225
226Bellis, et al. Standards Track [Page 4]
227
228RFC 8490 DNS Stateful Operations March 2019
229
230
231 The actual data pertaining to DNS Stateful Operations (expressed in
232 TLV syntax) is appended to the end of the DNS message header. Just
233 as in traditional DNS-over-TCP [RFC1035] [RFC7766], the stream
234 protocol carrying DSO messages (which are just another kind of DNS
235 message) frames them by putting a 16-bit message length at the start.
236 The length of the DSO message is therefore determined from that
237 length rather than from any of the DNS header counts.
238
239 When displayed using packet analyzer tools that have not been updated
240 to recognize the DSO format, this will result in the DSO data being
241 displayed as unknown extra data after the end of the DNS message.
242
243 This new format has distinct advantages over an RR-based format
244 because it is more explicit and more compact. Each TLV definition is
245 specific to its use case and, as a result, contains no redundant or
246 overloaded fields. Importantly, it completely avoids conflating DNS
247 Stateful Operations in any way with normal DNS operations or with
248 existing EDNS(0)-based functionality. A goal of this approach is to
249 avoid the operational issues that have befallen EDNS(0), particularly
250 relating to middlebox behavior (see sections discussing EDNS(0), and
251 problems caused by firewalls and load balancers, in the recent work
252 describing causes of DNS failures [Fail]).
253
254 With EDNS(0), multiple options may be packed into a single OPT
255 pseudo-RR, and there is no generalized mechanism for a client to be
256 able to tell whether a server has processed or otherwise acted upon
257 each individual option within the combined OPT pseudo-RR. The
258 specifications for each individual option need to define how each
259 different option is to be acknowledged, if necessary.
260
261 In contrast to EDNS(0), with DSO there is no compelling motivation to
262 pack multiple operations into a single message for efficiency
263 reasons, because DSO always operates using a connection-oriented
264 transport protocol. Each DSO operation is communicated in its own
265 separate DNS message, and the transport protocol can take care of
266 packing several DNS messages into a single IP packet if appropriate.
267 For example, TCP can pack multiple small DNS messages into a single
268 TCP segment. This simplification allows for clearer semantics. Each
269 DSO request message communicates just one primary operation, and the
270 RCODE in the corresponding response message indicates the success or
271 failure of that operation.
272
273
274
275
276
277
278
279
280
281
282Bellis, et al. Standards Track [Page 5]
283
284RFC 8490 DNS Stateful Operations March 2019
285
286
2872. Requirements Language
288
289 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
290 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
291 "OPTIONAL" in this document are to be interpreted as described in
292 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
293 capitals, as shown here.
294
2953. Terminology
296
297 DSO: DNS Stateful Operations.
298
299 connection: a bidirectional byte (or message) stream, where the
300 bytes (or messages) are delivered reliably and in order, such as
301 provided by using DNS-over-TCP [RFC1035] [RFC7766] or DNS-over-TLS
302 [RFC7858].
303
304 session: the unqualified term "session" in the context of this
305 document refers to a persistent network connection between two
306 endpoints that allows for the exchange of DNS messages over a
307 connection where either end of the connection can send messages to
308 the other end. (The term has no relationship to the "session
309 layer" of the OSI "seven-layer model".)
310
311 DSO Session: a session established between two endpoints that
312 acknowledge persistent DNS state via the exchange of DSO messages
313 over the connection. This is distinct from a DNS-over-TCP session
314 as described in the previous specification for DNS-over-TCP
315 [RFC7766].
316
317 close gracefully: a normal session shutdown where the client closes
318 the TCP connection to the server using a graceful close such that
319 no data is lost (e.g., using TCP FIN; see Section 5.3).
320
321 forcibly abort: a session shutdown as a result of a fatal error
322 where the TCP connection is unilaterally aborted without regard
323 for data loss (e.g., using TCP RST; see Section 5.3).
324
325 server: the software with a listening socket, awaiting incoming
326 connection requests, in the usual DNS sense.
327
328 client: the software that initiates a connection to the server's
329 listening socket, in the usual DNS sense.
330
331 initiator: the software that sends a DSO request message or a DSO
332 unidirectional message during a DSO Session. Either a client or
333 server can be an initiator.
334
335
336
337
338Bellis, et al. Standards Track [Page 6]
339
340RFC 8490 DNS Stateful Operations March 2019
341
342
343 responder: the software that receives a DSO request message or a DSO
344 unidirectional message during a DSO Session. Either a client or a
345 server can be a responder.
346
347 sender: the software that is sending a DNS message, a DSO message, a
348 DNS response, or a DSO response.
349
350 receiver: the software that is receiving a DNS message, a DSO
351 message, a DNS response, or a DSO response.
352
353 service instance: a specific instance of server software running on
354 a specific host (Section 9.1).
355
356 long-lived operation: an outstanding operation on a DSO Session
357 where either the client or server, acting as initiator, has
358 requested that the responder send new information regarding the
359 request, as it becomes available.
360
361 early data: a TLS 1.3 handshake containing data on the first flight
362 that begins a DSO Session (Section 2.3 of the TLS 1.3
363 specification [RFC8446]). TCP Fast Open [RFC7413] is only
364 permitted when using TLS.
365
366 DNS message: any DNS message, including DNS queries, responses,
367 updates, DSO messages, etc.
368
369 DNS request message: any DNS message where the QR bit is 0.
370
371 DNS response message: any DNS message where the QR bit is 1.
372
373 DSO message: a DSO request message, DSO unidirectional message, or
374 DSO response to a DSO request message. If the QR bit is 1 in a
375 DSO message, it is a DSO response message. If the QR bit is 0 in
376 a DSO message, it is a DSO request message or DSO unidirectional
377 message, as determined by the specification of its Primary TLV.
378
379 DSO response message: a response to a DSO request message.
380
381 DSO request message: a DSO message that requires a response.
382
383 DSO unidirectional message: a DSO message that does not require and
384 cannot induce a response.
385
386 Primary TLV: the first TLV in a DSO request message or DSO
387 unidirectional message; this determines the nature of the
388 operation being performed.
389
390
391
392
393
394Bellis, et al. Standards Track [Page 7]
395
396RFC 8490 DNS Stateful Operations March 2019
397
398
399 Additional TLV: any TLVs that follow the Primary TLV in a DSO
400 request message or DSO unidirectional message.
401
402 Response Primary TLV: in a DSO response, any TLVs with the same DSO-
403 TYPE as the Primary TLV from the corresponding DSO request
404 message. If present, any Response Primary TLV(s) MUST appear
405 first in the DSO response message, before any Response Additional
406 TLVs.
407
408 Response Additional TLV: any TLVs in a DSO response that follow the
409 (optional) Response Primary TLV(s).
410
411 inactivity timer: the time since the most recent non-keepalive DNS
412 message was sent or received (see Section 6.4).
413
414 keepalive timer: the time since the most recent DNS message was sent
415 or received (see Section 6.5).
416
417 session timeouts: the inactivity timer and the keepalive timer.
418
419 inactivity timeout: the maximum value that the inactivity timer can
420 have before the connection is gracefully closed.
421
422 keepalive interval: the maximum value that the keepalive timer can
423 have before the client is required to send a keepalive (see
424 Section 7.1).
425
426 resetting a timer: setting the timer value to zero and restarting
427 the timer.
428
429 clearing a timer: setting the timer value to zero but not restarting
430 the timer.
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450Bellis, et al. Standards Track [Page 8]
451
452RFC 8490 DNS Stateful Operations March 2019
453
454
4554. Applicability
456
457 DNS Stateful Operations are applicable to several known use cases and
458 are only applicable on transports that are capable of supporting a
459 DSO Session.
460
4614.1. Use Cases
462
463 Several use cases for DNS Stateful Operations are described below.
464
4654.1.1. Session Management
466
467 In one use case, establishing session parameters such as server-
468 defined timeouts is of great use in the general management of
469 persistent connections. For example, using DSO Sessions for stub-to-
470 recursive DNS-over-TLS [RFC7858] is more flexible for both the client
471 and the server than attempting to manage sessions using just the
472 edns-tcp-keepalive EDNS(0) Option [RFC7828]. The simple set of TLVs
473 defined in this document is sufficient to greatly enhance connection
474 management for this use case.
475
4764.1.2. Long-Lived Subscriptions
477
478 In another use case, DNS-based Service Discovery (DNS-SD) [RFC6763]
479 has evolved into a naturally session-based mechanism where, for
480 example, long-lived subscriptions lend themselves to 'push'
481 mechanisms as opposed to polling. Long-lived stateful connections
482 and server-initiated messages align with this use case [Push].
483
484 A general use case is that DNS traffic is often bursty, but session
485 establishment can be expensive. One challenge with long-lived
486 connections is sustaining sufficient traffic to maintain NAT and
487 firewall state. To mitigate this issue, this document introduces a
488 new concept for the DNS -- DSO "keepalive traffic". This traffic
489 carries no DNS data and is not considered 'activity' in the classic
490 DNS sense, but it serves to maintain state in middleboxes and to
491 assure the client and server that they still have connectivity to
492 each other.
493
494
495
496
497
498
499
500
501
502
503
504
505
506Bellis, et al. Standards Track [Page 9]
507
508RFC 8490 DNS Stateful Operations March 2019
509
510
5114.2. Applicable Transports
512
513 DNS Stateful Operations are applicable in cases where it is useful to
514 maintain an open session between a DNS client and server, where the
515 transport allows such a session to be maintained, and where the
516 transport guarantees in-order delivery of messages on which DSO
517 depends. Two specific transports that meet the requirements to
518 support DNS Stateful Operations are DNS-over-TCP [RFC1035] [RFC7766]
519 and DNS-over-TLS [RFC7858].
520
521 Note that in the case of DNS-over-TLS, there is no mechanism for
522 upgrading from DNS-over-TCP to DNS-over-TLS mid-connection (see
523 Section 7 of the DNS-over-TLS specification [RFC7858]). A connection
524 is either DNS-over-TCP from the start, or DNS-over-TLS from the
525 start.
526
527 DNS Stateful Operations are not applicable for transports that cannot
528 support clean session semantics or that do not guarantee in-order
529 delivery. While in principle such a transport could be constructed
530 over UDP, the current specification of DNS-over-UDP [RFC1035] does
531 not provide in-order delivery or session semantics and hence cannot
532 be used. Similarly, DNS-over-HTTP [RFC8484] cannot be used because
533 HTTP has its own mechanism for managing sessions, which is
534 incompatible with the mechanism specified here.
535
536 Only DNS-over-TCP and DNS-over-TLS are currently defined for use with
537 DNS Stateful Operations. Other transports may be added in the future
538 if they meet the requirements set out in the first paragraph of this
539 section.
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562Bellis, et al. Standards Track [Page 10]
563
564RFC 8490 DNS Stateful Operations March 2019
565
566
5675. Protocol Details
568
569 The overall flow of DNS Stateful Operations goes through a series of
570 phases:
571
572 Connection Establishment: A client establishes a connection to a
573 server (Section 4.2).
574
575 Connected but Sessionless: A connection exists, but a DSO Session
576 has not been established. DNS messages can be sent from the
577 client to server, and DNS responses can be sent from the server to
578 the client. In this state, a client that wishes to use DSO can
579 attempt to establish a DSO Session (Section 5.1). Standard DNS-
580 over-TCP inactivity timeout handling is in effect [RFC7766] (see
581 Section 7.1.2 of this document).
582
583 DSO Session Establishment in Progress: A client has sent a DSO
584 request within the last 30 seconds, but has not yet received a DSO
585 response for that request. In this phase, the client may send
586 more DSO requests and more DNS requests, but MUST NOT send DSO
587 unidirectional messages (Section 5.1).
588
589 DSO Session Establishment Timeout: A client has sent a DSO request,
590 and after 30 seconds has still received no DSO response for that
591 request. This means that the server is now in an indeterminate
592 state. The client forcibly aborts the connection. The client MAY
593 reconnect without using DSO, if appropriate.
594
595 DSO Session Establishment Failed: A client has sent a DSO request,
596 and received a corresponding DSO response with a nonzero RCODE.
597 This means that the attempt to establish the DSO Session did not
598 succeed. At this point, the client is permitted to continue
599 operating without a DSO Session (Connected but Sessionless) but
600 does not send further DSO messages (Section 5.1).
601
602 DSO Session Established: A client has sent a DSO request, and
603 received a corresponding DSO response with RCODE set to NOERROR
604 (0). A DSO Session has now been successfully established. Both
605 client and server may send DSO messages and DNS messages; both may
606 send replies in response to messages they receive (Section 5.2).
607 The inactivity timer (Section 6.4) is active; the keepalive timer
608 (Section 6.5) is active. Standard DNS-over-TCP inactivity timeout
609 handling is no longer in effect [RFC7766] (see Section 7.1.2 of
610 this document).
611
612
613
614
615
616
617
618Bellis, et al. Standards Track [Page 11]
619
620RFC 8490 DNS Stateful Operations March 2019
621
622
623 Server Shutdown: The server has decided to gracefully terminate the
624 session and has sent the client a Retry Delay message
625 (Section 6.6.1). There may still be unprocessed messages from the
626 client; the server will ignore these. The server will not send
627 any further messages to the client (Section 6.6.1.1).
628
629 Client Shutdown: The client has decided to disconnect, either
630 because it no longer needs service, the connection is inactive
631 (Section 6.4.1), or because the server sent it a Retry Delay
632 message (Section 6.6.1). The client closes the connection
633 gracefully (Section 5.3).
634
635 Reconnect: The client disconnected as a result of a server shutdown.
636 The client either waits for the server-specified Retry Delay to
637 expire (Section 6.6.3) or else contacts a different server
638 instance. If the client no longer needs service, it does not
639 reconnect.
640
641 Forcibly Abort: The client or server detected a protocol error, and
642 further communication would have undefined behavior. The client
643 or server forcibly aborts the connection (Section 5.3).
644
645 Abort Reconnect Wait: The client has forcibly aborted the connection
646 but still needs service. Or, the server forcibly aborted the
647 connection, but the client still needs service. The client either
648 connects to a different service instance (Section 9.1) or waits to
649 reconnect (Section 6.6.3.1).
650
6515.1. DSO Session Establishment
652
653 In order for a session to be established between a client and a
654 server, the client must first establish a connection to the server
655 using an applicable transport (see Section 4.2).
656
657 In some environments, it may be known in advance by external means
658 that both client and server support DSO, and in these cases either
659 client or server may initiate DSO messages at any time. In this
660 case, the session is established as soon as the connection is
661 established; this is referred to as implicit DSO Session
662 establishment.
663
664 However, in the typical case a server will not know in advance
665 whether a client supports DSO, so in general, unless it is known in
666 advance by other means that a client does support DSO, a server MUST
667 NOT initiate DSO request messages or DSO unidirectional messages
668 until a DSO Session has been mutually established by at least one
669 successful DSO request/response exchange initiated by the client, as
670
671
672
673
674Bellis, et al. Standards Track [Page 12]
675
676RFC 8490 DNS Stateful Operations March 2019
677
678
679 described below. This is referred to as explicit DSO Session
680 establishment.
681
682 Until a DSO Session has been implicitly or explicitly established, a
683 client MUST NOT initiate DSO unidirectional messages.
684
685 A DSO Session is established over a connection by the client sending
686 a DSO request message, such as a DSO Keepalive request message
687 (Section 7.1), and receiving a response with a matching MESSAGE ID,
688 and RCODE set to NOERROR (0), indicating that the DSO request was
689 successful.
690
691 Some DSO messages are permitted as early data (Section 11.1). Others
692 are not. Unidirectional messages are never permitted as early data,
693 unless an implicit DSO Session exists.
694
695 If a server receives a DSO message in early data whose Primary TLV is
696 not permitted to appear in early data, the server MUST forcibly abort
697 the connection. If a client receives a DSO message in early data,
698 and there is no implicit DSO Session, the client MUST forcibly abort
699 the connection. This can only be enforced on TLS connections;
700 therefore, servers MUST NOT enable TCP Fast Open (TFO) when listening
701 for a connection that does not require TLS.
702
7035.1.1. DSO Session Establishment Failure
704
705 If the response RCODE is set to NOTIMP (4), or in practice any value
706 other than NOERROR (0) or DSOTYPENI (defined below), then the client
707 MUST assume that the server does not implement DSO at all. In this
708 case, the client is permitted to continue sending DNS messages on
709 that connection but MUST NOT issue further DSO messages on that
710 connection.
711
712 If the RCODE in the response is set to DSOTYPENI ("DSO-TYPE Not
713 Implemented"; RCODE 11), this indicates that the server does support
714 DSO but does not implement the DSO-TYPE of the Primary TLV in this
715 DSO request message. A server implementing DSO MUST NOT return
716 DSOTYPENI for a DSO Keepalive request message because the Keepalive
717 TLV is mandatory to implement. But in the future, if a client
718 attempts to establish a DSO Session using a response-requiring DSO
719 request message using some newly-defined DSO-TYPE that the server
720 does not understand, that would result in a DSOTYPENI response. If
721 the server returns DSOTYPENI, then a DSO Session is not considered
722 established. The client is, however, permitted to continue sending
723 DNS messages on the connection, including other DSO messages such as
724 the DSO Keepalive, which may result in a successful NOERROR response,
725 yielding the establishment of a DSO Session.
726
727
728
729
730Bellis, et al. Standards Track [Page 13]
731
732RFC 8490 DNS Stateful Operations March 2019
733
734
735 When a DSO message is received by an existing DNS server that doesn't
736 recognize the DSO OPCODE, two other possible outcomes exist: the
737 server might send no response to the DSO message, or the server might
738 drop the connection.
739
740 If the server sends no response to the DSO message, the client SHOULD
741 wait 30 seconds, after which time the server will be assumed not to
742 support DSO. If the server doesn't respond within 30 seconds, it can
743 be assumed that it is not going to respond; this leaves it in an
744 unspecified state: there is no specification requiring that a
745 response be sent to an unknown message, but there is also no
746 specification stating what state the server is in if no response is
747 sent. Therefore the client MUST forcibly abort the connection to the
748 server. The client MAY reconnect, but not use DSO, if appropriate
749 (Section 6.6.3.1). By disconnecting and reconnecting, the client
750 ensures that the server is in a known state before sending any
751 subsequent requests.
752
753 If the server drops the connection the client SHOULD mark that
754 service instance as not supporting DSO, and not attempt a DSO
755 connection for some period of time (at least an hour) after the
756 failed attempt. The client MAY reconnect but not use DSO, if
757 appropriate (Section 6.6.3.2).
758
7595.1.2. DSO Session Establishment Success
760
761 When the server receives a DSO request message from a client, and
762 transmits a successful NOERROR response to that request, the server
763 considers the DSO Session established.
764
765 When the client receives the server's NOERROR response to its DSO
766 request message, the client considers the DSO Session established.
767
768 Once a DSO Session has been established, either end may unilaterally
769 send appropriate DSO messages at any time, and therefore either
770 client or server may be the initiator of a message.
771
7725.2. Operations after DSO Session Establishment
773
774 Once a DSO Session has been established, clients and servers should
775 behave as described in this specification with regard to inactivity
776 timeouts and session termination, not as previously prescribed in the
777 earlier specification for DNS-over-TCP [RFC7766].
778
779 Because a server that supports DNS Stateful Operations MUST return an
780 RCODE of "NOERROR" when it receives a Keepalive TLV DSO request
781 message, the Keepalive TLV is an ideal candidate for use in
782 establishing a DSO Session. Any other option that can only succeed
783
784
785
786Bellis, et al. Standards Track [Page 14]
787
788RFC 8490 DNS Stateful Operations March 2019
789
790
791 when sent to a server of the desired kind is also a good candidate
792 for use in establishing a DSO Session. For clients that implement
793 only the DSO-TYPEs defined in this base specification, sending a
794 Keepalive TLV is the only DSO request message they have available to
795 initiate a DSO Session. Even for clients that do implement other
796 future DSO-TYPEs, for simplicity they MAY elect to always send an
797 initial DSO Keepalive request message as their way of initiating a
798 DSO Session. A future definition of a new response-requiring DSO-
799 TYPE gives implementers the option of using that new DSO-TYPE if they
800 wish, but does not change the fact that sending a Keepalive TLV
801 remains a valid way of initiating a DSO Session.
802
8035.3. DSO Session Termination
804
805 A DSO Session is terminated when the underlying connection is closed.
806 DSO Sessions are "closed gracefully" as a result of the server
807 closing a DSO Session because it is overloaded, because of the client
808 closing the DSO Session because it is done, or because of the client
809 closing the DSO Session because it is inactive. DSO Sessions are
810 "forcibly aborted" when either the client or server closes the
811 connection because of a protocol error.
812
813 o Where this specification says "close gracefully", it means sending
814 a TLS close_notify (if TLS is in use) followed by a TCP FIN, or
815 the equivalent for other protocols. Where this specification
816 requires a connection to be closed gracefully, the requirement to
817 initiate that graceful close is placed on the client in order to
818 place the burden of TCP's TIME-WAIT state on the client rather
819 than the server.
820
821 o Where this specification says "forcibly abort", it means sending a
822 TCP RST or the equivalent for other protocols. In the BSD Sockets
823 API, this is achieved by setting the SO_LINGER option to zero
824 before closing the socket.
825
8265.3.1. Handling Protocol Errors
827
828 In protocol implementation, there are generally two kinds of errors
829 that software writers have to deal with. The first is situations
830 that arise due to factors in the environment, such as temporary loss
831 of connectivity. While undesirable, these situations do not indicate
832 a flaw in the software and are situations that software should
833 generally be able to recover from.
834
835 The second is situations that should never happen when communicating
836 with a compliant DSO implementation. If they do happen, they
837 indicate a serious flaw in the protocol implementation beyond what is
838 reasonable to expect software to recover from. This document
839
840
841
842Bellis, et al. Standards Track [Page 15]
843
844RFC 8490 DNS Stateful Operations March 2019
845
846
847 describes this latter form of error condition as a "fatal error" and
848 specifies that an implementation encountering a fatal error condition
849 "MUST forcibly abort the connection immediately".
850
8515.4. Message Format
852
853 A DSO message begins with the standard twelve-byte DNS message header
854 [RFC1035] with the OPCODE field set to the DSO OPCODE (6). However,
855 unlike standard DNS messages, the question section, answer section,
856 authority records section, and additional records sections are not
857 present. The corresponding count fields (QDCOUNT, ANCOUNT, NSCOUNT,
858 ARCOUNT) MUST be set to zero on transmission.
859
860 If a DSO message is received where any of the count fields are not
861 zero, then a FORMERR MUST be returned.
862
863 1 1 1 1 1 1
864 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
865 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
866 | MESSAGE ID |
867 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
868 |QR | OPCODE (6) | Z | RCODE |
869 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
870 | QDCOUNT (MUST be zero) |
871 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
872 | ANCOUNT (MUST be zero) |
873 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
874 | NSCOUNT (MUST be zero) |
875 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
876 | ARCOUNT (MUST be zero) |
877 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
878 | |
879 / DSO Data /
880 / /
881 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898Bellis, et al. Standards Track [Page 16]
899
900RFC 8490 DNS Stateful Operations March 2019
901
902
9035.4.1. DNS Header Fields in DSO Messages
904
905 In a DSO unidirectional message, the MESSAGE ID field MUST be set to
906 zero. In a DSO request message, the MESSAGE ID field MUST be set to
907 a unique nonzero value that the initiator is not currently using for
908 any other active operation on this connection. For the purposes
909 here, a MESSAGE ID is in use in this DSO Session if the initiator has
910 used it in a DSO request message for which it is still awaiting a
911 response, or if the client has used it to set up a long-lived
912 operation that has not yet been canceled. For example, a long-lived
913 operation could be a Push Notification subscription [Push] or a
914 Discovery Relay interface subscription [Relay].
915
916 Whether a message is a DSO request message or a DSO unidirectional
917 message is determined only by the specification for the Primary TLV.
918 An acknowledgment cannot be requested by including a nonzero MESSAGE
919 ID in a message that is required according to its Primary TLV to be
920 unidirectional. Nor can an acknowledgment be prevented by sending a
921 MESSAGE ID of zero in a message that is required to be a DSO request
922 message according to its Primary TLV. A responder that receives
923 either such malformed message MUST treat it as a fatal error and
924 forcibly abort the connection immediately.
925
926 In a DSO request message or DSO unidirectional message, the DNS
927 Header Query/Response (QR) bit MUST be zero (QR=0). If the QR bit is
928 not zero, the message is not a DSO request or DSO unidirectional
929 message.
930
931 In a DSO response message, the DNS Header QR bit MUST be one (QR=1).
932 If the QR bit is not one, the message is not a DSO response message.
933
934 In a DSO response message (QR=1), the MESSAGE ID field MUST NOT be
935 zero, and MUST contain a copy of the value of the (nonzero) MESSAGE
936 ID field in the DSO request message being responded to. If a DSO
937 response message (QR=1) is received where the MESSAGE ID is zero,
938 this is a fatal error and the recipient MUST forcibly abort the
939 connection immediately.
940
941 The DNS Header OPCODE field holds the DSO OPCODE value (6).
942
943 The Z bits are currently unused in DSO messages; in both DSO request
944 messages and DSO responses, the Z bits MUST be set to zero (0) on
945 transmission and MUST be ignored on reception.
946
947 In a DSO request message (QR=0), the RCODE is set according to the
948 definition of the request. For example, in a Retry Delay message
949 (Section 6.6.1), the RCODE indicates the reason for termination.
950 However, in most DSO request messages (QR=0), except where clearly
951
952
953
954Bellis, et al. Standards Track [Page 17]
955
956RFC 8490 DNS Stateful Operations March 2019
957
958
959 specified otherwise, the RCODE is set to zero on transmission, and
960 silently ignored on reception.
961
962 The RCODE value in a response message (QR=1) may be one of the
963 following values:
964
965 +------+-----------+------------------------------------------------+
966 | Code | Mnemonic | Description |
967 +------+-----------+------------------------------------------------+
968 | 0 | NOERROR | Operation processed successfully |
969 | | | |
970 | 1 | FORMERR | Format error |
971 | | | |
972 | 2 | SERVFAIL | Server failed to process DSO request message |
973 | | | due to a problem with the server |
974 | | | |
975 | 4 | NOTIMP | DSO not supported |
976 | | | |
977 | 5 | REFUSED | Operation declined for policy reasons |
978 | | | |
979 | 11 | DSOTYPENI | Primary TLV's DSO-Type is not implemented |
980 +------+-----------+------------------------------------------------+
981
982 Use of the above RCODEs is likely to be common in DSO but does not
983 preclude the definition and use of other codes in future documents
984 that make use of DSO.
985
986 If a document defining a new DSO-TYPE makes use of response codes not
987 defined here, then that document MUST specify the specific
988 interpretation of those RCODE values in the context of that new DSO
989 TLV.
990
991 The RCODE field is followed by the four zero-valued count fields,
992 followed by the DSO Data.
993
9945.4.2. DSO Data
995
996 The standard twelve-byte DNS message header with its zero-valued
997 count fields is followed by the DSO Data, expressed using TLV syntax,
998 as described in Section 5.4.4.
999
1000 A DSO request message or DSO unidirectional message MUST contain at
1001 least one TLV. The first TLV in a DSO request message or DSO
1002 unidirectional message is referred to as the "Primary TLV" and
1003 determines the nature of the operation being performed, including
1004 whether it is a DSO request or a DSO unidirectional operation. In
1005 some cases, it may be appropriate to include other TLVs in a DSO
1006 request message or DSO unidirectional message, such as the DSO
1007
1008
1009
1010Bellis, et al. Standards Track [Page 18]
1011
1012RFC 8490 DNS Stateful Operations March 2019
1013
1014
1015 Encryption Padding TLV (Section 7.3). Additional TLVs follow the
1016 Primary TLV. Additional TLVs are not limited to what is defined in
1017 this document. New Additional TLVs may be defined in the future.
1018 Their definitions will describe when their use is appropriate.
1019
1020 An unrecognized Primary TLV results in a DSOTYPENI error response.
1021 Unrecognized Additional TLVs are silently ignored, as described in
1022 Sections 5.4.5 and 8.2.
1023
1024 A DSO response message may contain no TLVs, or may contain one or
1025 more TLVs, appropriate to the information being communicated.
1026
1027 Any TLVs with the same DSO-TYPE as the Primary TLV from the
1028 corresponding DSO request message are Response Primary TLV(s) and
1029 MUST appear first in a DSO response message. A DSO response message
1030 may contain multiple Response Primary TLVs, or a single Response
1031 Primary TLV, or in some cases, no Response Primary TLV. A Response
1032 Primary TLV is not required; for most DSO operations the MESSAGE ID
1033 field in the DNS message header is sufficient to identify the DSO
1034 request message to which a particular response message relates.
1035
1036 Any other TLVs in a DSO response message are Response Additional
1037 TLVs, such as the DSO Encryption Padding TLV (Section 7.3). Response
1038 Additional TLVs follow the Response Primary TLV(s), if present.
1039 Response Additional TLVs are not limited to what is defined in this
1040 document. New Response Additional TLVs may be defined in the future.
1041 Their definitions will describe when their use is appropriate.
1042 Unrecognized Response Additional TLVs are silently ignored, as
1043 described in Sections 5.4.5 and 8.2.
1044
1045 The specification for each DSO TLV determines what TLVs are required
1046 in a response to a DSO request message using that TLV. If a DSO
1047 response is received for an operation where the specification
1048 requires that the response carry a particular TLV or TLVs, and the
1049 required TLV(s) are not present, then this is a fatal error and the
1050 recipient of the defective response message MUST forcibly abort the
1051 connection immediately. Similarly, if more than the specified number
1052 of instances of a given TLV are present, this is a fatal error and
1053 the recipient of the defective response message MUST forcibly abort
1054 the connection immediately.
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066Bellis, et al. Standards Track [Page 19]
1067
1068RFC 8490 DNS Stateful Operations March 2019
1069
1070
10715.4.3. DSO Unidirectional Messages
1072
1073 It is anticipated that most DSO operations will be specified to use
1074 DSO request messages, which generate corresponding DSO responses. In
1075 some specialized high-traffic use cases, it may be appropriate to
1076 specify DSO unidirectional messages. DSO unidirectional messages can
1077 be more efficient on the network because they don't generate a stream
1078 of corresponding reply messages. Using DSO unidirectional messages
1079 can also simplify software in some cases by removing the need for an
1080 initiator to maintain state while it waits to receive replies it
1081 doesn't care about. When the specification for a particular TLV used
1082 as a Primary TLV (i.e., first) in an outgoing DSO request message
1083 (i.e., QR=0) states that a message is to be unidirectional, the
1084 MESSAGE ID field MUST be set to zero and the receiver MUST NOT
1085 generate any response message corresponding to that DSO
1086 unidirectional message.
1087
1088 The previous point, that the receiver MUST NOT generate responses to
1089 DSO unidirectional messages, applies even in the case of errors.
1090
1091 When a DSO message is received where both the QR bit and the MESSAGE
1092 ID field are zero, the receiver MUST NOT generate any response. For
1093 example, if the DSO-TYPE in the Primary TLV is unrecognized, then a
1094 DSOTYPENI error MUST NOT be returned; instead, the receiver MUST
1095 forcibly abort the connection immediately.
1096
1097 DSO unidirectional messages MUST NOT be used "speculatively" in cases
1098 where the sender doesn't know if the receiver supports the Primary
1099 TLV in the message because there is no way to receive any response to
1100 indicate success or failure. DSO unidirectional messages are only
1101 appropriate in cases where the sender already knows that the receiver
1102 supports and wishes to receive these messages.
1103
1104 For example, after a client has subscribed for Push Notifications
1105 [Push], the subsequent event notifications are then sent as DSO
1106 unidirectional messages. This is appropriate because the client
1107 initiated the message stream by virtue of its Push Notification
1108 subscription, thereby indicating its support of Push Notifications
1109 and its desire to receive those notifications.
1110
1111 Similarly, after a Discovery Relay client has subscribed to receive
1112 inbound multicast DNS (mDNS) [RFC6762] traffic from a Discovery
1113 Relay, the subsequent stream of received packets is then sent using
1114 DSO unidirectional messages. This is appropriate because the client
1115 initiated the message stream by virtue of its Discovery Relay link
1116 subscription, thereby indicating its support of Discovery Relay and
1117 its desire to receive inbound mDNS packets over that DSO Session
1118 [Relay].
1119
1120
1121
1122Bellis, et al. Standards Track [Page 20]
1123
1124RFC 8490 DNS Stateful Operations March 2019
1125
1126
11275.4.4. TLV Syntax
1128
1129 All TLVs, whether used as "Primary", "Additional", "Response
1130 Primary", or "Response Additional", use the same encoding syntax.
1131
1132 A specification that defines a new TLV must specify whether the DSO-
1133 TYPE can be used as a Primary TLV, and whether the DSO-TYPE can be
1134 used as an Additional TLV. Some DSO-TYPEs are dual-purpose and can
1135 be used as Primary TLVs in some messages, and in other messages as
1136 Additional TLVs. The specification for a DSO-TYPE must also state
1137 whether, when used as the Primary (i.e., first) TLV in a DSO message
1138 (i.e., QR=0), that DSO message is unidirectional, or is a DSO request
1139 message that requires a response.
1140
1141 If a DSO request message requires a response, the specification must
1142 also state which TLVs, if any, are to be included in the response and
1143 how many instances of each of the TLVs are allowed. The Primary TLV
1144 may or may not be contained in the response depending on what is
1145 specified for that TLV. If multiple instances of the Primary TLV are
1146 allowed the specification should clearly describe how they should be
1147 processed.
1148
1149 1 1 1 1 1 1
1150 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
1151 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
1152 | DSO-TYPE |
1153 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
1154 | DSO-LENGTH |
1155 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
1156 | |
1157 / DSO-DATA /
1158 / /
1159 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
1160
1161 DSO-TYPE: A 16-bit unsigned integer, in network (big endian) byte
1162 order, giving the DSO-TYPE of the current DSO TLV per the IANA
1163 "DSO Type Codes" registry.
1164
1165 DSO-LENGTH: A 16-bit unsigned integer, in network (big endian) byte
1166 order, giving the size in bytes of the DSO-DATA.
1167
1168 DSO-DATA: Type-code specific format. The generic DSO machinery
1169 treats the DSO-DATA as an opaque "blob" without attempting to
1170 interpret it. Interpretation of the meaning of the DSO-DATA for a
1171 particular DSO-TYPE is the responsibility of the software that
1172 implements that DSO-TYPE.
1173
1174
1175
1176
1177
1178Bellis, et al. Standards Track [Page 21]
1179
1180RFC 8490 DNS Stateful Operations March 2019
1181
1182
11835.4.5. Unrecognized TLVs
1184
1185 If a DSO request message is received containing an unrecognized
1186 Primary TLV, with a nonzero MESSAGE ID (indicating that a response is
1187 expected), then the receiver MUST send an error response with a
1188 matching MESSAGE ID, and RCODE DSOTYPENI. The error response MUST
1189 NOT contain a copy of the unrecognized Primary TLV.
1190
1191 If a DSO unidirectional message is received containing both an
1192 unrecognized Primary TLV and a zero MESSAGE ID (indicating that no
1193 response is expected), then this is a fatal error and the recipient
1194 MUST forcibly abort the connection immediately.
1195
1196 If a DSO request message or DSO unidirectional message is received
1197 where the Primary TLV is recognized, containing one or more
1198 unrecognized Additional TLVs, the unrecognized Additional TLVs MUST
1199 be silently ignored, and the remainder of the message is interpreted
1200 and handled as if the unrecognized parts were not present.
1201
1202 Similarly, if a DSO response message is received containing one or
1203 more unrecognized TLVs, the unrecognized TLVs MUST be silently
1204 ignored and the remainder of the message is interpreted and handled
1205 as if the unrecognized parts are not present.
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234Bellis, et al. Standards Track [Page 22]
1235
1236RFC 8490 DNS Stateful Operations March 2019
1237
1238
12395.4.6. EDNS(0) and TSIG
1240
1241 Since the ARCOUNT field MUST be zero, a DSO message cannot contain a
1242 valid EDNS(0) option in the additional records section. If
1243 functionality provided by current or future EDNS(0) options is
1244 desired for DSO messages, one or more new DSO TLVs need to be defined
1245 to carry the necessary information.
1246
1247 For example, the EDNS(0) Padding Option [RFC7830] used for security
1248 purposes is not permitted in a DSO message, so if message padding is
1249 desired for DSO messages, then the DSO Encryption Padding TLV
1250 described in Section 7.3 MUST be used.
1251
1252 A DSO message can't contain a TSIG record because a TSIG record is
1253 included in the additional section of the message, which would mean
1254 that ARCOUNT would be greater than zero. DSO messages are required
1255 to have an ARCOUNT of zero. Therefore, if use of signatures with DSO
1256 messages becomes necessary in the future, a new DSO TLV would have to
1257 be defined to perform this function.
1258
1259 Note, however, that while DSO *messages* cannot include EDNS(0) or
1260 TSIG records, a DSO *session* is typically used to carry a whole
1261 series of DNS messages of different kinds, including DSO messages and
1262 other DNS message types like Query [RFC1034] [RFC1035] and Update
1263 [RFC2136]. These messages can carry EDNS(0) and TSIG records.
1264
1265 Although messages may contain other EDNS(0) options as appropriate,
1266 this specification explicitly prohibits use of the edns-tcp-keepalive
1267 EDNS(0) Option [RFC7828] in *any* messages sent on a DSO Session
1268 (because it is obsoleted by the functionality provided by the DSO
1269 Keepalive operation). If any message sent on a DSO Session contains
1270 an edns-tcp-keepalive EDNS(0) Option, this is a fatal error and the
1271 recipient of the defective message MUST forcibly abort the connection
1272 immediately.
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290Bellis, et al. Standards Track [Page 23]
1291
1292RFC 8490 DNS Stateful Operations March 2019
1293
1294
12955.5. Message Handling
1296
1297 As described in Section 5.4.1, whether an outgoing DSO message with
1298 the QR bit in the DNS header set to zero is a DSO request or a DSO
1299 unidirectional message is determined by the specification for the
1300 Primary TLV, which in turn determines whether the MESSAGE ID field in
1301 that outgoing message will be zero or nonzero.
1302
1303 Every DSO message with the QR bit in the DNS header set to zero and a
1304 nonzero MESSAGE ID field is a DSO request message, and MUST elicit a
1305 corresponding response, with the QR bit in the DNS header set to one
1306 and the MESSAGE ID field set to the value given in the corresponding
1307 DSO request message.
1308
1309 Valid DSO request messages sent by the client with a nonzero MESSAGE
1310 ID field elicit a response from the server, and valid DSO request
1311 messages sent by the server with a nonzero MESSAGE ID field elicit a
1312 response from the client.
1313
1314 Every DSO message with both the QR bit in the DNS header and the
1315 MESSAGE ID field set to zero is a DSO unidirectional message and MUST
1316 NOT elicit a response.
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346Bellis, et al. Standards Track [Page 24]
1347
1348RFC 8490 DNS Stateful Operations March 2019
1349
1350
13515.5.1. Delayed Acknowledgement Management
1352
1353 Generally, most good TCP implementations employ a delayed
1354 acknowledgement timer to provide more efficient use of the network
1355 and better performance.
1356
1357 With a bidirectional exchange over TCP, such as with a DSO request
1358 message, the operating system TCP implementation waits for the
1359 application-layer client software to generate the corresponding DSO
1360 response message. The TCP implementation can then send a single
1361 combined packet containing the TCP acknowledgement, the TCP window
1362 update, and the application-generated DSO response message. This is
1363 more efficient than sending three separate packets, as would occur if
1364 the TCP packet containing the DSO request were acknowledged
1365 immediately.
1366
1367 With a DSO unidirectional message or DSO response message, there is
1368 no corresponding application-generated DSO response message, and
1369 consequently, no hint to the transport protocol about when it should
1370 send its acknowledgement and window update.
1371
1372 Some networking APIs provide a mechanism that allows the application-
1373 layer client software to signal to the transport protocol that no
1374 response will be forthcoming (in effect it can be thought of as a
1375 zero-length "empty" write). Where available in the networking API
1376 being used, the recipient of a DSO unidirectional message or DSO
1377 response message, having parsed and interpreted the message, SHOULD
1378 then use this mechanism provided by the networking API to signal that
1379 no response for this message will be forthcoming. The TCP
1380 implementation can then go ahead and send its acknowledgement and
1381 window update without further delay. See Section 9.5 for further
1382 discussion of why this is important.
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402Bellis, et al. Standards Track [Page 25]
1403
1404RFC 8490 DNS Stateful Operations March 2019
1405
1406
14075.5.2. MESSAGE ID Namespaces
1408
1409 The namespaces of 16-bit MESSAGE IDs are independent in each
1410 direction. This means it is *not* an error for both client and
1411 server to send DSO request messages at the same time as each other,
1412 using the same MESSAGE ID, in different directions. This
1413 simplification is necessary in order for the protocol to be
1414 implementable. It would be infeasible to require the client and
1415 server to coordinate with each other regarding allocation of new
1416 unique MESSAGE IDs. It is also not necessary to require the client
1417 and server to coordinate with each other regarding allocation of new
1418 unique MESSAGE IDs. The value of the 16-bit MESSAGE ID combined with
1419 the identity of the initiator (client or server) is sufficient to
1420 unambiguously identify the operation in question. This can be
1421 thought of as a 17-bit message identifier space using message
1422 identifiers 0x00001-0x0FFFF for client-to-server DSO request
1423 messages, and 0x10001-0x1FFFF for server-to-client DSO request
1424 messages. The least-significant 16 bits are stored explicitly in the
1425 MESSAGE ID field of the DSO message, and the most-significant bit is
1426 implicit from the direction of the message.
1427
1428 As described in Section 5.4.1, an initiator MUST NOT reuse a MESSAGE
1429 ID that it already has in use for an outstanding DSO request message
1430 (unless specified otherwise by the relevant specification for the
1431 DSO-TYPE in question). At the very least, this means that a MESSAGE
1432 ID can't be reused in a particular direction on a particular DSO
1433 Session while the initiator is waiting for a response to a previous
1434 DSO request message using that MESSAGE ID on that DSO Session (unless
1435 specified otherwise by the relevant specification for the DSO-TYPE in
1436 question), and for a long-lived operation, the MESSAGE ID for the
1437 operation can't be reused while that operation remains active.
1438
1439 If a client or server receives a response (QR=1) where the MESSAGE ID
1440 is zero, or is any other value that does not match the MESSAGE ID of
1441 any of its outstanding operations, this is a fatal error and the
1442 recipient MUST forcibly abort the connection immediately.
1443
1444 If a responder receives a DSO request message (QR=0) where the
1445 MESSAGE ID is not zero, the responder tracks request MESSAGE IDs, and
1446 the MESSAGE ID matches the MESSAGE ID of a DSO request message it
1447 received for which a response has not yet been sent, it MUST forcibly
1448 abort the connection immediately. This behavior is required to
1449 prevent a hypothetical attack that takes advantage of undefined
1450 behavior in this case. However, if the responder does not track
1451 MESSAGE IDs in this way, no such risk exists. Therefore, tracking
1452 MESSAGE IDs just to implement this sanity check is not required.
1453
1454
1455
1456
1457
1458Bellis, et al. Standards Track [Page 26]
1459
1460RFC 8490 DNS Stateful Operations March 2019
1461
1462
14635.5.3. Error Responses
1464
1465 When a DSO request message is unsuccessful for some reason, the
1466 responder returns an error code to the initiator.
1467
1468 In the case of a server returning an error code to a client in
1469 response to an unsuccessful DSO request message, the server MAY
1470 choose to end the DSO Session or MAY choose to allow the DSO Session
1471 to remain open. For error conditions that only affect the single
1472 operation in question, the server SHOULD return an error response to
1473 the client and leave the DSO Session open for further operations.
1474
1475 For error conditions that are likely to make all operations
1476 unsuccessful in the immediate future, the server SHOULD return an
1477 error response to the client and then end the DSO Session by sending
1478 a Retry Delay message as described in Section 6.6.1.
1479
1480 Upon receiving an error response from the server, a client SHOULD NOT
1481 automatically close the DSO Session. An error relating to one
1482 particular operation on a DSO Session does not necessarily imply that
1483 all other operations on that DSO Session have also failed or that
1484 future operations will fail. The client should assume that the
1485 server will make its own decision about whether or not to end the DSO
1486 Session based on the server's determination of whether the error
1487 condition pertains to this particular operation or to any subsequent
1488 operations. If the server does not end the DSO Session by sending
1489 the client a Retry Delay message (Section 6.6.1), then the client
1490 SHOULD continue to use that DSO Session for subsequent operations.
1491
1492 When a DSO unidirectional message type is received (MESSAGE ID field
1493 is zero), the receiver should already be expecting this DSO message
1494 type. Section 5.4.5 describes the handling of unknown DSO message
1495 types. When a DSO unidirectional message of an unexpected type is
1496 received, the receiver SHOULD forcibly abort the connection. Whether
1497 the connection should be forcibly aborted for other internal errors
1498 processing the DSO unidirectional message is implementation dependent
1499 according to the severity of the error.
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514Bellis, et al. Standards Track [Page 27]
1515
1516RFC 8490 DNS Stateful Operations March 2019
1517
1518
15195.6. Responder-Initiated Operation Cancellation
1520
1521 This document, the base specification for DNS Stateful Operations,
1522 does not itself define any long-lived operations, but it defines a
1523 framework for supporting long-lived operations such as Push
1524 Notification subscriptions [Push] and Discovery Relay interface
1525 subscriptions [Relay].
1526
1527 Long-lived operations, if successful, will remain active until the
1528 initiator terminates the operation.
1529
1530 However, it is possible that a long-lived operation may be valid at
1531 the time it was initiated, but then a later change of circumstances
1532 may render that operation invalid. For example, a long-lived client
1533 operation may pertain to a name that the server is authoritative for,
1534 but then the server configuration is changed such that it is no
1535 longer authoritative for that name.
1536
1537 In such cases, instead of terminating the entire session, it may be
1538 desirable for the responder to be able to cancel selectively only
1539 those operations that have become invalid.
1540
1541 The responder performs this selective cancellation by sending a new
1542 DSO response message with the MESSAGE ID field containing the MESSAGE
1543 ID of the long-lived operation that is to be terminated (that it had
1544 previously acknowledged with a NOERROR RCODE) and the RCODE field of
1545 the new DSO response message giving the reason for cancellation.
1546
1547 After a DSO response message with nonzero RCODE has been sent, that
1548 operation has been terminated from the responder's point of view, and
1549 the responder sends no more messages relating to that operation.
1550
1551 After a DSO response message with nonzero RCODE has been received by
1552 the initiator, that operation has been terminated from the
1553 initiator's point of view, and the canceled operation's MESSAGE ID is
1554 now free for reuse.
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570Bellis, et al. Standards Track [Page 28]
1571
1572RFC 8490 DNS Stateful Operations March 2019
1573
1574
15756. DSO Session Lifecycle and Timers
1576
15776.1. DSO Session Initiation
1578
1579 A DSO Session begins as described in Section 5.1.
1580
1581 Once a DSO Session has been created, client or server may initiate as
1582 many DNS operations as they wish using the DSO Session.
1583
1584 When an initiator has multiple messages to send, it SHOULD NOT wait
1585 for each response before sending the next message.
1586
1587 A responder MUST act on messages in the order they are received, and
1588 SHOULD return responses to request messages as they become available.
1589 A responder SHOULD NOT delay sending responses for the purpose of
1590 delivering responses in the same order that the corresponding
1591 requests were received.
1592
1593 Section 6.2.1.1 of the DNS-over-TCP specification [RFC7766] specifies
1594 this in more detail.
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626Bellis, et al. Standards Track [Page 29]
1627
1628RFC 8490 DNS Stateful Operations March 2019
1629
1630
16316.2. DSO Session Timeouts
1632
1633 Two timeout values are associated with a DSO Session: the inactivity
1634 timeout and the keepalive interval. Both values are communicated in
1635 the same TLV, the Keepalive TLV (Section 7.1).
1636
1637 The first timeout value, the inactivity timeout, is the maximum time
1638 for which a client may speculatively keep an inactive DSO Session
1639 open in the expectation that it may have future requests to send to
1640 that server.
1641
1642 The second timeout value, the keepalive interval, is the maximum
1643 permitted interval between messages if the client wishes to keep the
1644 DSO Session alive.
1645
1646 The two timeout values are independent. The inactivity timeout may
1647 be shorter, the same, or longer than the keepalive interval, though
1648 in most cases the inactivity timeout is expected to be shorter than
1649 the keepalive interval.
1650
1651 A shorter inactivity timeout with a longer keepalive interval signals
1652 to the client that it should not speculatively keep an inactive DSO
1653 Session open for very long without reason, but when it does have an
1654 active reason to keep a DSO Session open, it doesn't need to be
1655 sending an aggressive level of DSO keepalive traffic to maintain that
1656 session. An example of this would be a client that has subscribed to
1657 DNS Push notifications. In this case, the client is not sending any
1658 traffic to the server, but the session is not inactive because there
1659 is an active request to the server to receive push notifications.
1660
1661 A longer inactivity timeout with a shorter keepalive interval signals
1662 to the client that it may speculatively keep an inactive DSO Session
1663 open for a long time, but to maintain that inactive DSO Session it
1664 should be sending a lot of DSO keepalive traffic. This configuration
1665 is expected to be less common.
1666
1667 In the usual case where the inactivity timeout is shorter than the
1668 keepalive interval, it is only when a client has a long-lived, low-
1669 traffic operation that the keepalive interval comes into play in
1670 order to ensure that a sufficient residual amount of traffic is
1671 generated to maintain NAT and firewall state, and to assure the
1672 client and server that they still have connectivity to each other.
1673
1674 On a new DSO Session, if no explicit DSO Keepalive message exchange
1675 has taken place, the default value for both timeouts is 15 seconds.
1676
1677 For both timeouts, lower values of the timeout result in higher
1678 network traffic and a higher CPU load on the server.
1679
1680
1681
1682Bellis, et al. Standards Track [Page 30]
1683
1684RFC 8490 DNS Stateful Operations March 2019
1685
1686
16876.3. Inactive DSO Sessions
1688
1689 At both servers and clients, the generation or reception of any
1690 complete DNS message (including DNS requests, responses, updates, DSO
1691 messages, etc.) resets both timers for that DSO Session, with the one
1692 exception being that a DSO Keepalive message resets only the
1693 keepalive timer, not the inactivity timeout timer.
1694
1695 In addition, for as long as the client has an outstanding operation
1696 in progress, the inactivity timer remains cleared and an inactivity
1697 timeout cannot occur.
1698
1699 For short-lived DNS operations like traditional queries and updates,
1700 an operation is considered "in progress" for the time between request
1701 and response, typically a period of a few hundred milliseconds at
1702 most. At the client, the inactivity timer is cleared upon
1703 transmission of a request and remains cleared until reception of the
1704 corresponding response. At the server, the inactivity timer is
1705 cleared upon reception of a request and remains cleared until
1706 transmission of the corresponding response.
1707
1708 For long-lived DNS Stateful Operations (such as a Push Notification
1709 subscription [Push] or a Discovery Relay interface subscription
1710 [Relay]), an operation is considered "in progress" for as long as the
1711 operation is active, i.e., until it is canceled. This means that a
1712 DSO Session can exist with active operations, with no messages
1713 flowing in either direction, for far longer than the inactivity
1714 timeout. This is not an error. This is why there are two separate
1715 timers: the inactivity timeout and the keepalive interval. Just
1716 because a DSO Session has no traffic for an extended period of time,
1717 it does not automatically make that DSO Session "inactive", if it has
1718 an active operation that is awaiting events.
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738Bellis, et al. Standards Track [Page 31]
1739
1740RFC 8490 DNS Stateful Operations March 2019
1741
1742
17436.4. The Inactivity Timeout
1744
1745 The purpose of the inactivity timeout is for the server to balance
1746 the trade-off between the costs of setting up new DSO Sessions and
1747 the costs of maintaining inactive DSO Sessions. A server with
1748 abundant DSO Session capacity can offer a high inactivity timeout to
1749 permit clients to keep a speculative DSO Session open for a long time
1750 and to save the cost of establishing a new DSO Session for future
1751 communications with that server. A server with scarce memory
1752 resources can offer a low inactivity timeout to cause clients to
1753 promptly close DSO Sessions whenever they have no outstanding
1754 operations with that server and then create a new DSO Session later
1755 when needed.
1756
17576.4.1. Closing Inactive DSO Sessions
1758
1759 When a connection's inactivity timeout is reached, the client MUST
1760 begin closing the idle connection, but a client is not required to
1761 keep an idle connection open until the inactivity timeout is reached.
1762 A client MAY close a DSO Session at any time, at the client's
1763 discretion. If a client determines that it has no current or
1764 reasonably anticipated future need for a currently inactive DSO
1765 Session, then the client SHOULD gracefully close that connection.
1766
1767 If, at any time during the life of the DSO Session, the inactivity
1768 timeout value (i.e., 15 seconds by default) elapses without there
1769 being any operation active on the DSO Session, the client MUST close
1770 the connection gracefully.
1771
1772 If, at any time during the life of the DSO Session, too much time
1773 elapses without there being any operation active on the DSO Session,
1774 then the server MUST consider the client delinquent and MUST forcibly
1775 abort the DSO Session. What is considered "too much time" in this
1776 context is five seconds or twice the current inactivity timeout
1777 value, whichever is greater. If the inactivity timeout has its
1778 default value of 15 seconds, this means that a client will be
1779 considered delinquent and disconnected if it has not closed its
1780 connection after 30 seconds of inactivity.
1781
1782 In this context, an operation being active on a DSO Session includes
1783 a query waiting for a response, an update waiting for a response, or
1784 an active long-lived operation, but not a DSO Keepalive message
1785 exchange itself. A DSO Keepalive message exchange resets only the
1786 keepalive interval timer, not the inactivity timeout timer.
1787
1788 If the client wishes to keep an inactive DSO Session open for longer
1789 than the default duration, then it uses the DSO Keepalive message to
1790 request longer timeout values as described in Section 7.1.
1791
1792
1793
1794Bellis, et al. Standards Track [Page 32]
1795
1796RFC 8490 DNS Stateful Operations March 2019
1797
1798
17996.4.2. Values for the Inactivity Timeout
1800
1801 For the inactivity timeout value, lower values result in more
1802 frequent DSO Session teardowns and re-establishments. Higher values
1803 result in lower traffic and a lower CPU load on the server, but a
1804 higher memory burden to maintain state for inactive DSO Sessions.
1805
1806 A server may dictate any value it chooses for the inactivity timeout
1807 (either in a response to a client-initiated request or in a server-
1808 initiated message) including values under one second, or even zero.
1809
1810 An inactivity timeout of zero informs the client that it should not
1811 speculatively maintain idle connections at all, and as soon as the
1812 client has completed the operation or operations relating to this
1813 server, the client should immediately begin closing this session.
1814
1815 A server will forcibly abort an idle client session after five
1816 seconds or twice the inactivity timeout value, whichever is greater.
1817 In the case of a zero inactivity timeout value, this means that if a
1818 client fails to close an idle client session, then the server will
1819 forcibly abort the idle session after five seconds.
1820
1821 An inactivity timeout of 0xFFFFFFFF represents "infinity" and informs
1822 the client that it may keep an idle connection open as long as it
1823 wishes. Note that after granting an unlimited inactivity timeout in
1824 this way, at any point the server may revise that inactivity timeout
1825 by sending a new DSO Keepalive message dictating new Session Timeout
1826 values to the client.
1827
1828 The largest *finite* inactivity timeout supported by the current
1829 Keepalive TLV is 0xFFFFFFFE (2^32-2 milliseconds, approximately 49.7
1830 days).
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850Bellis, et al. Standards Track [Page 33]
1851
1852RFC 8490 DNS Stateful Operations March 2019
1853
1854
18556.5. The Keepalive Interval
1856
1857 The purpose of the keepalive interval is to manage the generation of
1858 sufficient messages to maintain state in middleboxes (such at NAT
1859 gateways or firewalls) and for the client and server to periodically
1860 verify that they still have connectivity to each other. This allows
1861 them to clean up state when connectivity is lost and to establish a
1862 new session if appropriate.
1863
18646.5.1. Keepalive Interval Expiry
1865
1866 If, at any time during the life of the DSO Session, the keepalive
1867 interval value (i.e., 15 seconds by default) elapses without any DNS
1868 messages being sent or received on a DSO Session, the client MUST
1869 take action to keep the DSO Session alive by sending a DSO Keepalive
1870 message (Section 7.1). A DSO Keepalive message exchange resets only
1871 the keepalive timer, not the inactivity timer.
1872
1873 If a client disconnects from the network abruptly, without cleanly
1874 closing its DSO Session, perhaps leaving a long-lived operation
1875 uncanceled, the server learns of this after failing to receive the
1876 required DSO keepalive traffic from that client. If, at any time
1877 during the life of the DSO Session, twice the keepalive interval
1878 value (i.e., 30 seconds by default) elapses without any DNS messages
1879 being sent or received on a DSO Session, the server SHOULD consider
1880 the client delinquent and SHOULD forcibly abort the DSO Session.
1881
18826.5.2. Values for the Keepalive Interval
1883
1884 For the keepalive interval value, lower values result in a higher
1885 volume of DSO keepalive traffic. Higher values of the keepalive
1886 interval reduce traffic and the CPU load, but have minimal effect on
1887 the memory burden at the server because clients keep a DSO Session
1888 open for the same length of time (determined by the inactivity
1889 timeout) regardless of the level of DSO keepalive traffic required.
1890
1891 It may be appropriate for clients and servers to select different
1892 keepalive intervals depending on the type of network they are on.
1893
1894 A corporate DNS server that knows it is serving only clients on the
1895 internal network, with no intervening NAT gateways or firewalls, can
1896 impose a longer keepalive interval because frequent DSO keepalive
1897 traffic is not required.
1898
1899 A public DNS server that is serving primarily residential consumer
1900 clients, where it is likely there will be a NAT gateway on the path,
1901 may impose a shorter keepalive interval to generate more frequent DSO
1902 keepalive traffic.
1903
1904
1905
1906Bellis, et al. Standards Track [Page 34]
1907
1908RFC 8490 DNS Stateful Operations March 2019
1909
1910
1911 A smart client may be adaptive to its environment. A client using a
1912 private IPv4 address [RFC1918] to communicate with a DNS server at an
1913 address outside that IPv4 private address block may conclude that
1914 there is likely to be a NAT gateway on the path, and accordingly
1915 request a shorter keepalive interval.
1916
1917 By default, it is RECOMMENDED that clients request, and servers
1918 grant, a keepalive interval of 60 minutes. This keepalive interval
1919 provides for reasonably timely detection if a client abruptly
1920 disconnects without cleanly closing the session. Also, it is
1921 sufficient to maintain state in firewalls and NAT gateways that
1922 follow the IETF recommended Best Current Practice that the
1923 "established connection idle-timeout" used by middleboxes be at least
1924 2 hours and 4 minutes [RFC5382] [RFC7857].
1925
1926 Note that the shorter the keepalive interval value, the higher the
1927 load on client and server. Moreover, for a keepalive value that is
1928 shorter than the time needed for the transport to retransmit, the
1929 loss of a single packet would cause a server to overzealously abort
1930 the connection. For example, a (hypothetical and unrealistic)
1931 keepalive interval value of 100 ms would result in a continuous
1932 stream of ten messages per second or more (if allowed by the current
1933 congestion control window) in both directions to keep the DSO Session
1934 alive. And, in this extreme example, a single retransmission over a
1935 path with, as an example, 100 ms RTT would introduce a momentary
1936 pause in the stream of messages long enough to cause the server to
1937 abort the connection.
1938
1939 Because of this concern, the server MUST NOT send a DSO Keepalive
1940 message (either a DSO response to a client-initiated DSO request or a
1941 server-initiated DSO message) with a keepalive interval value less
1942 than ten seconds. If a client receives a DSO Keepalive message
1943 specifying a keepalive interval value less than ten seconds, this is
1944 a fatal error and the client MUST forcibly abort the connection
1945 immediately.
1946
1947 A keepalive interval value of 0xFFFFFFFF represents "infinity" and
1948 informs the client that it should generate no DSO keepalive traffic.
1949 Note that after signaling that the client should generate no DSO
1950 keepalive traffic in this way, the server may at any point revise
1951 that DSO keepalive traffic requirement by sending a new DSO Keepalive
1952 message dictating new Session Timeout values to the client.
1953
1954 The largest *finite* keepalive interval supported by the current
1955 Keepalive TLV is 0xFFFFFFFE (2^32-2 milliseconds, approximately 49.7
1956 days).
1957
1958
1959
1960
1961
1962Bellis, et al. Standards Track [Page 35]
1963
1964RFC 8490 DNS Stateful Operations March 2019
1965
1966
19676.6. Server-Initiated DSO Session Termination
1968
1969 In addition to canceling individual long-lived operations selectively
1970 (Section 5.6), there are also occasions where a server may need to
1971 terminate one or more entire DSO sessions. An entire DSO session may
1972 need to be terminated if the client is defective in some way or
1973 departs from the network without closing its DSO session. DSO
1974 Sessions may also need to be terminated if the server becomes
1975 overloaded or is reconfigured and lacks the ability to be selective
1976 about which operations need to be canceled.
1977
1978 This section discusses various reasons a DSO session may be
1979 terminated and the mechanisms for doing so.
1980
1981 In normal operation, closing a DSO Session is the client's
1982 responsibility. The client makes the determination of when to close
1983 a DSO Session based on an evaluation of both its own needs and the
1984 inactivity timeout value dictated by the server. A server only
1985 causes a DSO Session to be ended in the exceptional circumstances
1986 outlined below. Some of the exceptional situations in which a server
1987 may terminate a DSO Session include:
1988
1989 o The server application software or underlying operating system is
1990 shutting down or restarting.
1991
1992 o The server application software terminates unexpectedly (perhaps
1993 due to a bug that makes it crash, causing the underlying operating
1994 system to send a TCP RST).
1995
1996 o The server is undergoing a reconfiguration or maintenance
1997 procedure that, due to the way the server software is implemented,
1998 requires clients to be disconnected. For example, some software
1999 is implemented such that it reads a configuration file at startup,
2000 and changing the server's configuration entails modifying the
2001 configuration file and then killing and restarting the server
2002 software, which generally entails a loss of network connections.
2003
2004 o The client fails to meet its obligation to generate the required
2005 DSO keepalive traffic or to close an inactive session by the
2006 prescribed time (five seconds or twice the time interval dictated
2007 by the server, whichever is greater, as described in Section 6.2).
2008
2009 o The client sends a grossly invalid or malformed request that is
2010 indicative of a seriously defective client implementation.
2011
2012 o The server is over capacity and needs to shed some load.
2013
2014
2015
2016
2017
2018Bellis, et al. Standards Track [Page 36]
2019
2020RFC 8490 DNS Stateful Operations March 2019
2021
2022
20236.6.1. Server-Initiated Retry Delay Message
2024
2025 In the cases described above where a server elects to terminate a DSO
2026 Session, it could do so simply by forcibly aborting the connection.
2027 However, if it did this, the likely behavior of the client might be
2028 simply to treat this as a network failure and reconnect immediately,
2029 putting more burden on the server.
2030
2031 Therefore, to avoid this reconnection implosion, a server SHOULD
2032 instead choose to shed client load by sending a Retry Delay message
2033 with an appropriate RCODE value informing the client of the reason
2034 the DSO Session needs to be terminated. The format of the DSO Retry
2035 Delay TLV and the interpretations of the various RCODE values are
2036 described in Section 7.2. After sending a DSO Retry Delay message,
2037 the server MUST NOT send any further messages on that DSO Session.
2038
2039 The server MAY randomize retry delays in situations where many retry
2040 delays are sent in quick succession so as to avoid all the clients
2041 attempting to reconnect at once. In general, implementations should
2042 avoid using the DSO Retry Delay message in a way that would result in
2043 many clients reconnecting at the same time if every client attempts
2044 to reconnect at the exact time specified.
2045
2046 Upon receipt of a DSO Retry Delay message from the server, the client
2047 MUST make note of the reconnect delay for this server and then
2048 immediately close the connection gracefully.
2049
2050 After sending a DSO Retry Delay message, the server SHOULD allow the
2051 client five seconds to close the connection, and if the client has
2052 not closed the connection after five seconds, then the server SHOULD
2053 forcibly abort the connection.
2054
2055 A DSO Retry Delay message MUST NOT be initiated by a client. If a
2056 server receives a DSO Retry Delay message, this is a fatal error and
2057 the server MUST forcibly abort the connection immediately.
2058
20596.6.1.1. Outstanding Operations
2060
2061 At the instant a server chooses to initiate a DSO Retry Delay
2062 message, there may be DNS requests already in flight from client to
2063 server on this DSO Session, which will arrive at the server after its
2064 DSO Retry Delay message has been sent. The server MUST silently
2065 ignore such incoming requests and MUST NOT generate any response
2066 messages for them. When the DSO Retry Delay message from the server
2067 arrives at the client, the client will determine that any DNS
2068 requests it previously sent on this DSO Session that have not yet
2069 received a response will now certainly not be receiving any response.
2070
2071
2072
2073
2074Bellis, et al. Standards Track [Page 37]
2075
2076RFC 8490 DNS Stateful Operations March 2019
2077
2078
2079 Such requests should be considered failed and should be retried at a
2080 later time, as appropriate.
2081
2082 In the case where some, but not all, of the existing operations on a
2083 DSO Session have become invalid (perhaps because the server has been
2084 reconfigured and is no longer authoritative for some of the names),
2085 but the server is terminating all affected DSO Sessions en masse by
2086 sending them all a DSO Retry Delay message, the reconnect delay MAY
2087 be zero, indicating that the clients SHOULD immediately attempt to
2088 re-establish operations.
2089
2090 It is likely that some of the attempts will be successful and some
2091 will not, depending on the nature of the reconfiguration.
2092
2093 In the case where a server is terminating a large number of DSO
2094 Sessions at once (e.g., if the system is restarting) and the server
2095 doesn't want to be inundated with a flood of simultaneous retries, it
2096 SHOULD send different reconnect delay values to each client. These
2097 adjustments MAY be selected randomly, pseudorandomly, or
2098 deterministically (e.g., incrementing the time value by one tenth of
2099 a second for each successive client, yielding a post-restart
2100 reconnection rate of ten clients per second).
2101
21026.6.2. Misbehaving Clients
2103
2104 A server may determine that a client is not following the protocol
2105 correctly. There may be no way for the server to recover the DSO
2106 session, in which case the server forcibly terminates the connection.
2107 Since the client doesn't know why the connection dropped, it may
2108 reconnect immediately. If the server has determined that a client is
2109 not following the protocol correctly, it MAY terminate the DSO
2110 Session as soon as it is established, specifying a long retry-delay
2111 to prevent the client from immediately reconnecting.
2112
21136.6.3. Client Reconnection
2114
2115 After a DSO Session is ended by the server (either by sending the
2116 client a DSO Retry Delay message or by forcibly aborting the
2117 underlying transport connection), the client SHOULD try to reconnect
2118 to that service instance or to another suitable service instance if
2119 more than one is available. If reconnecting to the same service
2120 instance, the client MUST respect the indicated delay, if available,
2121 before attempting to reconnect. Clients SHOULD NOT attempt to
2122 randomize the delay; the server will randomly jitter the retry delay
2123 values it sends to each client if this behavior is desired.
2124
2125
2126
2127
2128
2129
2130Bellis, et al. Standards Track [Page 38]
2131
2132RFC 8490 DNS Stateful Operations March 2019
2133
2134
2135 If a particular service instance will only be out of service for a
2136 short maintenance period, it should indicate a retry delay value that
2137 is a little longer than the expected maintenance window. It should
2138 not default to a very large delay value, or clients may not attempt
2139 to reconnect promptly after it resumes service.
2140
2141 If a service instance does not want a client to reconnect ever
2142 (perhaps the service instance is being decommissioned), it SHOULD set
2143 the retry delay to the maximum value 0xFFFFFFFF (2^32-1 milliseconds,
2144 approximately 49.7 days). It is not possible to instruct a client to
2145 stay away for longer than 49.7 days. If, after 49.7 days, the DNS or
2146 other configuration information still indicates that this is the
2147 valid service instance for a particular service, then clients MAY
2148 attempt to reconnect. In reality, if a client is rebooted or
2149 otherwise loses state, it may well attempt to reconnect before 49.7
2150 days elapse, for as long as the DNS or other configuration
2151 information continues to indicate that this is the service instance
2152 the client should use.
2153
21546.6.3.1. Reconnecting after a Forcible Abort
2155
2156 If a connection was forcibly aborted by the client due to
2157 noncompliant behavior by the server, the client SHOULD mark that
2158 service instance as not supporting DSO. The client MAY reconnect but
2159 not attempt to use DSO, or it may connect to a different service
2160 instance if applicable.
2161
21626.6.3.2. Reconnecting after an Unexplained Connection Drop
2163
2164 It is also possible for a server to forcibly terminate the
2165 connection; in this case, the client doesn't know whether the
2166 termination was the result of a protocol error or a network outage.
2167 When the client notices that the connection has been dropped, it can
2168 attempt to reconnect immediately. However, if the connection is
2169 dropped again without the client being able to successfully do
2170 whatever it is trying to do, it should mark the server as not
2171 supporting DSO.
2172
21736.6.3.3. Probing for Working DSO Support
2174
2175 Once a server has been marked by the client as not supporting DSO,
2176 the client SHOULD NOT attempt DSO operations on that server until
2177 some time has elapsed. A reasonable minimum would be an hour. Since
2178 forcibly aborted connections are the result of a software failure,
2179 it's not likely that the problem will be solved in the first hour
2180 after it's first encountered. However, by restricting the retry
2181 interval to an hour, the client will be able to notice when the
2182 problem has been fixed without placing an undue burden on the server.
2183
2184
2185
2186Bellis, et al. Standards Track [Page 39]
2187
2188RFC 8490 DNS Stateful Operations March 2019
2189
2190
21917. Base TLVs for DNS Stateful Operations
2192
2193 This section describes the three base TLVs for DNS Stateful
2194 Operations: Keepalive, Retry Delay, and Encryption Padding.
2195
21967.1. Keepalive TLV
2197
2198 The Keepalive TLV (DSO-TYPE=1) performs two functions. Primarily, it
2199 establishes the values for the Session Timeouts. Incidentally, it
2200 also resets the keepalive timer for the DSO Session, meaning that it
2201 can be used as a kind of "no-op" message for the purpose of keeping a
2202 session alive. The client will request the desired Session Timeout
2203 values and the server will acknowledge with the response values that
2204 it requires the client to use.
2205
2206 DSO messages with the Keepalive TLV as the Primary TLV may appear in
2207 early data.
2208
2209 The DSO-DATA for the Keepalive TLV is as follows:
2210
2211 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
2212 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2213 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2214 | INACTIVITY TIMEOUT (32 bits) |
2215 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2216 | KEEPALIVE INTERVAL (32 bits) |
2217 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2218
2219 INACTIVITY TIMEOUT: The inactivity timeout for the current DSO
2220 Session, specified as a 32-bit unsigned integer, in network (big
2221 endian) byte order in units of milliseconds. This is the timeout
2222 at which the client MUST begin closing an inactive DSO Session.
2223 The inactivity timeout can be any value of the server's choosing.
2224 If the client does not gracefully close an inactive DSO Session,
2225 then after five seconds or twice this interval, whichever is
2226 greater, the server will forcibly abort the connection.
2227
2228 KEEPALIVE INTERVAL: The keepalive interval for the current DSO
2229 Session, specified as a 32-bit unsigned integer, in network (big
2230 endian) byte order in units of milliseconds. This is the interval
2231 at which a client MUST generate DSO keepalive traffic to maintain
2232 connection state. The keepalive interval MUST NOT be less than
2233 ten seconds. If the client does not generate the mandated DSO
2234 keepalive traffic, then after twice this interval the server will
2235 forcibly abort the connection. Since the minimum allowed
2236 keepalive interval is ten seconds, the minimum time at which a
2237 server will forcibly disconnect a client for failing to generate
2238 the mandated DSO keepalive traffic is twenty seconds.
2239
2240
2241
2242Bellis, et al. Standards Track [Page 40]
2243
2244RFC 8490 DNS Stateful Operations March 2019
2245
2246
2247 The transmission or reception of DSO Keepalive messages (i.e.,
2248 messages where the Keepalive TLV is the first TLV) reset only the
2249 keepalive timer, not the inactivity timer. The reason for this is
2250 that periodic DSO Keepalive messages are sent for the sole purpose of
2251 keeping a DSO Session alive when that DSO Session has current or
2252 recent non-maintenance activity that warrants keeping that DSO
2253 Session alive. Sending DSO keepalive traffic itself is not
2254 considered a client activity; it is considered a maintenance activity
2255 that is performed in service of other client activities. If DSO
2256 keepalive traffic itself were to reset the inactivity timer, then
2257 that would create a circular livelock where keepalive traffic would
2258 be sent indefinitely to keep a DSO Session alive. In this scenario,
2259 the only activity on that DSO Session would be the keepalive traffic
2260 keeping the DSO Session alive so that further keepalive traffic can
2261 be sent. For a DSO Session to be considered active, it must be
2262 carrying something more than just keepalive traffic. This is why
2263 merely sending or receiving a DSO Keepalive message does not reset
2264 the inactivity timer.
2265
2266 When sent by a client, the DSO Keepalive request message MUST be sent
2267 as a DSO request message with a nonzero MESSAGE ID. If a server
2268 receives a DSO Keepalive message with a zero MESSAGE ID, then this is
2269 a fatal error and the server MUST forcibly abort the connection
2270 immediately. The DSO Keepalive request message resets a DSO
2271 Session's keepalive timer and, at the same time, communicates to the
2272 server the client's requested Session Timeout values. In a server
2273 response to a client-initiated DSO Keepalive request message, the
2274 Session Timeouts contain the server's chosen values from this point
2275 forward in the DSO Session, which the client MUST respect. This is
2276 modeled after the DHCP protocol, where the client requests a certain
2277 lease lifetime using DHCP option 51 [RFC2132], but the server is the
2278 ultimate authority for deciding what lease lifetime is actually
2279 granted.
2280
2281 When a client is sending its second and subsequent DSO Keepalive
2282 request messages to the server, the client SHOULD continue to request
2283 its preferred values each time. This allows flexibility so that if
2284 conditions change during the lifetime of a DSO Session, the server
2285 can adapt its responses to better fit the client's needs.
2286
2287 Once a DSO Session is in progress (Section 5.1), a DSO Keepalive
2288 message MAY be initiated by a server. When sent by a server, the DSO
2289 Keepalive message MUST be sent as a DSO unidirectional message with
2290 the MESSAGE ID set to zero. The client MUST NOT generate a response
2291 to a server-initiated DSO Keepalive message. If a client receives a
2292 DSO Keepalive request message with a nonzero MESSAGE ID, then this is
2293 a fatal error and the client MUST forcibly abort the connection
2294 immediately. The DSO Keepalive unidirectional message from the
2295
2296
2297
2298Bellis, et al. Standards Track [Page 41]
2299
2300RFC 8490 DNS Stateful Operations March 2019
2301
2302
2303 server resets a DSO Session's keepalive timer and, at the same time,
2304 unilaterally informs the client of the new Session Timeout values to
2305 use from this point forward in this DSO Session. No client DSO
2306 response to this unilateral declaration is required or allowed.
2307
2308 In DSO Keepalive response messages, exactly one instance of the
2309 Keepalive TLV MUST be present and is used only as a Response Primary
2310 TLV sent as a reply to a DSO Keepalive request message from the
2311 client. A Keepalive TLV MUST NOT be added to other responses as a
2312 Response Additional TLV. If the server wishes to update a client's
2313 Session Timeout values other than in response to a DSO Keepalive
2314 request message from the client, then it does so by sending a DSO
2315 Keepalive unidirectional message of its own, as described above.
2316
2317 It is not required that the Keepalive TLV be used in every DSO
2318 Session. While many DSO operations will be used in conjunction with
2319 a long-lived session state, not all DSO operations require a long-
2320 lived session state, and in some cases the default 15-second value
2321 for both the inactivity timeout and keepalive interval may be
2322 perfectly appropriate. However, note that for clients that implement
2323 only the DSO-TYPEs defined in this document, a DSO Keepalive request
2324 message is the only way for a client to initiate a DSO Session.
2325
23267.1.1. Client Handling of Received Session Timeout Values
2327
2328 When a client receives a response to its client-initiated DSO
2329 Keepalive request message, or receives a server-initiated DSO
2330 Keepalive unidirectional message, the client has then received
2331 Session Timeout values dictated by the server. The two timeout
2332 values contained in the Keepalive TLV from the server may each be
2333 higher, lower, or the same as the respective Session Timeout values
2334 the client previously had for this DSO Session.
2335
2336 In the case of the keepalive timer, the handling of the received
2337 value is straightforward. The act of receiving the message
2338 containing the DSO Keepalive TLV itself resets the keepalive timer
2339 and updates the keepalive interval for the DSO Session. The new
2340 keepalive interval indicates the maximum time that may elapse before
2341 another message must be sent or received on this DSO Session, if the
2342 DSO Session is to remain alive.
2343
2344 In the case of the inactivity timeout, the handling of the received
2345 value is a little more subtle, though the meaning of the inactivity
2346 timeout remains as specified; it still indicates the maximum
2347 permissible time allowed without useful activity on a DSO Session.
2348 The act of receiving the message containing the Keepalive TLV does
2349 not itself reset the inactivity timer. The time elapsed since the
2350 last useful activity on this DSO Session is unaffected by exchange of
2351
2352
2353
2354Bellis, et al. Standards Track [Page 42]
2355
2356RFC 8490 DNS Stateful Operations March 2019
2357
2358
2359 DSO Keepalive messages. The new inactivity timeout value in the
2360 Keepalive TLV in the received message does update the timeout
2361 associated with the running inactivity timer; that becomes the new
2362 maximum permissible time without activity on a DSO Session.
2363
2364 o If the current inactivity timer value is less than the new
2365 inactivity timeout, then the DSO Session may remain open for now.
2366 When the inactivity timer value reaches the new inactivity
2367 timeout, the client MUST then begin closing the DSO Session as
2368 described above.
2369
2370 o If the current inactivity timer value is equal to the new
2371 inactivity timeout, then this DSO Session has been inactive for
2372 exactly as long as the server will permit, and now the client MUST
2373 immediately begin closing this DSO Session.
2374
2375 o If the current inactivity timer value is already greater than the
2376 new inactivity timeout, then this DSO Session has already been
2377 inactive for longer than the server permits, and the client MUST
2378 immediately begin closing this DSO Session.
2379
2380 o If the current inactivity timer value is already more than twice
2381 the new inactivity timeout, then the client is immediately
2382 considered delinquent (this DSO Session is immediately eligible to
2383 be forcibly terminated by the server) and the client MUST
2384 immediately begin closing this DSO Session. However, if a server
2385 abruptly reduces the inactivity timeout in this way, then, to give
2386 the client time to close the connection gracefully before the
2387 server resorts to forcibly aborting it, the server SHOULD give the
2388 client an additional grace period of either five seconds or one
2389 quarter of the new inactivity timeout, whichever is greater.
2390
23917.1.2. Relationship to edns-tcp-keepalive EDNS(0) Option
2392
2393 The inactivity timeout value in the Keepalive TLV (DSO-TYPE=1) has
2394 similar intent to the edns-tcp-keepalive EDNS(0) Option [RFC7828]. A
2395 client/server pair that supports DSO MUST NOT use the edns-tcp-
2396 keepalive EDNS(0) Option within any message after a DSO Session has
2397 been established. A client that has sent a DSO message to establish
2398 a session MUST NOT send an edns-tcp-keepalive EDNS(0) Option from
2399 this point on. Once a DSO Session has been established, if either
2400 client or server receives a DNS message over the DSO Session that
2401 contains an edns-tcp-keepalive EDNS(0) Option, this is a fatal error
2402 and the receiver of the edns-tcp-keepalive EDNS(0) Option MUST
2403 forcibly abort the connection immediately.
2404
2405
2406
2407
2408
2409
2410Bellis, et al. Standards Track [Page 43]
2411
2412RFC 8490 DNS Stateful Operations March 2019
2413
2414
24157.2. Retry Delay TLV
2416
2417 The Retry Delay TLV (DSO-TYPE=2) can be used as a Primary TLV
2418 (unidirectional) in a server-to-client message, or as a Response
2419 Additional TLV in either direction. DSO messages with a Relay Delay
2420 TLV as their Primary TLV are not permitted in early data.
2421
2422 The DSO-DATA for the Retry Delay TLV is as follows:
2423
2424 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
2425 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2426 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2427 | RETRY DELAY (32 bits) |
2428 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2429
2430 RETRY DELAY: A time value, specified as a 32-bit unsigned integer in
2431 network (big endian) byte order, in units of milliseconds, within
2432 which the initiator MUST NOT retry this operation or retry
2433 connecting to this server. Recommendations for the RETRY DELAY
2434 value are given in Section 6.6.1.
2435
24367.2.1. Retry Delay TLV Used as a Primary TLV
2437
2438 When used as the Primary TLV in a DSO unidirectional message, the
2439 Retry Delay TLV is sent from server to client. It is used by a
2440 server to instruct a client to close the DSO Session and underlying
2441 connection, and not to reconnect for the indicated time interval.
2442
2443 In this case, it applies to the DSO Session as a whole, and the
2444 client MUST begin closing the DSO Session as described in
2445 Section 6.6.1. The RCODE in the message header SHOULD indicate the
2446 principal reason for the termination:
2447
2448 o NOERROR indicates a routine shutdown or restart.
2449
2450 o FORMERR indicates that a client DSO request was too badly
2451 malformed for the session to continue.
2452
2453 o SERVFAIL indicates that the server is overloaded due to resource
2454 exhaustion and needs to shed load.
2455
2456 o REFUSED indicates that the server has been reconfigured, and at
2457 this time it is now unable to perform one or more of the long-
2458 lived client operations that were previously being performed on
2459 this DSO Session.
2460
2461
2462
2463
2464
2465
2466Bellis, et al. Standards Track [Page 44]
2467
2468RFC 8490 DNS Stateful Operations March 2019
2469
2470
2471 o NOTAUTH indicates that the server has been reconfigured and at
2472 this time it is now unable to perform one or more of the long-
2473 lived client operations that were previously being performed on
2474 this DSO Session because it does not have authority over the names
2475 in question (for example, a DNS Push Notification server could be
2476 reconfigured such that it is no longer accepting DNS Push
2477 Notification requests for one or more of the currently subscribed
2478 names).
2479
2480 This document specifies only these RCODE values for the DSO Retry
2481 Delay message. Servers sending DSO Retry Delay messages SHOULD use
2482 one of these values. However, future circumstances may create
2483 situations where other RCODE values are appropriate in DSO Retry
2484 Delay messages, so clients MUST be prepared to accept DSO Retry Delay
2485 messages with any RCODE value.
2486
2487 In some cases, when a server sends a DSO Retry Delay unidirectional
2488 message to a client, there may be more than one reason for the server
2489 wanting to end the session. Possibly, the configuration could have
2490 been changed such that some long-lived client operations can no
2491 longer be continued due to policy (REFUSED), and other long-lived
2492 client operations can no longer be performed due to the server no
2493 longer being authoritative for those names (NOTAUTH). In such cases,
2494 the server MAY use any of the applicable RCODE values, or
2495 RCODE=NOERROR (routine shutdown or restart).
2496
2497 Note that the selection of RCODE value in a DSO Retry Delay message
2498 is not critical since the RCODE value is generally used only for
2499 information purposes such as writing to a log file for future human
2500 analysis regarding the nature of the disconnection. Generally,
2501 clients do not modify their behavior depending on the RCODE value.
2502 The RETRY DELAY in the message tells the client how long it should
2503 wait before attempting a new connection to this service instance.
2504
2505 For clients that do in some way modify their behavior depending on
2506 the RCODE value, they should treat unknown RCODE values the same as
2507 RCODE=NOERROR (routine shutdown or restart).
2508
2509 A DSO Retry Delay message (DSO message where the Primary TLV is Retry
2510 Delay) from server to client is a DSO unidirectional message; the
2511 MESSAGE ID MUST be set to zero in the outgoing message and the client
2512 MUST NOT send a response.
2513
2514 A client MUST NOT send a DSO Retry Delay message to a server. If a
2515 server receives a DSO message where the Primary TLV is the Retry
2516 Delay TLV, this is a fatal error and the server MUST forcibly abort
2517 the connection immediately.
2518
2519
2520
2521
2522Bellis, et al. Standards Track [Page 45]
2523
2524RFC 8490 DNS Stateful Operations March 2019
2525
2526
25277.2.2. Retry Delay TLV Used as a Response Additional TLV
2528
2529 In the case of a DSO request message that results in a nonzero RCODE
2530 value, the responder MAY append a Retry Delay TLV to the response,
2531 indicating the time interval during which the initiator SHOULD NOT
2532 attempt this operation again.
2533
2534 The indicated time interval during which the initiator SHOULD NOT
2535 retry applies only to the failed operation, not to the DSO Session as
2536 a whole.
2537
2538 Either a client or a server, whichever is acting in the role of the
2539 responder for a particular DSO request message, MAY append a Retry
2540 Delay TLV to an error response that it sends.
2541
25427.3. Encryption Padding TLV
2543
2544 The Encryption Padding TLV (DSO-TYPE=3) can only be used as an
2545 Additional or Response Additional TLV. It is only applicable when
2546 the DSO Transport layer uses encryption such as TLS.
2547
2548 The DSO-DATA for the Padding TLV is optional and is a variable length
2549 field containing non-specified values. A DSO-LENGTH of 0 essentially
2550 provides for 4 bytes of padding (the minimum amount).
2551
2552 1 1 1 1 1 1
2553 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
2554 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
2555 / /
2556 / PADDING -- VARIABLE NUMBER OF BYTES /
2557 / /
2558 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
2559
2560 As specified for the EDNS(0) Padding Option [RFC7830], the PADDING
2561 bytes SHOULD be set to 0x00. Other values MAY be used, for example,
2562 in cases where there is a concern that the padded message could be
2563 subject to compression before encryption. PADDING bytes of any value
2564 MUST be accepted in the messages received.
2565
2566 The Encryption Padding TLV may be included in either a DSO request
2567 message, response, or both. As specified for the EDNS(0) Padding
2568 Option [RFC7830], if a DSO request message is received with an
2569 Encryption Padding TLV, then the DSO response MUST also include an
2570 Encryption Padding TLV.
2571
2572 The length of padding is intentionally not specified in this document
2573 and is a function of current best practices with respect to the type
2574 and length of data in the preceding TLVs [RFC8467].
2575
2576
2577
2578Bellis, et al. Standards Track [Page 46]
2579
2580RFC 8490 DNS Stateful Operations March 2019
2581
2582
25838. Summary Highlights
2584
2585 This section summarizes some noteworthy highlights about various
2586 aspects of the DSO protocol.
2587
25888.1. QR Bit and MESSAGE ID
2589
2590 In DSO request messages, the QR bit is 0 and the MESSAGE ID is
2591 nonzero.
2592
2593 In DSO response messages, the QR bit is 1 and the MESSAGE ID is
2594 nonzero.
2595
2596 In DSO unidirectional messages, the QR bit is 0 and the MESSAGE ID is
2597 zero.
2598
2599 The table below illustrates which combinations are legal and how they
2600 are interpreted:
2601
2602 +------------------------------+------------------------+
2603 | MESSAGE ID zero | MESSAGE ID nonzero |
2604 +--------+------------------------------+------------------------+
2605 | QR=0 | DSO unidirectional message | DSO request message |
2606 +--------+------------------------------+------------------------+
2607 | QR=1 | Invalid - Fatal Error | DSO response message |
2608 +--------+------------------------------+------------------------+
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
2632
2633
2634Bellis, et al. Standards Track [Page 47]
2635
2636RFC 8490 DNS Stateful Operations March 2019
2637
2638
26398.2. TLV Usage
2640
2641 The table below indicates, for each of the three TLVs defined in this
2642 document, whether they are valid in each of ten different contexts.
2643
2644 The first five contexts are DSO requests or DSO unidirectional
2645 messages from client to server, and the corresponding responses from
2646 server back to client:
2647
2648 o C-P - Primary TLV, sent in DSO request message, from client to
2649 server, with nonzero MESSAGE ID indicating that this request MUST
2650 generate response message.
2651
2652 o C-U - Primary TLV, sent in DSO unidirectional message, from client
2653 to server, with zero MESSAGE ID indicating that this request MUST
2654 NOT generate response message.
2655
2656 o C-A - Additional TLV, optionally added to a DSO request message or
2657 DSO unidirectional message from client to server.
2658
2659 o CRP - Response Primary TLV, included in response message sent back
2660 to the client (in response to a client "C-P" request with nonzero
2661 MESSAGE ID indicating that a response is required) where the DSO-
2662 TYPE of the Response TLV matches the DSO-TYPE of the Primary TLV
2663 in the request.
2664
2665 o CRA - Response Additional TLV, included in response message sent
2666 back to the client (in response to a client "C-P" request with
2667 nonzero MESSAGE ID indicating that a response is required) where
2668 the DSO-TYPE of the Response TLV does not match the DSO-TYPE of
2669 the Primary TLV in the request.
2670
2671 The second five contexts are their counterparts in the opposite
2672 direction: DSO requests or DSO unidirectional messages from server to
2673 client, and the corresponding responses from client back to server.
2674
2675 o S-P - Primary TLV, sent in DSO request message, from server to
2676 client, with nonzero MESSAGE ID indicating that this request MUST
2677 generate response message.
2678
2679 o S-U - Primary TLV, sent in DSO unidirectional message, from server
2680 to client, with zero MESSAGE ID indicating that this request MUST
2681 NOT generate response message.
2682
2683 o S-A - Additional TLV, optionally added to a DSO request message or
2684 DSO unidirectional message from server to client.
2685
2686
2687
2688
2689
2690Bellis, et al. Standards Track [Page 48]
2691
2692RFC 8490 DNS Stateful Operations March 2019
2693
2694
2695 o SRP - Response Primary TLV, included in response message sent back
2696 to the server (in response to a server "S-P" request with nonzero
2697 MESSAGE ID indicating that a response is required) where the DSO-
2698 TYPE of the Response TLV matches the DSO-TYPE of the Primary TLV
2699 in the request.
2700
2701 o SRA - Response Additional TLV, included in response message sent
2702 back to the server (in response to a server "S-P" request with
2703 nonzero MESSAGE ID indicating that a response is required) where
2704 the DSO-TYPE of the Response TLV does not match the DSO-TYPE of
2705 the Primary TLV in the request.
2706
2707 +-------------------------+-------------------------+
2708 | C-P C-U C-A CRP CRA | S-P S-U S-A SRP SRA |
2709 +------------+-------------------------+-------------------------+
2710 | KeepAlive | X X | X |
2711 +------------+-------------------------+-------------------------+
2712 | RetryDelay | X | X X |
2713 +------------+-------------------------+-------------------------+
2714 | Padding | X X | X X |
2715 +------------+-------------------------+-------------------------+
2716
2717 Note that some of the columns in this table are currently empty. The
2718 table provides a template for future TLV definitions to follow. It
2719 is recommended that definitions of future TLVs include a similar
2720 table summarizing the contexts where the new TLV is valid.
2721
2722
2723
2724
2725
2726
2727
2728
2729
2730
2731
2732
2733
2734
2735
2736
2737
2738
2739
2740
2741
2742
2743
2744
2745
2746Bellis, et al. Standards Track [Page 49]
2747
2748RFC 8490 DNS Stateful Operations March 2019
2749
2750
27519. Additional Considerations
2752
27539.1. Service Instances
2754
2755 We use the term "service instance" to refer to software running on a
2756 host that can receive connections on some set of { IP address, port }
2757 tuples. What makes the software an instance is that regardless of
2758 which of these tuples the client uses to connect to it, the client is
2759 connected to the same software, running on the same logical node (see
2760 Section 9.2), and will receive the same answers and the same keying
2761 information.
2762
2763 Service instances are identified from the perspective of the client.
2764 If the client is configured with { IP address, port } tuples, it has
2765 no way to tell if the service offered at one tuple is the same server
2766 that is listening on a different tuple. So in this case, the client
2767 treats each different tuple as if it references a different service
2768 instance.
2769
2770 In some cases, a client is configured with a hostname and a port
2771 number. The port number may be given explicitly, along with the
2772 hostname. The port number may be omitted, and assumed to have some
2773 default value. The hostname and a port number may be learned from
2774 the network, as in the case of DNS SRV records. In these cases, the
2775 { hostname, port } tuple uniquely identifies the service instance,
2776 subject to the usual case-insensitive DNS comparison of names
2777 [RFC1034].
2778
2779 It is possible that two hostnames might point to some common IP
2780 addresses; this is a configuration anomaly that the client is not
2781 obliged to detect. The effect of this could be that after being told
2782 to disconnect, the client might reconnect to the same server because
2783 it is represented as a different service instance.
2784
2785 Implementations SHOULD NOT resolve hostnames and then perform the
2786 process of matching IP address(es) in order to evaluate whether two
2787 entities should be determined to be the "same service instance".
2788
2789
2790
2791
2792
2793
2794
2795
2796
2797
2798
2799
2800
2801
2802Bellis, et al. Standards Track [Page 50]
2803
2804RFC 8490 DNS Stateful Operations March 2019
2805
2806
28079.2. Anycast Considerations
2808
2809 When an anycast service is configured on a particular IP address and
2810 port, it must be the case that although there is more than one
2811 physical server responding on that IP address, each such server can
2812 be treated as equivalent. What we mean by "equivalent" here is that
2813 both servers can provide the same service and, where appropriate, the
2814 same authentication information, such as PKI certificates, when
2815 establishing connections.
2816
2817 If a change in network topology causes packets in a particular TCP
2818 connection to be sent to an anycast server instance that does not
2819 know about the connection, the new server will automatically
2820 terminate the connection with a TCP reset, since it will have no
2821 record of the connection, and then the client can reconnect or stop
2822 using the connection as appropriate.
2823
2824 If, after the connection is re-established, the client's assumption
2825 that it is connected to the same instance is violated in some way,
2826 that would be considered an incorrect behavior in this context. It
2827 is, however, out of the possible scope for this specification to make
2828 specific recommendations in this regard; that would be up to follow-
2829 on documents that describe specific uses of DNS Stateful Operations.
2830
2831
2832
2833
2834
2835
2836
2837
2838
2839
2840
2841
2842
2843
2844
2845
2846
2847
2848
2849
2850
2851
2852
2853
2854
2855
2856
2857
2858Bellis, et al. Standards Track [Page 51]
2859
2860RFC 8490 DNS Stateful Operations March 2019
2861
2862
28639.3. Connection Sharing
2864
2865 As previously specified for DNS-over-TCP [RFC7766]:
2866
2867 To mitigate the risk of unintentional server overload, DNS
2868 clients MUST take care to minimize the number of concurrent
2869 TCP connections made to any individual server. It is RECOMMENDED
2870 that for any given client/server interaction there SHOULD be
2871 no more than one connection for regular queries, one for zone
2872 transfers, and one for each protocol that is being used on top
2873 of TCP (for example, if the resolver was using TLS). However,
2874 it is noted that certain primary/secondary configurations
2875 with many busy zones might need to use more than one TCP
2876 connection for zone transfers for operational reasons (for
2877 example, to support concurrent transfers of multiple zones).
2878
2879 A single server may support multiple services, including DNS Updates
2880 [RFC2136], DNS Push Notifications [Push], and other services, for one
2881 or more DNS zones. When a client discovers that the target server
2882 for several different operations is the same service instance (see
2883 Section 9.1), the client SHOULD use a single shared DSO Session for
2884 all those operations.
2885
2886 This requirement has two benefits. First, it reduces unnecessary
2887 connection load on the DNS server. Second, it avoids the connection
2888 startup time that would be spent establishing each new additional
2889 connection to the same DNS server.
2890
2891 However, server implementers and operators should be aware that
2892 connection sharing may not be possible in all cases. A single host
2893 device may be home to multiple independent client software instances
2894 that don't coordinate with each other. Similarly, multiple
2895 independent client devices behind the same NAT gateway will also
2896 typically appear to the DNS server as different source ports on the
2897 same client IP address. Because of these constraints, a DNS server
2898 MUST be prepared to accept multiple connections from different source
2899 ports on the same client IP address.
2900
2901
2902
2903
2904
2905
2906
2907
2908
2909
2910
2911
2912
2913
2914Bellis, et al. Standards Track [Page 52]
2915
2916RFC 8490 DNS Stateful Operations March 2019
2917
2918
29199.4. Operational Considerations for Middleboxes
2920
2921 Where an application-layer middlebox (e.g., a DNS proxy, forwarder,
2922 or session multiplexer) is in the path, care must be taken to avoid a
2923 configuration in which DSO traffic is mishandled. The simplest way
2924 to avoid such problems is to avoid using middleboxes. When this is
2925 not possible, middleboxes should be evaluated to make sure that they
2926 behave correctly.
2927
2928 Correct behavior for middleboxes consists of one of the following:
2929
2930 o The middlebox does not forward DSO messages and responds to DSO
2931 messages with a response code other than NOERROR or DSOTYPENI.
2932
2933 o The middlebox acts as a DSO server and follows this specification
2934 in establishing connections.
2935
2936 o There is a 1:1 correspondence between incoming and outgoing
2937 connections such that when a connection is established to the
2938 middlebox, it is guaranteed that exactly one corresponding
2939 connection will be established from the middlebox to some DNS
2940 resolver, and all incoming messages will be forwarded without
2941 modification or reordering. An example of this would be a NAT
2942 forwarder or TCP connection optimizer (e.g., for a high-latency
2943 connection such as a geosynchronous satellite link).
2944
2945 Middleboxes that do not meet one of the above criteria are very
2946 likely to fail in unexpected and difficult-to-diagnose ways. For
2947 example, a DNS load balancer might unbundle DNS messages from the
2948 incoming TCP stream and forward each message from the stream to a
2949 different DNS server. If such a load balancer is in use, and the DNS
2950 servers it points to implement DSO and are configured to enable DSO,
2951 DSO Session establishment will succeed, but no coherent session will
2952 exist between the client and the server. If such a load balancer is
2953 pointed at a DNS server that does not implement DSO or is configured
2954 not to allow DSO, no such problem will exist, but such a
2955 configuration risks unexpected failure if new server software is
2956 installed that does implement DSO.
2957
2958 It is of course possible to implement a middlebox that properly
2959 supports DSO. It is even possible to implement one that implements
2960 DSO with long-lived operations. This can be done either by
2961 maintaining a 1:1 correspondence between incoming and outgoing
2962 connections, as mentioned above, or by terminating incoming sessions
2963 at the middlebox but maintaining state in the middlebox about any
2964 long-lived operations that are requested. Specifying this in detail
2965 is beyond the scope of this document.
2966
2967
2968
2969
2970Bellis, et al. Standards Track [Page 53]
2971
2972RFC 8490 DNS Stateful Operations March 2019
2973
2974
29759.5. TCP Delayed Acknowledgement Considerations
2976
2977 Most modern implementations of the Transmission Control Protocol
2978 (TCP) include a feature called "Delayed Acknowledgement" [RFC1122].
2979
2980 Without this feature, TCP can be very wasteful on the network. For
2981 illustration, consider a simple example like remote login using a
2982 very simple TCP implementation that lacks delayed acks. When the
2983 user types a keystroke, a data packet is sent. When the data packet
2984 arrives at the server, the simple TCP implementation sends an
2985 immediate acknowledgement. Mere milliseconds later, the server
2986 process reads the one byte of keystroke data, and consequently the
2987 simple TCP implementation sends an immediate window update. Mere
2988 milliseconds later, the server process generates the character echo
2989 and sends this data back in reply. The simple TCP implementation
2990 then sends this data packet immediately too. In this case, this
2991 simple TCP implementation sends a burst of three packets almost
2992 instantaneously (ack, window update, data).
2993
2994 Clearly it would be more efficient if the TCP implementation were to
2995 combine the three separate packets into one, and this is what the
2996 delayed ack feature enables.
2997
2998 With delayed ack, the TCP implementation waits after receiving a data
2999 packet, typically for 200 ms, and then sends its ack if (a) more data
3000 packet(s) arrive, (b) the receiving process generates some reply
3001 data, or (c) 200 ms elapse without either of the above occurring.
3002
3003 With delayed ack, remote login becomes much more efficient,
3004 generating just one packet instead of three for each character echo.
3005
3006 The logic of delayed ack is that the 200 ms delay cannot do any
3007 significant harm. If something at the other end were waiting for
3008 something, then the receiving process should generate the reply that
3009 the thing at the other end is waiting for, and TCP will then
3010 immediately send that reply (combined with the ack and window
3011 update). And if the receiving process does not in fact generate any
3012 reply for this particular message, then by definition the thing at
3013 the other end cannot be waiting for anything. Therefore, the 200 ms
3014 delay is harmless.
3015
3016 This assumption may be true unless the sender is using Nagle's
3017 algorithm, a similar efficiency feature, created to protect the
3018 network from poorly written client software that performs many rapid
3019 small writes in succession. Nagle's algorithm allows these small
3020 writes to be coalesced into larger, less wasteful packets.
3021
3022
3023
3024
3025
3026Bellis, et al. Standards Track [Page 54]
3027
3028RFC 8490 DNS Stateful Operations March 2019
3029
3030
3031 Unfortunately, Nagle's algorithm and delayed ack, two valuable
3032 efficiency features, can interact badly with each other when used
3033 together [NagleDA].
3034
3035 DSO request messages elicit responses; DSO unidirectional messages
3036 and DSO response messages do not.
3037
3038 For DSO request messages, which do elicit responses, Nagle's
3039 algorithm and delayed ack work as intended.
3040
3041 For DSO messages that do not elicit responses, the delayed ack
3042 mechanism causes the ack to be delayed by 200 ms. The 200 ms delay
3043 on the ack can in turn cause Nagle's algorithm to prevent the sender
3044 from sending any more data for 200 ms until the awaited ack arrives.
3045 On an enterprise Gigabit Ethernet (GigE) backbone with sub-
3046 millisecond round-trip times, a 200 ms delay is enormous in
3047 comparison.
3048
3049 When this issues is raised, there are two solutions that are often
3050 offered, neither of them ideal:
3051
3052 1. Disable delayed ack. For DSO messages that elicit no response,
3053 removing delayed ack avoids the needless 200 ms delay and sends
3054 back an immediate ack that tells Nagle's algorithm that it should
3055 immediately grant the sender permission to send its next packet.
3056 Unfortunately, for DSO messages that *do* elicit a response,
3057 removing delayed ack removes the efficiency gains of combining
3058 acks with data, and the responder will now send two or three
3059 packets instead of one.
3060
3061 2. Disable Nagle's algorithm. When acks are delayed by the delayed
3062 ack algorithm, removing Nagle's algorithm prevents the sender
3063 from being blocked from sending its next small packet
3064 immediately. Unfortunately, on a network with a higher round-
3065 trip time, removing Nagle's algorithm removes the efficiency
3066 gains of combining multiple small packets into fewer larger ones,
3067 with the goal of limiting the number of small packets in flight
3068 at any one time.
3069
3070 The problem here is that with DSO messages that elicit no response,
3071 the TCP implementation is stuck waiting, unsure if a response is
3072 about to be generated or whether the TCP implementation should go
3073 ahead and send an ack and window update.
3074
3075 The solution is networking APIs that allow the receiver to inform the
3076 TCP implementation that a received message has been read, processed,
3077 and no response for this message will be generated. TCP can then
3078
3079
3080
3081
3082Bellis, et al. Standards Track [Page 55]
3083
3084RFC 8490 DNS Stateful Operations March 2019
3085
3086
3087 stop waiting for a response that will never come, and immediately go
3088 ahead and send an ack and window update.
3089
3090 For implementations of DSO, disabling delayed ack is NOT RECOMMENDED
3091 because of the harm this can do to the network.
3092
3093 For implementations of DSO, disabling Nagle's algorithm is NOT
3094 RECOMMENDED because of the harm this can do to the network.
3095
3096 At the time that this document is being prepared for publication, it
3097 is known that at least one TCP implementation provides the ability
3098 for the recipient of a TCP message to signal that it is not going to
3099 send a response, and hence the delayed ack mechanism can stop
3100 waiting. Implementations on operating systems where this feature is
3101 available SHOULD make use of it.
3102
3103
3104
3105
3106
3107
3108
3109
3110
3111
3112
3113
3114
3115
3116
3117
3118
3119
3120
3121
3122
3123
3124
3125
3126
3127
3128
3129
3130
3131
3132
3133
3134
3135
3136
3137
3138Bellis, et al. Standards Track [Page 56]
3139
3140RFC 8490 DNS Stateful Operations March 2019
3141
3142
314310. IANA Considerations
3144
314510.1. DSO OPCODE Registration
3146
3147 The IANA has assigned the value 6 for DNS Stateful Operations (DSO)
3148 in the "DNS OpCodes" registry.
3149
315010.2. DSO RCODE Registration
3151
3152 IANA has assigned the value 11 for the DSOTYPENI error code in the
3153 "DNS RCODEs" registry. The DSOTYPENI error code ("DSO-TYPE Not
3154 Implemented") indicates that the receiver does implement DNS Stateful
3155 Operations, but does not implement the specific DSO-TYPE of the
3156 Primary TLV in the DSO request message.
3157
315810.3. DSO Type Code Registry
3159
3160 The IANA has created the 16-bit "DSO Type Codes" registry, with
3161 initial (hexadecimal) values as shown below:
3162
3163 +-----------+-----------------------+-------+-----------+-----------+
3164 | Type | Name | Early | Status | Reference |
3165 | | | Data | | |
3166 +-----------+-----------------------+-------+-----------+-----------+
3167 | 0000 | Reserved | NO | Standards | RFC 8490 |
3168 | | | | Track | |
3169 | | | | | |
3170 | 0001 | KeepAlive | OK | Standards | RFC 8490 |
3171 | | | | Track | |
3172 | | | | | |
3173 | 0002 | RetryDelay | NO | Standards | RFC 8490 |
3174 | | | | Track | |
3175 | | | | | |
3176 | 0003 | EncryptionPadding | NA | Standards | RFC 8490 |
3177 | | | | Track | |
3178 | | | | | |
3179 | 0004-003F | Unassigned, reserved | NO | | |
3180 | | for DSO session- | | | |
3181 | | management TLVs | | | |
3182 | | | | | |
3183 | 0040-F7FF | Unassigned | NO | | |
3184 | | | | | |
3185 | F800-FBFF | Experimental/local | NO | | |
3186 | | use | | | |
3187 | | | | | |
3188 | FC00-FFFF | Reserved for future | NO | | |
3189 | | expansion | | | |
3190 +-----------+-----------------------+-------+-----------+-----------+
3191
3192
3193
3194Bellis, et al. Standards Track [Page 57]
3195
3196RFC 8490 DNS Stateful Operations March 2019
3197
3198
3199 The meanings of the fields are as follows:
3200
3201 Type: The 16-bit DSO type code.
3202
3203 Name: The human-readable name of the TLV.
3204
3205 Early Data: If OK, this TLV may be sent as early data in a TLS zero
3206 round-trip (Section 2.3 of the TLS 1.3 specification [RFC8446])
3207 initial handshake. If NA, the TLV may appear as an Additional TLV
3208 in a DSO message that is sent as early data.
3209
3210 Status: RFC status (e.g., "Standards Track") or "External" if not
3211 documented in an RFC.
3212
3213 Reference: A stable reference to the document in which this TLV is
3214 defined.
3215
3216 Note: DSO Type Code zero is reserved and is not currently intended
3217 for allocation.
3218
3219 Registrations of new DSO Type Codes in the "Reserved for DSO session-
3220 management" range 0004-003F and the "Reserved for future expansion"
3221 range FC00-FFFF require publication of an IETF Standards Action
3222 document [RFC8126].
3223
3224 Requests to register additional new DSO Type Codes in the
3225 "Unassigned" range 0040-F7FF are to be recorded by IANA after Expert
3226 Review [RFC8126]. The expert review should validate that the
3227 requested type code is specified in a way that conforms to this
3228 specification, and that the intended use for the code would not be
3229 addressed with an experimental/local assignment.
3230
3231 DSO Type Codes in the "experimental/local" range F800-FBFF may be
3232 used as Experimental Use or Private Use values [RFC8126] and may be
3233 used freely for development purposes or for other purposes within a
3234 single site. No attempt is made to prevent multiple sites from using
3235 the same value in different (and incompatible) ways. There is no
3236 need for IANA to review such assignments (since IANA does not record
3237 them) and assignments are not generally useful for broad
3238 interoperability. It is the responsibility of the sites making use
3239 of "experimental/local" values to ensure that no conflicts occur
3240 within the intended scope of use.
3241
3242 Any document defining a new TLV that lists a value of "OK" in the
3243 Early Data column must include a threat analysis for the use of the
3244 TLV in the case of TLS zero round-trip. See Section 11.1 for
3245 details.
3246
3247
3248
3249
3250Bellis, et al. Standards Track [Page 58]
3251
3252RFC 8490 DNS Stateful Operations March 2019
3253
3254
325511. Security Considerations
3256
3257 If this mechanism is to be used with DNS-over-TLS, then these
3258 messages are subject to the same constraints as any other DNS-over-
3259 TLS messages and MUST NOT be sent in the clear before the TLS session
3260 is established.
3261
3262 The data field of the "Encryption Padding" TLV could be used as a
3263 covert channel.
3264
3265 When designing new DSO TLVs, the potential for data in the TLV to be
3266 used as a tracking identifier should be taken into consideration and
3267 should be avoided when not required.
3268
3269 When used without TLS or similar cryptographic protection, a
3270 malicious entity may be able to inject a malicious unidirectional DSO
3271 Retry Delay message into the data stream, specifying an unreasonably
3272 large RETRY DELAY, causing a denial-of-service attack against the
3273 client.
3274
3275 The establishment of DSO Sessions has an impact on the number of open
3276 TCP connections on a DNS server. Additional resources may be used on
3277 the server as a result. However, because the server can limit the
3278 number of DSO Sessions established and can also close existing DSO
3279 Sessions as needed, denial of service or resource exhaustion should
3280 not be a concern.
3281
328211.1. TLS Zero Round-Trip Considerations
3283
3284 DSO permits zero round-trip operation using TCP Fast Open with
3285 TLS 1.3 [RFC8446] early data to reduce or eliminate round trips in
3286 session establishment. TCP Fast Open is only permitted in
3287 combination with TLS 1.3 early data. In the rest of this section, we
3288 refer to TLS 1.3 early data in a TLS zero round-trip initial
3289 handshake message, regardless of whether or not it is included in a
3290 TCP SYN packet with early data using the TCP Fast Open option, as
3291 "early data."
3292
3293 A DSO message may or may not be permitted to be sent as early data.
3294 The definition for each TLV that can be used as a Primary TLV is
3295 required to state whether or not that TLV is permitted as early data.
3296 Only response-requiring messages are ever permitted as early data,
3297 and only clients are permitted to send a DSO message as early data
3298 unless there is an implicit DSO session (see Section 5.1).
3299
3300
3301
3302
3303
3304
3305
3306Bellis, et al. Standards Track [Page 59]
3307
3308RFC 8490 DNS Stateful Operations March 2019
3309
3310
3311 For DSO messages that are permitted as early data, a client MAY
3312 include one or more such messages as early data without having to
3313 wait for a DSO response to the first DSO request message to confirm
3314 successful establishment of a DSO Session.
3315
3316 However, unless there is an implicit DSO session, a client MUST NOT
3317 send DSO unidirectional messages until after a DSO Session has been
3318 mutually established.
3319
3320 Similarly, unless there is an implicit DSO session, a server MUST NOT
3321 send DSO request messages until it has received a response-requiring
3322 DSO request message from a client and transmitted a successful
3323 NOERROR response for that request.
3324
3325 Caution must be taken to ensure that DSO messages sent as early data
3326 are idempotent or are otherwise immune to any problems that could
3327 result from the inadvertent replay that can occur with zero round-
3328 trip operation.
3329
3330 It would be possible to add a TLV that requires the server to do some
3331 significant work and send that to the server as initial data in a TCP
3332 SYN packet. A flood of such packets could be used as a DoS attack on
3333 the server. None of the TLVs defined here have this property.
3334
3335 If a new TLV is specified that does have this property, that TLV must
3336 be specified as not permitted in zero round-trip messages. This
3337 prevents work from being done until a round-trip has occurred from
3338 the server to the client to verify that the source address of the
3339 packet is reachable.
3340
3341 Documents that define new TLVs must state whether each new TLV may be
3342 sent as early data. Such documents must include a threat analysis in
3343 the security considerations section for each TLV defined in the
3344 document that may be sent as early data. This threat analysis should
3345 be done based on the advice given in Sections 2.3, 8, and
3346 Appendix E.5 of the TLS 1.3 specification [RFC8446].
3347
334812. References
3349
335012.1. Normative References
3351
3352 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities",
3353 STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987,
3354 <https://www.rfc-editor.org/info/rfc1034>.
3355
3356 [RFC1035] Mockapetris, P., "Domain names - implementation and
3357 specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
3358 November 1987, <https://www.rfc-editor.org/info/rfc1035>.
3359
3360
3361
3362Bellis, et al. Standards Track [Page 60]
3363
3364RFC 8490 DNS Stateful Operations March 2019
3365
3366
3367 [RFC1918] Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G.,
3368 and E. Lear, "Address Allocation for Private Internets",
3369 BCP 5, RFC 1918, DOI 10.17487/RFC1918, February 1996,
3370 <https://www.rfc-editor.org/info/rfc1918>.
3371
3372 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
3373 Requirement Levels", BCP 14, RFC 2119,
3374 DOI 10.17487/RFC2119, March 1997,
3375 <https://www.rfc-editor.org/info/rfc2119>.
3376
3377 [RFC2136] Vixie, P., Ed., Thomson, S., Rekhter, Y., and J. Bound,
3378 "Dynamic Updates in the Domain Name System (DNS UPDATE)",
3379 RFC 2136, DOI 10.17487/RFC2136, April 1997,
3380 <https://www.rfc-editor.org/info/rfc2136>.
3381
3382 [RFC6891] Damas, J., Graff, M., and P. Vixie, "Extension Mechanisms
3383 for DNS (EDNS(0))", STD 75, RFC 6891,
3384 DOI 10.17487/RFC6891, April 2013,
3385 <https://www.rfc-editor.org/info/rfc6891>.
3386
3387 [RFC7766] Dickinson, J., Dickinson, S., Bellis, R., Mankin, A., and
3388 D. Wessels, "DNS Transport over TCP - Implementation
3389 Requirements", RFC 7766, DOI 10.17487/RFC7766, March 2016,
3390 <https://www.rfc-editor.org/info/rfc7766>.
3391
3392 [RFC7830] Mayrhofer, A., "The EDNS(0) Padding Option", RFC 7830,
3393 DOI 10.17487/RFC7830, May 2016,
3394 <https://www.rfc-editor.org/info/rfc7830>.
3395
3396 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for
3397 Writing an IANA Considerations Section in RFCs", BCP 26,
3398 RFC 8126, DOI 10.17487/RFC8126, June 2017,
3399 <https://www.rfc-editor.org/info/rfc8126>.
3400
3401 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
3402 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
3403 May 2017, <https://www.rfc-editor.org/info/rfc8174>.
3404
340512.2. Informative References
3406
3407 [Fail] Andrews, M. and R. Bellis, "A Common Operational Problem
3408 in DNS Servers - Failure To Communicate", Work in
3409 Progress, draft-ietf-dnsop-no-response-issue-13, February
3410 2019.
3411
3412
3413
3414
3415
3416
3417
3418Bellis, et al. Standards Track [Page 61]
3419
3420RFC 8490 DNS Stateful Operations March 2019
3421
3422
3423 [NagleDA] Cheshire, S., "TCP Performance problems caused by
3424 interaction between Nagle's Algorithm and Delayed ACK",
3425 May 2005,
3426 <http://www.stuartcheshire.org/papers/nagledelayedack/>.
3427
3428 [Push] Pusateri, T. and S. Cheshire, "DNS Push Notifications",
3429 Work in Progress, draft-ietf-dnssd-push-18, March 2019.
3430
3431 [Relay] Lemon, T. and S. Cheshire, "Multicast DNS Discovery
3432 Relay", Work in Progress, draft-ietf-dnssd-mdns-relay-02,
3433 March 2019.
3434
3435 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts -
3436 Communication Layers", STD 3, RFC 1122,
3437 DOI 10.17487/RFC1122, October 1989,
3438 <https://www.rfc-editor.org/info/rfc1122>.
3439
3440 [RFC2132] Alexander, S. and R. Droms, "DHCP Options and BOOTP Vendor
3441 Extensions", RFC 2132, DOI 10.17487/RFC2132, March 1997,
3442 <https://www.rfc-editor.org/info/rfc2132>.
3443
3444 [RFC5382] Guha, S., Ed., Biswas, K., Ford, B., Sivakumar, S., and P.
3445 Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142,
3446 RFC 5382, DOI 10.17487/RFC5382, October 2008,
3447 <https://www.rfc-editor.org/info/rfc5382>.
3448
3449 [RFC6762] Cheshire, S. and M. Krochmal, "Multicast DNS", RFC 6762,
3450 DOI 10.17487/RFC6762, February 2013,
3451 <https://www.rfc-editor.org/info/rfc6762>.
3452
3453 [RFC6763] Cheshire, S. and M. Krochmal, "DNS-Based Service
3454 Discovery", RFC 6763, DOI 10.17487/RFC6763, February 2013,
3455 <https://www.rfc-editor.org/info/rfc6763>.
3456
3457 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP
3458 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014,
3459 <https://www.rfc-editor.org/info/rfc7413>.
3460
3461 [RFC7828] Wouters, P., Abley, J., Dickinson, S., and R. Bellis, "The
3462 edns-tcp-keepalive EDNS0 Option", RFC 7828,
3463 DOI 10.17487/RFC7828, April 2016,
3464 <https://www.rfc-editor.org/info/rfc7828>.
3465
3466 [RFC7857] Penno, R., Perreault, S., Boucadair, M., Ed., Sivakumar,
3467 S., and K. Naito, "Updates to Network Address Translation
3468 (NAT) Behavioral Requirements", BCP 127, RFC 7857,
3469 DOI 10.17487/RFC7857, April 2016,
3470 <https://www.rfc-editor.org/info/rfc7857>.
3471
3472
3473
3474Bellis, et al. Standards Track [Page 62]
3475
3476RFC 8490 DNS Stateful Operations March 2019
3477
3478
3479 [RFC7858] Hu, Z., Zhu, L., Heidemann, J., Mankin, A., Wessels, D.,
3480 and P. Hoffman, "Specification for DNS over Transport
3481 Layer Security (TLS)", RFC 7858, DOI 10.17487/RFC7858, May
3482 2016, <https://www.rfc-editor.org/info/rfc7858>.
3483
3484 [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol
3485 Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
3486 <https://www.rfc-editor.org/info/rfc8446>.
3487
3488 [RFC8467] Mayrhofer, A., "Padding Policies for Extension Mechanisms
3489 for DNS (EDNS(0))", RFC 8467, DOI 10.17487/RFC8467,
3490 October 2018, <https://www.rfc-editor.org/info/rfc8467>.
3491
3492 [RFC8484] Hoffman, P. and P. McManus, "DNS Queries over HTTPS
3493 (DoH)", RFC 8484, DOI 10.17487/RFC8484, October 2018,
3494 <https://www.rfc-editor.org/info/rfc8484>.
3495
3496Acknowledgements
3497
3498 Thanks to Stephane Bortzmeyer, Tim Chown, Ralph Droms, Paul Hoffman,
3499 Jan Komissar, Edward Lewis, Allison Mankin, Rui Paulo, David
3500 Schinazi, Manju Shankar Rao, Bernie Volz, and Bob Harold for their
3501 helpful contributions to this document.
3502
3503Authors' Addresses
3504
3505 Ray Bellis
3506 Internet Systems Consortium, Inc.
3507 950 Charter Street
3508 Redwood City, CA 94063
3509 United States of America
3510
3511 Phone: +1 (650) 423-1200
3512 Email: ray@isc.org
3513
3514
3515 Stuart Cheshire
3516 Apple Inc.
3517 One Apple Park Way
3518 Cupertino, CA 95014
3519 United States of America
3520
3521 Phone: +1 (408) 996-1010
3522 Email: cheshire@apple.com
3523
3524
3525
3526
3527
3528
3529
3530Bellis, et al. Standards Track [Page 63]
3531
3532RFC 8490 DNS Stateful Operations March 2019
3533
3534
3535 John Dickinson
3536 Sinodun Internet Technologies
3537 Magadalen Centre
3538 Oxford Science Park
3539 Oxford OX4 4GA
3540 United Kingdom
3541
3542 Email: jad@sinodun.com
3543
3544
3545 Sara Dickinson
3546 Sinodun Internet Technologies
3547 Magadalen Centre
3548 Oxford Science Park
3549 Oxford OX4 4GA
3550 United Kingdom
3551
3552 Email: sara@sinodun.com
3553
3554
3555 Ted Lemon
3556 Nibbhaya Consulting
3557 P.O. Box 958
3558 Brattleboro, VT 05302-0958
3559 United States of America
3560
3561 Email: mellon@fugue.com
3562
3563
3564 Tom Pusateri
3565 Unaffiliated
3566 Raleigh, NC 27608
3567 United States of America
3568
3569 Phone: +1 (919) 867-1330
3570 Email: pusateri@bangj.com
3571
3572
3573
3574
3575
3576
3577
3578
3579
3580
3581
3582
3583
3584
3585
3586Bellis, et al. Standards Track [Page 64]
3587
3588