@Comment(-*-SCRIBE-*-) @Comment(SCRIBE Text Formatter Input for the KERMIT Protocol Manual) @Make @Comment(Use /draft:F on command line to produce .LPT file w/cc in column 1) @Style @Modify @Comment(Printing Device Dependencies) @Define @Case @Comment(Set desired spacing around various environments) @Modify @Modify @Modify @Modify @Modify @Modify @Comment(Set spacing and paging requirements for chapter & section headings) @Modify(Hdx,Above 2,Below 1,Need 8) @Modify(Hd0,Above 2,Below 1,Need 8) @Modify(Hd2,Above 2,Below 1,Need 8) @Modify(Hd3,Above 2,Below 1,Need 8) @Modify(Hd4,Above 2,Below 1,Need 8) @String(What="Manual") @Comment(Ptxt is for text -- comments -- in a program) @Define(Ptxt=text,continue,break off,fill,facecode R,leftmargin +3,indent 0, initialize "@ ") @Comment(Start the document off with a titlepage) @Begin(TitlePage,Initialize "@BlankSpace(2.5inches)",sink 0) @MajorHeading(KERMIT PROTOCOL MANUAL) @i Frank da Cruz, Bill Catchings@foot Columbia University Center for Computing Activities New York, New York 10027 @value Copyright (C) 1981,1982,1983 Trustees of Columbia University in the City of New York @i @end(titlepage) @case @pageheading @PrefaceSection In the second edition, the Kermit manual contained everything -- user's guide, protocol description, appendices describing the operation of various microcomputers, etc. Due to the proliferation of Kermit implementations, and to the growing number of users who aren't interested in the protocol, the manual has been split in two -- a Users Guide and a Protocol Manual. This is the Protocol Manual. It is intended for use by those who wish to produce a new implementation of Kermit. Readers are assumed to be thoroughly familiar with the Kermit Users Guide. In splitting the manual into two parts, some new material was added to each part. This part has a new state table, descriptions of some of the features new to version 2 of the protocol (8-bit quoting, server functions). A minor bug was fixed in the @q listing. Version 3 differs from version 2 by allowing for optional checksum types and data compression. The version 3 manual also clarifies some minor points that were left unstated in the version 2 manual, and the @q listing has been replaced by one for an actual production UNIX KERMIT. Before attempting to write a new Kermit program, be sure to check with the authors to make sure that no one else is working on the same thing, and that you have the latest information. To avoid confusion, the Columbia University DEC-20 implementation of KERMIT should be considered the definitive implementation of the protocol. The true test of any new Kermit is whether it can talk to Columbia DEC-20 Kermit. KERMIT is distributed from Columbia University on magnetic tape. Complete ordering instructions are in the Kermit Users Guide. @begin KERMIT Distribution Columbia University Center for Computing Activities 7th Floor, Watson Laboratory 612 West 115th Street New York, NY 10025 @end No warranty of the software nor of the accuracy of the documentation surrounding it is expressed or implied, and neither the authors nor Columbia University acknowledge any liability resulting from program or documentation errors. @Chapter @label<-protocol> This manual describes the @Index[KERMIT] KERMIT @Index[Protocol] protocol. It is assumed that you understand the purpose and operation of the Kermit file transfer facility, described in the @i. @Section The KERMIT file transfer protocol is intended for use in an environment where there is a diverse mixture of computers of all types -- micros, workstations, laboratory computers, timesharing systems. All these systems need have in common is the ability to communicate in ASCII over ordinary serial telecommunication lines. KERMIT makes no assumptions about the speed, flow control, or duplex of the systems involved. Various protocols for transferring files over TTY lines already existed when KERMIT was developed. Unlike many of its predecessors, KERMIT is not truly full duplex or "asynchronous"; in order to accommodate itself to half-@|duplex systems (and to the DEC-20, with its "Achilles Heel" front end), it does not "stack" packets and it does not send long packets; it normally waits for a reply to each packet it sends. Thus transfer rates cannot be achieved that are as high as those for truly asynchronous full duplex protocols. Nevertheless, KERMIT runs at 50-80% efficiency (user bits / baud rate). Its half duplex mode of communication allows it to be implemented on computers with either full or half duplex terminal communications, and in fact it presently runs on IBM mainframes (half duplex), UNIX systems (full duplex, but line-@|oriented ), and on character-@|oriented full duplex systems (TOPS-20, most micros). The primary design goals of KERMIT are reliability, portability, simplicity, and file transparency (for textual files). Efficiency has been sacrificed to some extent for portability (especially to work for half duplex systems) and for simplicity. Complex approaches (like multiprocessing/@|scheduling schemes, multiplexed physical links, multichannel logical links, etc) were were avoided so that the KERMIT specification can be easily implemented on any computer, and that the person who works from the specification has some chance of understanding it. The procedure for running KERMIT tends to be complicated because it is intended for individual, rather than system, use. Host servers, embedded signon procedures, etc, are not required so that use of KERMIT will be relatively uniform between any two systems, and any ordinary user can establish a connection. Actually, version 2 of the protocol simplifies the "user interface" a great deal by allowing a server or "slave" mode of operation -- the server is not a permanent feature of the host, but is cranked up by the user as needed. But once running on the user's line, the server simplifies things a great deal. In the following sections, we define some terminology, discuss some minimal requirements for host systems, describe the packet format and the protocol. A state table for the protocol is given, and a listing is provided of an actual working implementation of the Kermit "Kernel" in the C language. @Section The KERMIT protocol is specifically designed for character-@|oriented transmission over serial telecommunication lines. The design allows for the restrictions and peculiarities of the medium and the requirements of diverse operating systems -- buffering, duplex, parity, character set, file organization, etc. The service provided is minimal -- no attempt is made at an "integrated" link between two systems. KERMIT provides terminal connection and file transfer, period. File transfer is accomplished by sending packets back and forth; the sender sends file names, file contents, and control information; the receiver acknowledges (positively or negatively) each packet. The packets have a layered design, in keeping with the ANSI and ISO philosophy, with the outermost fields used by the data link layer to verify data integrity, the next by the session layer to verify continuity, and the data itself at the highest level to perform any transformations that may be necessary. @Section All @u in the following text are expressed in @i@index (base 8) notation unless otherwise specified. @ux @Begin NUL@\Null, idle, ASCII character 0. SOH@\Start-of-header, ASCII character 1. SP@\Space, blank, ASCII 40. CR@\Carriage return, ASCII 15. LF@\Linefeed, ASCII 12. CRLF@\A carriage-@|return linefeed sequence. DEL@\Delete, rubout, ASCII 177. @end A @ux is considered to be any ASCII character in the range 0 through 37, or the DEL character (177). A @ux is considered to be any character in the range 40 (SP) through 176 (tilde). Several @u are useful in the description of the protocol and in the program example. The machine that Kermit runs on need operate only on integer data, so these are functions that operate upon the numeric value of single ASCII characters. @begin @Index[char(x)] @q@\Transforms the integer @i, which is assumed to lie in the range 0 to 136, into a printable ASCII character; 0 becomes SP, 1 becomes "!", etc. @Index[unchar(x)] @q@\Transforms the character @i, which is assumed to be in the printable range (SP through tilde), into an integer in the range 0 to 136. @Index[ctl(x)] @q@\Maps between control characters and their printable representations, preserving the high-@|order bit@foot{The high order bit is normally the parity bit, but Kermit uses this bit as data in order to do 8-bit transmission for binary files when the systems permit.}. If @i is a control character, then @example that is, the same function is used to controllify and uncontrollify. The argument is assumed to be a true control character (0 to 37), or the result of applying @c to a true control character (i.e. 100 to 137). The transformation is the expected one, viz. @q(^A becomes A and vice versa). @end @Index[ACK] @u stands for "Acknowledge", a packet that acknowledges receipt of another packet. Not to be confused with the ASCII character ACK. @Index[NAK] @u stands for "Negative Acknowledge". A packet that says a packet was received in bad condition (e.g. bad checksum), the wrong packet was received, or an expected packet was never received. Not to be confused with the ASCII character NAK. @Section In order to run on as many different systems as possible, KERMIT makes the following assumptions: @begin All printable @Index[ASCII] ASCII characters are acceptable as input to the host and will not be transformed in any way. A single nonprintable ASCII character can be used for synchronization. The character is normally Control-A (SOH, ASCII 1), but can be redefined. @IndexEntry[Key="Line Terminator (see End-Of-Line)", Text="Line Terminator (see End-Of-Line)"] @Index[End-Of-Line (EOL)] If a host requires a line terminator for terminal input, that terminator must be a single ASCII character, presumably a control character such as CR or LF. @Index[Remote] When using a job's controlling terminal for file transfer, the system must allow the KERMIT program to set the terminal to half duplex, infinite width (no "wraparound" or CRLF insertion by the operating system), and no translation of incoming or outgoing characters (for instance, raising lowercase letters to uppercase, transforming control characters to printable sequences, etc). In short, the terminal must be put in "raw" mode, and, hopefully, restored afterwards to normal operation. @Index[ACK] The host's terminal input buffer is at least long enough to receive the longest ACK packet (the ACK to the send-@|initiate packet can be 10 or 12 characters long). @Index[Padding] If a host requires padding, the padding character is in the range ASCII 0-37 or ASCII 177. @Index[Binary Files] Both communicating hosts are capable of 8-bit terminal i/o if Kermit is to transfer binary files. @end The last item may be circumvented for those hosts which insist upon having a parity bit; version 2 of Kermit provides a new 8-bit quoting mechanism, which is described later. KERMIT does @i assume: @begin Anything about @Index[Baud] baud rate. That the host can do @Index[XON/XOFF] XON/XOFF or any other kind of flow control. This kind of flow control can be initiated behind Kermit's back by commands to the host computers. If the hosts support any kind of flow control, then it should be used if possible, since it will cut down on retransmission due to buffering problems. @Index[Duplex]@Index[Full Duplex]@Index[Half Duplex] That the host is capable of full duplex operation. Any mixture of half and full duplex hosts is supported. @end @Section @Index[Records]@Index[Logical Records] @Index[Printable Files]@Index[Binary Files] For transmission between unlike systems, files must be assigned to either of two catagories: @i or @i. A printable file is defined to be one that will make sense on the foreign system -- a document, program source, textual data, etc. A binary file is one that will (and probably can) not make sense on the foreign system -- an executable program, numbers stored in internal format, etc. When binary files are transmitted to an unlike system, it is important only that they can be brought back to the original system (or one like it) intact; no special conversions are necessary during transmission. But for printable files to be transferred in a useful fashion, there must be a standard way to represent them during transmission. KERMIT's standard is simple: ASCII characters, with "logical records" (lines) delimited by CRLFs. @Index[ASCII]@Index[Parity] All characters are transmitted and interpreted in 7- or 8-bit ASCII. If any conversion is necessary to or from ASCII it is the responsibility of the non-@|ASCII host do so. It is assumed that the computers implementing this protocol can transmit and receive the printable ASCII characters between 040 and 176 (octal), i.e. space through tilde, without translation. Similarly, it is the responsibility of systems that do not store printable files as sequences of lines delimited by CRLFs to perform the necessary conversions upon input and output. For instance, IBM mainframes might strip trailing blanks on output and add them back on input; UNIX would prepend a CR to its normal record terminator, LF, upon output and discard it upon input. Since computers can't be expected to distinguish a printable file from a binary file -- especially one originating from an unlike system -- the user will generally have to give a command to Kermit to tell it whether to perform these conversions. @Index[Control Characters] Any ASCII control characters (0-37 octal) and DEL (177 octal) are preceded by a special quote character and mapped to characters in the printable range on transmission and unquoted and unmapped upon receipt. This is to prevent unpredicatable actions that can occur when the remote host receives raw control characters (for instance, it might interpret them as flow control signals). For binary files, eight bit character transmission is permissible as long as the two Kermit programs involved can control the value of the parity bit. In that case, the 8th bit of a transmitted character will match that of the original data byte, after control-@|quoting has been done. When one or both sides cannot control the parity bit, a special prefix character may be prepended, as described below. Data compression is also allowed. A special prefix character will denote that the following character is a repeat count, and the next character is the one to be repeated. These prefix characters can be combined in various ways; for instance a control character with its 8th bit set may be repeated @i times, the repeat prefix may be quoted as a data character, etc. See the appendix for a listing of the ASCII alphabet, with EBCDIC equivalents. @Section @Index[Packet] The KERMIT protocol is built around exchange of packets of the following format: @begin <@i(check)> @end or @begin @end where all fields consist of ASCII characters, and: @begin @u@\Is a synchronization character to mark the beginning of the packet. In standard KERMIT, this is SOH (Control-A, ASCII 1), but may be redefined. @u@\The number of ASCII characters within the packet that follow this field, in other words the packet length minus two. Since this number is transformed to a single character via the char function, packet character counts of 0 to 94 (decimal) are permitted, and 96 (decimal) is the maximum total packet length. Does not include end of line or padding characters, which are outside the packet and are strictly for the benefit of the operating system. @u@\The packet sequence number, modulo 100, ranging from 0 to 77 (octal). Sequence numbers "wrap around" to 0 after each group of 64 (decimal) packets. @u@\The packet type, a single ASCII character, one of the following: @begin D@\Data packet Y@\Acknowledge (ACK) N@\Negative acknowledge (NAK) S@\Send initiate B@\Break transmission F@\File header Z@\End of file (EOF) E@\Error @end The following are new packet types for version 2 of the protocol. Not all of them have been implemented, and some may never be at all, but their use should be reserved. These packet types have been added to allow for a "Kermit Server"@index@index, which receives all its commands from the other Kermit, rather than directly from the user. @begin R@\Receive Initiate. Ask the server to send the specified file(s). C@\Host Command. The data field contains a string to be executed as a command by the host. G@\Generic Kermit Command. Single character in data field (possibly followed by operands, shown in {braces}, optional fields in [brackets]) specifies the command: @begin I@\Login {user~password~account} C@\Connect, Change working directory {directory[~password]} L@\Logout, Bye F@\Finish (Shut down the server, but don't logout). D@\Directory @q([){filespec}@q(]) U@\Disk Usage Query E@\Erase (delete) {filespec} T@\Type {filespec} S@\Submit for batch processing {filespec~options} P@\Print {filespec~options} W@\Who's logged in? (Finger) @q([){user ID}@q(]) M@\Send a Message {user or line ID} H@\Help Q@\Server Status Query @end Note that tilde ("@q<~>") is chosen to delimit fields when a command takes more than one operand, on the assumption that tildes are not found in user IDs, passwords, directory names, or account designators on systems that might support Kermit servers. @blankspace(1) X@\Text header. Allows transfer of text to the other Kermit's screen in response to a generic or host command. This works just like file transfer except that the destination "device" is the screen rather than a file. @end @u@\The "contents" of the packet, if any contents are required in the given type of packet, interpreted according to the packet type. Nonprintable ASCII characters (and possibly 8-bit and/@|or repeated characters) are quoted with prefix characters and suitably transformed. Quoted or prefixed sequences may not be broken across packets. Logical records in printable files are delimited with quoted CRLFs. Any quote characters are included in the count. @u<@i(check)>@\@Index[Checksum]A block check on the characters in the packet between, but not including, the mark and the checksum itself, taken modulo 100 (octal). The check for each packet is computed by both hosts, and must agree if a packet is to be accepted. There are presently 3 types of checks (bit 0 is the least significant): @begin Single-character arithmetic sum (@i); only six bits of the arithmetic sum are included. In order that all the bits of each character contribute to this quantity, the bits 6 and 7 of the final value are added to the quantity formed by bits 0-5. Thus if @i is the arithmetic sum of the ASCII characters, then @example[chksum = (s + ((s AND 300)/100)) AND 77] The final result is transformed by @c[CHAR]. This is the default block check, and all Kermits must be capable of performing it. Two-character arithmetic sum (optional). Bits 6-11 form the first character (via @c[CHAR]), bits 0-5 form the second (also via @c[CHAR]). Three-character 16-bit CRC-CCITT (optional). The 16-bit CRC formed from the generating polynomial @i@+(16)+@i@+(12)+@i@+(5)+1 is parcelled into three printable characters, via @c[CHAR]: bits 12-15 in the first, 6-11 in the second, and 0-5 in the third. This option will be described in greater detail should it ever actually be implemented. @end The single-character checksum has proven quite adequate in practice. The other options can be used only if both sides agree to do so. @end @Index[End-Of-Line (EOL)] Any line terminator that may be required by the host may be appended to the packet; this is carriage return (ASCII 15) by default. Line terminators are not considered part of the packet, and are not accounted for in the count or checksum. Terminators are not necessary to the protocol, and are invisible to it, as are any characters that may appear between packets. If a host cannot do single character input from a TTY line, then a terminator will be required for that host. The terminator can be specified in the initial connection protocol. The contents of the data field for each type of packet may vary depending upon the state of the transmission. For instance, the acknowledgement to a send-@|initiate packet contains various parameters (timing and buffer-@|size information, etc.), whereas when in file-@|receiving state, the acknowledgment packet contains no data. @Section @Index[Smart]@Index[Dumb] A "smart" Kermit is one that is capable of timing out; a "dumb" Kermit is one that cannot. A @Index[Timeout] timeout feature is desirable so that Kermit won't wait forever for expected data to arrive. While timing out lets you detect when the remote system or program crashes, the most important use of timeouts is to prevent @Index[Deadlock] deadlocks, such as might happen when all or part of a packet is lost in transmission (leaving the receiving Kermit waiting for the packet that never arrived, and the sending Kermit for the ACK that is never sent). In any conversation between two Kermits, one smart Kermit is sufficient to prevent deadlocks. Two smart Kermits also work reliably together. Two dumb Kermits, however, must be watched carefully. The local Kermit should keep some sort of running confirmation on the screen, so that the user can detect when transmission stops. For instance, microcomputer Kermits keep the current packet number on the screen, so that the user can watch it changing. Some provision should be made in a dumb Kermit for manual intervention; for instance, if input appears at the keyboard while waiting for a packet, then send a NAK for the expected packet or resend the current one. Shared systems which can become sluggish when heavily used should adjust their own timeout intervals on a per-@|packet basis, based on the system load, so that file transfers won't fail simply because the system was too slow. @Section @Index[Local]@Index[Remote] "Local" refers to the host that has control of the user's terminal; "remote" refers to the other host. "Sending" refers to the host that is sending a file; "receiving" refers the host receiving a file. If the local Kermit is sending, the remote Kermit is receiving. And vice versa. A @Index[Microcomputer] microcomputer is always in control of the screen, so it can always be considered local. An implementation of Kermit for a multi-@|terminal system must be able to determine whether it is in control of the terminal or not. In general, if it is sending packets to its controlling terminal (primary output, stdout, TTY:, KB:, or whatever) then it is a remote Kermit; if it is sending packets to an assigned terminal or device (e.g.@ as specified by the SET @Index[Line] LINE command), it is a local Kermit and it should update the screen. At initialization, a Kermit program should determine whether it is local or remote and set a flag that can be used later to determine how it updates the screen, handles errors, etc. @Section A @i@Index@Index is always a remote Kermit, a "slave". Unlike ordinary interactive Kermits, it does not have a "user interface"; it gets all its commands from another Kermit. Most Kermits are not capable of acting as servers, but when a server is available, it is desirable that all implementations of Kermit know how to talk to it. The server is a concept new to version 2 of the Kermit protocol and differs from earlier Kermits by the addition of several new packet types (Receive Init, Generic Command, Host Command). Kermits that want to talk to servers should implement handling for these new packet types. Old Kermits can still send files to servers, but they have no way to ask a server to send files, and no way to shut down a remote server (other than connecting to it and killing it with some host function like @q(^C)). Note that between transactions, when the server has no tasks pending, it sends out periodic NAKs to prevent a deadlock in case a command was sent to it but was lost. These NAKs can pile up in the local "user" Kermit's input buffer (if it has one), so the user Kermit should be prepared to clear its input buffer before sending a command to a server. @Section @Index[Initial Connection Protocol] The user starts the remote Kermit first via a virtual terminal connection, and then escapes back to the local host and starts or continues the local Kermit. The receiving Kermit waits for a @q packet from the sending Kermit. It doesn't matter whether the sending Kermit is started before or after the receiving Kermit (if before, the @q packet will be retransmitted periodically until the receiving Kermit acknowledges it). The data field in the @q packet is optional; trailing fields can be omitted to accept default values. @begin <8-bit-quote> @end where: @begin 1. @u@\The sending Kermit's maximum buffer size. If none specified, the default value of 96 (decimal) is assumed. The receiving Kermit should send packets no longer than this length. 2. @u@\The number of seconds after which the sending Kermit wishes to be timed out if no packets have been successfully received by the receiving Kermit, provided the receiving Kermit is capable of timeouts. If none specified, the receiving Kermit's default is accepted. The normal value is in the 5-15 second range. A value of 0 means "don't time me out". This value is taken as a guideline rather than an absolute, and may be adjusted on a per-@|packet basis by timesharing systems depending upon system load. 3. @u@\The number of @Index[Padding] padding characters the sending Kermit needs preceding each packet. Some systems may require padding; for instance, some @Index[Half Duplex] half duplex systems may need some time to "turn the line around". If none specified, or a value of 0 (ASCII NUL) is specified, no padding is done and the contents of the next field is ignored. 4. @u@\The character the sending Kermit wants used for padding. Normally NUL (ASCII 0), but some IBM systems want DEL (ASCII 177). If npad is nonzero but pad is omitted, 0 will be used as a padding character. 5. @Index[End-Of-Line (EOL)] @u@\The desired line terminator for incoming packets. Only a single control character, transformed by @c (not @c!), is permitted in this field. Hosts cannot specify printable terminators or multi-@|character terminator sequences. If none specified, carriage return (CR, ASCII 15) is used. 6. @Index[Quote] @u@\The printable ASCII character the sending Kermit will use when quoting control characters, in the range 41-76 or 140-176. If none specified, "#" is used. This character is taken literally. 7. @Index[8-Bit-Quote]@u<8-Bit-Quote>@\Specify quoting mechanism for 8-bit quantities. A quoting mechanism is necessary when sending binary files to hosts which prevent use of the 8th bit for data. When elected, the quoting mechanism will be used by @i hosts, and the quote character must be in the range 41-76 or 140-176, but different from the control-@|quoting character. This field is interpreted as follows: @begin @q@\I agree to 8-bit quoting if you request it. @q@\I will not do 8-bit quoting. @q<&>@\(or any other character in the range 41-76 or 140-176 besides Y and N) I want to do 8-bit quoting using this character (it will be done if the other Kermit puts a @q in this field). The recommended 8th-bit quoting prefix character is "&". @i: 8-bit quoting will not be done. @end 8. @Index[Checksum]@u[chktype]@\The type of block check. The only values presently allowed in this field are "1", "2", and "3", though future implementations may allow others. These values specify the single- and double-@|character arithmetic checksums, or the three-@|character CRC, described above. If anything other than "1" or "2", or if this field is omitted, "1" will be used. The sender requests the desired type of checksum in this field; if the receiver replies with the same type in its ACK then the requested type will be used, otherwise the single-@|character arithmetic checksum must be used ("1"). Both sides, of course, must use the same checksum type. 9. @Index[Repeat Count]@u[repeat]@\The prefix character to be used to indicate a repeated character. This can be any printable character other than blank (which denotes no repeat count prefix), but "~" is recommended. Both sides must agree (as they must for the block check type), or else repeat counts will not be done. Groups of 4 identical characters or more may be transmitted more efficiently using a repeat count, though an individual implementation may wish to set a higher threshhold. 10-11. @Index[Reserved Fields]@u[Reserved Fields]@\Sites who wish to add their own parameters to the initial connection negotiation must start at field 12 (decimal). Any intervening fields may be left blank (that is, they may contain the space character). @end The receiving Kermit responds with an ACK ("Y") packet containing the same information as it applies to itself. From that point, both Kermits are "configured" to communicate with each other. In the case of 8-bit quoting, one side must specify the character to be used, and the other must agree with a "Y" in the same field, but the order in which this occurs does not matter. Similarly for checksums -- if one side requests 2 character checksums and the other side responds with a "1" or with nothing at all, then single-@|character checksums will be done, since not all implementations can be expected to do 2-@|character checksums or CRCs. And for repeat counts; if the repeat field of the send-init and the ACK do not agree, repeat processing will not be done. All send-init fields are optional. The data field may be left totally empty. Similarly, intervening fields may be defaulted by setting them to blank. Kermit implementations should know what to do in these cases, namely apply appropriate defaults. The defaults should be: @begin bufsiz:@\system dependent npad, pad:@\0, no padding. eol:@\CR (carriage return) quote:@\the character "#" 8-bit-quote:@\none, don't do 8-bit quoting chktype:@\"1", single-character checksum repeat:@\No repeat count processing @end Note that there are no prolonged negotiations during this initial connection protocol -- there is one @q and one ACK in reply. Everything must be settled in this exchange. The very first @q may not get through if the sending Kermit makes wrong assumptions about the receiving host. For instance, the receiving host may require some padding or a special end of line character in order to read the @q packet. For this reason, there should be @Index[SET] SET command parameters to allow the user to specify whatever may be necessary to get the first packet through. When Kermit is running as a server@index, it is possible for the user side to send a @q packet to it, telling it to fetch some files. Since we can't assume that the two Kermits are running on like systems, the local (user) Kermit must parse the file specification as a character string and let the server to check it. If the server likes the filespec, it sends a send-init packet -- @i -- to the user, and then behaves as described above. The server may also recognize some kinds of "bureaucratic" packets (containing commands to type a file, provide a directory listing, access a directory, etc). In fact, it will sit and listen for packets forever, until it gets one that tells it to shut itself down. @Section Quoting is used for control characters, and may also be used for 8-bit quantities or repeat counts. When more than one type of quoting is in effect, a single character can be preceded by more than one quote character. A receiver will never do any kind of quoting, since quoting can only occur in the data field, and the receiver only send ACKs and NAKs with empty or special data fields. Repeat count processing can only be requested by the sender, and will only be used by the sender if the receiver agrees. 8th-bit quoting is a special case, since it is normally not desirable to use it because it increases both processing and transmission overhead. However, since it is the only mechanism for binary file transfer available to those systems that usurp the parity bit, a receiver must be able to request the sender to do 8th-bit quoting, since most senders will not normally do it by default. The following table should clarify Kermit's quoting mechanism: @begin Quoted With @u @ux A A ~(A ["(" is ASCII 40 - 32 = 6] ^A #A ~(#A 'A &A ~(&A '^A &#A ~(&#A # ## ~(## '# &## ~(&## & #& ~(#& '& &#& ~(&#& ~ #~ ~(#~ '~ &#~ ~(&#~ @end @q represents any printable character, @q<^A> represents any control character, @q<'x> represents any character with the 8th bit set. The @q<#> character is used for control-@|character quoting, and the @q<&> character for 8-bit quoting. The repeat count must always precede any other prefix character. The repeat count is taken literally (after transformation by UNCHAR); for instance "#" and "&" immediately following a "~" denote repeat counts, not control characters or 8-bit characters. The quote character "#" is most closely bound to the data character, followed by the 8-bit prefix, followed by the repeat count; in other words, the order should be: repeat prefix and count, 8-bit quote, control quote, the data character itself. To illustrate, note that @q<&#&> is @i equivalent to @q<#&&>. And finally, note that: @display(@i) @Section The protocol is defined over a @i. A transaction consists of the successful or unsuccessful transfer of one or more files. An initial connection is made for each transaction, and the connection is broken at the end of a transaction. The machine sending the file(s) transmits a send-@|initiate packet. Upon acknowledgement, a file header packet (containing the file name as data) is sent, followed by as many data packets as necessary to transmit the file. The file is followed by an end of file packet. The sending machine waits for an acknowledgment of each packet from the receiving machine. When all the files are transmitted, an end of transmission packet is sent. If a host times out waiting for an acknowledgement, it tries to retransmit the unacknowledged packet several times. If a host times out waiting for some other kind of packet, it can send either a NAK packet for the expected packet or another ACK for the last packet it got. If any packet is garbled or lost in transmission (the latter is detected when the sequence number increases by more than 1, modulo 100, the former by a bad checksum), the host that received it sends a NAK for the garbled or missing packet. The prodecure is the same when a Kermit server is involved, except that user may send a receive-@|initiate packet, which merely requests the server to send back a send-@|initiate packet, followed by the files requested. A few heuristics are useful: @begin A NAK for the current packet is equivalent to an ACK for the previous packet. This covers the common situation in which a packet is successfully received, and then ACK'd, but the ACK is lost. The ACKing side then times out waiting for the next packet and NAKs it. The side that receives a NAK for packet @i while waiting for an ACK for packet @i simply sends packet @i. If packet @i arrives more than once, simply ACK it and discard it. This can happen when the first ACK was lost. Resending the ACK is necessary and sufficient -- don't write the packet out to the file, because it's already there! When opening a connection, discard the contents of the line's input buffer before reading or sending the first packet. This is especially important if the other side is either in receive mode, or acting as a server, in which case it has been sending out periodic NAKs for your expected SEND-@|INIT or command packet. If you don't do this, you may find that there are sufficient NAKs to prevent the transfer -- you send a SEND-INIT, read the response, which is an old NAK, so you send another SEND-INIT, read the next old NAK, and so forth, up to the retransmission limit, and give up before getting to the ACKs that are waiting in line behind all the old NAKs. If the number of NAKs is below the cutoff, then each packet may be transmitted multiply. Similarly, after reading a packet (successfully or not), you should clear the input buffer. There should be nothing there for you anyway, since the other side must normally wait for you to send your packet in response. Failure to clear the buffer could result in propogation of the repetition of a packet caused by stacked-up NAKs. @end @Section The KERMIT protocol consists of a set of states, and rules for what to do when changing from one state to another. State changes normally occur based on the type of packets that are sent or received, or errors that may occur. Packets always go back and forth; the sender of a file always sends data packets of some kind (init, header, data) and the receiver always returns ACK or NAK packets. "Smart" Kermits can time out while waiting for a packet. Timeouts@index have been omitted from the following state table for simplicity, but the action is the same as if the expected packet had been received in bad condition -- it is NAK'd. A local Kermit can print error messages on its own screen. If an error occurs in the remote Kermit, an error message cannot simply be printed at the "terminal" because it will come to the local Kermit in the packet data stream, and will be discarded because it's not a valid packet. Therefore, remote Kermits should handle error conditions by sending Error packets containing the text of the error message, and then taking appropriate action, such as breaking transmission with a Break packet. Error conditions generally arise in low level routines which are outside the scope of the state table -- for instance, trying to write onto a full or write-@|protected disk. Kermit servers always return to "start" state after an error. Here's the state table for version 2 of Kermit; it differs from version 1 only by the addition of the "super state" (which is different for "user" and "server" Kermits). Version 1 always starts out in state S or R. Note that upon entering a given state, a certain kind of packet is either being sent or is expected to arrive -- this is shown on top of the description of that state. As a result of the action, various events may take place; these are shown in the EVENT column. For each event, an appropriate ACTION is taken, and the protocol enters a NEW STATE. @begin @u@ux @i(@ @ @ @ @ -- SUPER STATE --) @!@i(User Mode)@/@&_@\ start Want to send (none) S Want to rcv Send rcv-init R @i(Server Mode)@/@&_@\ start Get send-init (none) R Get rcv-init Find file to send S Can't find file, give error start Get "G" cmd ACK, Execute generic command start Get "C" cmd ACK, Execute host command start (other) Report error start @i(@ @ @ @ @ -- SEND STATES --) @!@i(Send Send-Init Packet)@/@&_@\ S Get NAK,bad ACK (None) S Get good ACK Set remote's parms, open file SF (Other) (None) A @i(Send File-Header Packet)@/@&_@\ SF Get NAK,bad ACK (None) SF Get good ACK Get bufferful of file data SD (Other) (None) A @i(Send File-Data Packet)@/@&_@\ SD Get NAK,bad ACK (None) SD Get good ACK Get bufferful of file data SD (End of file) (None) SZ (Other) (None) A @i(Send EOF Packet)@/@&_@\ SZ Get NAK,bad ACK (None) SZ Get good ACK Get next file to send SF (No more files) (None) SB (Other) (None) A @i@/@&_@\ SB Get NAK,bad ACK (None) SB Get good ACK (None) C (Other) (None) A @i<@ @ @ @ @ -- RECEIVE STATES --> @i(Wait for Send-Init Packet)@/@&_@\ R Get Send-Init ACK w/local parms RF (Other) (None) A @i(Wait for File-Header Packet)@/@&_@\ RF Get Send-Init ACK w/local parms (previous ACK was lost) RF Get Send-EOF ACK (prev ACK lost) RF Get Break ACK C Get File-Header Open file, ACK RD (Other) (None) A @i(Wait for File-Data Packet)@/@&_@\ RD Get previous packet(D,F) ACK it again RD Get EOF ACK it, close the file RF Get good data Write to file, ACK RD (Other) (None) A @i<@ @ @ @ @ -- STATES COMMON TO SENDING AND RECEIVING --> C (Send Complete) start A ("Abort") start @end @Section The preceding state table shows the packet level protocol. When writing a new Kermit, you still must worry about building the packets and getting them in and out of your machine. This work is best done by low level routines with names like "@q", "@q", "@q", "@q", etc. The packet-@|level routines -- which do quoting, build checksums, etc -- would be fairly standard in any implementation of Kermit. Low level i/o routines, however, must be customized for each machine or operating system. Some operating systems make i/o very easy, while others require you to pay a lot of attention to small details. The C program in the next section is an example where the operating system (UNIX in this case) does all the work; you send and receive characters simply by using system READ and WRITE functions. A few general low-level considerations are worth mentioning: @begin End of Line@\If your host does not require a record terminator for terminal input -- that is, if it can "wake up" on every character -- then you don't have to worry about line terminators on incoming packets; you can just skip over everything between the end of a packet and the beginning of the next one. On the other hand, you must supply whatever terminator the other host requires (it tells you what that is in its send-init packet, or in its acknowledgement to yours). Padding@\Some hosts may require padding, a sequence of "idle characters" (typically NUL or DEL). You have to worry about sending them. You don't have to worry about them on input though, since they come between packets, and will probably have been eaten by some other process or interface anyway (or else why would you need them?). Timeout@\When doing input on the serial port, it is desirable to get an interrupt to wake you up after a certain amount of time if nothing comes in. Or, if you have a port status register, you can have a loop that looks at it a certain number of times before giving up. Or, you can check both the port and the keyboard each time through the loop, which allows wakeup from hung protocol when the user gives typein. Quoting@\Once characters have been successfully received, quoted control characters, prefixed 8-bit bytes, or repeat-@|count sequences must be fixed. Similarly, the quoting transformations must be done when filling a packet, before sending it out or computing the block check. Parity@\Character-level routines must know how to handle the parity bit. This is normally controlled by the settings of some flags that say whether incoming characters have data or parity in the 8th bit, and whether the 8th bit must be used for parity on outgoing characters, and if so, what kind of parity. Handshake@\When dealing with record-@|oriented or half duplex systems, you may have to worry about line turnaround. Again, the user normally sets a special flag, which should be checked before sending a packet. For instance, when communicating with an IBM host, a packet cannot be sent until you have an XON. Local file i/o@\In addition to port i/o, each Kermit must know how to do i/o to its own file structure. The DEC-20 has to worry about whether the file is to be opened in 8-bit or 7-bit mode, and should make sure not to try to send certain kinds of files (directories, archived files, RMS files, etc). IBM VM/CMS Kermit must worry about LRECL, BLKSIZE, RECFM, and so on. Micros must worry about @q<^Z>s at end of file, and funny files created by word processing software. Most systems have to worry about conflicts arising when an incoming file has the same name as an existing file. Terminal emulation@\ When doing terminal emulation (although this is outside the protocol), Kermit must worry about whether to echo characters locally or to let the host do it; it must do whatever must be done with parity bits in both directions; it must interpret screen control codes; it must watch out for the escape character. @end Here, for instance, is more or less what a micro has to do to send a packet (which is already built, with all quoting done): @begin Loop thru all chars in packet, building checksum. Add eol if required. Padding? Yes, send n pad chars. Sending to IBM? Yes, wait for XON. Timed out? Whoops, give up. Send each character. @end To send a character: @begin Get port status. Ready? No, repeat previous step. Set parity bit appropriately. Output the character. @end Getting a character: @begin Get port status. If no character, then check console. If nothing either place, back to step 1. Got char from port. Doing parity? Yes, turn it off. Dispatch appropriately. @end @Chapter @label<-KProg> @Index[Program, Kermit] What follows is a listing of a real production version of KERMIT, written in the C language, that runs under the UNIX operating system. This program implements version 1 of the protocol, and even that not entirely (for instance, error packets are not sent or processed). Only the most rudimentary command parser is provided; the @i shows the commands that more advanced Kermits have. It must be emphasized that this is a bare minimum implementation of Kermit. Anyone writing a new Kermit from scratch is encouraged to look at the source for one of the more advanced implementations -- Kermit-20, Kermit-80, Kermit-86 -- as a model. Although you may not understand the language they're written in, there are profuse comments that can be useful. @Begin @Include @End @Include @case @SendEnd(#Index "@begin@end") @COMMENT '>