送信者: Wayne Cannon <wayne@sgl.crestech.ca>
宛先: alan whitney <arw@dopey.haystack.edu>; tetsuro kondo <kondo@crl.go.jp>; <sgleng@sgl.crestech.ca>
Cc: Wayne Cannon <wayne@sgl.crestech.ca>
件名 : SGL Comments on Draft VSI-S Spec
日時 : 2001年2月21日 5:24


Dear Whitney-san and Kondo-san:

Here are some comments from SGL on the draft ``straw man'' VSI-S
specification. Please feel free to get back to us with any comments
or queries.

  Best wishes,
  Wayne

********************************************************************************


                             VSI-S COMMENTS

General Comments

We like with the overall VSI-S design. It does a good job of remaining
generic while incorporating elements made necessary by the VSI-H specification.
We also agree with the idea of an ASCII protocol, which makes debugging and
testing easier.

Protocol Reliability

Our main concerns relate to protocol reliability. There is currently no
checksum mechanism to ensure that commands and responses have been received
correctly. This is important mainly in the case of RS232 control connections,
although it might also be important with socket connections in the event of
broken connections or other unusual socket behaviour (any experienced socket
programmer has seen plenty of this).
A suggested solution is to add an ASCII checksum character which is located
before the <CR> but after the (last) semicolon. To maintain the spirit of
ASCII-only, the checksum must a printable ASCII character from 0x20 to 0x7E.
It can be a simple additive checksum calculated by summing all the ASCII values
of the preceding bytes and then calculating:  ((sum % 0x5F) + 0x20)
where % is the modulo operator. We might want to exclude some special 
characters from the allowed set, but more possibilities is better in the
case of checksums.

Action on bad checksum: According to data communications theory the safest
action to take when a bad checksum is detected is to discard the data and
wait for a retry. In our case we discard everything up to and including
the <CR> character. The reason for this is that the extent of the data loss
is not known, so we can't reliably return an error saying "the checksum was
bad". For example suppose it was a semicolon, or the <CR> character itself
which was garbled, then we don't even know how many commands have been received
and thus how many error responses to send. Another reason 
"discard-and-wait-for-retry" is better is that the lost/garbled data could have been the response itself, in which case the command was executed but the client
side doesn't know the result. Note that both the server and client discard 
anything with a bad checksum, and the client retries the command after a 
specified timeout period. Exceeding the maximum allowed number of retries 
means that the server is "dead", or the connection has been broken.

This brings up another "pathological" situation to avoid: the case of repeated
non-idempotent commands. Examples of non-idempotent commands include status
reads that result in status conditions being cleared, or 
incrementing/decrementing quantities. If such commands are repeated via
retries then double execution is possible but undesirable. To solve this
problem we must also include a sequence number along with the checksum.
This is simply an ASCII byte value that increments by one for each new command,
but stays the same on retries. The server should ignore multiple commands with
the same sequence number (the response to the client uses the original
sequence number). Therefore we end up with a revised command/query/response
syntax as follows:

<command/query;><command/query;><command/query;><S><C><CR>

Where <S> and <C> are the sequence number and checksum (both in the range
of ASCII values 0x20 to 0x7E). To allow for easy manual testing, a
command should be provided to enable and disable checksum/sequence number
checking, for example keyword 'cscheck' with values 'on' or 'off'.
We also need to define a timeout period and maximum number of retries. In
the S2/S3 RCL protocol we normally use 0.5 seconds timeout and max. 2 retries,
although some commands allow a longer timeout. Note that certain commands
should be exempt from automatic retries, specifically commands which
contain information that is time-sensitive -- for example time setting commands
like 'dotset'. If such a command fails it should be resent by the user or
higher-level software with an updated time value.

Unsolicited Responses

The straw-man VSI-S protocol allows unsolicited responses from the server
to the client. For example, delayed completion responses, unsolicited error
responses, and periodic responses.
We feel that unsolicited responses should be avoided. The main reason is that
it is difficult to ensure that unsolicited responses were properly
received by the other side -- there is no "backwards" error return mechanism
and putting one in would be clumsy. Also eliminating unsolicited responses
is more in line with client/server philosophy, since the client is not
obligated to listen to the server except immediately after a request
is initiated. A simple way to work around this issue is to use the
empty command (just <CR>) to "trigger" the next unsolicited response to
be returned. If none is available then the server still sends a response
indicating this (for example a new VSI-S return code 3 could be used).
Using the <CR> to trigger unsolicited responses still allows relatively
easy viewing of such responses under manual control of the VSI-S link.
Another possible approach (used in the S2) is to return this kind of 
information (delayed completion, unsolicited errors) in response to a
periodic 'status' request. We in fact distinguish between error codes and
status codes, where error codes are conditions which result immediately from
a command and status codes indicate delayed actions or problems which are
not connected with a particular command. The key is that there is a one-to-one
correspondence between messages sent to, and messages received from, the server.
A final, additional advantage of eliminating unsolicited responses is that 
it allows the communication channel to be half-duplex, and makes it possible
to control multiple devices on a common channel. While this is not needed
for socket or RS232 links, it at least doesn't rule out multi-drop links.

Some of these points may seem like nit-picking, but we have found it
very valuable to design a robust, reliable protocol from the outset. 

Comment Handling

The handling for comments described in section 4.3 seems a bit unnecessary.
It is probably better to eliminate comments before they are even transmitted.
Then they are simply not part of the protocol, just part of the handling
for prepared ASCII command files. We should make an important distinction
between the VSI-S protocol, which is a machine-to-machine protocol, and
any user-level interfaces that derive from it, such as prepared command
files. The command file syntax could be made nearly identical to the VSI-S
syntax, but it would have certain important differences:
-Comments and possibly other "readablity" extensions allowed in command file.
-No checksum/sequence number in the command file.

In any case we feel such a command file definition is outside the scope of the 
VSI-S document.

Multiple Commands

In Section 4.4 on multiple commands, it states that "Any number of 
commands/queries may be concatenated in this manner". In the interest of
limiting buffer sizes we should probably specify a hard (but generous) limit,
say 4096 command bytes total.

Additional Standard Command Keywords

We suggest the addition of these commands which should be standard on all
systems conforming to VSI-S:

  Keyword           Function
  -------           --------

   ident         Returns a device type string so that the client can
                 recognize the type of device it is talking to and perhaps
                 also adjust its command set for DTS-specific features.
                 Example response: 'S3-REC'

   version       Returns a software version string from the device, e.g.
                 ROS version 3.2c [162] (compiled Wed Jul 19 15:17:49 EDT 2000)

   diag          Initiate self-test (POST or other standard internal test
                 sequence). Useful to run or repeat the self-test, if available.

Position Command (6.1)

We suggest defining parameter value 0 to mean rewind to BOT (beginning of tape)
in the 'position' command.

Ethernet Control Port Number (5.2)

Yes, the VSI-S spec should suggest a "standard" port number to use. 
Port number assignments are handled by the Internet Assigned Numbers
Authority; check at http://www.iana.org/numbers.htm under 'Port Numbers',
which has links to both a full list of all currently assigned port numbers,
and the application forms for new numbers. We would want a 'user/registered'
number for this in the 1024-49151 range.  Ports from 0-1023 are for system
or well-known protocols and require a full standard submitted to the 
Internet Engineering Task Force, while ports 49152-65535 are reserved for
dynamic allocation or private purposes. Some systems use everything from 
32768 on up for dynamic purposes, so the 1024-32767 range would be best.
There are still plenty of unassigned numbers in that range. Note: The S2/S3
RCL link uses port number 1025 (not registered with the authority), so that
number should be avoided.

 