.PS 64,72 .B.C; ^LPA11-K DIGITAL I/O FOR THE VAX11^ .HL 1 INTRODUCTION. .B.I5;The LPA11-K is capable of medium speed, around 20,000 bytes/sec, digital input/output on the VAX. Since it is a direct memory access(DMA) device, it does not load the central processor severely as would a parallel interface directly on the unibus. The LPA11 assembly consists of two microprocessors, a silo memory, a programmable real-time clock, and, for digital I/O, one or more DR11-K parallel interface boards. Communication between the LPA11-K and the VAX user is by way of calls to subroutines provided with the VMS operating system or by QIO requests to the LPA driver. We will be concerned with the former, both because it can be performed by high-level or macro assembler programs and because relatively little system overhead is avoided by using the QIO alternative when large buffers are transferred. .B2.I5;The next section will describe the steps necessary to get the LPA11-K into operation. Then follow a series of precautions and lessons that are either obscure or undocumented. Next is a section devoted to finding out what is wrong, and finally some sample programs that can be use as templates. .HL 1 Installation and Testing. .B.I5;The LPA11-K hardware is presumably installed by field service, though the cabling between the external device and the LPA is frequently connected by the user or a vendor of an external device. In our case the external device is a laboratory instrument containing a PDP8 as built-in controller. The PDP8 contains a vendor supplied parallel digital I/O interface board, as well as others to operate the instrument. .B2.I5;Field service routinely tests part of the LPA11-K capabilities; there a diagonstic routine which uses a maintenance cable to wrap output back around to input. This does not test the external data ready or data accept lines; an oscilloscope or digital logic analyzer is required for this purpose. The individual boards making up the LPA11-K can be tested by field service directly on the unibus more completely than is possible assembled. VMS includes a driver and microcode loader program. The later is invoked by a two line command procedure LPA11START.COM, supplied in the [SYSMGR] directory, at least with Version 3.0 of VMS. It is invoked on startup and produces a detached process that hibernates. Thereafter microcode can be reloaded by calls from high-level languages. User programs and UETP diagnostics report an error loading microcode if this has not been done. The UETP test of the LPA11 is similar to that used by field service; in several upgrades of version 2 of VMS it produced errors when there was nothing wrong with the hardware. .HL 2 Configuration. .B.I5;The minimum configuration for the LPA11-K for digital I/O consists of the dual microprocessors, the programmable real time clock, and at least one digital interface, DR11-K. Additional DR11-K's may be used for simultaneous communication with several devices, as well as A/D or D/A for analog I/O. .B2.I5;The microcode for the externally directed, or slave, microprocessor is in read only memory. The microcode for the internally directed, or master, microprocessor is partly in read only memory, partly down loaded from the VAX. Two versions of loadable microcode are furnished, dedicated, for single user analog I/O, and multirequest, for multiple user and/or digital I/O. Since we are interested in digital I/O we will suppose that the multiple request microde is loaded. This is the default. The microcode is not user modifiable. .B2.I5;The software provided under VMS for the LPA consists of a QIO interface and a set of subroutines callable from high level language application in a manner very similar to system service routines. The calls to these routines begin with LPA$. .HL 2 User Software. .B.I5;User programs are required to furnish a 50 longword arrary for communication between VMS and the LPA; it is passed by reference in most of the calls to the service routines. In addition, buffers for transfer of data between the user program and the external device must be defined in the user program. These must be contiguous, which is usually assured in FORTRAN by placing them in a COMMON block. One to eight buffers may be used; they must be of equal size. For purposes of communication with the LPA software, the buffers are numbered from zero up to a maximum seven. The LPA works with 16 bit PDP11 words, though it is possible to connect a 12 bit word device to the DR11-K; the high order 4 bits are dropped on output or zero filled on input. The LPA transfers entire buffers to or from memory; what happens during the process is inaccessible to the user program. The communication is achieved by transferring control of buffers back and forth between the LPA and the user program. .HL 3 Order of Calls. The logical steps in a user program are: (1) Define the scratch and buffer arrarys. (2) Load the multirequest microcode by a call to LPA$LOADMC. Condition and error codes are provided; the most common error results from failure to have executed the startup command file since the system was last booted. (3) The clock rate and preset are declared in a call to LPA$CLOCKA. The clock must be started even if the I/O operations will be controlled by other device external data ready/data accept signals. (4) The number and the names(beginning addresses) of the buffers are communicated to the LPA software by a call to LPA$SETIBF, which also locks them in memory so that they are not paged. (5) If the operation is digital output, the buffers are next filled by the user program. (6) The buffers are released to the LPA by a call to LPA$RLSBUF. (7) Digital input is begun with a call to LPA$DISWP to start a sweep, digital output by LPA$DOSWP. (8) When the LPA has finished filling or emptying a buffer, it is retrieved for processing by the user program by calls to either LPA$IWTBUT, which waits on buffer availability, or LPA$IGETBUF, which returns a buffer if one is available or a failure code otherwise. (9) Sweeps are terminated gracefully by a call to LPA$STPSWP. .HL3 Software Verification. To test the handshake between the LPA11 and an external device it is necessary to monitor the lines with an oscilloscope or other instrument. The test program, LPATEST, included as an appendix, fills buffers with the buffer number and transmits them back to itself via a maintenance cable linking output back to input on the DR11-K interface module. Displaying the data received provides easy verification that the microcode has been loaded, and that the major functions of the high level language interface are in place. This program also provides a template for specific digital I/O applications. We found the sample program in [SYSHLP.EXAMPLES] difficult to read and therefore difficult to decide whether we had problems with the hardware, software, or just confusion in interpreting the instructions and requirements in the manuals. As noted above, it is not possible to test the external data ready/data accept handshake without an external device or instrument on the assembled DR11-K. The individual components are tested by field service directly on the unibus if necessary to isolate a specific problem. On one occasion all components tested correctly on the unibus, but the assembled unit did not. The problem was in a power supply. .HL3 Errors and Problems. Likely errors are buffer overruns and underruns. The LPA does not recognize timeout, so the user must arrange to send or receive an integer number of full buffers. This makes it convenient to use buffers small enough that an odd fraction of a buffer can be padded out to full size with minimum waste of time and memory. For digital output applications this is a minor nuisance. For digital input, it is often not so easy to arrange that the external device fill the last buffer completely. In our application, the external device sends typically a number of buffers of data, then pauses for a number of seconds to a number of minutes before resuming transmission with a logically distinct quantity of data. Being able to use this pause or delay time for processing, in our case, is conditional on the external device transmitting a complete buffer of data before pausing, since the interpretation of the data depends on parameters not available until just before the pause. Since the LPA is a direct memory access device, the VAX CPU is available to process data while the LPA is active. .B2.I5;The user program has the responsability for filling or emptying buffers and verifying the quality of the data. Since all of this overhead occurs on a per buffer basis, it is desirable to use large buffers. .B2.I5;For digital output, data is transferred either at a fixed speed set by the programmable real time clock, or at whatever speed is permitted by the external data ready/data accept handshake. Because the master microprocessor has a more complex program to execute than the slave, its maximum rate is slightly slower. As a result, the slave can process data as fast as the master can transfer it or the external device respond, whichever is slower. On output this is not a problem. .B2.I5;For digital input, the situation is more complicated. We spent a great deal of time and effort debugging a problem with digital input where the external device supplied data at a speed just within the capability of the slave processor, but slightly too fast for the master. These effective rates are functions of the buffer size. We found that using eight large buffers, 1024 words each, made it almost always possible to keep up with the external device, while smaller buffers resulted in overruns, particularly on startup. As originally operated, the external device was fast enough to crash VMS, in one session a dozen times in an hour. Test routines, including one to transmit output to input using a maintenance cable showed the LPA11 to be performing properly. With a digital logic analyzer we were finally able to demonstrate that data was being supplied at about the maximum rate for the slave processor to accept it, but slightly too fast for the master microprocessor to keep up. The bugcheck message referred to AST delivery. This experience may be a matter of concern on systems on which the LPA is used to limit the load on the CPU of the VAX in order to permit simultaneous interactive processes. In our case the overhead of entering and leaving a subroutine on the PDP8 was sufficient to eliminate the crashes with the small buffers, and with the large buffers, to eliminate any overrun except occasionally on startup. .B2.I5;The lesson from our experience is that the handshake involving the external data ready/data accept will not force a faster external device to slow to the maximum data rate of the LPA. No obvious error is reported. When the speed mismatch is marginal, data overruns are observed; when it is worse, VMS will bugcheck and crash. Data overrun on the first buffer on digital input is not unusual; the extra buffer provided for this possibility is usually adequate to avoid loss of data. Other users tend appreciate this sort of experimentation not being carried on while they using the VAX. .HL 3 The Clock. The programmable real time clock is realted to a user program by three parameters. The first, IRATE, sets the speed of the clock. All other timings are multiples of this cycle, so it is normally set 1MHz so that multiple users are afforded the maximum range of timings for different applications. Next is the preset, IPRSET. At first glance obscure, is simple in operation. A negative integer is specified. In operation, this value is copied into a counter to initialize it. On each clock cycle the counter is incremented. When it overflows from negative to zero, the next level of timing is stepped, and the counter reinitialized. The final timing parameter, DWELL, is the number of times the counter is reset per operation. IRATE and IPRSET are described in the documentation as hardware parameters, DWELL as software. For any parameter combination, the final digital I/O rate will not exceed a figure between 10,000 and 15,000 words per second. For digital I/O, the clock must be set and started even if the external device is to provide a handshake. The documentation indicates that for multiple users, the first sets the rate and starts the clock; subsequent ones specify values of DWELL appropriate to their tasks. .HL 2 Data Storage. .B2.I5;We suppose that the application program writes out data received, with or without processing it, in a reasonably efficient manner. For example, in FORTRAN: .B2.I10;DIMENSION LBUFFS(1024,8),LBUFR1(1024)......LBUFR8(1024) .B0.I10;EQUIVALENCE (LBUFFS(1,1),LBUFR1(1)),....(LBUFFS(1,8),LBUFR8(1)) .B0.I10;....(code) .B0.I10;WRITE(IOUT)LBUFRn etc. .B2.I5;We have found that we could not stress even a single RM03 disk system storing data, and it appears that doing physical output to RX02 disks might be possible at only a slight reduction in throughput. For production use of the instrument we have dual RM03 disks on separate mass buss adapters, with one containing the system and a program to translate and store the data received, the second receiving the data. Timing experiments indicate that we are able to do extensive data reduction on the VAX at no sensible decrease of data rate relative to simple acquisition and storage. .HL 2 Documentation. .B.I5;The LPA11-K driver and the high level language routines are described in the VAX/VMS I/O User's Guide, Chapter 5 of AA-M540A-TE for version 3.0, Chapter 10 of AA-D028B-TE for version 2.X. There is a philosophical discussion in the Real Time User's Guide which we understood better by hindsight than foresight. The LPA11-K Laboratory Peripheral Accelerator Installation and Maintenance Manual, EK-LPA11-IN-002 is also intended for VAX/VMS sites. .B2.I5;Other documentation which tacitly assumes PDP11, rather than VAX, computers includes: .LM15.B.I-5;LPA11-K Laboratory Peripheral Accelerator User's Guide, EK-LPA-ll-UG-001 .B0.I-5;LPA11-K ^FORTRAN^ User's Reference Guide, AA-H852A-TC .B0.I-5;DR11-K Interface User's Guide and Maintenance Manual, EK-DR11K-MM-001 .B0.I-5;KMC11 General Purpose Microprocessor User's Manual, EK-KMC11-OP-PRE .B6.I5;L.C. Cusachs 30/VIII/82