PAGESWAPPER The Power Fail of a 780 by A. T. McClinton The VAX 11/780 is supposed to run forever. But the local power company wishes that it were in the phone business so it could start selling computers. This has led to several interesting problems in the area of providing an the electrons necessary to chase the bits and bytes out the terminal lines and keep the programmers happy. We being of sound mind purchased several "DIEHARDS" to attach to our machine. But I purchased the machine the same time that I bought a new 1978 Dodge Colt. When the OEM battery went on the Colt I began asking questions of field service about how to check the battery on the memory. They replace them when they fail and have no test points to see how much charge is left on the battery. This led to trying a modified version of the battery test method that I heard about at a DECUS meeting. I ask my weekend operator to at 5 am Sunday morning when very few people are on the system, to throw the main breaker on the back of the VAX and leave it off for 5 minutes. This will place a load on the system as I have 3 megabyte on each controller. Both of which are battery backed up. The second time we did this we found a very interesting problem. The third battery in the VAX failed. This battery is present in all VAX 11/780's. It is the TOY (Time of Year) clock. The symptom was the VAX continued running but the time and date were now in April of 1984. I recognized it as a hardware problem that should have been caught by the software. A fishe'ing experience showed that the clock was being set to zero because of the battery failure. The VMS operating system per both the fishe and the hardware handbook use the fact that the clock is less than 10000000 to indicate a time failure and request the operator to enter the time on a cold boot. On a warm restart, no test is made. The result is the software assumes that the clock has wrapped and that the system has not 1 PAGESWAPPER - May 1983 - Volume 4 Number 7 The Power Fail of a 780 o been booted in during months 13, 14 or 15 of the clock. This results in the time being set to a random time normally in the future. The problem has SPR'ed and DEC will consider changing it in a future major release. They feel it is better to come up with the wrong time than to require operator intervention on the auto reboot. I hope that they will fix all of the programs that now have incorrect time stamps. namely accounting records, errorlog entries, etc. SPEAR absolutely refuses to read past the point where the data shifted forward past today. Backup plans to backup the files if they still are on the disk next year. And all of the timed entries in the batch que were fired off immediately. Many of these required operator intervention. In the standard method of solving problems I would propose that two new sysgen parameters be added: 1. WARMWAIT - This parameter would be the number of micro fortnights that the operating system would wait for an operator to enter the correct time if the warm reboot process were to find an error in the time during a warm reboot. 2. WARMTICK - This parameter would be the number of clock ticks to add to the last known time before the power fail if the operator does not respond within WARMWAIT time. Note that WARMWAIT will also be added. In the event of a power fail while waiting for an operator response then the system will add all recognized time plus warmtick times the number of power fails. All of the above obviously is unnecessary if you can make certain that the clock is in good repair. Perhaps replacing the TOY clock with a real one would solve the entire problem. 2 PAGESWAPPER - May 1983 - Volume 4 Number 7 In this issue... In this issue... The Power Fail of a 780 . . . . . . . . . . . . . . 1 In this issue... . . . . . . . . . . . . . . . . . . 3 Editor's Workfile . . . . . . . . . . . . . . . . . 3 Stand-Alone BACKUP on 11/750 System Disk . . . . . . 4 Who "Owns" the PageSwapper Mascot? . . . . . . . . . 5 Performance Tuning Large VAX Installations . . . . . 6 VAX/VMS Real Time Performance Test Results . . . . 12 INPUT/OUTPUT . . . . . . . . . . . . . . . . . . . 38 INPUT/OUTPUT Submission Form . . . . . . . . . . . 45 System Improvement Request Submission Form . . . . 47 Material for publication in the Pageswapper should be sent (US mail only -- no "express" services please) to: Larry Kilgallen Box 81, MIT Station Cambridge, MA 02139-0901 Preference is given to material submitted as machine-readable Runoff source. Mailing list requests are NOT handled at the above address; they should be sent to the DECUS office. Editor's Workfile I heard from one poor author who had tried to send his Pageswapper article in machine-readable form and TWICE had it returned with "no such person" or some such status code. Being acquainted with Jim Ebright, the author even checked to make sure their really was such a person as Larry Kilgallen. The answer is that there is, but any submissions to the Pageswapper should be MAILED to the address given. Do NOT attempt to use any of the so-called "express" services or "overnight-delivery". It always slows it down, and in at least one recorded case caused total non-delivery. I have been informed that last month's article by Richard Garland was a repeat. As best as I can figure it out, the floppy he submitted was turned over to me when I signed on as editor and I had presumed it was as yet unpublished. I included it in the April Pageswapper and returned the floppy. Well, at least now Richard has his floppy back and we can't do it again (not with the same article anyway). 3 PAGESWAPPER - May 1983 - Volume 4 Number 7 Stand-Alone BACKUP on 11/750 System Disk Stand-Alone BACKUP on 11/750 System Disk Joe Springer Software Services, Digital Equipment Corporation (NYO-8) 1 Penn Plaza, 8th Floor, New York, NY 10119 In reference to the article entitled "Stand-alone Backup on Your System Disk" that appeared in the PAGESWAPPER of February 1983, Vol. 4, No. 4: 1. The procedure described will work both for the VAX 780 and the 730. However, the 750 is somewhat different. After building stand-alone BACKUP in [SYSE.SYSEXE] on the system disk, and the bootfile (BCKBOO.CMD) is written out to the console medium and the system is shut down, boot from the system console as follows: >>>B DDA0 This boots BOOT58 from the console medium (device DDA0), after which you can run the command file, as follows: BOOT58>@BCKBOO.CMD At this point, the command file runs and brings up stand-alone BACKUP. The reason for the extra step to boot from DDA0 is that on the 750 you cannot specify a command file from the normal console subsystem prompt (>>>) as normally done on other processors. When you type, for example, >>>B DUA0 you are really specifying a device (in this case the R80), not a bootable file. 2. While the procedure in the article used for connecting, and using FLX for reading from and writing to the console medium is correct, a command procedure exists that will do most of it for you. It works on all three VAX processors. It is SYS$UPDATE:DXCOPY. Note that it mounts CSA1 (730 users note: not CSA2). For further information see V3.2 Installtion Guides (Section 3.2.4 in VAX-11/780, Section 5.1.2 in VAX-11/750; it is not in the 730 Software installation Guide). 3. You can use any hex number for the 'n' in [SYSn.SYSEXE] except 0 (where your system is now) and F which is reserved for Digital. Be sure that you modify the BCKBOO.CMD properly, and that it matches the new [SYSn.SYSEXE] that you have created. 4 PAGESWAPPER - May 1983 - Volume 4 Number 7 Who "Owns" the PageSwapper Mascot? Who "Owns" the PageSwapper Mascot? I have been using idle time on my VAX to generate computer graphics frames on a bit mapped printer/plotter for "flip-packs". These are small booklets that you hold in your hands and flip through to animate the pictures. Even with professional printing and binding, it's probably the cheapest computer animated graphics output you can produce! Here is one frame of a stochastically modelled mountain: I bring all this to your attention because I would like to make a flip pack of the cheshire cat mascot. I envision a booklet in which the cat fades from one end to the other (the smile remains, of course!) Flip the book backwards, and the cat reappears. The only problem is, I assume I would need permission to use the design. Who can give this to me? And can I get a larger copy of it to make digitizing easier? Perhaps I could justify the use of the design by selling the resulting flip packs at the Decus bookstore next fall, profits to go to Decus, the BAYVAX LUG, or some other worthy cause. Mike Higgins Virtual MicroSystems Inc. 2150 Shattuck Ave, suite 720 Berkeley, CA 94704 5 PAGESWAPPER - May 1983 - Volume 4 Number 7 Performance Tuning Large VAX Installations Performance Tuning Large VAX Installations Mark Love Northeastern University Boston, Mass 02115 Tuning is an issue about which everyone wants to know more. In this article I will try not to bore readers rehashing old stuff, but rather provide an overall logical framework for discussion, provide pointers to where good information may be found, and give a few "pearly words of wisdom" which have proved helpful at our large site. My definition of "large" is any VAX 11/780 ( if it's large, why use a 750 or 730? ) supporting more than 30 simultaneous interactive users with, perhaps, additional jobs running in batch queues. Tuning a large VAX facility can be viewed in 3 contexts: o tuning VMS itself; o tuning application programs; o user education and site management. This article deals mainly with tuning VMS. Some comments do address user education and site management. TUNING VMS __________ The first Golden Rule of Tuning, which applies to any size VAX, is: the more work VMS does managing its resources, the less work it does for users - the "real" work. Paging, swapping, file lookups, window turns, process creations, image activations and device interrupts are all examples of VMS overhead activities which should be minimized. Golden Rule of Tuning #2: providing more resources - hardware - helps minimize VMS overhead and maximize user work. Corollary to Golden Rule of Tuning #2: you can't put 10 pounds of ____ (user work) into a 5 pound sack (VAX). In order to process a given workload, you must provide sufficient hardware; no amount of tuning will overcome insufficient resources. There are three potential hardware bottlenecks: memory, I/O and CPU. o memory; o I/O; o CPU. 6 PAGESWAPPER - May 1983 - Volume 4 Number 7 Performance Tuning Large VAX Installations MEMORY TUNING _____________ Any swapping or excessive paging ( > 100 pages/sec systemwide ) is strong but not conclusive evidence for insufficient memory. Our observation has been that any large VAX with less than 4 Mbytes of memory is definitely underconfigured. A large VAX with 4 - 5.5 Mbytes is marginally configured, while more than 5.5 Mbytes is usually sufficient. We have seen large VAXes where more than 5.5 Mbytes was a waste, while other large VAXes profitably used 8 Mbytes. Moral: the most effective tuning advice usually is: buy more memory, especially since it's become so cheap ( < $4K/Mbyte ). How should VMS be tuned to use this memory? Much of the black art aspect of VMS tuning has been removed as of release 3.2 with the addition to Chapter 12, Systems Management and Operations Guide. This chapter now contains over 90 pages of useful information which is must reading for serious and effective tuning. However, some additional suggestions may prove helpful. An implicit assumption in what follows is that swapping should rarely, if ever, occur and that paging should be reduced to an absolute minimum. The system (EXEC) should page very little: once per minute is not unrealistic. SYSMWCNT will probably need to be larger than the default value. Its exact value is best determined by results: use MONITOR PAGE to glean system page fault rates. Process paging can be reduced by generous WSQUOTA's in each authorization record and a large WSEXTENT sysgen parameter value ( WSMAX should equal WSEXTENT). WSQUOTA = 500 and WSMAX, WSEXTENT = 1000 work well in an academic environment, but these are not magic values suitable for all installations. Further reductions in process paging can be gained by making working set adjustment more sensitive to increase but less sensitive to decrease than the default values. An example might be WSINC = 50, PFRATL = 1, PFRATH = 50, and WSDEC = 2. ( If PFRATL = 0, the system seems not to "take back" extent pages, so memory is quickly exhausted and swapping begins. Perhaps this is a scheduler bug in VMS 3.2 ?) Page faults from the modified list are less expensive than faults to disk, so configure the modified list as large as possible, within the guidelines set forth in chapter 12. Reasonable values might be MPW_WRTCLUSTER = 120, MPW_HILIM = 1500, MPW_LOLIM = 250 and MPW_THRESH = 750. Making sure the disk ACP's run efficiently is another good use of memory. First, running 1 ACP for each separate MASSBUS (and UNIBUS if it has disks) is usually a good idea. Additionally, if you have disks with significantly different 7 PAGESWAPPER - May 1983 - Volume 4 Number 7 Performance Tuning Large VAX Installations performance, use a separate ACP for each type. Do not run multiple ACP's for each disk, however; caching is more effective than separate processes which must be context switched. Second, caching ACP information in memory via the ACP_*CACHE parameters can have a big effect on overall performance and I/O throughput in particular. The more information the ACP's can keep in memory via these caches, the less overhead disk I/O the ACP's must perform. ACP_MAPCACHE is used to hold the bitmap of free and used disk blocks. One bit per allocation cluster is required. As an example, an RM05 with about 500,000 blocks and a cluster allocation factor of 3 requires (500000/3) = 166666 bits for the entire bitmap. Since there are 8 * 512 = 4096 bits/page, it takes 166666/4096 ~= 40.5 pages to map an entire RM05 bitmap. A maximal ACP_MAPCACHE value would be sufficient to hold the entire bitmap for every disk the ACP controls. A more reasonable but still effective value would be sufficient to hold the free blocks portions of all the disk bitmaps controlled by a single ACP. Note however that as the disks approach being full, values closer to the maximal value are probably required for efficient caching. ACP_HDRCACHE is used to hold the file headers ( 1 disk block each ) of open files. Set this parameter to the number of files you expect to be simultaneously open on all disks controlled by an ACP. A rough guess of the number of open files on each disk can be obtained by using the SHOW DEVICE/FILE xxxx: command (remember to give yourself WORLD privilege). A reasonable value for ACP_DIRCACHE is the value given to ACP_HDRCACHE. If disk quotas are active, ACP_QUOCACHE should be large enough to hold the entire quota file ( [0,0]quota.sys ) for each disk controlled by the ACP. You can use the MONITOR FCP display to determine how effective your ACP cache values are. Small numbers of window turns and high numbers of cache hits, relative to the number of FCP calls, is a good indication of effective caching. Significant overhead can be saved during image activation for frequently used programs by installing them /OPEN/SHARED/HEADER_RESIDENT with the INSTALL utility. Additionally, installing frequently used images /SHARE saves memory, since multiple users map into the same physical pages. Adding images to the installed lists will probably mean increasing the KFILSTCNT, GBLPAGES and GBLSECTIONS sysgen parameters. You can determine the most frequently used images by 8 PAGESWAPPER - May 1983 - Volume 4 Number 7 Performance Tuning Large VAX Installations running image accounting. Remember that image accounting is costly and can cause the accounting file to grow rapidly. Try running image accounting for a few days during heavy loading to collect a statistically significant sample. To turn on image accounting, use the SET ACCOUNTING/ENABLE=IMAGE command; SET ACCOUNTING/DISABLE=IMAGE turns it off. To produce a list of images and frequency counts, use the command ACCOUNT/TYPE=IMAGE/SUM=IMAGE/SORT=IMAGE/OUT=listfile accountfile. The list of images will not be in order of use frequency, but a simple run through the SORT utility can remedy that. Installing 70 images is not profligate, providing the use statistics warrant it. Installing the top n% of used images, or installing all images run more frequently than x time per day are reasonable selection criteria. A word of caution when installing images /SHARE. Creating global sections in this manner is an excellent way of utilizing memory as noted above, but you will not reap the benefits unless the file protections are set to allow world read-execute access (W:RE). Setting just W:E defeats sharing global pages. This problem can be observed via the INSTALL utility with the /GLOBALS command: the reference count will always be 0, even when people are using the image. Note, however, that the reference count is normally 0 if nobody is using the image. I/O TUNING __________ An I/O bottleneck is probably best characterized by complaints of slow response time, most processes in LEF state, and only a few processes in COM state ( < 5 ). Use the MONITOR STATES display to glean this information. The discussions above concerning ACP caching, INSTALLing and reducing paging and swapping are significant steps that will reduce I/O bottlenecks. The RMS Tuning Guide may prove useful, particularly for programs accessing large Indexed Sequential files. An excellent discussion of FORTRAN file I/O optimization techniques can be found in an article by Tom Kent appearing in the March 82 Pageswapper. I/O load balancing can help reduce bottlenecks. A large VAX should have multiple MASSBUSes, and possibly multiple UNIBUSes, with devices spread evenly across them. The activity on each bus should be about equal, as should activity on each disk, assuming all disks are similar. The SHOW DEVICE/FULL command will display device operation counts which can be used as an approximate measure of activity over a system. If load seems to be concentrated on one disk or bus, try moving users or applications around to level the load. Users should not be placed on the system disk as it is 9 PAGESWAPPER - May 1983 - Volume 4 Number 7 Performance Tuning Large VAX Installations usually the most heavily used. In certain situations it may be profitable to make the primary and secondary swap files on the system disk small, and create large secondary swap and page files on a less used bus/disk. This moves paging and swapping activity off the system disk ( although given the above discussion you shouldn't be doing that much paging and swapping ). Do not place tape drives on the same bus as the system disk, especially if they are used frequently. In configurations with large numbers of terminals, you can reduce the overhead associated with character processing by replacing DZ-11's with DMF32's or DMF equivalents from non-DEC vendors. These devices are more intelligent than DZ-11's and can reduce character processing overhead. Devices which generate large amounts of output, such as plotters and graphics terminals, would benefit greatly from such a reconfiguration. If a lot of printing is done, consider interfacing the line printers with DMF32's or other DMA line printer controllers to further reduce character output overhead. Sometimes a modem or terminal or DZ-11 breaks in such a way that, to the VAX, it looks as if someone is attempting to log in over and over again, very rapidly. This behavior can also be caused by improper grounding in the terminal lines. This situation can cause substantial overhead (loss of VAX responsiveness) and can be observed via the MONITOR MODES display. Excessive time on the interrupt stack ( > 40 % ) may indicate such a problem. Confirmation can be obtained by searching the accounting files for login failures. Login failure recording may be turned on by the SET ACCOUNTING/ENABLE=LOGIN_FAILURE command, and a summary of such events may be obtained with the command ACCOUNTING/TYPE=LOGFAIL/SUMMARY=LOGFAIL accountfile. By removing the /SUMMARY qualifier from the above command, the terminal lines on which the failures are occuring will be displayed. For sites such as academic computing centers which support large numbers of usernames ( > 1000 ), directory organization can have a large impact on I/O and overall system performance. Placing all default directories into the top level ([0,0]) can cause the ACP to spend considerable time doing directory searches; this effect is exaggerated if the directories are clustered alphabetically. ACP directory search time can be reduced in this case by organizing default directories hierarchically. For example, suppose a username must be provided to each of 30 students in 100 different classes. Instead of placing 3000 directories in [0,0], put 100 class directories in [0,0], and then make 30 subdirectories in each of the 100 class directories. Each username in the AUTHORIZE 10 PAGESWAPPER - May 1983 - Volume 4 Number 7 Performance Tuning Large VAX Installations file would then point to a subdirectory. Every site should have some backup procedure for keeping redundant copies of files. Performing backups regularly on disks can have the additional benefit of consolidating file allocations, which reduces ACP overhead activity. As files are created, extended and deleted during normal operations, the disk space tends to become fragmented. Even files can become fragmented: the space allocated to them is discontiguous. Running BACKUP/IMAGE operations not only creates redundant copies of disks, it makes file allocations be contiguous (or nearly so). If the MONITOR FCP display shows large numbers of window turns, this may indicate fragmented disk space, especially if a BACKUP/IMAGE has not been performed in a while. CPU TUNING __________ Not much can be said about CPU bottlenecks except "buy another VAX". The number of users that a given VAX configuration can support varies substantially, and depends greatly on the type of work being done. Assuming a well tuned VAX, CPU saturation can be said to occur when there are consistently more than 15 computable processes ( in the COM state ). The only remedies, outside of acquiring more VAXes, are program tuning or management decree. Program tuning is a very large topic and cannot be adequately discussed here; generally, each case must be considered separately. In situations where demand is excessive during some hours but light during others, steps can be taken to even out the CPU demands. One step is to create batch queues, and encourage people to submit their work to batch. This isn't very difficult at sites where resource charging is in effect if the rates for batch work are lower than interactive. A more draconian but effective means of forcing batch work is to place a CPU time limit in the AUTHORIZE record of each user. During an interactive session this limit causes images to exit if the limit is exceeded. Initializing batch queues with a maximum CPU time will override this limit. Set up properly, this permits larger jobs to run in batch but not interactively, forcing users to make use of batch jobs. 11 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results VAX/VMS Real Time Performance Test Results by Richard Kayfes The Aerospace Corporation P. O. Box 92957 M1-166 Los Angeles, CA 90009 1. Introduction This paper presents the results of some of the tests that were developed to assess the response and overhead of VAX/VMS and DECnet. The tests described here are very detailed and specific in nature. They do not address general multi-user applications nor do they test VAX products such as DATATRIEVE, EDT and the system utilities. They were designed to obtain detailed and repeatable timings of VMS/DECnet components. The tests are divided into three categories: 1. System Response Tests These tests measure the microsecond level execution time of various VMS system services and various VMS system mechanisms, such as process context switches, real time clock interrupt service, asynchronous system trap (AST) delivery and paging. 2. VMS System Load Tests These tests measure sample instruction execution rates and the disk input/output (I/O) rates that can be realized using FORTRAN unformatted sequential I/O and virtual QIO I/O. The tests also measure the impact of the SWAPPER on the wakeup of real time tasks. 3. DECnet Load Tests These tests measure the nontransparent interprocess communication transfer rates that can be realized between two nodes in a DECnet network and also the load placed on the VAX by these communicating processes. The tests also measure the impact of DECnet on the wakeup of real time tasks. The tests documented here were run on three different VAX computers with no other users on the machine. The following describes the relevant hardware/software configurations for the tests: 12 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results VAX #1 VAX #2 VAX #3 CPU VAX-11/780 VAX-11/780 VAX-11/780 Memory: 4.0 Mbytes 4.0 Mbytes 4.0 Mbytes 1 Controller 1 Controller 1 Controller User Disk: 1 RM03 1 RM05 1 RM05 System Disk: Same as user Same as user 1 RP06 Software: VMS 3.1 VMS 3.0 VMS 2.5 Communications Line: PCL11-B 2. Capabilities Needed In order to obtain microsecond level elapsed times and investigate memory cache performance, a capability to read and set internal VAX CPU registers is needed. The interval clock (ICR) register is a microsecond counter which is used to generate an interrupt at level 24 every 10 milliseconds for updating software time and for servicing timed events. By reading the ICR register and software time, microsecond level elapsed times can be computed. The SBI/Cache Maintenance (SBIMT) register contains bits which if set can disable the memory cache facility. These bits must be set/reset to investigate cache performance. In order to read or set the ICR and SBIMT registers, a program must execute in Kernel mode. In all microsecond level tests, the time to read the ICR register is precomputed and subtracted from the test timings. An average value is used for the overhead which typically varies by +/- 15 microseconds from the average. 3. System Response Tests The system response tests determine the microsecond level execution time of various VMS system services, the time to service a page fault, the time to perform a process context switch, the time to service the 10 millisecond (10 MS) clock interrupt, the time to respond to a timed event, the time to deliver an AST and the time to execute various output statements. The system response tests are primarily written in FORTRAN and are run in standalone mode. All code and common regions which are directly involved in test timings are locked into the working set. Code and data which must be locked from an elevated access mode, however, is not locked into the working set. In addition, for the non-page-fault tests, the entire set of tests is executed once before any timing values are accumulated. The tests are also run with a large working set size with automatic working set adjustment disabled. This ensures that all code and data referenced is in the working set 13 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results and that no page faults will occur during the tests. To ensure that 10 MS interrupt processing times are not included in services which take less than 10 MS to complete, a routine is called to wait for a 10 MS interrupt before starting each test. Due to the memory cache facility in the VAX, microsecond level timings are influenced by whether the test is performed in a loop and by whether the memory cache is flushed before a test. When the LKWSET, ULWSET system services were executed in a loop, the timings obtained were 45% faster than when they were not executed in a loop. In the page fault test, successive page cache faults were timed with the memory cache flushed before the first fault. As expected, the second fault was serviced (256 microseconds) faster than the first fault (387 microseconds). In order to generate consistent and repeatable timings, the memory cache, therefore, is flushed before each measurement by referencing every byte of a 40000 byte array. In addition, any code which is to be timed is not executed in a loop. 3.1 System Service Test Table 3.1 lists the system service execution times for eight repetitions of the system service test on VAX #1. For each service, the maximum/minimum/average time expressed in microseconds is listed along with a short description of the test. Table 3.1-1 System Service Execution Times Lock Management ENQW 717/675/699 Null lock granted. No other process using lock. ENQ 559/515/538 Null lock granted synchronously. No other process using lock. ENQ 646/593/616 Null lock granted synchronously. One other process using lock. ENQ 628/577/596 Concurrent read sublock granted synchronously. No other process using lock. ENQ 357/326/341 Synchronously convert null mode lock to concurrent read mode. No other process using lock. ENQ 445/408/424 Lock conversion not granted. Place on conversion queue. ENQ 380/336/359 Synchronously convert null mode lock to concurrent read mode. One other process using lock. 14 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results ENQ 714/659/678 Lock not granted. Place on wait queue. DEQ 469/432/449 Remove lock. No other process using lock. DEQ 502/447/474 Remove sublock. No other process using lock. DEQ 1680/744/875 Remove lock from conversion queue. DEQ 747/696/714 Remove lock from wait queue. DEQ 732/709/722 Release lock that a suspended process is waiting on. Memory Management CRMPSC 10511/9520/9676 Create 1 page private section. File opened but not previously used. CRMPSC 933/890/911 Create 1 page private section. File previously used. CRMPSC 4935/2509/2826 Create 1 page global section. File previously used. DELTVA 597/577/588 Delete 1 page private section. DELTVA 6384/6066/6317 Delete 1 page global section previously updated to disk. UPDSEC 1463/860/1015 Update 1 page unmodified global section to disk. System service execution time. UPDSEC 1572/1535/1544 Update 1 page modified global section to disk. System service execution time. UPDSEC 26538/8675/16684 Update 1 page modified global section to disk. Time to complete write and set event flag. LKPAG 563/514/532 Lock a page in physical memory. The page is in a working set, but is not locked in the working set. ULKPAG 513/467/489 Unlock previously locked page from physical memory. SETSWM 122/81/96 Disable swapping. 15 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results SETSWM 118/82/92 Enable swapping. EXPREG 429/395/406 Increase the P0 virtual address space by 1 page. DELTVA 510/472/486 Delete 1 page from P0 virtual address space. ADJSWL 224/191/203 Increase working set size by 1 page. ADJSWL 238/200/216 Increase working set size by 10 pages. ADJSWL 245/207/223 Decrease working set size by 11 pages. Working set is not full. PURGWS 4430/4091/4352 Purge 1 page from the working set. LKWSET 680/644/659 Lock previously purged page into the working set. ULWSET 561/470/496 Unlock a page from the working set. Timer GETTIM 100/70/83 Get software time. ASCTIM 1938/1875/1904 Convert software time to ASCII. SETIMR 440/394/414 Create timed event with time in absolute system format. CANTIM 403/349/372 Cancel timed event. SCHDWK 387/310/365 Schedule a wakeup with time in absolute system format. SETIME 53916/33772/44812 Set software time to a value in absolute system format. CANWAK 378/322/352 Cancel a wakeup request. Event Flag ASCEFC 2037/1397/1504 Associate to a temporary global event flag cluster. ASCEFC 1331/1266/1295 Associate to a permanent global event flag cluster. SETEF 161/126/138 Set an event flag in a temporary global event flag cluster. No 16 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results other process associated to the cluster. SETEF 167/130/147 Set an event flag in a permanent global event flag cluster. no other process associated to the cluster. SETEF 163/136/148 Set an event flag in a temporary global event flag cluster. One other process associated to the cluster. SETEF 174/140/157 Set a local event flag. CLREF 120/91/106 Clear an event flag in a temporary event flag cluster. No other process associated to the cluster. CLREF 123/88/106 Clear an event flag in a permanent event flag cluster. No other process associated to the cluster. CLREF 118/85/98 Clear a local event flag. READEF 89/75/86 Read an event flag in a temporary global event flag cluster. WAITFR 152/105/124 Wait on a temporary global event flag that is already set. WAITFR 143/107/122 Wait on a permanent global event flag that is already set. WAITFR 142/95/116 Wait for a local event flag that is already set. DACEFC 444/402/414 Disassociate from a temporary event flag cluster. DACEFC 315/273/283 Disassociate from a permanent event flag cluster. DLCEF 1292/1236/1260 Delete a permanent event flag cluster. Process Control CREPRC 3318/3049/3081 Execute CREPRC to create a low priority process. WAITFR 264201/149198/214617 Force invocation of a low priority process that was created but not activated. 17 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results CREPRC 474500/333529/394685 Create a high priority process. DELPRC 651/594/622 Execute DELPRC to delete a low priority process. WAKE 225/185/198 Execute WAKE with no process context switch. RESUME 194/168/177 Execute RESUME with no process context switch. SUSPND 379/316/346 Execute SUSPND for a low priority computable process. SETPRI 368/310/330 Set a subprocess priority to a lower value. FORCEX 376/318/344 Force exit of a computable subprocess. DELPRC 215711/210467/212615 Execute DELPRC and wait for the process to be deleted. Input/Output ASSIGN 911/875/885 Assign a channel to null device (NLA0:). ASSIGN 1453/1382/1416 Assign a channel to a mailbox already created by the same process. CREMBX 820/757/795 Create a temporary mailbox and assign a channel to it. ASSIGN 1002/950/980 Assign a channel to a VT100 terminal. DASSGN 489/461/473 Deassign channel from NLA0: DASSGN 1319/1288/1304 Deassign channel from a temporary mailbox. DASSGN 742/712/728 Deassign channel from a VT100 terminal. QIO 912/858/887 Queue a write of 1 byte to NLA0: QIOW 989/900/937 Write 1 byte to NLA0: QIO 1245/1210/1225 Queue a write of 1 byte to a temporary mailbox. QIOW 1326/1294/1306 Write 1 byte to a temporary 18 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results mailbox. QIO 2065/1992/2033 Queue a write of 1 byte to a VT100 terminal. QIOW 2152/2071/2093 Write 1 byte to a VT100 terminal. QIOW 1396/1336/1370 Write 100 bytes to a temporary mailbox. QIOW 1591/1533/1566 Write 100 bytes to a temporary mailbox and deliver an AST when done. QIOW 1843/1769/1804 Write 100 bytes to a mailbox, hibernate and use an AST to wake the process. QIOW 1299/1078/1129 Queue a virtual write of one 512-byte sector to an RM03 disk file. QIO 4438/3981/4059 Queue a virtual write of 99 sectors to an RM03 disk file. 3.2 Page Fault Test The page fault test was designed to measure the VMS response time to the following types of page faults. o page fault to the disk o page fault to the free list o page fault to the modified list o demand zero page fault o page fault to the modified list which causes the SWAPPER to be invoked. The page fault test was executed as a separate program with automatic working set adjustment disabled to ensure control over page faulting. The GETJPI system service was used to ensure that the proper number of page faults were generated for each timing value obtained. The working set was limited to 300 pages. Three FORTRAN arrays were used: 1- COPYONREF A 100000 byte array initialized to non-zero values and treated as copy-on-reference. 19 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results 2- READONLY a 6000 byte array initialized to non-zero values and defined as read-only with an option command to the linker. 3- DEMANDZ A 4 megabyte uninitialized array which is treated as demand-zero memory. Each of these arrays was assigned to a separate cluster via the CLUSTER option command to the linker. The cluster factor was varied from 1 to the default value. Disk faults were triggered by referencing each page of COPYONREF and READONLY. Demand zero faults were generated by initial references to pages in DEMANDZ. The results for disk type faults are as follows: Time (Milliseconds) VAX #1-RM03 Max/Min/Aver Cluster = 32 89/67/80 Cluster = 1 36/6/17 VAX #2-RM05 Cluster = 64 114/67/89 Cluster = 1 39/6/13 As seen in the above test results, the use of a small cluster size resulted in a low average service time. This is due to the fact that many of the disk faults were to pages on the same disk track. Page cache faults were generated by referencing pages in READONLY and COPYONREF that were in physical memory. References to READONLY triggered faults to the free list while modified list faults were generated by setting an element in COPYONREF. Page cache faults were first generated with space available in the working set. This was done by first purging the page to be referenced from the working set and then timing a reference to the purged page. Page cache faults were then timed with the working set full. The working set was filled by referencing pages from DEMANDZ. The results from the page cache tests are as follows: 20 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results Working set Mod List Free List Demand Zero Max/Min/Aver Max/Min/Aver Max/Min/Aver not full 439/353/381 694/333/393 774/715/737 429/202/243* full 696/601/642 659/595/636 1014/706/889 476/391/439* In the above test results, two sets of values are shown for faults on pages contained in the modified list. The second set of results is for faults which occur immediately after a previous page fault and for which the page cache is not flushed. These values correspond to faults with good memory cache performance. The first set of results is for faults which occur when the memory cache has been cleared. The above results clearly illustrate the need to know how a test is performed in order to draw any conclusions as to whether the test is realistic. In a paper authored by Digital Equipment Corporation employees (1), the page cache fault time was stated to be approximately 200 microseconds. As seen in the test results above, this time would correspond to a test with good memory cache performance and which used a partially filled working set. Execution of the SWAPPER to write pages from the modified list was timed by filling the working set and then timing references to 600 pages of DEMANDZ. The test was run on VAX #1. The values of VMS SYSGEN parameters which control the size of the modified list and how it is written to disk were: 1. MPW_HILIMIT = 500 2. MPW_LOWLIMIT = 200 3. MPW_WRTCLUSTER = 96 The average value obtained for execution of these tests was 27 milliseconds. 3.3 Process Context Switch Test The process context switch tests were designed to measure the time VMS takes to remove one process from execution and replace it with another. In these tests a driver process creates subprocesses which are invoked by executing VMS system services or FORTRAN statements which trigger process context switches to occur. The subprocesses created pass their timing measurements back to the driver via a permanently installed global section. The process context switch tests were run on VAX #1 and include the time to execute the VMS triggering mechanism. 21 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results VMS Process Time (Microseconds) Mechanism Priority Max/Min/Aver SETEF Increased 590/432/573 WAITFR Decreased 394/356/378 WAKE Increased 639/570/609 HIBER Decreased 397/363/328 RESUME Increased 698/653/670 SUSPND Decreased 696/661/678 FORTRAN STOP Decreased 3876/3792/3820 In the test results shown above, the process priority column indicates whether the software priority is increased or decreased as a result of the process context switch. For example, the SETEF test caused a higher priority process to be invoked by setting an event flag that it was waiting on. In the HIBER test, the currently executing process forces itself to hibernate which causes a lower priority process to be invoked. 3.4 Real Time Clock Interrupt Test The real time clock interrupt test program consists of a loop which for each iteration saves the time that VMS took to process the last 10 MS interrupt into an array which is locked into the working set. The test program continually reads the ICR register. Whenever the ICR register becomes more negative, it is saved since the more negative value indicates that a 10 MS interrupt has just been serviced. The real time clock interrupt test measures quantum expiration processing for real and non-real time processes, the effect of automatic working set adjustment, the time to update software time and the time to execute periodic system procedures. Figure 3.4-1 contains output from the real time clock interrupt test run on VAX #2 with automatic working set adjustment enabled. Each number printed corresponds to the time that VMS took to process a 10 MS interrupt. Since 200 values are printed and since the elapsed time of the run was 2 seconds, control was returned to the test program at the end of each 10 MS interrupt. Each row in the test printout contains 10 values so that for each column of the printout successive row elements are separated in time by 100 MS. The test program contains several paths which detect occurences of the 10 MS interrupt. Since the execution time of each path is different, the test program also prints the execution time of the longest path (7 microseconds). The interrupt test program generates no page faults and uses less than 100 bytes of the memory cache. This means that most of the times listed in figure 3.4-1 correspond to code execution with excellent memory cache usage. Most of the times listed are low ( 36 microseconds) and correspond to the time VMS takes to update software time and to check if any timed events have 22 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results expired. To determine the quantum expiration time, one needs to look for large interrupt processing times separated by the quantum expiration interval. Since the value of the system parameter (QUANTUM) controlling the frequency of quantum expiration was 20, VMS processing of quantum expiration occurred at least as frequently as once every 200 MS. As shown in Figure 3.4-1, the seventh column contains 10 large values which are separated by 200 MS. The large 7 millisecond values are caused by VMS automatic working set adjustment. Just prior to this interrupt test, the test program had been generating many page faults. When the interrupt test program was expanded to a 20 second interval the quantum expiration times eventually dropped to an 800 microsecond average. In the printout shown in Figure 3.4-1, there are also two other sets of large 10 MS interrupt processing times besides the quantum expiration times. Execution times for these are contained in columns 3 and 5 of the printout. Within each column the values are separated by 1 second intervals. These sets of values are present in every run of the test program made on the VAX #2 computer. The values printed correspond to the time to check for terminal timeout and system procedures such as EXE$TIMEOUT which are automatically invoked at 1 second intervals. The EXE$TIMEOUT subroutine, among other things, checks to see if the swapper or error logger should be awakened and calls device drivers that have exceeded their timeout intervals. the real time clock interrupt test results for VAX #1 are as follows: Execution Time (Microseconds) First Value Subsequent Values Normal Interrupt Processing 37 System Procedure Execution 1430 1026 Terminal Polling 289 223 Quantum Expiration Real Time Priority 60 50 Normal Priority 347 260 In the above test results, two sets of values are shown: one for the first value read in a test and the other for subsequent values read in a test. Since the memory cache is flushed before the start of the interrupt test, the first value read for a particular type of interrupt processing should be higher than subsequent values read. 23 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results Figure 3.4-1 Quantum Expiration Test - VAX #2 Automatic Working Set Adjustment 10 ms interrupt overhead max time = 7640 min time = 40 10ms interrupt times: 49 40 44 42 40 44 42 40 43 42 45 43 42 45 43 42 367 43 42 45 43 42 45 45 43 41 45 43 42 45 43 42 45 43 42 45 2829 63 44 42 40 44 42 40 1138 50 45 43 42 40 44 42 339 42 45 43 7308 57 43 41 45 43 42 45 44 42 40 44 42 40 44 42 40 44 42 40 7640 62 44 43 41 45 43 41 45 43 42 45 43 42 45 43 42 45 43 42 7592 56 41 44 42 40 44 42 40 44 42 40 44 42 40 44 42 40 44 42 7133 61 42 40 44 42 40 44 42 40 44 42 40 44 42 40 44 42 40 44 7561 53 42 40 44 42 40 44 1099 60 41 45 43 42 45 43 335 40 44 42 7315 60 42 45 43 42 45 43 42 45 43 42 45 43 42 45 43 42 45 43 7076 55 40 45 43 42 45 43 42 45 43 42 45 43 42 45 43 42 45 43 7069 57 43 42 elapsed time = 2.000 MAX CALC. OVERHEAD (MICROS)= 8 3.5 Timed Event Response Tests were run to determine how long it takes VMS to deliver timed events generated by the SETIMR and SCHDWK system services. These tests did not use AST routines and were run on VAX #1. CPU Time (Microseconds) Type of Event Max/Min/Aver Timer Expiration 703/577/624 Scheduled Wakeup 674/587/611 3.6 AST Delivery The time to deliver an AST was determined by executing the SETIMR and QIO system services with and without an AST. As can be seen below, AST delivery requires approximately 240 microseconds. 24 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results CPU Time (Microseconds) VMS Mechanism (Max/Min/Aver) Timer Expiration -With AST Delivery, Return 895/872/881 -With AST Delivery 909/832/864 -Without AST 703/577/624 QIOW to a Mailbox -With AST Delivery 1608/1533/1567 -Without AST 1356/1325/1337 3.7 Input/Output Microsecond level timings were obtained for FORTRAN statements which write to a mailbox, open and close a scratch disk file, use virtual QIO writes to disk and which use FORTRAN formatted and unformatted writes to disk. The FORTRAN disk output tests were run with the file parameter BUFFERCOUNT set to 1. The timings obtained were only for writes to the RMS buffer and did not include time to write to the disk. The virtual QIO to disk test measured the time to execute the QIO system service as well as the time to finish the QIO. The time to finish the QIO was determined by a routine executing in kernel mode which continually reads the ICR register and records any large jumps in the register value. These jumps are caused by the VMS disk driver executing at an elevated interrupt priority level (IPL). ICR register jumps caused by the 10 MS interrupt were not recorded. These tests were run on VAX #1. Category CPU Time (Microseconds) Mailbox Write 1300 Fixed + 2.4/fullword FORTRAN write to RMS buffer - I4 Format 989 Fixed + 703/fullword - O4 Format 992 Fixed + 721/fullword - Z3 Format 958 Fixed + 744/fullword - A4 Format 965 Fixed + 563/fullword - F4.1 Format 1156 Fixed + 1017/fullword - Unformatted 1287 Fixed + 4.3/fullword Virtual QIO to disk 2275 Fixed + 50/Sector FORTRAN OPEN Statement Elapsed time about 250000 FORTRAN CLOSE Statement Elapsed time about 100000 4. VMS System Load Tests 25 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results The objective of the VMS system load tests is to measure the performance of the VAX while executing a controlled synthetic workload in a stand-alone environment. The load is placed on the VAX by a software tool which is interactively driven by command input. These commands provide the means to vary the demand on the CPU, memory and disk I/O subsystem. The load created is self-measuring and provides two essential statistics. For compute- bound loads the machine-language instruction execution rate is provided while for disk I/O bound loads the data input/output rate to disk is provided. Tests described here measure the instruction execution rates for four different instruction mixes and compare the disk I/O rates that can be realized using FORTRAN sequential and virtual QIO disk access methods. 4.1 Synthetic Load Tool A software tool is used to create a synthetic workload consisting of a set of concurrently executing subprocesses written in FORTRAN. The main program (CONTROL) for the tool inputs test parameters from an interactive terminal and outputs test results to the terminal and optionally to a file. The test input parameters determine how many subprocesses should be created and their type. CONTROL creates the appropriate number of subprocesses and places their execution characteristics into a permanently installed writable shared image. CONTROL sets a common event flag to start subprocess execution and resets a shared image variable to stop subprocess execution. At the end of a test, each subprocess will place its execution statistics into the shared image and then enter a common event flag wait state. Each subprocess will also set other variables in the shared image to indicate that it has stopped and the time at which it stopped. CONTROL will then use the SETIMR system service to test at one second intervals whether all subprocesses have stopped. After all subprocesses have stopped, CONTROL will retrieve the execution statistics from the shared image and output the test results. In this way, subprocess execution is efficiently synchronized and timed, and test characteristics can be properly changed. Subprocess execution is controlled by test parameters placed in the shared image. The subprocesses created by CONTROL are all identical and are implemented as infinite loops. The loop is divided into a compute-bound portion and an I/O-bound portion. To determine the instruction execution rate for a test, CONTROL multiplies the number of machine language instructions executed per loop iteration by the total number of loop iterations executed. The number of machine language instructions executed per loop iteration is determined by examination of the subprocess compiled code listing. Four different instruction mixes were executed. They are shown in figure 4.1-1. Notice that for MIX #3, 25-30% of the instructions use longword relative deferred addressing. This is due to the way in which VMS 3.0 implements universal symbol references (references to a shared image variable) as indirect 26 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results references through a "fixup vector". Although the compiled code references to the shared image variable use PC relative addressing the linker will change them to PC relative deferred. Figure 4.1-1 Instruction Mixes VAX #3-MIX #1 ADDL2 I,R 7.7% ADDL2 (R),R 7.7% AOBLEQ R,(R),PC 7.7% ASHL #7,(R),R 7.7% ASHL #7,R,R 7.7% BNEQ PC 15.4% CMPL PC,# 7.7% DIVL3 I,(R),(R) 7.7% MOVL [R],(R) or (R),[R] 7.7% SUBL3 R,(R),(R) 7.7% TSTL PC 7.7% VAX #2-MIX #2 AOBLEQ (R),(R),PC 14.2% ADDL3 (R),(R),R 14.2% BNEQ PC 28.4% CMPL @PC 14.2% MOVL [R],(R) or (R),[R] 14.2% TSTL @PC 14.2% VAX #2-MIX #3 ADDL3 (R),(R),R 12.8% AOBLEQ (R),(R),PC 12.8% AOBLEQ I,(R),PC 1.3% ASHL #7,R,(R) 1.3% BGTR PC 1.3% BNEQ PC 25.6% CMPL (R),(R) 1.3% CMPL @PC,# 12.8% DIVL3 (R), (R), R 1.3% MOVL [R],(R) or 12.8% (R),[R] 1.3% MOVL #,(R) 1.3% MULL3 (R),R,R 1.3% SUBL3 R,(R),R 1.3% TSTL @PC 12.8% 27 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results VAX #2-MIX #4 VAX #1-MIX #4 ADDL3 (R),(R),R 12.8% AOBLEQ (R),(R),PC 12.8% AOBLEQ I,(R),PC 1.3% ASHL #7,(R),(R) 1.3% BGTR PC 1.3% BNEQ PC 25.6% CMPL (R),(R) 1.3% CMPL (R), # 12.8% DIVL3 (R),(R),(R) 1.3% MOVL [R],(R) or 12.8% (R),[R] 1.3% MOVL #,(R) 1.3% MULL3 (R),(R),R 1.3% SUBL3 R,(R),(R) 1.3% TSTL (R) 12.8% Notation: (R) displacement from a register [R] indexed @PC longword relative deferred I immediate # short literal PC displacement from PC R register 4.2 Test Methodology All tests are run in a stand-alone environment. Each test scenario is executed for approximately 30 seconds and the test results are saved on a file for subsequent printing. For tests in which VMS overhead statistics are gathered, the MONITOR utility is run from another terminal logged in at high priority. The processor modes display is used at a 15 second sampling interval. 4.3 VAX Instruction Execution Rate To determine the rate at which the VAX executes user program instructions, tests were run with a load consisting of a single compute-bound process. They were run in such a manner that no page faults were generated. The compute-bound code executed should exhibit excellent memory cache performance since it consists of a small loop which accesses a small amount of memory (4000 bytes for the test). To verify this, identical workloads 28 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results were executed with the memory cache enabled and disabled. As can be seen in Figure 4.3-1, the performance improvement due to the memory cache ranges from 4.5 to 2.6 based on how much of the time was spend executing user code. The compute rates varied as follows: Compute-Rate (Instruction/Second) Performance Activity Enabled Disabled Improvement With no page faults 618556 135248 4.5 With page cache faults 156702 50741 3.1 With page file faults 101232 38800 2.6 To compare the subprocess cache performance with that of typical programs, elapsed time measurements were taken for identical executions of the FORTRAN compiler and the linker with the memory cache enabled and disabled. The test showed that the cache memory improved execution time by only about a factor of 2.0 for the linker and 2.5 for the compiler. The instruction execution rate tests were run on three different VAXs (#1, #2, #3) and utilized four different instruction mixes as listed in Figure 4.1-1. They were run at non-real-time priority with automatic working set adjustment disabled. Instruction Execution Rate Computer Mix Instructions/sec VAX #1/VMS 2.5 #4 746194 VAX #2 #4 750568 VAX #2 #3 628345 VAX #2 #2 651796 VAX #3 #1 470961 The execution rates listed above are best-case rates for the instruction mixes tested. These rates decrease dramatically as the numbers of subprocesses increase and as their real memory requirements exceed their working set limits. Figure 4.3-1 shows the effect of varying the real memory requirements of a compute-bound load consisting of two subprocesses. For this load, working set adjustment is disabled and the working set default size and quota are set to 150 pages. The real memory requirement of the load is varied by sequentially referencing differing protions of a large (280000 byte) virtual memory array. For the test in Figure 4.3-1, this array is modified. Figure 4.3-1 plots the instruction execution rate of the subprocesses against the portion of the virtual memory array that is modified. As shown, a dramatic fall in execution rate occurs as the increase in the portion of the array modified causes the real memory requirements of the subprocess to exceed the working set size (LENGTH = 14000). At this point paging to 29 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results the modified list begins. As the portion of the array referenced increases, the execution rate again drops off (LENGTH = 30000) as the SWAPPER is invoked to write the modified list to disk. Execution Rate (1000 instructions /second) | 700 + + + + + + + | | |_____ 600 + \_+__ + + + + + | \ VAX #2 - MIX #3 | \ | \ 500 + + \ + + + + + | \ | \ | \ 400 + + \ + + + + + | \ | \ | \ 300 + + \ + + + + + | \ | \ | \ 200 + + \ + + + + + | \___________________ Cache Enabled |_________ \\\\\\\\_________________ | \ 100 + + \ + + + + + | \____________________________________ | Cache Disabled | 0 +-------+-------+-------+-------+-------+-------+--- 10000 20000 30000 40000 50000 60000 Length (Longwords) 4.4 Disk Output Rates The disk I/O tests described here were run on VAX #1 using a single RM03 disk that also served as the system disk. Tests were run so that no user page faults were generated. To do this, each test scenario was first executed for 5 seconds after the synthetic load was established to ensure that all the virtual memory referenced by the test was in the subprocess working sets. After the 5 second interval was over, an 30 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results approximately 30 second interval was then executed in which measurements were made and saved on a file. No attempt was made to pack the user disk. Approximately 20000 sectors were available on the disk for the tests. Only scratch disk files were used in the tests. Two types of disk access methods were used: virtual QIO in which the QIOW system service was called to sequentially write memory buffers that were a multiple of 512 bytes and sequential disk file I/O in which unformatted FORTRAN WRITE statements were used. The files created for the virtual QIO tests were contiguous and 100 sectors long. For the FORTRAN sequential file tests, files were created with an initial and total size of 100 sectors using a FORTRAN OPEN statement (contiguous-best-try) with BUFFERCOUNT =2. Since unformatted WRITE statements were used, the record length was set to 4 bytes less than the block size selected for the file. Two forms of the WRITE statement were used: o implied DO list: WRITE (.....) (DAT(I),I=1,N) o array name: WRITE (.....) DAT In the array name case, a subroutine call was needed to execute the WRITE statement. The following table shows the disk transfer rates realized by a single subprocess for the various disk access methods as a function of the transfer size (BLOCKSIZE) to disk: Disk Transfer Rate (Kilobytes/second) FORTRAN Sequential WRITE Blocksize Implied Array (bytes) Do Name QIO 512 27 27 29 1024 32 50 57 2048 35 86 109 4096 36 130 187 8192 36 170 286 16384 36 194 393 As can be seen, the virtual QIO access method generates the highest transfer rate. The low transfer rate shown for the implied DO list case was caused by the FORTRAN compiler generating a call to a library routine for each word transferred. The CPU utilization statistics, as measured by the MONITOR utility, were also different and are shown below for a transfer size of 4096 bytes: 31 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results CPU Utilization (%) Interrupt Kernel Exec User Idle Virtual QIO 6 8 84 FORTRAN Sequential Array Name 6 8 17 3 64 Implied Do 1 2 4 74 18 The CPU was heavily loaded by the implied DO list test. The test illustrates that if FORTRAN READ/WRITE statements are not used properly, they can become a very inefficient way of transferring data to/from a disk. 4.5 Impact of the SWAPPER on Real Time Tasks As shown in the page fault system response test (Section 3.2), when the SWAPPER is invoked to reduce the size of the modified list page cache, it can seize the CPU at real time priority for large periods of time (30 milliseconds) to write portions of the modified list to disk. Tests described here further investigate these actions of the SWAPPER by measuring the impact of the SWAPPER on scheduled wakeups of a real time task. To do this, a real time task was created (WAKEUP) to schedule wakeups at automatically repeated 10 millisecond intervals. The task would then execute a 2000 iteration loop in which for each iteration it would execute the following steps in sequence: 1. Read software time to the 10 millisecond level. 2. Execute the hibernate (HIBER) system service to wait for the next 10 millisecond interrupt. 3. Read software time to the microsecond level. 4. Calculate the microsecond delay in waking the real time task and store this time into an array. The WAKEUP program was executed stand-alone and with a load created by the loading tool described in section 4.1. The load consisted of 5 compute-bound subprocesses executing with automatic working set adjustment disabled and executing in such a fashion that the SWAPPER would be frequently invoked to write the modified list to disk. The WAKEUP program was run at a priority both above and below the SWAPPER. The wakeup time distributions for WAKEUP tests run for durations of 20 seconds are as follows: 32 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results Max Wakeup Priority of Wakeup Time Distribution (%) Time WAKEUP program Test Load >1MS >5MS >10MS >20MS (MS) Below SWAPPER None 1 2 5 processes 33 11 11 5 30 Above SWAPPER 5 Processes 29 11 4 19 The above test results illustrate that the SWAPPER can block real time tasks at higher priority that itself for significant (20 milliseconds or more) periods of time. If a real time application requires response to timed events in less than 20 milliseconds, then the timed event may need to be recognized and responded to at elevated IPL. 5. DECnet Load Tests The DECnet load tests measure the interprocess communication transfer rates that can be realized between processes communicating between two DECnet nodes. The load is generated by a software tool which is controlled by interactive command input. The load created is self measuring and the message transfer rates are displayed to the user. The tests also measure the effect of DECnet transfer on scheduled wakeups of a real time task. The tests were run on VAX #2 with a PCL11-B time division multiplexed communication line set up for an effective line speed rating of 1.6 megabits per second. 5.1 DECnet Load Tool A software took is used to create the test loads. The load consists of a set of concurrently executing subprocesses which form logical links with companion subprocesses on a remote node and transfer data over these logical links via nontransparent interprocess communication. The loading tool has a structure very similar to the VMS synthetic load tool described in Section 4. The driver for the tool (LOADNET) inputs test parameters from an interactive terminal and outputs test results to tne terminal and optionally to a file. The test parameters determine how many processes should be created and the size of the messages that they will transfer. LOADNET creates the appropriate number of subprocesses and places their message size in a permanently installed writable shared image. Once created, each subprocess first establishes a logical link with its companion subprocess, sends the message transfer size to the companion subprocess, and then waits on a common event flag to be set by LOADNET before starting message transfer. LOADNET sets a common event flag to start subprocess execution and then resets a shared image variable to stop subprocess execution. During a test each subprocess increments a variable in the shared memory each time it sends and receives a message. At the 33 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results end of a test, each subprocess sets other variables in the shared image to indicate that it has stopped and the time at which it stopped. LOADNET will then use the SETIMR system service to test at one second intervals whether all subprocesses have stopped. After all subprocesses have ended message communication, LOADNET will then access the run statistics from the shared memory and output the test results. The subprocesses created by LOADNET are all identically implemented as infinite loops. The message transfer portion of the subprocesses consists of two calls to the QIOW system service to send one message and then receive one message per loop iteration. 5.2 Test Methodology The DECnet load tests are run in a stand-alone environment on 2 VAX computers. Each test is executed for approximately 30 seconds and the test results are then saved on a file for subsequent printing. For tests in which VMS/DECnet overhead statistics are gathered, the MONITOR utility is run from another terminal logged in at high priority. The processor modes display is used at a 15 second sampling interval. 5.3 Interprocess Communication Transfer Rates Tests were run on VAX #2 with 1 and 2 subprocess test loads using a DECnet buffer size of 2112 bytes. The following table shows the relationship between the data transfer rate and the message size for these tests: Data Transfer Rate (Kbytes/sec) Message Size 1 Subprocess 2 Subprocesses 128 11.9 12.8 256 23.3 25.2 512 39.5 48.0 1024 66.2 85.8 2048 90.4 136.8 4096 116.8 164.0 8192 143.4 188.9 16384 125.3 193.7 During the 2-subprocess test above, the MONITOR utility was run and CPU utilization statistics were gathered. From these statistics, an average DECnet processing time per message (QIOW) can be computed (CPU utilization rate divided by message transfer rate). The following table lists the DECnet message processing time as a function of message size for the 2-subprocess test: 34 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results Message Data Message CPU-Utilization (%) Processing Time Size Rate Rate Interrupt Kernel Idle Per message (bytes) (kbytes (msg/ (milliseconds) /sec) sec) 128 12.8 100 75 23 0 10.0 256 25.2 98 73 20 4 9.8 512 48.0 93 71 23 3 10.3 1024 85.8 84 65 26 5 11.3 2048 136.8 67 53 27 17 12.4 4096 164.0 40 58 24 16 20.9 8192 188.9 23 59 21 18 35.6 16384 193.7 12 59 17 22 66.0 As seen above, the CPU time used by DECnet to transfer a message can be quite significant for large messages. Since most of DECnet processing is done at elevated IPL, transfer of large messages can severely impact real time tasks. 5.4 Impact of DECnet on Real Time Tasks As shown in section 5.3, DECnet performs most of its processing at elevated IPL. Tests described here were run to show the effect of DECnet on scheduled wakeups of a real time task. The WAKEUP program, described in section 4.5, was run with and without any DECnet activity. Without DECnet activity, the average wakeup time was measured to be 486 microseconds with a maximum of 1681 microseconds. Using the DECnet loading tool, described in Section 5.1, a 2-subprocess load at low priority was executed. The subprocesses used DECnet to transfer 1024 byte messages using the QIOW system service. The DECnet buffer size was set at 576 bytes. When the MONITOR utility was run for a 70 second test, the following results were obtained: Message Data Message CPU-Utilization (%) Processing Time Size Rate Rate Interrupt Kernel Idle Per message (bytes) (kbytes (msg/ (milliseconds) /sec) sec) 1024 86.8 85 68 26 1 11.7 When the WAKEUP program was run at real time priority with the above described DECnet load, the following distribution of wakeup times was measured over a 70 second interval: 35 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results Maximum: 29.9 milliseconds Average: 5.6 milliseconds > 1 millisecond 74% > 5 milliseconds 38% >10 milliseconds 22% The above test results indicate that if a real time application must respond to a real time event in less than 100 milliseconds while DECnet is transferring user messages, then the recognition of and response to the real time event may have to be performed at elevated IPL. 6. Summary This paper has presented many timing measurements of services offered by VMS and DECnet and of components of VMS. The points brought out worth emphasizing are as follows: 1. Know the conditions under which performance tests are made. If test conditions which directly affect performance are not known, then it is very difficult to relate the test results to other test environments. For example, the memory cache facility is an important performance feature of the VAX. The timing values obtained from any test on the VAX are strongly influenced by how the memory cache is utilized. In the system response tests reported here, the memory cache was flushed before each test. In the instruction execution rate, disk I/O rate and DECnet load tests, the memory cache was very efficiently used since the test programs consisted primarily of small loops. 2. Be aware of the impact of the SWAPPER and DECnet. When the SWAPPER is writing the modified list to disk and when DECnet is transferring user messages, real time processes can be severely impacted since the SWAPPER and DECnet will perform these tasks at elevated IPL. 3. Be aware of the overhead of FORTRAN READ/WRITE statements. The efficiency of these statements can very easily be abused as seen in the disk I/O test when an implied DO list was used in the FORTRAN WRITE statement. 7. References 36 PAGESWAPPER - May 1983 - Volume 4 Number 7 VAX/VMS Real Time Performance Test Results (1) Levy, H. M. and Lipman, P. H., "Virtual Memory Management in the VAX/VMS Operating System", Computer, March 1982 37 PAGESWAPPER - May 1983 - Volume 4 Number 7 INPUT/OUTPUT INPUT/OUTPUT A SIG Information Interchange A form for INPUT/OUTPUT submissions is available at the back of the issue. INPUT/OUTPUT 124 Caption: RJE between VAXes Message: I am looking for information on an RJE capability using DECnet between two VAX systems. We currently have a 780 on which we run four batch queues in support of a large chemical engineering application. Each queue is single streamed; cpu time limits are stair-stepped on each queue to segregate the jobs based on run time requirements. In May a 750 will become available to us, and we would like to off-load the batch queues to it. What we would like to do is set up an RJE capability between the 780 and 750 using DECnet. SUBMIT/REMOTE does not offer enough capabilities with respect to parameter passing, returning output or use of other than the SYS$BATCH queue. Have any sites implemented such a capability between VAXes? Are there any products, DEC or otherwise, that provide this capability? Contact: H. C. Braesicke Celanese Chemical Company, Technical Center Box 9077 Corpus Christi, Texas 78408 (512) 241-2343 x4276 Date: March 21, 1983 INPUT/OUTPUT 125 Caption: Auto-call feature on DF03-AC modem Message: Has anyone had any luck with software that can run a DF03? We are trying to get ours to call out by sending appropriate CTRL and character strings. We've had little or no luck so far and would appreciate any help. 38 PAGESWAPPER - May 1983 - Volume 4 Number 7 INPUT/OUTPUT Contact: Chris Yetman Teradyne, Incorporated 179 Lincoln Street, M.S. 37 Boston, MA 02111 (617) 482-2700 x3151 Date: March 28, 1983 INPUT/OUTPUT 126 Caption: Change and Configuration Control Message: Two change and configuration control environments are available from Softool. Change Control supports code management, automating the control and documentation of changes. It can handle anything: source code, object code, documents... and can deal with any languages. Change and Configuration Control also provides full support for configuration control. Both offer automatic reconstruction of previous versions, problem tracking, difference reports, management reports, access control, archiving, compression, encryption, automatic recover, etc. Contact: Bruce Hanna 340 South Kellog Avenue Goleta, CA 93117 (805) 964-0560 Date: March 30, 1983 INPUT/OUTPUT 127 Caption: LISP for VAX - REPLY TO I/O #117 Message: DECUS has a LISP interpreter written in MACRO-11 for RSX systems which I was able to bring on-line on a VAX with minimal changes. Unfortunately, I don't remember exactly what I had to do and I have changed jobs since them. The DECUS order number is 11-433. Contact: Chris H. Ruhnke 1815 North Fort Myer Drive Arlington, VA 22209 (703) 841-3711 Date: April 4, 1983 39 PAGESWAPPER - May 1983 - Volume 4 Number 7 INPUT/OUTPUT INPUT/OUTPUT 128 Caption: Automatic failover for dual redundant VAX/VMS system Message: We would like information on the description and availability of hardware and software features which have the capability for automatic failover of computers and peripheral devices in a dual redundant VAX/VMS configuration. Contact: Frank F. Islam Computer Sciences Corporation 6301 Ivy Lane Greenbelt, MD 20770 (301) 441-3613 Date: April 5, 1983 INPUT/OUTPUT 129 Caption: Proxy Login (does it work?) Message: Tried proxy login as shown in Pageswapper. Can get it to work only on local node but not on remote nodes. Would like to speak with someone who has it working. Contact: Steve Lipshutz General Electric, M-2445 VFSC-100 P. O. Box 8555 Philadelphia, PA 19101 (215) 962-1137 Date: April 6, 1983 40 PAGESWAPPER - May 1983 - Volume 4 Number 7 INPUT/OUTPUT INPUT/OUTPUT 130 Caption: Command screen editor and shell for VAX/VMS Message: We have developed a command editor and shell for VMS. With it one may: edit the (DCL) command line using EDT keypad editor commands; recall previous commands for editing/reexecution; recall local/global symbols for editing/execution; concatenate multiple commands (with ); redirect input (with <) and output (with >); pipeline output of a command to input of another (with !). Contact: Dan Dill Chemistry Department Boston University Boston, MA 02215 (617) 353-4279 Date: April 7, 1983 INPUT/OUTPUT 131 Caption: Milestones program wanted Message: We need milestones of project management software that is in the public domain. Any information will be appreciated. Contact: Garnik Abrahamian 21255 Califa Street Woodland Hills, CA 91367 (213) 888-4850 Date: April 11, 1983 INPUT/OUTPUT 132 Caption: Statistical Quality Assessment Software Message: We are looking for software to handle Dr. W. E. Deming's methods of statistical quality assessment. We would like the software to have the ability to calculate and plot the control charts used by the "Deming Method". If you have anything at all on this subject, please contact me. Contact: Dan Danciger Florida Wire and Cable Company Post Office Box 6835 41 PAGESWAPPER - May 1983 - Volume 4 Number 7 INPUT/OUTPUT Jacksonville, FL 32236 (904) 781-9224 Date: April 15, 1983 INPUT/OUTPUT 133 Caption: LISP on a VAX/VMS machine - REPLY to I/O #117 Message: The USC Information Science Institute has implemented the Interlisp, from Xerox Corporation, language on a VAX running VMS. This work was sponsored by DARPA; therefore it is free. To get a copy mail your request to: USC Information Sciences Institute, Interlisp-VAX Project, 4676 Admiralty Way, Marina del Rey, CA 90291. Contact: Gordon Krauter LLNL, P. O. Box 808, L-153 Livermore, CA 94550 (415) 423-2836 Date: April 18, 1983 INPUT/OUTPUT 134 Caption: PDP-11 RSX-11M to VAX 11/780 VMS and back task Message: We are looking fo a task to run on a PDP-11/44 (running real time process control applications) to transmit data files to a VAX 11/780 over 1200 bps modems. Also, the task should be capable of retrieving data files on the VAX as needed over the 1200 bps modems. Contact: Kalidas Madhaupeddi Phelps Dodge Corporation Morenci, AZ 85540 (602) 865-4521 x 564 Date: April 14, 1983 INPUT/OUTPUT 135 Caption: X.25 link from VAX to Tandem - REPLY TO I/O #102 Message: Last year we had installed and tested a direct X.25 link between VAX11/780 and Tandem computers. On the VAX side we have used PSI Rev. 1.2 under VMS 3.0, 42 PAGESWAPPER - May 1983 - Volume 4 Number 7 INPUT/OUTPUT while the Tandem side, emulating DCE node, used X25AM package, Rev. E04. The tests covered process-to-process communication, using both SVC and PVC schemes and both "transparent" and "non-transparent" (Network Process) modes. Admittedly, we had some minor difficulties in configuring a non-standard, direct (DTE to DCE) link, but no significant system problem has been encountered during these tests. In view of the successful completion of these tests all process-to-process logical links between VAX and Tandem computers of the new AEP Control Center are currently being developed using X.25 protocol. Our tests of X.29 links, on the other hand, were not successful. Here, as a secondary objective, we thought of configuring Tandem as a concentrator for the Tandem development CRT terminals which are currently switchable between these two computers. We are interested in exchanging information on PSI in general, and on configuration parameters for X.29 in particular. Contact: Mietek Flam American Electric Power Service Corporation 2 Broadway, Room 901 New York, NY 10004 (212) 440-9349 Date: April 22, 1983 INPUT/OUTPUT 136 Caption: Fortran Access of VMS Virtual Memory Message: While allocating and deallocating memory from a FORTRAN program is straightforward using LIB$GET_VM and LIB$FREE_VM, the overhead generated when accessing the allocated memory by passing the address of the allocated area to a function subroutine using the %VAL function makes use of the virtual memory for writing and reading individual data elements very slow (11 instructions as opposed to the original one read or write.) I am interested in finding some trick in FORTRAN or MACRO that allows access to virtual memory with less overhead. Anyone who has solved this problem will earn my eternal appreciation if they send me a copy of the trick. 43 PAGESWAPPER - May 1983 - Volume 4 Number 7 INPUT/OUTPUT Contact: Gregory Aharonian Higher Order Software 955 Massachusetts Avenue Cambridge, MA 02139 (617) 661-8900 Date: April 25, 1983 44 PAGESWAPPER - May 1983 - Volume 4 Number 7 INPUT/OUTPUT Submission Form INPUT/OUTPUT Submission Form A SIG Information Interchange Please reprint in the next issue of the Pageswapper Caption: ______________________________________________________ Message: ______________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ Contact: Name _______________________________________________________ Address ____________________________________________________ ____________________________________________________________ Telephone ____________________________ If this is a reply to a previous I/O, which number? ________ Signature _____________________________ Date ________________ Mail this form to: PAGESWAPPER Editor, DECUS, MRO2-1/C11, One Iron Way, Marlborough, MA 01752, USA 45 PAGESWAPPER - May 1983 - Volume 4 Number 7 INPUT/OUTPUT Submission Form Tear out to submit an I/O item PAGESWAPPER Editor DECUS, MRO2-1/C11 One Iron Way Marlborough, MA 01752 USA 46 PAGESWAPPER - May 1983 - Volume 4 Number 7 System Improvement Request Submission Form System Improvement Request Submission Form SIG ref no. _________ Page 1 of _____ ________________________________________________________________ Submittor: Firm: Address: Phone: ________________________________________________________________ Circle application area(s) most closely related to yours (OEMs circle end use): Transaction Processing Business EDP (accounting) Program Development Systems Development General Timesharing Student Timesharing Shared Small Applications Shared Large Applications Process Control Word Processing Large Simulation ________________________________________________________________ System Configuration: CPU Model: System Disk: Memory Size: Average User Load: Operating System: Version: ________________________________________________________________ Abstract (Please limit to four lines): ________________________________________________________________ Description (include justification and expected usefulness): Use additional pages if required Completed SIR should be returned to: Gary L. Grebus, Battelle Columbus Laboratories, 505 King Avenue, Columbus, Ohio 43201, USA 47 PAGESWAPPER - May 1983 - Volume 4 Number 7 System Improvement Request Submission Form Tear out to submit an SIR Gary L. Grebus Battelle Columbus Laboratories 505 King Avenue Columbus, Ohio 43201 USA 48