From: Glenn C. Everhart
Date: 6 June 1997
Re: IRP Stack

For switching to work most efficiently & with least risk I want to add
a context stack to the IRP definition, adding it at the end.

Operation would be as follows:

IRP gets 3 new fields at the end:

IRP$L_CURCSP	Context stack pointer, which is absolute address of the
		currently in use context stack
IRP$L_STKFS	Flags, initially only one defined, IRP$V_ONSTACK = 0
		which means if set that the current context is in the
		in - IRP context stack (for an intercept)
IRP$A_CTXSTK	Area to be used for the context stack. For an initial bogey
		I'm assuming 64 longwords.


A typical context to be saved is
  Original irp$l_ucb
  Original irp$l_pid
  Original irp$l_stkfs
  Original irp$l_sts
  Original irp$l_media
  Original irp$l_stkfs
  Value of intercept UCB
  Original irp$l_curcsp (before intercept)

Thus 64 longs is enough for 8 of these intercepts. Hopefully that is enough
for most cases.

To use this, intercept code checks

* IRP size is long enough to hold the new fields
* IRP$L_CURCSP points within the IRP

If both of these are true (optionally also if a new IRP status bit is
set to say this is a new format one but I think that can be dispensed
with) the IRP is considered a new format one with a context stack.

When saving context, the intercept checks that there's enough room for
a new context on the IRP's stack and if so saves it in the IRP. If
there is not room, it clears IRP$V_ONSTACK and allocates a separate
structure (I have code using a separate list and code using an IRPE)
and links it somewhere. The separate structure needs to have the original
value of IRP$V_ONSTACK saved too, so it can be restored. Old format
IRPs also use separate structures. We must catch post processing and
steal irp$l_pid so we can clean our intercept up.

If anything else intercepts us, new contexts get stored lower on the
stack. 

When anything intercepting us finishes, its intercept will run first
since IRP$L_PID will be left pointing at it. It will restore the
context pointer, so when we get control at i/o post time, which we
must, our context is on top of stack. We must clear it off before
posting to the next lower layer. (An intercept is presumed to insert
itself to grab control before whatever was there when it was inserted
and must pass control on when done.)

Therefore a stack is the sensible construct.

Suppose we have 3 intercepts I1, I2, and I3 intercepting driver D
in that order.

Control goes

I3, I2, I1, D

I3 pushes original context, sets irp$l_pid to P3, its post proc.
I2 pushes I3 context, sets irp$l_pid tp P2, its post proc
I1 pushes I2 context, sets irp$l_pid to P1, its post proc
Driver D does its thing

IRP goes to driver postproc which calls P1 (irp$l_pid hook)
P1 does its thing, pops I2 context, and its REQCOM gets to P2
P2 does its thing, pops I3 context, and its REQCOM gets to P3
P3 does its thing, pops original context, and its REQCOM finally completes
	the IRP.


WHY NOT USE OFFSET IN IRP$L_CURCSP?
 If we use an offset, we cannot tell if an IRP context stack is
real, or if the IRP is an edited copy. In general an intercept
must send and expect a normal IRP flow...one in, same one out.
This is how the whole I/O system works. What is sent to drivers
by special cases like shdriver is still individual IRPs that get
treated like normal ones from anywhere else. 

We don't in general want intercepts above such drivers to be seeing
extra IRPs from such pseudodrivers beneath; they want to see the
original IRP only. Hence we don't use an offset. Requiring the
address to be inside the IRP makes it harder to mistake a cloned IRP
for a new format one besides. Where we find an invalid context stack
pointer, we should treat the context stack as empty if the IRP is
long enough and some intercepts are below the level of the shdriver
or similar driver.

CAN THIS STUFF BE SPOOFED?

A simple copy of an IRP will have the wrong context stack pointer
(CSP) address so can be rejected. A copy of the first part of a new
IRP is unlikely to have the context stack pointer right, and should
have the wrong (too short) size. The size would reject it. A newly
cloned short IRP would be too short also.

If someone allocates an old-size IRP buffer and copies the first part
of a new larger IRP into it INCLUDING THE SIZE, the buffer size will
be invalid and cause trouble when deallocating. THIS IS BROKEN BEHAVIOR
IN ANY CASE and deserves to be found out. I don't believe anything
actually behaves like this, but even if it does, it needs to be
cleaned up.

One could add a new flag in addition, but I believe it gains nothing
significant over requiring the CSP and length to be the new values
and dislike using up yet another status bit.

While this is a point release time for OVMS, it also may be that there
won't be another .0 release. Best to get things fixed up now.


WILL THIS INTERFERE WITH EXISTING THINGS?
No. The new material is all past the old end of the IRP. Only a few
changes to sysqioreq would be needed to initialize the context stack
pointer and the valid flag. Anything else would produce an IRP that
was of the old type until it was made aware of the new behavior. As
has been said, the only usage that gets in trouble is one that
copies an incorrect size word, which is arguably broken anyway.

WILL THIS HELP OTHER THINGS?
Yes. Besides switching, EDO wants to use this for snappy disk (they
have TWO levels of interception!). Besides that I have furnished
some code to the Galaxies folks which works in this way and would be
able to profit from this.

The two slowdowns in adding I/O processing in layers have been 
  1. The need to get an IPL 4 interrupt and then fork back to
	grab postprocessing control. That's being fixed separately, and
  2. The need to allocate some structure to hold intercept context and
	search for it when needed at posting time.

This proposal addresses the second need. Switching paths needs it
now to avoid a performance hit. With it in place, the pool
allocation and deallocation, and the searching, time can be avoided.
Without it, the context can be stored only separately and switching
will be slower.  The other products will take a hit.

BUT THIS IS A POINT RELEASE
Yes, it's a point release. However it has been remarked that VMS may
not have anything but point releases in the future. And the logic
here doesn't mean any driver need be rebuilt. Most drivers won't
even notice this, and drivers that create entire new IRPs, or copy
IRPs, will create IRPs that follow the old rules unless they are
modified. The needed checks mentioned above (which are needed in all
cases anyway for an intercept, to account for context stack
overflow) will detect that these are old style IRPs and give correct
operation.

SHOULD A NEW FLAG BE IN IRP$L_STS2 to say it's a new format IRP?
I disfavor adding new flag bits if they aren't needed. We need to
check length and CSP validity anyway and it isn't clear that a new
bit adds much. It would not be there for old apps that create brand
new IRPs but copied IRPs would tend to have it even if they weren't
fully copied.

Note finally that a cloned IRP CAN be made correct as a new format
one, if a driver so wishes, by resetting the context stack pointer
(CSP) in the new IRP that is copied. Any driver that did this would
need to be aware of some intercept characteristics though, which is
improbable. The base abstraction is the original driver interface
which operates on the same UCB througout. Where this is not
followed, it is a simple rule to state that intercepts need to keep
away from drivers that use different IRPs.

If there is no intercept in place using the new IRP contents of
course, there is no behavior change at all.