;-*- Mode:Text -*-

;forked 5/23/86 ...
; this is tentative but non-perfect spec for call-hardware with
; return-then-open, return-then-t-open etc.
; I think the timing is just too hairy.

;current state: instructions fit together OK, except for
; overlapping clocks at end and start of adjacent ops,
; and open and open-call need to be delayed one cycle.

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;


23 bit PC: 8MW, 64MB instructions
do 22-24 if going through map; else 21 is enough for 2MW.

to-do:

1. static low memory, short calls, open-call, t-open-call

2. select old / new frame addresses for destinations
	bit select new value / last value for dest
	always previous value for source
	last-value is only ref'd as dest; restarted instruction
	will restore last value during source, so last-value
	pipeline does not need back-up hardware.

return PC saved in Hram:
	PC+1 or old Hpc
	note that output must be valid at end of cycle that writes it.

static low ram; call-0: in parallel with normal alu op.
	8-10 bit vector
	share PC mux inputs with traps?
	single valid bit for whole area

current 4-bit continuation-code can't afford decoder time for
selecting PC mux ... can PC mux select be part of IR? (for all but force-trap)

check address setup time for Hram and freelist writes.

timing of O/A/R old / new select
	have to select old/new early?
	reg ram address setup time?
...
	source is the critical one; since it's always new O/A/R then
	at worst that can be hardwired?
	dest requires old/new, but actual reg address is delayed two clocks
	past source.
	Ret-dest vs. dest frame select: selected late; pipeline reg
	delays reg addr for normal dest, but ret-dest is selec

retd-return-then-t-open:
	open is not available in time for source for next instruction;
	see modified t-open that adds temp-open so open can clock early.

;;;;

call hardware operations start on the mid-instruction 35ns clock;
some complete in one half-cycle, others take both.
do any take more than two cycles?

all instructions get old frames as sources?
by "default", destinations get the new frames.
sometimes want old frames as destinations?
next instruction always gets new frames.
return-destination always gets restored state (state as of before the call).
return-then-open: what frames for destination? (between return and open vs. after open?)

try bidirectional / tri-state paths for Hram and frame passing

;;;;

ALU requires a DEST field:
	N-bits for frame-select (open, active, return, global ...)
	4 bits for offset within frame
CALL also requires a return-destination:
	N-bit frame-select; 4-bit offset.
	RETURN uses return-destination frame and offset as destination
	instead of normal DEST.

reg addr mux select input was using clock to select source / dest fields;
still select source addrs during read cycle,
now select dest or ret-dest during write.
normal dest is always from IR;
ret-dest is from ret-dest ram output, possibly through ret-dest reg.
use ret-dest reg for indirect dest?

brute-force:
reg addr mux was two-input, selected by clock.  source / dest reg addr.
now is four-input, selected by clock and ch-op.  source / source / dest / ret-dest ?

dest is already delayed two clocks from IR to write-cycle;
ret-dest is popped from Hret-dest by return instruction.
dest is from IR of return instruction; delay twice to get to write cycle.

that ret-dest will be selected is known at IR of return instruction,
but ret-dest value is not popped from Hret-dest until clock at end of IR cycle,
so it is really delayed only one clock.

return frame is only valid until next open (same frame is allocated again);
would a previous-active option for the active frame reg work OK?

;;;;;;;;;;;;;;;;

Frames available as source and destination bases:

Open:
	Frame to store arguments for function that is about to be called.

Active:
	Frame for current function local variables and the arguments
	the function was called with.

Return:
	Frame in which to store values to be returned by the active function.

Global:
	Frame number comes from instruction.

and possibly:

Return-destination:
	Frame and offset come from H(active)-return-dest.
	(no latch)

Indirect:
	Frame and offset is from previously computed value.

;;;;;;;;;;;;;;;;

Function-call operations:

Open:
	Create a new open frame in preparation for a function call.

	1. allocate a new frame and save open and active in it.
	2. make the new frame be the open frame.

	Source: before (open frame should not be sourced)
	Dest: after

	asserted during second cycle:
		H(free) <- active, open
	clocked at end of second cycle:
		open <- free
		increment free-ptr

Call:
	Call a function.  Make the current open frame become the active frame.

	Can be followed by retd-return-then-t-open; must address Hram with
	new active frame during R cycle.

	IR bits for one source and for ALU operation are used for PC;
	ALU op or external path is wired to connect source and dest.
	(use one source and byte op; not ALU op)
	Also need return-destination from IR.

	1. copy open frame pointer to active frame. (active was saved in OPEN)
	2. save the (return) PC and return-destination in the H(active). (new active)
	3. jump (set pc) to the code for the function to be called.

	Source: before (for new active, use open?)
	Dest: after; maybe before?

	clocked at start of first cycle:
		PC mux selects short or long PC from IR

	clocked at start of second cycle:
		active <- open
	asserted during second cycle:
		H(active) <- PC+1, IR:ret-dest  (OK if "reading" Hram gets this)

Open-call:
	Create a new open frame, make it the active frame and call a function.
	(Intended for fast calling of single-argument functions.)
	(Only legal for short-call?)
**	Requires IR bits for ALU, Source-1, Source-2, Dest, Ret-dest, Short-PC.

	Can be followed by retd-return-then-t-open; must address Hram with
	new active frame during R cycle.

	1. allocate a new frame and save open, active, ret-PC and ret-dest in it.
	2. set open and active to the new frame.
	3. jump (set PC) to the function to be called.

	Source: before
**	Dest: after; maybe before?

	clocked at start of first cycle:
		PC mux selects short or long PC from IR

	asserted during second cycle:
		H(free) <- open, active, PC+1, IR:ret-dest  (same as new active)
	clocked at end of second cycle:
		open <- free
**		active <- free  (delayed, active <- open)
		increment free-ptr
1. delay so that Hram is used only during next first cycle.
2. shuffle so that Hram is addressed by active, using activetemp to save active for write.

Tail-recursive-open:
	Copy the current open frame state into a new frame, in preparation for
	a tail-recursive call, which will throw away the active frame as
	if returning; then start a new call by copying open to active.

	1. copy H(active) into open, temp-active, temp-pc and ret-dest.
	2. allocate a new frame and save open, temp-active, temp-pc and ret-dest in it.
	3. make the new frame be the open frame.

	Source: before (open should not be sourced) **	Dest: after

	asserted during second cycle:
		H addressed by active
	clocked at end of second cycle:
		open, temp-active, temp-pc, ret-dest <- H(active)
	asserted during next first cycle:
		H(free) <- open, temp-active, temp-pc, ret-dest
	clocked at end of next first cycle:
		open <- free
		increment free-ptr

(alt, for retd-ret-then-t-open, so open is valid in time to be used as source)
	asserted during second cycle:
		H addressed by active
	clocked at end of second cycle:
		open <- free
		temp-open, temp-active, temp-pc, ret-dest <- H(active)
	asserted during next first cycle:
		H(free-or-open) <- temp-open, temp-active, temp-pc, ret-dest
	clocked at end of next first cycle:
		increment free-ptr

Tail-recursive-call:
	Call a function.  Discard the active frame as if returning, and make the
	current open frame become the active frame.

	1. push active frame pointer onto frame freelist
	2. copy open to active.
	3. jump (set pc) to the code for the function to be called.

	Source: before (for new active, use open)
**	Dest: after

	clocked at start of first cycle:
		PC mux selects short or long PC from IR
	clocked at start of second cycle:
		decrement free-ptr
	asserted during second cycle:
		free(free-ptr) <- active
	clocked at end of second cycle:
		active <- open

Return:
	Return, and invoke call-hardware operation from H(active) ret-dest op.
	Dest for this instruction is H(active) ret-dest dest and offset.

**	Requires that Hram ret-dest be addressed by active during second half
	of IR cycle, so that Hram ret-PC can be selected for next fetch.
**	Can Hram ret-dest access for ch-ops use just one cycle, so PC
	mux can get it on time?
	Need Hram ret-dest in time to start indirect op on time.

**	asserted during second half of IR cycle:
		Hram ret-PC and ret-dest addressed by active (new active for preceding cycle)
	clocked at start of decode cycle:
		PC mux selects return PC from Hram
		next call-hardware op from H(active) ret-dest op (use 1/2 cycle later)

Retd-return: (invoked indirectly)
	Restore the frame environment to that prior to the preceding CALL.
	Makes return values in old active frame addressable through return frame.

	1. push return onto frame freelist.
	2. copy active to return.
	3. pop open, active, PC and ret-dest from H(active)

	Source: before
	Dest: after; use Hret-dest.

(ret-PC and selecting this ch-op done by "return")
	clocked at start of second cycle:
		decrement free-ptr
	asserted during second cycle:
		free(free-ptr) <- return
		Hram addressed by active
	clocked at end of second cycle:
		return <- active
		dest-select, active, open <- H(active)   (ret-pc was already used)
		;Hram ret-dest goes to dest-select latch inputs

Retd-return-then-open:
	Do return followed by open.

	1. push return onto frame freelist.
	2. copy active to return.
	3. pop open, active, PC and ret-dest saved by preceding CALL.
	4. save ret-dest frame and offset in dest-select pipeline

	5. allocate a new frame and save open and active in it.
	6. make the new frame be the open frame.

	Source: before (open frame should not be sourced)
	Dest: after

   (return)
	clocked at start of second cycle:
		decrement free-ptr
	asserted during second cycle:
		free(free-ptr) <- return
		Hram addressed by active
	clocked at end of second cycle:
		return <- active
		active, open, dest-select <- H(active)  (ret-pc was already used)
   (open)
	asserted during next first cycle:
		H(free) <- active, open
	clocked at end of next first cycle:
		open <- free
		increment free-ptr

Retd-return-then-tail-recursive-open:
	Do return followed by tail-recursive-open.

	Can only be followed by open and t-call;
	check that those do not use their R cycle.

   (return)
	1. push return onto frame freelist.
	2. copy active to return.
	3. pop open, active, PC and ret-dest saved by preceding CALL.
	4. save ret-dest frame and offset in dest-select pipeline
   (t-open)
	5. copy H(active) into open, temp-active, temp-pc and ret-dest.
	6. allocate a new frame and save open, temp-active, temp-pc and ret-dest in it.
	7. make the new frame be the open frame.

   (return)
	clocked at start of second cycle:
		decrement free-ptr
	asserted during second cycle:
		free(free-ptr) <- return
		Hram addressed by active
	clocked at end of second cycle:
		return <- active
		active, open, dest-select <- H(active)  (ret-pc was already used)
   (t-open)
	asserted during next first cycle:
		Hram addressed by active
	clocked at end of next first cycle:
		open, temp-active, temp-pc, ret-dest <- H(active)
	asserted during next second cycle:
(Hram not addressed by active, but Hram output regs contain correct data)
		H(free) <- open, temp-active, temp-pc, ret-dest
	clocked at end of next second cycle:
		open <- free
		increment free-ptr

cancel-open-frame:
	Undo an open or a t-open.

	1. push the open frame on the freelist.
	2. restore active and open from H(open).

	clocked at start of second cycle:
		decrement free-ptr
		;;active <- open
	asserted during second cycle:
		free(free-ptr) <- open
		Hram addressed by open  ;;active
	clocked at end of second cycle:
		open, active <- H(open)  ;;active

;;;;;;;;;;;;;;;;


call-hardware operations executed (only) from IR:
	no-op
	open
	call
	open-call
	t-open
	t-call
	t-open-call
	cancel-open
	return (only to invoke indirect return op from return-destination)
	jump
	open-jump
	t-open-jump
maybe open-jump and t-open-jump are only short jumps.
(?) all jumps are conditional; call-hardware operations always happen.
NOTE: conditional jumps must be -ex-next or -ex-next-ex-next.
  only unconditional jumps to constant or precomputed addresses can
  take effect in the next cycle.

call-hardware operations executed (only) from return-destination:
	retd-return
	retd-return-then-open
	retd-return-then-t-open

call-hardware operations that specify a new PC in IR:
	call / short-call
	open-call / short-open-call
	t-call / short-t-call
	t-open-call / short-t-open-call
	jump / short-jump
short-call and short-open-call are same ch ops as call and open-call;
short- just selects PC mux and ALU op differently.
short...call calls page 0 with 8-bit offset (shifted left?)
short-jump jumps within page.

call-hardware operations that specify a return-destination in IR:
	call
	open-call

call-hardware frame-select inputs:
	open
	active
	return
	global
old-frame-state select inputs:
	three possible states:
		before, in-progress, after
	but only one or two are ever meaningful.
	mux selects old / new;
	regs are clocked differently for each op,
	so old / new are the two possibilities.
	Note that old / new for source vs. dest may be different.
	old - source is always "before";
	new - dest is always "after".
	hopefully ... new source is one clock past old source,
	old destis one clock before new dest.
	return-then-...

;;;;

PC mux select slop should be good enough for return-NOW.
... also for normal select of PC+1.

change RETURN protocol to free the returned frame before the
next time return is loaded, instead of freeing it as return
is loaded.

;;;;

illegal call-order combinations cause trap (and no-op).

;;;;

call hardware operations use at least R and 1st half of ALU.
return requires Hram addressed by active during 2nd half
of IR (R of preceding).

if Hram(active) is assumed during IR cycle, does active have
to be restored more than two instructions back (for trap)?

alternatives:
0. separate addressing for Hram open/active/ret-d vs. ret-PC
1. shuffle cycles within current Decode-Read-ALU1.
2. shift ch time to Read-ALU1-ALU2.
3. compiler adds no-op if preceding op doesn't assert correct H(active).
4. call-hardware forces no-op if "".

5. delete retd-return-then-t-open.

remember that H(active) for return-PC must be NEW active for preceding op.

call-order may be in our favor.

;;;;

Instructions:
	call
	open-call
	t-call
precede:
	retd-return
	retd-return-then-open
	retd-return-then-t-open
and must assert the correct ret-dest and ret-pc during the R cycle.
(simple: assert H(new-active); ok: at least have correct outputs)

Instruction:
	retd-return-then-t-open
is a 3-cycle call-hardware op and is followed by:
	open
	open-call
	t-call
which must not use the R cycle.

;;;;

current problem:
H(active) during R before return is likely,
but WHICH active is not. ... must be "new" active for that op.

screw is
	retd-return-then-t-open
	open-call
	retd-return-then-t-open

or
	retd-return-then-t-open
	t-call
	retd-return-then-t-open

open-call and t-call can trust retd-return-then-t-open to
assert H(active) during next-R, and can assert it anyway
if not preceded by retd-return-then-t-open.

;;;;

retd-return-then-t-open now asserts correct ret-dest and ret-pc
during its second R cycle, but is this state destroyed and/or restored
for a trap?