units
       branch
       integer
       floating point
on 601
       issue at most one per unit per cycle
       eight entry instruction queue
       can fill queue from cache in one clock cycle
               loads from requested address to end of cache block
pipeline
       prefetch
               includes ins. cache access cycles
       decode
       execute
       writeback

fpu
       IQ[3210] → fpu buffer/decode [≥1 cycle] → execute 1 → execute 2 → writeback
iu
       IQ0/decode → buffer [if exec busy] → execute [hold for dependency] →
                                       circulate in load/store
                                       writeback
bpu
       IQ[3210] → decode/execute → writeback

notes
       address calculation must complete before stored value enters write buffer