# HIGH LEVEL HARDWARE # **QRQN** 1/05 The ORION 1/05 processor is an advanced high-performance 32-bit computer architecture which draws on mainframe/supercomputer techniques to provide outstanding performance at reasonable cost. The CPU includes an IEEE 754 standard floating-point unit and two combination cache-plus-memory management units, one for data and one for instructions. It makes extensive use of concurrency, including a sophisticated pipeline and separate buses for instructions and data. # **ORON1/05** The processor is complemented by a range of autonomous microprocessor-based I/O subsystems that connect to the 32 Mbyte/sec ORION synchronous backplane. To these can be attached a variety of peripherals including disks from 168-474 Mbytes, 1/4 inch cartridge and 1/2 inch reel-to-reel tape drives, and the ORION LaserWriter. Communications capabilities include support for Ethernet, RS232-C and X.25. Software includes High Level Hardware's implementation of the Berkeley 4.2 version of the UNIX operating system, compilers for C, FORTRAN, Pascal and other languages, and a growing range of applications software. # **ORION 1/05 Processor** # **Processor** The ORION 1/05 processor is designed with the primary goal of providing a dramatic increase in performance. The synthesis of supercomputer design concepts with state-of-the-art very large scale integration (VLSI) provides an average execution rate exceeding five MIPS (millions of instructions per second), with a peak execution rate of 33 MIPS. The basic architectural concepts draw heavily on recent research into supercomputer and reduced instruction set computer (RISC) architectures: - Simplified instruction set - Hardwired logic - Highly pipelined central processing unit - Separate integer and floating-point execution units - Dual high-speed cache memories for instructions and data with independent 32-bit data paths to processor - Demand paged virtual memory The ORION 1/05 processor includes features of supercomputer architectures not previously used in microprocessors. The processor is highly pipelined with up to four instructions executing simultaneously, supports instruction prefetching, has a one Mflop on chip floating-point unit, runs at 33 MHz, and executes a load/store register-to-register instruction set. **ORION 1/05 Processor Interface** The central processing unit (CPU) uses a three-stage execution pipeline to improve performance by providing a degree of parallelism. The integer execution unit, which is itself pipelined, includes two sets of 16 general purpose 32-bit registers. The floating-point unit supports the IEEE 754 standard and includes eight dedicated 64bit registers. Floating-point and integer operations proceed in parallel. A resource manager keeps track of required resources and issues instructions only when those resources are available. Two levels of prefetching are included, instruction prefetching within the CPU and cache prefetching, the latter ensuring that whenever a word is requested from the instruction cache, the next four sequential words from main memory are available. Virtual memory address translation is overlapped with cache access, resulting in zero overhead. # **MEMORY** The ORION 1/05 CPU consists of a three chip set with a transistor count of 846,000. A register-to-register architecture with multiple register sets and a load/store interface to memory embodies the best features of reduced instruction set computers (RISC) whilst being simple enough to implement in hardwired logic with a pipelined execution unit. A full range of addressing modes allows the data structures of high level languages to be compiled easily into efficient code. In order to overcome the remaining deficiencies of a true RISC architecture and to provide efficient execution of the complex instruction sequences which occur in scientific and engineering applications, a 1K by 48-bit macro instruction ROM holds such sequences as strings of machine instructions and delivers them directly to the instruction pipeline. # Memory The ORION 1/05 features a minimum of eight Mbytes of main memory as standard, organised as 32-bit words with four-way interleaving, allowing 16 bytes of data to be fetched or stored in one operation. Each eight Mbyte memory module is provided with byte parity protection and is constructed using 1Mbit dynamic CMOS RAMs. Random access cycle time is 300 ns per 32-bit word but multi-word transfers to and from cache yield an effective cycle time of 125 ns per 32-bit word (32 Mbytes/sec). Memory capacity is currently expandable to 16 Mbytes. In the future, by using alternative techology, the overall memory capacity will be raised. Independent cache memory and memory management units are provided for instructions and data. Each has a 32-bit data path to the CPU and contains 4 Kbytes of cache memory and a virtual memory translation buffer. These provide typical hit ratios of over 96% for instructions and 90% for data. This is achieved by using a two-way set associative organisation with a 16 byte line size which fully exploits the four-way interleave of main memory. Both write-through and copy-back strategies are supported and cache coherence is ensured by hardware monitoring of bus activity. **ORION 1/05 Memory Module** # READ OPERANDS ALU SHIFTER RESULTS FLOATING POINT UNIT # **Peripherals** The following peripheral interfaces are currently supported on the ORION 1/05 and are directly connected to the I/O subsystem interconnect: - OR-PIC Disk/Tape Controller. Supports up to three disk or tape units in any combination and also provides four RS232-C terminal lines. A variety of high performance Winchester technology disk subsystems is available, with capacities ranging from 168 to 474 Mbytes. For backup or archiving, a 1/4 inch cartridge tape or industry standard 1/2 inch magtape subsystem is available. - OR-COM Communications Controller. Supports an IEEE 802.3 Ethernet channel and also provides four RS232-C terminal lines. Ethernet building blocks include a transceiver, an eight channel fanout unit and # I/O SUBSYSTEM # PERIPHERALS # I/O Subsystems Up to eight intelligent I/O subsystems can be installed to control peripheral activity, together with I/O buffer memory of up to 16 Mbytes. Each I/O subsystem includes a full function microcomputer which performs control functions and housekeeping, whilst data transfers to and from peripheral devices take place via a direct memory access (DMA) path constructed using bit-slice microprocessors. This allows the full performance of the ORION I/O subsystem interconnect (bandwidth 32 Mbytes/sec) and peripheral device to be exploited, with the I/O microcomputer able to take corrective action on soft I/O errors. Software on ORION communicates with the I/O subsystems using a high level message passing protocol. a variety of Ethernet and transceiver cables. More than one OR-COM can be installed, allowing gateways between Ethernets to be constructed. OR-MUX Terminal Multiplexor. Supports 12 RS232-C terminal lines. Each port operates at up to 38,400 Baud and has a full set of modem control lines. Other peripherals include the ORION LaserWriter which can be shared by a work group to perform tasks ranging from printing text files to the production of high quality camera ready copy, an X.25 adapter to support access to wide area networks such as PSS and JANET, and interfaces to the IEEE 488 instrumentation bus and floppy disk drives. A general purpose 16-bit DMA parallel interface is also available. | Dhrystone Benchmark | 0 | 2000 | 4000 | 6000 | 8000 | |---------------------|---|------|------|-------|------| | AT & T 3B2/400 | | | | | | | APOLLO DN330 | | | | | | | SUN 3/160 | | 182 | | | | | PYRAMID 98X | | | | | | | GOULD PN 9080 | | | | | | | DEC VAX 8600 | | | | | | | HLH ORION 1/05 | | | | No Gr | | ## Software The ORION Time Sharing Operating System (OTS) is a version of the UNIX operating system derived directly from the Berkeley 4.2 BSD design. The operating system supports multiple processes running in separate protected demand-paged virtual address spaces. Interprocess communication (IPC) facilities include asynchronous software interrupts (signals) and message buffers (pipes and sockets). The system is autoconfiguring and supports any combination of currently available peripheral devices. There is a generic mechanism for supporting networking protocols including TCP/IP which, combined with Ethernet and the powerful IPC mechanisms allows applications to be distributed transparently across a network. # SOFTWARE The OTS distribution comprises over 500 utility programs, as well as an optimising C compiler and a full set of printed and on-line documentation. Language options include FORTRAN 77, Pascal, COBOL, BASIC Plus, Common LISP and Prolog. Applications include ACEGEN, a fourth generation program generator, the INGRES database management system, the REDUCE symbolic algebra system and the Network File System. A growing range of application software is available through the ORION Users' Group and other third-parties. # ORION 1/05 Programming Model - Sixteen general purpose registers - Eight floating-point registers - Two operand register-register instruction set - Load/store memory interface - Nine addressing modes - Complex instructions implemented as macros - Precise traps and interrupts - Demand paged virtual memory - Both write-through and copy-back cache strategies - Data types 8, 16 and 32 bit integer, 32 and 64 bit floating-point # HIGH LEVEL HARDWARE # Specification at a glance ### Processor # Type: 32 bit CPU CMOS VLSI technology 64 bit IEEE 754 Floating point execution unit (FPU) Dual independent pipelines for concurrent integer and floating point execution Pipelined instruction fetch and decode Cycle Time: 30 ns Cache memory: Independent 4 Kbyte caches for instructions and data Cache access time four clock cycles (120 ns) Quadword line buffer acts as a cache-within-a-cache with access time of two clock cycles (60 ns) Two-way set associativity with 16 byte line size Cache coherence ensured by hardware bus monitoring Typical hit ratios of over 96% for instructions and over 90% for data Cache prefetching provides 100% instruction hit ratio on in-line code Supports both write-through and copy-back cacheing strategies Type: CMOS 100 ns, 1 Mbit or 4 Mbit dynamic RAM Up to 64 Mbytes in increments of 8 and 32 Mbytes Organisation 32 bits data plus 4 bits parity, four way interleaved Access length 1, 2, 4 and 16 bytes Bandwidth 32 Mbyte/sec sustained Address space: Instantaneous virtual address space of 4096 Mbytes Three physical address spaces, each of 4096 Mbytes Page size 4 Kbytes Random access cycle time: 300 ns (4 bytes), 500 ns (16 bytes) Translation buffer cache: Dual independent 128 entry caches for instructions and data Address translation carried out in parallel with main caches, giving an effective translation time of zero Two-way set associativity Typical hit ratio of over 99% # System Interconnect # Processor to caches: Dual independent 32 bit internal data paths to instruction and data caches **Processor to memory:** 32 bit synchronous Bandwidth 64 Mbyte/sec (16 MHz clock) I/O to memory: 32 bit synchronous, supporting multiple DMA channels Bandwidth 32 Mbyte/sec (8 MHz clock) ### I/O Subsystems # Type: Multiple intelligent microprocessor based peripheral controllers Microprogrammed bit slice DMA channels Dedicated I/O buffer memory (16 Mbytes max.) All data transfers via DMA to minimise CPU overhead ### **High Level Hardware Limited** PO Box 170, Windmill Road, Headington, Oxford OX3 7BN, England Telephone: (0865) 750494 Telex: 838854 - HLH G