Computer Architecture -

Computer Architecture is the conceptual design and fundamental operational structure of a computer system. That is, it is a model and a functional description of design requirements and implementations for various parts of a computer, with special interest in how the central processing unit (CPU) works internally and accesses memory addresses.

It is also often defined as the way to select and interconnect hardware components to create computers according to the requirements of functionality, performance and cost.

The computer receives and sends the information through the peripherals through the channels. The CPU is in charge of processing the information that reaches the computer. The exchange of information has to be done with the peripherals and the CPU.

All units of a system except the CPU are called peripheral, so the computer has two distinct parts, which are: the CPU (responsible for running programs and which is composed of the main memory, the Arithmetic Logical Unit (UAL) and Control Unit) and peripherals (which can be input, output, I / O and communications).

The implementation of instructions is similar to the use of a series of dismantling in a manufacturing factory. In assembly lines, the product passes through many stages of production before having the product disarmed. Each stage or segment of the chain is specialized in a specific area of ​​the production line and always carries out the same activity. This technology is applied in the design of efficient processors.

These processors are known as pipeline processors. These are composed of a list of linear and sequential segments where each segment performs a task or group of computational tasks. The data coming from outside are entered into the system to be processed. The computer performs operations with the data it has stored in memory, produces new data or information for external use.

The architectures and sets of instructions can be classified considering the following aspects:

  • Storage of Operatives in the UPC: where the operators are located besides the information subtractor (SI)
  • Number of Explicit Operands per Instruction: how many operands are expressed explicitly in a typical instruction. Usually they are 0, 1, 2 and 3.
  • Operator Position: Can any operand be in memory ?, or should some or all be in the internal registers of the UPC. How memory address is specified (addressing modes available).
  • Operations: What operations are available in the instruction set.
  • Type and size of Operands and how they are specified.

Storage of Operands in the CPU

The basic difference is in the internal storage of the UPC

The main alternatives are:

  • Accumulator.
  • Recordset.
  • Memory

Features: In an accumulator architecture an operand is implicitly in the accumulator always reading and entering data. (Eg standard calculator - standard)

In the stack architecture it is not necessary to name the operands since they are at the top of the stack. (Eg HP battery calculator)

The Records Architecture has only explicit operands (it is the one that is named) in registers or memory.

Advantages of Architectures

  • Stack: Simple model for expression evaluation (inverse Polish notation). Short instructions can give a good code density.
  • Accumulator: Short instructions. Minimizes internal machine states (simple control unit).
  • Registration: More general model for the similar instruction code. Automates code generation and reuse of operands. Reduce traffic to memory. A computer currently has 32 records as standard. Access to data is faster, faster.

Disadvantages of Architectures

  • Battery: A battery can not be accessed randomly. This limitation makes it difficult to generate efficient code. It also hinders an efficient implementation, since the stack becomes a bottleneck ie it is difficult to transfer data at its speed mk.
  • Accumulator: As the accumulator is only temporary storage, memory traffic is the highest in this approach.
  • Registration: All operators must be appointed, leading to longer instructions.


In computer or computer architecture, a register is a high-speed, low-capacity memory, integrated into the microprocessor, which allows for temporary storage and access to widely used values, usually in mathematical operations.

Records are at the top of the memory hierarchy, and are the fastest way the system has to store data. Logs are usually measured by the number of bits they store; for example, an "8-bit register" or a "32-bit register". Registers are usually implemented in a register bank, but formerly individual flip flops, SRAM memory or even more primitive forms were used.

The term is generally used to refer to the group of records that can be directly indexed as operands of an instruction, as defined in the instruction set. However, microprocessors also have many other registers that are used for a specific purpose, such as the program counter. For example, in the IA32 architecture, the instruction set defines 8 32-bit registers.

Types of Records

Data records are used to store whole numbers. In some old computers, there was a single record where all the information was stored, called the accumulator.
Memory registers are used to save memory addresses exclusively. They were widely used in Harvard architecture, since addresses often had a different word size than the data.

General Purpose Registers (GPRs) can save both data and addresses. They are fundamental in von Neumann's architecture. Most modern computers use GPR.

Floating-point records are used to save floating-point data.

Constant records have read-only hardware values. For example, in MIPS, the zero record is always 0.

Specific-purpose registers store system-specific information, such as the stack pointer or status log.

there are also flags and base registers

Memory Management Unit or Memory Management Unit MMU

The Memory Management Unit, is a hardware device formed by a group of integrated circuits, responsible for the management of accesses to memory by the Central Processing Unit (CPU).

Among the functions of this device are the translation of logical (or virtual) addresses to physical (or real) addresses, memory protection, cache control and, in simpler computer architectures (especially 8 bits), Bank switching.

When the CPU tries to access a logical memory address, the MMU performs a search on a special cache called the Translation Lookaside Buffer (TLB), which keeps the portion of the page table used less time. In this report page table entries (called PTEs) are maintained, where the physical addresses corresponding to some logical addresses can be retrieved directly. When the address required by the CPU is in the TLB, its translation to real or physical address is delivered, in what is known as 'TLB hit'.

In another case, when the searched address is not in the TLB, the processor looks in the page table of the process using the page number as input to the process. At the input of the process page table is a presence bit, which indicates if the page searched is in main memory. If the presence bit is enabled, this PTE is loaded into the TLB and the physical address is returned. Otherwise, the operating system is informed of the situation, by means of a page fault. It is the operating system that makes the necessary adjustments (that is, loading the page into physical memory) using one of the page replacement algorithms to continue execution from the instruction that caused the failure.
A fundamental benefit of the MMU is the ability to implement memory protection, preventing programs from accessing portions of forbidden memory. For example you can prevent a program from accessing or modifying memory sectors of other programs.

Logical Arithmetic Unit ALU

The Arithmetic Logic Unit is a digital circuit that calculates arithmetic operations (such as addition, subtraction, multiplication, etc.) and logical operations (yes, y, or, no) between two numbers.

Many types of electronic circuits need to perform some sort of arithmetic operation, so even the circuit inside a digital clock will have a tiny ALU that keeps adding 1 to the current time, and keeps checking if it should turn on the alarm sound, etc.

By far the most complex electronic circuits are those built into modern microprocessor chips. Therefore, these processors have within them a very complex and powerful ALU. In fact, a modern microprocessor (and mainframes) can have multiple cores, each core with multiple execution units, each with multiple ALUs.

Many other circuits can contain a logical arithmetic unit inside: graphic processing units such as those in modern GPUs, FPU like the old mathematical coprocessor 80387, and digital signal processors such as those found in sound cards, readers CD and HDTVs. All of these have several powerful and complex ALUs in them.

The mathematician John von Neumann proposed the concept of the ALU in 1945, when he wrote a report on foundations for a new computer called EDVAC (Electronic Discrete Variable Automatic Computer). Later, in 1946, he worked with his colleagues designing a computer for the Princeton Institute of Advanced Studies (IAS). The IAS computer became the prototype for many later computers. In this proposal, von Neumann outlined what he believed would be necessary in his machine, including an ALU.

Von Neumann explained that an ALU is a fundamental requirement for a computer because you need to perform basic mathematical operations: addition, subtraction, multiplication, and división.1 therefore believed it was "reasonable that a computer should contain specialized organs for these operations ".

Numerical Systems

An ALU must process numbers using the same format as the rest of the digital circuit. For modern processors, this format is almost always the representation of the binary number of complement to two. Early computers used a wide variety of numbering systems, including complement to one, sign-magnitude format, and even true decimal systems, with ten tubes per digit.

The ALUs for each of these numerical systems showed different designs, and this influenced the current preference for the complement to two, since this is the simplest representation, for the electronic circuit of the ALU, to calculate additions and subtractions, etc.

The ALU is basically composed of: Operational Circuit, Logs of Entries, Accumulator Record and a Registry of States, set of records that make possible the accomplishment of each one of the operations.

Most computer actions are performed by the ALU. The ALU takes data from the processor registers. This data is processed and the results of this operation are stored in the output registers of the ALU. Other mechanisms move data between these registers and memory.

A control unit controls the ALU by adjusting the circuits that tells the ALU which operations to perform.

Simple Operations

Most ALUs can perform the following operations:

  • Arithmetic operations of integers (addition, subtraction, and sometimes multiplication and division, although this is more complex)
  • Logical bit operations (AND, NOT, OR, XOR, XNOR)
  • Bit shift operations (Moves or rotates a word in a specific number of bits to the left or right, with or without sign extension). The displacements can be interpreted as multiplications or divisions by 2.

Complex Operations

An engineer can design an ALU to calculate any operation, no matter how complex it may be; the problem is that the more complex the operation, the more expensive the ALU will be, the more space it will use in the processor, and the more energy it will dissipate.

Therefore, engineers always calculate a compromise, to provide the processor (or other circuits) with a sufficiently powerful ALU to calculate fast, but not of a complexity of such caliber that makes an ALU economically prohibitive. Imagine that you need to calculate, say, the square root of a number; the digital engineer will examine the following options to implement this operation:

  • Design a very complex ALU that calculates the square root of any number in a single step. This is called calculation in a single clock cycle.
  • Design a complex ALU that calculates the square root with several steps (like the algorithm we learned in school). This is called interactive calculation, and generally relies on the control of a complex control unit with built-in microcode.
  • Design a simple ALU on the processor, and sell a separate, specialized and expensive processor, which the client can install additional to the processor, and implement one of the above options. This is called a coprocessor or floating point unit.
  • Emulate the existence of the coprocessor, that is, whenever a program tries to calculate the square root, have the processor check if a coprocessor is present and use it if there is one; if there is not one, interrupt the program process and invoke the operating system to perform the calculation of the square root by means of a certain software algorithm. This is called software emulation.
  • Tell programmers that there is no coprocessor and there is no emulation, so they will have to write their own algorithms to calculate square roots by software. This is done by software libraries.

Superior options range from the fastest and most costly to the slowest and most economical. Therefore, while even the simplest computer can calculate the most complicated formula, simpler computers will usually take a long time because several of the steps to calculate the formula will involve options # 3, # 4 and # 5 from above.

Complex processors such as the Pentium IV and AMD Athlon 64 implement option # 1 for more complex operations and the slower # 2 for extremely complex operations. That is possible because of the ability to build very complex ALUs on these processors.

Inputs and Outputs

The inputs to the ALU are the data in which the operations will be performed (called operands) and a code from the control unit indicating which operation to perform. Its output is the result of the computation of the operation.

In many designs the ALU also takes or generates as inputs or outputs a set of condition codes from or to a status register. These codes are used to indicate cases such as inbound or outbound carry, overflow, division by zero, etc.

Floating Point Unit FPU

A floating-point unit or, also known as a mathematical coprocessor, is a component of the central processing unit specialized in calculating floating-point operations. The basic operations that every FPU can perform are the usual addition and multiplication, although some more complex systems are also capable of performing trigonometric or exponential calculations.

Not all central processing units have a dedicated FPU. In the absence of FPU, the CPU can use microcode programs to emulate a floating-point function through the logical arithmetic unit (ALU), which reduces the cost of hardware in exchange for a significant loss of speed.

In some architectures, floating-point operations are treated completely differently than whole operations, with dedicated records and different cycle times. Even for complex operations, such as division, they could have a circuit dedicated to that operation.

Until the mid-1990s, it was common for CPUs not to incorporate an FPU into home computers, but were an optional element known as a coprocessor. Examples could be the FPUs 387 and 487 that were used in the Intel 80386 and Intel 80486SX central processing units (the 80486DX model already included the serial coprocessor) on Intel Pentium machines, or the FPU 68881 used in the central processing units 680x0 on Macintosh computers.

It should be noted that in Anglo-Saxon countries numerical use of the point is used as a separator of units instead of the internationally recognized in the International System of Units, the comma, for that reason the most correct translation would be "floating comma unit" because its operations move the decimal point.


A Floating Point Unit (FPU) also performs arithmetic operations between two values, but it does so for floating-point numbers, which is much more complicated than the two-to-one representation commonly used in an ALU. To do these calculations, an FPU has built-in several complex circuits, including some internal ALUs.

Generally engineers call ALU the circuit that performs arithmetic operations in integer (as a complement to two and BCD) arithmetic operations, whereas circuits that calculate in more complex formats such as floating point, complex numbers, etc., are generally given a longer name specific, such as FPU.


You May Also Like