Design of n-Bit Adder without Applying Binary to Quaternary Conversion

Microprocessor has been considered as most important part in ICs manufacturing and making progress since more than 50 years, so increasing microprocessor speed is paid attention in all technologies. ALU is known as the slowest part in microprocessor because of the ripple carry, nowadays microprocessor uses 8-uints as pipeline, each one has 8-bits for implementing 64-bit, working in this form has been captured the microprocessor development and limited its speed for all its computations. Parallel processing and high speed ICs always trying to increase this speed but unfortunately it remains limited. The contemporary solution for increasing microprocessors speed is the Multiple Valued Logic (MVL) technology that will reduce the 8-bits to 4-qbits, this paper proposes a new design of a 2-qbit full adder (FA) as a basic unit to implement MVL ALU (AMLU) that has 8-units as pipeline, each one consists of 4-qbits to implement 32-qbit which is equivalent to 64-bit, without applying binary to quaternary conversion and vice versa. The proposed design increases microprocessors speed up to 1.65 times, but also a little increase of implementation. KeywordsCCCi, Full Adder (FA), Multi Valued Logic (MVL), Quaternary Logic. How to cite this article: W.MH. Khalaf, D.R. Zaghar and K.A. Al-majdi, “Design of n-Bit Adder Without Applying Binary to Quaternary Conversion,” Engineering and Technology Journal, Vol. 37, Part A, No. 03, pp. 106-111, 2019.


Introduction
MICROPROCESSOR is considered as the core of diverse systems like PC and embedded systems. Therefore, its logic circuits will be modified for additional amplifications and more reviewing. Grow and development of microelectronics pushes toward implementing large and fast ICs, and therefore to implement high-speed processor families such as Intel microprocessors. New generations of these microprocessors lead to invention of Pentium technology, which opens a new portal for implementing a very high speed ICs (3.1 million transistors with 273 pins). Arithmetic Logic Unit (ALU) is considered as the heart of microprocessor design, which controls and manages the speed and operation of these processors. Adder is known as the main and basic part of processing element, so that, optimization of the adder circuit will lead to an improvement of the processing unit. This optimization will effect on calculation time of ALU (decreasing total calculation time), and therefore the performance of the unit will be improved [1]. Because of the constraints of parallel processing, new technologies still have some limitations that prevent it from improving the performance of microprocessor, but these limitations did not stop researchers to find some enhancements in this field.
Multiple full adder circuits consist of cascaded full adders to add N-bit numbers; there are N full adder circuits cascaded in parallel. Each adder gives two outputs one is called sum and the other is called carry, this output carry is defined as carry out. This carry out will be considered as carry in of the next full adder, and so on, therefore this adder type is called ripple carry adder. The disadvantage of this type is the delay, because the second sum bit should wait until the first sum and carry out is done, and so on. The final output sum and carry should wait until all previous carryouts generated [2]. All traditional logic gates work with binary logic values (two levels), researchers are trying to increase number of logic values for more than two levels, like 3 valued levels (Ternary Logic) or 4 valued levels (Quaternary Logic) which is called Multiple Valued Logic (MVL) [3]. MVL presents some advantages to binary logic systems by reducing component's number of the system through increasing number of levels (transmitting data more than before on the same line); this means, by reducing the total chip area or on the same chip area putting additional number of logic gates. Using MVL will reduce also the total number of interconnections and consuming power. On the other hand, MVL has some disadvantages, since noise, margins will be reduced and implementing stable and compact MVL systems still have some restrictions, and this system always needs a conversion part to and from traditional binary logic. Vasundara and Gurumurthy used down literal circuits to convert quaternary-binary and binaryquaternary converters to implement different arithmetic operation like, sub-traction, addition and multiplication both in Modulo-4 and Galois field using MVL [4]. Sharifi et al. proposed a new design of CNTFET based quaternary Full Adder MVL using voltage mode multiple threshold with quaternary to binary converter, sum and carry generator and binary to quaternary converter [5]. Tabrizchi et al. proposed a new design of ternary half adder and multiplier which based on carbon nanotube field effect transistors (CNTFETs) to reduce the consuming power and chip surface size as well as increasing the calculation speed [6]. Kalbande tries to achieve high performance through minimizing the hardware implementation and power dissipation by using less number of transistors when a new design of HA and FA is proposed which uses quaternary input and a quaternary output as output without any need of binary-quaternary or vice versa converters [7]. As well as what is mentioned above, a survey can be found in [8].
The following sections of this article are organized as follows, section 2 introduces a brief description of multiple valued logic in quaternary and all its phases, section 3 presents the proposed design of two q-bit adder, while section 4 gives the total costs implementation of each part of the proposed design, section 5 exhibits experimental results while conclusions are shown in section 6.

CCCi Thoery for Quaternary (CCC4)
It is a new theory proposed by Zaghar for satisfying the important requirements of multiple valued logic in quaternary [9,10]. In case r = 4 of CCCi, i.e. S4 = {0; 1; 2; 3} or quaternary system, it will be called CCC4. It consists of three major phases which will be explained as follows.

I. Phase One
This phase is called Convert phase which consists of four major logic gates: LN (Lower Not), UN (Upper Not), LR (Lower Reject), and UR (Upper Reject), these gates are capable of covering all the prerequisites of the convert phase in CCC4, as well as there is another gate called AN (All Not) gate, but it can be replaced by LN and UN serially. Therefore; AN will be ignored to reduce the total number of used gates. Table 1 shows basic function of each gate.

II. Phase Two
It is called Coded Phase, this phase gives eight outputs in CCC4 space, and four of them are called E0, E1, E2, and E3 functions which are defined as follows: The remaining four outputs are called F0, F1, F2, and F3 functions which are defined as follows for quaternary: When radix is equal to 4, the maximum value is equal to 3 (where r − 1 = 3) while the minimum value is always equal to zero.

III. Phase Three
This phase is called Collect Phase, in this phase; many inputs will be collected to give only one output. This is defined as follows: = maximum( , )

Two q-bit Adder Design in CCC4
All previous related works have been tried to implement and improve a 1-qbit adder (full or half adder), while the practical use of MVL requires implementing N-qbit adder. The implementation of a single adder (1-qbit) which has adding time equal to T, then using this unit to implement N-qbit adder will give an adder with N ripple carry and in result, the total adder time will become N×T. Addition time is the most important factor in any adder design, this article focuses to design multiqbit fast adder, a new implementation of 2-qbit adder is proposed as shown in Figure 1. The main idea in this adder is to discriminate the input carry and then use it to generate the output carry in shortest possible path, note that the input carry requires to active 4 transistors for generating the output carry, while the inputs (a1 and b1) requires to active 8 transistors for generating the output array. The proposed design of Figure 1 consists of 4 transistors delay with a 4 transistor latency, and in result the implementation of a N-qbit adder which is shown in Figure 2 requires DD delay, where = 4 * 2 + 4 = 2 + 4 (1) While the direct design of 1-qbit adder as in [9] requires 6 transistors for each adder and in result the total delay is = 6 (2)

Gate's Implementation
The supplementary step for evaluating and enhancement the design is the implementation and calculation the cost and delay of the proposed circuit design. Total implementation and cost calculation of the proposed gate will be explained in the following subsections.

I. Implementation mode
Current-mode realization is suggested to be used in this paper for the operation of main gates. Reference current value Ib is chosen to be 5µA and will be increased with multiple integer values as shown in the following table (Table 2). In general, due to some differences in power supplies, active element dimension, technology parameters will vary the specified current levels from its original values. This will change distinct binary voltage-mode circuits, and the variance in the output signal will be given to the following levels [11]. This difference can be tolerated and the output can be sensed for a range of values that is called noise margin [12]. In this article the presence of current deviances from its original values is considered, its equivalent logic levels will be specified as follows: Where is the logic level, and I is the current value. From Eq. (3), noise margin is ± 2 . Accumulated error caused by distinct voltagemode binary circuits and deviations in output current carried over stages. For this reason, output current level of multiple-valued logic circuits requires a renovation operation before the current signal passes the noise margin, which is called level restoration [13].

II. V gate Implementation
V gate can be considered as a detector for the overflow, it gives an indication (output at maximum) when the sum of the inputs is higher than the maximum limits, the mathematical model function of this gate is given in Eq. (1), and its circuit diagram is shown in Figure 3 [10]. As shown in Figure 3, its total cost is 5 transistors, while maximum delay between inputs and output is 2d, where d is defined as the delay of one transistor, so it depends on the technology of fabrication, therefore; it will not be calculated in this paper. All delay calculations will be considered with respect to d value.

III. Vs+i Gate implementation
It is just wire connection as shown in Figure 4, therefore; its cost is zero transistors and its delay is 0d. .

IV. Current mirror implementation
Current mirror implementation is also made by using MOSFET transistors, as shown in Figure 5. It consists of two transistors, the first one is called M1 which operates in saturation region because VDS is greater than or equal to VGS, and the second transistor is called M2 also operates in saturation region when the output voltage is larger than its saturation voltage. In this simple configuration, the output current IOUT is related to IIN [14]. The total cost of this gate is (n+1) transistors, where n is the number of outputs, and maximum delay between inputs and output is 2d.

V. Simple switch and SEL implementation
The proposed switch device requires two complementary transistors as shown in Figure 6. Total cost of the proposed switch is two transistors, and maximum delay between inputs and output is 1d. SEL or binary selector is a set of switches connected in the same way of the binary selector [14,15]. Total cost of SEL used in Figure 1 is 12 transistors and its maximum delay between inputs and outputs of the 4-inputs SEL is equal to 2d, while 2-inputs SEL consists of 4 transistors and its maximum delay between inputs and output is equal to 1d.

VI. D3 Gate implementation
This gate is considered as complex gate because it consists of four different parts as shown in Figure  7. First part is CM, second one is V gate, third part is switch and the fourth part is V3 gate. The forth part of D3, is V3 gate which does same operation of V gate but it gives an indication (output at maximum) when the input is higher than value 2, the mathematical expression of this gate is as follows: The final expression of D3 gate is shown in Eq. (3) as below The total cost of this gate is 15 transistors, and its maximum delay between inputs and output is 4d.

System Evaluation
In this section, cost and delay are determined per transistor unit (d) for the proposed system and are compared with other proposed systems, because practical cost and area depend on technology of implementation. For proposed model, which is shown in figure 1, its total cost is 82 transistors, which require 6d as maximum delay at C1, and 4d from C0 to C1. As shown in Table 3. Adder circuit requires additional cost of 4 CM for input that consists of 28 transistors and 2d additional delay. Therefore, result, adder circuit requires 110 transistors and maximum delay at C1 is 8d and from C0 to C1 is 4d, i.e. it has 4d delay and 4d latency (8d -4d). Total cost of Figure 2 requires 220 transistors and maximum delay at C2 is 12 (8+4) delay. Therefore; an 8-qbit adder (equivalent to 16-bit adder) will require 440 transistors and maximum delay at C4 is 20d as total delay. In MOSFET technology [16], each adder circuit consists of 14 transistors, maximum delay is 3d and Ci is 2d, therefore; 16-bits adder consists of 224 transistors as total cost, and total delay is 33d. When our proposed model is compared with previous models, the following results can be found, total speed ratio is 165%, and total cost ratio is 196%. The most interesting result in our proposed model is the reduction in total delay, which is, became only 20d instead of 33d.

Conclusion
In this paper, a MVL multi-qbit adder is proposed without applying binary to quaternary and vice versa. The most interesting advantage of implementing multi-qbit adder instead of 1-qbit adder is the reduction in total delay of the ripple carry. Total number of transistors in comparison with other proposed methods is increased while total number of delay is decreased. With huge advances of VLSI technology, total number of transistors that implement 16, 32 or 64 bit will not have any effect on area or cost of CPU, while reducing the total number of delay for implementing addition operation will have a good effect on the speed of CPU.