CS 461/561 Computer Architecture

Order Management

Lost password? | Register?

Welcome

Account | New Order | Logout

CS 461/561 Computer Architecture

December 03, 2022 Essayheroes

Introduction

A complex number consists of a real and imaginary component and is usually written in the form where and are either integer or floating-point values and (the imaginary value) . Sometimes in engineering, the letter is used in place of because is used for other values.

Multiplying two complex numbers is done by applying the FOIL (Firsts, Outers, Inners and Lasts) method, similar to that of binomial multiplication. For example, multiplying (a + bi)(c + di) is accomplished as follows:

Firsts: a * c

Outers: a * di

Inners: bi * c

Lasts: bi * di

This produces (a+bi)(c+di) = ac + adi + bci + bdi². The terms are combined to produce the product back in the form a + bi. Keep in mind that i² = -1.

An example using actual values: (2.5 + 3i)(4.0 + 2i)

Firsts: 2.5 * 4.0

Outers: 2.5 * 2i

Inners: 3i * 4.0

Lasts: 3i * 2i

This produces 10 + 5i + 12i + 6i² = 10 + 17i + 6(-1) = 4 + 17i.

Some contemporary programming languages natively support complex numbers (Python, MATLAB). Newer revisions of some older languages (C, FORTRAN) have added support for complex numbers. Some programming languages have no native support for complex numbers.

Assignment Definition

Consider the following high-level language code which multiplies two vectors that contain single-precision complex numbers:

Values a, b and c are vectors; _re is the real component element and _im is the imaginary component element in each vector.

Convert this loop into pseudo RV64V assembly code using strip mining assuming the following architectural features:

Register s0 = loop counter & array index [i]

Vector registers: v0 – v31

MVL (maximum vector length) = 64

Instructions: vld (vector load)

vst (vector store)

vadd (vector add)

vsub (vector subtract)

vmul (vector multiply)

bne (branch if not equal)*

blt (branch if less than)*

j (unconditional jump)*

addi (integer add immediate)*

ori (logical or immediate)*

Note: instructions with an asterisk indicate the instructions are used only for setting initial index value and increments, and for loop control.

If the vector processor implements chaining with two lanes and has a single vector load/store unit, using the pseudo assembly code from question 1, show how convoys would be constructed to execute in the vector pipeline. How many chimes are required to execute the convoys?

Assume in the vector processor, the functional units have the following startup overhead: load/store unit: 12 cycles, multiply unit: 7 cycles, and the add/subtract unit: 6 cycles. How many clock cycles are required for each iteration of the loop, including startup overhead?

How many iterations are required to complete processing the vectors?

Instruction Formats

vld (vector load): vld v_D, vec_ref

vst (vector store): vst v_D, vec_ref

vadd (vector add): vadd v_D, v_S1, v_S2

vsub (vector subtract): vsub v_D, v_S1, v_S2

vmul (vector multiply): vmul v_D, v_S1, v_S2

bne (branch if not equal): bne x₁, x₂, target_label

blt (branch if less than): blt x₁, x₂, target_label

j (unconditional jump): j target_label

addi (integer add immediate): addi x_D, x_S1, x_S2

ori (logical or immediate): ori x_D, x_S1, const

Format Definitions

v_D = destination vector register

v_S1 = first source vector register

v_S2 = second source vector register

vec_ref = vector reference (name)

x₁ = first general purpose register for comparison

x₂ = second general purpose register for comparison

x_S1 = first source general purpose register

x_S2 = second source general purpose register

target_label = label of the target instruction for branch

const = an integer constant

Posted in: Research Papers

Document Type Academic Level Subject Urgency
Quote Price Pages
Total Price:
Currency Amount

Welcome

CS 461/561 Computer Architecture

Our Features

Share test to get free answers