

| <b>Course code: Course Title</b>             | <b>Course Structure</b> |          |          | <b>Pre-Requisite</b>         |
|----------------------------------------------|-------------------------|----------|----------|------------------------------|
| <b>SE310: Parallel Computer Architecture</b> | <b>L</b>                | <b>T</b> | <b>P</b> | <b>Computer Architecture</b> |
|                                              | <b>3</b>                | <b>1</b> | <b>0</b> |                              |

**Course Objective:** To introduce fundamentals of parallel, pipelines and superscalar architecture.

| <b>S. NO</b> | <b>Course Outcomes (CO)</b>                                                                                                          |
|--------------|--------------------------------------------------------------------------------------------------------------------------------------|
| <b>CO1</b>   | Understand the fundamentals of parallel computing, architectural classifications, and performance evaluation techniques.             |
| <b>CO2</b>   | Apply multi-core programming techniques, optimization strategies, and parallel processing libraries.                                 |
| <b>CO3</b>   | Analyze and understand multi-threaded architectures, cache coherence mechanisms, and memory consistency models.                      |
| <b>CO4</b>   | Understand and analyze compiler optimization and operating system issues for multiprocessing and approaches to resolve these issues. |
| <b>CO5</b>   | Analyze and implement parallel computing techniques in real-world applications                                                       |

| <b>S.No.</b>  | <b>Contents</b>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | <b>Contact Hours</b> |
|---------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------|
| <b>UNIT 1</b> | <b>Introduction:</b> Introduction to parallel computing, need for parallel computing, parallel architectural classification schemes, Flynn's, Fang's classification, performance of parallel processors, distributed processing, processor and memory hierarchy, bus, cache & shared memory, introduction to super scalar architectures, quantitative evaluation of performance gain using memory, cache miss/hits.                                                                                                                                                                                                                                                                                                                                                                                                                                                             | <b>6</b>             |
| <b>UNIT 2</b> | <b>Multi-core Architectures:</b> Introduction to multi-core architectures, issues involved into writing code for multi-core architectures, development of programs for these architectures, program optimizations techniques, building of some of these techniques in compilers, Open MP and other message passing libraries, threads, mutex etc.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | <b>6</b>             |
| <b>UNIT 3</b> | <b>Multi-threaded Architectures:</b> Parallel computers, Instruction level parallelism (ILP) vs. thread level parallelism (TLP), Performance issues: Brief introduction to cache hierarchy and communication latency, Shared memory multiprocessors, General architectures and the problem of cache coherence, Synchronization primitives: Atomic primitives; locks: TTS, ticket, array; barriers: central and tree; performance implications in shared memory programs; Chip multiprocessors: Why CMP (Moore's law, wire delay); shared L2 vs. tiled CMP; core complexity; power/performance; Snoopy coherence: invalidate vs. update, MSI, MESI, MOESI, MOSI; performance trade-offs; pipelined snoopy bus design; Memory consistency models: SC, PC, TSO, PSO, WO/WC, RC; Chip multiprocessor case studies: Intel Montecito and dual-core, Pentium4, IBM Power4, Sun Niagara | <b>10</b>            |
| <b>UNIT 4</b> | <b>Compiler Optimization Issues:</b> Introduction to optimization, overview of parallelization; Shared memory programming, introduction to Open MP; Dataflow analysis, pointer analysis, alias analysis; Data dependence analysis, solving data dependence equations (integer linear programming problem); Loop optimizations; Memory hierarchy issues in code optimization.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | <b>8</b>             |

|                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |           |
|-------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|
| <b>UNIT<br/>5</b> | <b>Operating System Issues:</b> Operating System issues for multiprocessor scheduling, Need for pre-emptive OS; Scheduling Techniques, Usual OS scheduling techniques, Threads, Distributed scheduler, Multiprocessor scheduling, Gang scheduling; Communication between processes, Message boxes, Shared memory; Sharing issues and Synchronization, sharing memory and other structures, Sharing I/O devices, Distributed Semaphores, monitors, spin-locks, Implementation techniques on multi-cores; Open MP, MPI and case studies | <b>8</b>  |
| <b>UNIT<br/>6</b> | <b>Applications</b> Case studies from Applications: Digital Signal Processing, Image processing, Speech processing.                                                                                                                                                                                                                                                                                                                                                                                                                   | <b>4</b>  |
|                   | <b>TOTAL</b>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | <b>42</b> |

| <b>REFERENCES</b> |  |                                                                                                                                       |                                      |
|-------------------|--|---------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------|
| <b>S.No.</b>      |  | <b>Name of Books/Authors/Publishers</b>                                                                                               | <b>Year of Publication / Reprint</b> |
| <b>1.</b>         |  | Kai Hwang, Naresh Jotwani, "Advanced Computer Architecture: Parallelism, Scalability, Programmability", TMH, 3 <sup>rd</sup> Edition. | <b>2003</b>                          |
| <b>2.</b>         |  | John P. Hayes, "Computer Architecture and Organization", McGraw Hill, 3 <sup>rd</sup> Edition.                                        | <b>2017</b>                          |
| <b>3.</b>         |  | Michael. J. Flynn, "Computer Architecture, Pipelined and Parallel Processor Design", Narosa Publishing.                               | <b>1998</b>                          |
| <b>4.</b>         |  | John L. Hennessy, David A. Patterson, "Computer Architecture: A Quantitative approach", Morgan Kauffmann, 6 <sup>th</sup> Edition     | <b>2017</b>                          |
| <b>5.</b>         |  | Kai Hwang, Faye A. Briggs, "Computer Architecture and Parallel Processing", MGH.                                                      | <b>2000</b>                          |