This repository contains implementation of various parallel algorithms using CUDA.
Each folder contains a README.md which contains description of programs.
- A0
- Sum of two large arrays (with time metrics)
- Vector Reduction
- A1
- Vector addition using thrust
- Image Blur
- A2
- Convert RGB image to grey scale
- Matrix Multiplication
- Tiled Matrix Multiplication
- A3
- Histogram for ASCII characters and numbers
- Histogram sort using thrust
- Image convolution
- Shared memory tiling by implementing 7-point Stencil
- A4
- Parallel BFS
- Vertex parallel mathod
- Edge Parallel method
- Work efficient method
- Parallel BFS