Spiral's Machine Learning Library
-
Updated
May 29, 2024 - Python
Spiral's Machine Learning Library
CUDA C++ Core Libraries
A performance-oriented prototyping harness for state of the art Molecular Dynamics algorithms
A General-purpose Parallel and Heterogeneous Task Programming System
fast parallel visualization of julia sets with CUDA and OpenMP
TinyChatEngine: On-Device LLM Inference Library
Safe rust wrapper around CUDA toolkit
A fast and scalable CUDA implementation to conduct highly parallelized evolutionary tests on large scale genomic data.
AI, IoT and Robotics Hardware + ROS
From zero to hero CUDA for accelerating maths and machine learning on GPU.
Accelerate and optimize existing C/C++ CPU-only applications using the most essential CUDA tools and techniques.
This repo contains CUDA Programming with C++. Projects are done to learn CUDA from scratch.
Gpu Accelerated Video Stabilizer, Cuda, OpenCL
My research, playground, techniques with Parallel Programming
🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
Yet Another Scattering Framework python implementation
μ-Cuda, COVER THE LAST MILE OF CUDA. With features: intellisense-friendly, structured launch, automatic cuda graph generation and updating.
A place where I learn about CUDA
Add a description, image, and links to the cuda-programming topic page so that developers can more easily learn about it.
To associate your repository with the cuda-programming topic, visit your repo's landing page and select "manage topics."