#

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

Here are 4,885 public repositories matching this topic...

lucasdelimanogueira / PyNorch

Recreating PyTorch from scratch (C/C++, CUDA and Python, with GPU support and automatic differentiation!)

python c deep-learning neural-network cuda pytorch

Updated May 16, 2024
Python

pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

machine-learning deep-learning cuda pytorch nvidia jetson tensorrt libtorch

Updated May 16, 2024
Python

pytorch / torchrec

Pytorch domain library for recommendation systems

deep-learning gpu cuda pytorch recommendation-system sharding recommender-system

Updated May 16, 2024
Python

beam-cloud / beta9

The open-source serverless GPU container runtime.

gpu distributed-computing cuda self-hosted fine-tuning ml-platform large-language-models llm generative-ai llm-inference

Updated May 16, 2024
Go

NVIDIA / MatX

An efficient C++17 GPU numerical computing library with Python-like syntax

hpc gpu cuda gpgpu gpu-computing

Updated May 16, 2024
C++

jaredhoberock / ubu

cuda gpu-computing gpu-programming cuda-programming circlelang

Updated May 16, 2024
C++

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving mlops llm inferentia llmops llm-serving trainium

Updated May 16, 2024
Python

catboost / catboost

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

python data-science machine-learning data-mining tutorial r big-data gpu cuda kaggle gbdt gbm gpu-computing decision-trees gradient-boosting coreml catboost categorical-features

Updated May 16, 2024
Python

QMCPACK / qmcpack

Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance portable GPU support

c-plus-plus hpc gpu mpi cuda high-performance-computing quantum-chemistry quantum-monte-carlo electronic-structure

Updated May 16, 2024
C++

binarybrainiacs / nexly

Deep Tech R&D Research

java infrastructure natural-language-processing visual-studio deep-neural-networks deep-learning experimental maven cuda artificial-intelligence cuda-kernels research-and-development hpc-systems cpp20 cuda-programming reccomendation-system

Updated May 16, 2024
C++

iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

machine-learning compiler runtime tensorflow vulkan cuda pytorch spirv jax mlir

Updated May 16, 2024
C++

NCAR / micm

A model-independent chemistry module for atmosphere models

hpc gpu cuda gpu-acceleration atmospheric-science ode-solver atmospheric-chemistry atmospheric-modeling

Updated May 16, 2024
C++

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

python machine-learning deep-learning gpu cuda pytorch jax fp8

Updated May 16, 2024
Python

icicle

ingonyama-zk / icicle

a GPU Library for Zero-Knowledge Acceleration

rust gpu cuda zero-knowledge

Updated May 16, 2024
C++

sebhtml / novigrad

An animal can do training and inference every day of its existence until the day of its death. A forward pass is all you need.

training rust ai neural-network cuda embeddings tensor language-model

Updated May 16, 2024
Rust

bokutotu / zenu

Deep Learning Framework Written in Rust

rust deep-neural-networks ai deep-learning hpc cuda autograd cublas blas gpu-computing cudnn

Updated May 16, 2024
Rust

rapidsai / cudf

cuDF - GPU DataFrame Library

python data-science cpp gpu arrow pydata cuda pandas data-analysis dask dataframe rapids cudf

Updated May 16, 2024
C++

ROCm / rocRAND

RAND library for HIP programming language

gpu random cuda rng hip rocm

Updated May 16, 2024
C++

m4rs-mt / ILGPU

ILGPU JIT Compiler for high-performance .Net GPU programs

cpu compiler dotnet amd gpu opencl parallel cuda intel jit nvidia gpgpu msil cil kernels ptx gpgpu-computing ilgpu

Updated May 16, 2024
C#

vectorch-ai / ScaleLLM

A high-performance inference system for large language models, designed for production environments.

performance gpu model production cuda efficiency inference transformer llama speculative serving llm llm-inference llama3

Updated May 16, 2024
C++

Created by Nvidia

Released June 23, 2007

Followers: 199 followers
Website: developer.nvidia.com/cuda-zone
Wikipedia: Wikipedia

Related Topics

nvcc