#

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

Here are 4,934 public repositories matching this topic...

pytorch / torchrec

Pytorch domain library for recommendation systems

deep-learning gpu cuda pytorch recommendation-system sharding recommender-system

Updated Jun 11, 2024
Python

QMCPACK / qmcpack

Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance portable GPU support

c-plus-plus hpc gpu mpi cuda high-performance-computing quantum-chemistry quantum-monte-carlo electronic-structure

Updated Jun 11, 2024
C++

pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

machine-learning deep-learning cuda pytorch nvidia jetson tensorrt libtorch

Updated Jun 11, 2024
Python

NVIDIA / gpu-operator

NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes

kubernetes gpu cuda nvidia

Updated Jun 11, 2024
Go

rapidsai / cudf

cuDF - GPU DataFrame Library

python data-science cpp gpu arrow pydata cuda pandas data-analysis dask dataframe rapids cudf

Updated Jun 10, 2024
C++

NCAR / micm

A model-independent chemistry module for atmosphere models

hpc gpu cuda gpu-acceleration atmospheric-science ode-solver atmospheric-chemistry atmospheric-modeling

Updated Jun 11, 2024
C++

OpenVoiceOS / status

Open Voice OS Status Page

status text-to-speech translator monitoring alerting cuda sam nvidia tts uptime stats speech-to-text stt piper ovos upptime openvoiceos fasterwhisper mimic3

Updated Jun 10, 2024
Markdown

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

gpu cuda pytorch tvm llm-inference flash-attention large-large-models

Updated Jun 10, 2024
Cuda

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving mlops llm inferentia llmops llm-serving trainium

Updated Jun 11, 2024
Python

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

python machine-learning deep-learning gpu cuda pytorch jax fp8

Updated Jun 11, 2024
Python

tomilov / sah_kd_tree

(in progress) SAH kd-tree parallel construction algorithm implementation

kd-tree cuda raytracing acceleration-data-structures acceleration-structure sah-kd-tree

Updated Jun 10, 2024
C++

csl-iisc / SBRP-ASPLOS23

gpu cuda persistent-memory gpgpu-sim

Updated Jun 10, 2024
C++

csl-iisc / iGUARD-SOSP21

gpu cuda race-detection

Updated Jun 10, 2024
Cuda

replicate / cog

Containers for machine learning

docker machine-learning ai deep-learning containers tensorflow cuda pytorch

Updated Jun 10, 2024
Python

pennylane-lightning

PennyLaneAI / pennylane-lightning

The PennyLane-Lightning plugin provides a fast state-vector simulator written in C++ for use with PennyLane

hpc gpu parallel openmp mpi distributed-computing cuda quantum-computing rocm quantum-machine-learning

Updated Jun 11, 2024
C++

beam-cloud / beta9

The open-source serverless GPU container runtime.

gpu distributed-computing cuda self-hosted fine-tuning ml-platform large-language-models llm generative-ai llm-inference

Updated Jun 10, 2024
Go

NikhilMukraj / spiking-neural-networks

Implementations of various simulations for integrate and fire models, as well as conductance based models with synaptic neurotransmission

python rust cuda computational-biology izhikevich-neurons biological-neural-networks biological-neurons hodgkin-huxley-neuron

Updated Jun 10, 2024
Rust

patientx / ComfyUI-Zluda

The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface. Now ZLUDA enhanced for better AMD GPU performance.

windows amd cuda rocm stable-diffusion comfyui zluda

Updated Jun 10, 2024
Python

iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

machine-learning compiler runtime tensorflow vulkan cuda pytorch spirv jax mlir

Updated Jun 10, 2024
C++

alpaka-group / alpaka

Abstraction Library for Parallel Kernel Acceleration 🦙

cpp hpc gpu openmp cuda header-only cpp17 hip heterogeneous-parallel-programming tbb openacc rocm

Updated Jun 10, 2024
C++

Created by Nvidia

Released June 23, 2007

Followers: 204 followers
Website: developer.nvidia.com/cuda-zone
Wikipedia: Wikipedia

Related Topics

nvcc