Recreating PyTorch from scratch (C/C++, CUDA and Python, with GPU support and automatic differentiation!)
-
Updated
May 16, 2024 - Python
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
Recreating PyTorch from scratch (C/C++, CUDA and Python, with GPU support and automatic differentiation!)
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
Pytorch domain library for recommendation systems
The open-source serverless GPU container runtime.
An efficient C++17 GPU numerical computing library with Python-like syntax
A high-throughput and memory-efficient inference and serving engine for LLMs
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance portable GPU support
Deep Tech R&D Research
A retargetable MLIR-based machine learning compiler and runtime toolkit.
A model-independent chemistry module for atmosphere models
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
An animal can do training and inference every day of its existence until the day of its death. A forward pass is all you need.
Deep Learning Framework Written in Rust
A high-performance inference system for large language models, designed for production environments.
Created by Nvidia
Released June 23, 2007