Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization
-
Updated
Apr 25, 2024 - Python
Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization
Analysis of token routing for different implementations of Mixture of Experts
Early release of the official implementation for "GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned Experts"
Gaussian Process-Gated Hierarchical Mixture of Experts
This instruction aims to reproduce the results in the paper “Mesh-clustered Gaussian process emulator for partial differential equation boundary value problems”(2024) to appear in Technometrics.
Using CCR to predict piezoresponse force microscopy datasets
Code, data, and pre-trained models for our EMNLP 2021 paper "Think about it! Improving defeasible reasoning by first modeling the question scenario"
Faster alternative to Fast Feedforward Layer that uses angular distance for routing
This is the repo for the MixKABRN Neural Network (Mixture of Kolmogorov-Arnold Bit Retentive Networks), and an attempt at first adapting it for training on text, and later adjust it for other modalities.
[Preprint] Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts
About Code repository for: Nguyen, H., Nguyen, T., Nguyen, K., & Ho, N. (2024). Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts. In Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, AISTATS 2024, Acceptance rate 27.6% over 1980 submissions.
MoE Decoder Transformer implementation with MLX
The implementation of mixtures for different tasks.
Differentially private retriever using transformer memory as a search index for information retrieval
Anomaly Detection by Recombining Gated Unsupervised Experts
This is a prototype of a MixtureOfExpert LLM made with pytorch. Currently in developpment, I am testing its capabilities of learning with simple little tests before learning it on large language datasets.
This collaborative framework is designed to harness the power of a Mixture of Experts (MoE) to automate a wide range of software engineering tasks, thereby enhancing code quality and expediting development processes.
Add a description, image, and links to the mixture-of-experts topic page so that developers can more easily learn about it.
To associate your repository with the mixture-of-experts topic, visit your repo's landing page and select "manage topics."