🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
-
Updated
Jun 3, 2024 - HTML
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
Stable Diffusion, SDXL, LoRA Training, DreamBooth Training, Automatic1111 Web UI, DeepFake, Deep Fakes, TTS, Animation, Text To Video, Tutorials, Guides, Lectures, Courses, ComfyUI, Google Colab, RunPod, NoteBooks, ControlNet, TTS, Voice Cloning, AI, AI News, ML, ML News, News, Tech, Tech News, Kohya LoRA, Kandinsky 2, DeepFloyd IF, Midjourney
Text to video generator in the brainrot form. Learn about any topic from your favorite personalities.
Convert an .srt captions file to American Signs Language (ASL) with public videos
Papers and resources on Controllable Generation using Diffusion Models, including ControlNet, DreamBooth, T2I-Adapter, IP-Adapter.
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
Code repository for T2V-Turbo
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
A benchmark for evaluating hallucination of text-to-video models
KandinskyVideo — multilingual end-to-end text2video latent diffusion model
Create deepfake video by just uploading the original video and specifying the text the character will read
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Diffusion model papers, survey, and taxonomy
[ICLR 2024] Cross-Modal Contextualized Diffusion Models for Text-Guided Visual Generation and Editing
This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc
Ground-A-Video: Zero-shot Grounded Video Editing using Text-to-image Diffusion Models (ICLR 2024)
Script that generates TikTok style videos using ffmpeg, moviepy, chatGPT, and SDXL api within a minute
Cassette is designed to create 30-second explanatory videos suitable for Instagram Reels or YouTube Shorts. Or you may call it a free python alternative to Brainrot.js
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Add a description, image, and links to the text-to-video topic page so that developers can more easily learn about it.
To associate your repository with the text-to-video topic, visit your repo's landing page and select "manage topics."