Issues: mosaicml/streaming
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
clean_stale_shared_memory duplicating the master process when called in a train.py script
bug
Something isn't working
#663
opened Apr 26, 2024 by
antoinedandi
Support large size index.json (20GB +)
enhancement
New feature or request
#662
opened Apr 25, 2024 by
andreamad8
Integer overflow and data corruption (uncompressed mds file size is larger than 2^32)
bug
Something isn't working
#659
opened Apr 22, 2024 by
jarnoseppanen-sc
Does it support Preference data (for training Reward / DPO)?
enhancement
New feature or request
#656
opened Apr 17, 2024 by
ericxsun
Azure Databricks MDS write ops in error: MapInPandas write_mds gives message Spark higher-order functions are not supported in Unity Catalog
bug
Something isn't working
#655
opened Apr 15, 2024 by
wolliq
Out of Memory when using Streaming Dataloader
bug
Something isn't working
#652
opened Apr 13, 2024 by
VikaasVarma
Augment existing dataset
enhancement
New feature or request
#646
opened Apr 3, 2024 by
LWprogramming
GPU utilisation drop between epochs
bug
Something isn't working
#643
opened Mar 29, 2024 by
rishabhm12
Integrating MDS Streaming with HF Dataset Streaming
enhancement
New feature or request
#633
opened Mar 19, 2024 by
siddk
Unexpected mds format data for json encoding / how to encode list of strings
bug
Something isn't working
#613
opened Feb 28, 2024 by
ssharpe42
Add support for registering custom Cloud Uploaders
enhancement
New feature or request
#612
opened Feb 28, 2024 by
JAEarly
On the fly sample filtering and limiting of datasets
enhancement
New feature or request
#578
opened Jan 26, 2024 by
ssharpe42
Making Streaming Dataset framework agnostic: Removing PyTorch dependency
enhancement
New feature or request
#551
opened Dec 26, 2023 by
Abhijit-2592
Distributed Key Value Tensor Store
enhancement
New feature or request
#539
opened Dec 16, 2023 by
OrenLeung
ValueError: cannot reshape array of size 24 into shape (8,newaxis,8) in Dataloader
#535
opened Dec 15, 2023 by
germanjke
Allow Stream's New feature or request
repeat
option to cycle through entire dataset before repeating, when shuffle=True
enhancement
#521
opened Dec 6, 2023 by
m-harmonic
option to host files via https/remote being a https url
enhancement
New feature or request
#511
opened Nov 24, 2023 by
felix-red-panda
Is
StreamingDataset
compatible with Ray distributed training?
#503
opened Nov 7, 2023 by
genesis-jamin
Support for sub-sampling long videos
enhancement
New feature or request
#489
opened Oct 28, 2023 by
con-bren
Previous Next
ProTip!
Adding no:label will show everything without a label.