Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chores: 记录一些未整理的需求或者未完成的功能 #44

Open
7 tasks done
hunjixin opened this issue Dec 14, 2023 · 4 comments
Open
7 tasks done

chores: 记录一些未整理的需求或者未完成的功能 #44

hunjixin opened this issue Dec 14, 2023 · 4 comments
Labels
question Further information is requested

Comments

@hunjixin
Copy link
Collaborator

hunjixin commented Dec 14, 2023

  • 权限控制
  • ak/sk支持
  • 公开数据集和私有数据集
  • email校验
  • object对象按repository分开, 便于以后进行gc操作, 不在和git兼容
  • 协作
  • 链路追踪
@hunjixin hunjixin added the question Further information is requested label Dec 14, 2023
@hunjixin hunjixin pinned this issue Dec 15, 2023
@hunjixin
Copy link
Collaborator Author

hunjixin commented Dec 31, 2023

https://www.youtube.com/watch?v=aqMXEvWTuVY

1.规模 pb级别
2. 结构化数据 不知道讲的啥,没理解, 感觉像多源头数据
3. 版本化数据
4. 协作
5. 多模态数据 没啥东西,我们现在就有
6. 图像,文本额外信息处理 大小尺寸,语言

@hunjixin
Copy link
Collaborator Author

文件对比:

  1. 文件对比
  2. 音频,视频 支持播放比对

@hunjixin
Copy link
Collaborator Author

支持第三方登录 优先github

@taoshengshi
Copy link
Contributor

  1. GPT-powered Diffs ?
  2. S3 Sync?Create and sync repos directly from an S3 bucket
  3. Efficiently store version history
  4. LLM ETL integration: https://github.com/Unstructured-IO/unstructured

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants