The open-source tool for building high-quality datasets and computer vision models
-
Updated
May 2, 2024 - Python
The open-source tool for building high-quality datasets and computer vision models
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
🔮 Instill Core is an open-source no-/low-code data, model, and pipeline orchestration platform
Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.
A curated list of resources for Document Understanding (DU) topic
Interactively explore unstructured datasets from your dataframe.
Interact, analyze and structure massive text, image, embedding, audio and video datasets
Curate better data for LLMs
NucliaDB, The AI Search database for RAG
A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ultra-low latency.
Embedding Studio is a framework which allows you transform your Vector Database into a feature-rich Search Engine.
python implementation of jordansissel's grok regular expression library
Enforce structured output from LLMs 100% of the time
No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents
Home of the AI workforce - Multi-agent system, AI agents & tools
Dynamic Kernel Matching (DKM) for Classifying Data with Non-conforming Features
Programming language for symbolic computation with unusual combination of pattern matching features: Tree patterns, associative patterns and expressions embedded in patterns.
A Jupyter notebook that uses the Watson Visual Recognition and Natural Language Understanding services to enrich Facebook Analytics and uses Cognos Dashboard Embedded to explore and visualize the results in Watson Studio
Adansons Base is a data programming tool for error-analysis of training results. It organizes metadata of unstructured data and creates and organizes datasets. It makes dataset creation more effective and helps to find low-quality data by using the training results and improves AI performance.
⛅ Versatile Data Pipeline (VDP) console website
Add a description, image, and links to the unstructured-data topic page so that developers can more easily learn about it.
To associate your repository with the unstructured-data topic, visit your repo's landing page and select "manage topics."