unstructured-data

Here are 127 public repositories matching this topic...

voxel51 / fiftyone

The open-source tool for building high-quality datasets and computer vision models

visualization python data-science machine-learning computer-vision deep-learning artificial-intelligence developer-tools image-classification object-detection data-cleaning active-learning data-quality data-curation unstructured-data vector-search data-centric-ai

Updated May 2, 2024
Python

towhee-io / towhee

Star

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

machine-learning computer-vision pipeline image-processing embeddings transformer video-processing feature-extraction convolutional-networks vit feature-vector image-retrieval unstructured-data embedding-vectors milvus vision-transformer towhee llm

Updated Jan 20, 2024
Python

instill-ai / instill-core

Star

🔮 Instill Core is an open-source no-/low-code data, model, and pipeline orchestration platform

python api cli golang open-source typescript ai pipeline etl developer-tools gpt hacktoberfest low-code no-code unstructured-data llm stable-diffusion generative-ai

Updated May 2, 2024
Makefile

milvus-io / bootcamp

Star

Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.

python nlp deep-learning question-answering image-classification image-recognition image-search hacktoberfest unstructured-data audio-search milvus benchmark-testing

Updated May 1, 2024
Python

tstanislawek / awesome-document-understanding

Star

A curated list of resources for Document Understanding (DU) topic

Updated Jun 2, 2023

Renumics / spotlight

Star

Interactively explore unstructured datasets from your dataframe.

audio machine-learning video computer-vision timeseries images exploratory-data-analysis data-visualization hacktoberfest meshes data-curation unstructured-data data-centric-ai

Updated Mar 20, 2024
TypeScript

nomic-ai / nomic

Star

Interact, analyze and structure massive text, image, embedding, audio and video datasets

python clustering text embeddings topic-modeling duplicate-detection unstructured-data

Updated Apr 30, 2024
Python

lilacai / lilac

Star

Curate better data for LLMs

artificial-intelligence data-analysis unstructured-data dataset-analysis

Updated Mar 19, 2024
Python

nuclia / nucliadb

Star

NucliaDB, The AI Search database for RAG

python search rust search-engine semantic machine-learning database text-classification semantic-search-engine vectors language-model search-engines unstructured-data mlops vector-search vector-search-engine nuclia ai-powered-search

Updated May 2, 2024
Python

A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ultra-low latency.

structured-data serving unstructured-data unified-sql vector-database mysql-compatibility embedding-search embedding-store key-value-distributed-store vector-ocean real-time-semantic-search

Updated Apr 30, 2024
Java

EulerSearch / embedding_studio

Star

Embedding Studio is a framework which allows you transform your Vector Database into a feature-rich Search Engine.

search-engine embeddings semantic-similarity search-algorithm query-parser fine-tuning unstructured-data vector-database embeddings-similarity unstructured-search llm-inference search-query-parser

Updated Mar 15, 2024
Python

garyelephant / pygrok

Star

python implementation of jordansissel's grok regular expression library

python extract-information grok parse-strings unstructured-data

Updated Nov 22, 2023
Python

automorphic-ai / trex

Star

Enforce structured output from LLMs 100% of the time

json etl regex unstructured-data large-language-models llm

Updated Sep 14, 2023
Python

Zipstack / unstract

Star

No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents

unstructured-data etl-pipeline llm-platform

Updated May 2, 2024
Python

RelevanceAI / relevanceai

Star

Home of the AI workforce - Multi-agent system, AI agents & tools

python search nlp search-engine natural-language-processing computer-vision clustering embeddings unstructured-data vector-search vector-database

Updated Jan 31, 2024
Python

jostmey / dkm

Star

Dynamic Kernel Matching (DKM) for Classifying Data with Non-conforming Features

machine-learning genomics repertoire unstructured-data nonconforming-data tcell-receptors statistical-classifiers dkm

Updated May 16, 2023
HTML

BartJongejan / Bracmat

Star

Programming language for symbolic computation with unusual combination of pattern matching features: Tree patterns, associative patterns and expressions embedded in patterns.

Updated Feb 8, 2024
C

IBM / pixiedust-facebook-analysis

Star

A Jupyter notebook that uses the Watson Visual Recognition and Natural Language Understanding services to enrich Facebook Analytics and uses Cognos Dashboard Embedded to explore and visualize the results in Watson Studio

data-science watson notebook pandas-dataframe natural-language jupyter-notebook watson-services watson-visual-recognition watson-api ibm-developer-technology-cognitive ibmcode enriched-data watson-natural-language unstructured-data watson-studio watson-apis

Updated Jan 21, 2021
Jupyter Notebook

adansons / base

Star

Adansons Base is a data programming tool for error-analysis of training results. It organizes metadata of unstructured data and creates and organizes datasets. It makes dataset creation more effective and helps to find low-quality data by using the training results and improves AI performance.

machine-learning database artificial-intelligence data-management unstructured-data

Updated Aug 10, 2022
Jupyter Notebook

instill-ai / console

Star

⛅ Versatile Data Pipeline (VDP) console website

console ui computer-vision deep-learning frontend image-classification object-detection structured-data hacktoberfest data-pipeline no-code model-serving vdp unstructured-data data-connector vision-ai versatile-data-pipeline

Updated May 2, 2024
TypeScript

Improve this page

Add a description, image, and links to the unstructured-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the unstructured-data topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unstructured-data

Here are 127 public repositories matching this topic...

voxel51 / fiftyone

towhee-io / towhee

instill-ai / instill-core

milvus-io / bootcamp

tstanislawek / awesome-document-understanding

Renumics / spotlight

nomic-ai / nomic

lilacai / lilac

nuclia / nucliadb

dingodb / dingo

EulerSearch / embedding_studio

garyelephant / pygrok

automorphic-ai / trex

Zipstack / unstract

RelevanceAI / relevanceai

jostmey / dkm

BartJongejan / Bracmat

IBM / pixiedust-facebook-analysis

adansons / base

instill-ai / console

Improve this page

Add this topic to your repo