Intent detection and Slot filling
-
Updated
Jan 7, 2020 - Python
Entity resolution (also known as data matching, data linkage, record linkage, and many other terms) is the task of finding entities in a dataset that refer to the same entity across different data sources (e.g., data files, books, websites, and databases). Entity resolution is necessary when joining different data sets based on entities that may or may not share a common identifier (e.g., database key, URI, National identification number), which may be due to differences in record shape, storage location, or curator style or preference.
Intent detection and Slot filling
This project aims to clusterize duplicates from the Music_Brainz dataset using a custom Kmeans implementation.
Contains Ruby scripts for accessing the Rosette API
Details for reproducing the experiments in our d-blink paper
ProxCluster is a framework for Incremental Entity Resolution that leverages concepts similar to K-Means for clustering duplicates. This work was developed as the final paper for my Bachelor degree in Computer Science
Pre-processing script for data from the Survey of Household Income and Wealth
U.S. Hospital and Hotel Recommendation System based on CMS and Kaggle Datasets
TFIDF / KNN based string matching
Person entity identification and matching using face recognition and machine learning algorithms
Super Fast String Matching in Python
The website for FinTech Studios's zentity fork.
Submitted solution for the ACM SIGMOD 2022 Programming Contest 💻🏅
Mock data that is used for unit testing of the Rosette API bindings
Addressed Entity Resolution challenges. Tasks include schema-agnostic blocking, pairwise comparisons, Meta-Blocking graph construction, and Jaccard similarity computation. Deliverables include source code, reports, and reproducibility guidelines in Python
Company Match algorithm with Spark and Python on DataBricks
Created by Halbert L. Dunn
Released 1946