Skip to content
@wbsg-uni-mannheim

Web-based Systems Group @ University of Mannheim

We explore technical and empirical questions concerning the development of global, decentralized information environments.

Pinned

  1. productbert-intermediate productbert-intermediate Public

    This repository contains code and data download scripts for the paper "Intermediate Training of BERT for Product Matching" by Ralph Peeters, Christian Bizer and Goran Glavaš.

    Python 33 9

  2. wdc-lspc-v2 wdc-lspc-v2 Public

    This repository contains code and data download scripts for the paper "Using schema.org annotations for training and maintaining product matchers" by Ralph Peeters, Anna Primpeli, Benedikt Wichtlhu…

    Jupyter Notebook 15 2

  3. WDCFramework WDCFramework Public

    Java Framework which is used by the Web Data Commons project to extract Microdata, Microformats and RDFa data, Web graphs, and HTML tables from the web crawls provided by the Common Crawl Foundation.

    Java 7 1

  4. contrastive-product-matching contrastive-product-matching Public

    This repository contains the code to reproduce the experiments of the poster "Supervised Contrastive Learning for Product Matching"

    Python 36 12

  5. TabAnnGPT TabAnnGPT Public

    This repository contains the code for the experiments run in the papers "Column Type Annotation using ChatGPT" and "Table Column Annotation using Large Language Models".

    Jupyter Notebook 7 2

  6. MatchGPT MatchGPT Public

    This repository contains code and extensive prompt examples to reproduce and extend the experiments in our papers "Using ChatGPT for Entity Matching" and "Entity Matching using Large Language Models".

    Jupyter Notebook 26 3

Repositories

Showing 10 of 26 repositories
  • TabAnnGPT Public

    This repository contains the code for the experiments run in the papers "Column Type Annotation using ChatGPT" and "Table Column Annotation using Large Language Models".

    Jupyter Notebook 7 2 0 0 Updated May 7, 2024
  • MatchGPT Public

    This repository contains code and extensive prompt examples to reproduce and extend the experiments in our papers "Using ChatGPT for Entity Matching" and "Entity Matching using Large Language Models".

    Jupyter Notebook 26 3 0 0 Updated Apr 24, 2024
  • ExtractGPT Public

    Attribute Value Extraction using Large Language Models

    Python 12 Apache-2.0 3 0 0 Updated Apr 16, 2024
  • Jupyter Notebook 4 1 1 0 Updated Apr 15, 2024
  • wdc-page Public

    This repository contains the source files of the Web Data Commons website and is used to maintain the site. The Web Data Commons project extracts structured data from the Common Crawl

    HTML 0 1 0 0 Updated Mar 15, 2024
  • wdc-pave Public

    Web Data Commons - Using LLMs for Product Attribute Value Extraction and Normalization

    Python 2 0 0 0 Updated Mar 15, 2024
  • SC-Block Public

    SC-Block is a supervised contrastive blocking method which combines supervised contrastive learning for positioning records in an embedding space and nearest neighbour search for candidate set building.

    Python 5 BSD-3-Clause 2 1 0 Updated Jan 26, 2024
  • wdc-smb Public

    This repository contains the code and data download links to reproduce building the WDC SMB Benchmark.

    0 BSD-3-Clause 0 0 0 Updated Dec 11, 2023
  • pie_chatgpt Public

    Product Information Extraction using ChatGPT

    Jupyter Notebook 2 0 0 0 Updated Oct 4, 2023
  • wdc-lspc-v2 Public

    This repository contains code and data download scripts for the paper "Using schema.org annotations for training and maintaining product matchers" by Ralph Peeters, Anna Primpeli, Benedikt Wichtlhuber and Christian Bizer.

    Jupyter Notebook 15 BSD-3-Clause 2 1 0 Updated Aug 29, 2023

Top languages

Loading…

Most used topics

Loading…