Skip to content
@dell-research-harvard

dell-research-harvard

Popular repositories

  1. AmericanStories AmericanStories Public

    The official Github for the American Stories dataset as in {link}

    Python 92 7

  2. linktransformer linktransformer Public

    A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning

    Python 79 6

  3. effocr effocr Public

    A model(ing framework) for sample efficient OCR

    Python 38 5

  4. HJDataset HJDataset Public

    A Large Dataset of Historical Japanese Documents with Complex Layouts

    Jupyter Notebook 28 4

  5. NEWS-COPY NEWS-COPY Public

    Noise-robust de-duplication at scale

    Python 15

  6. HomoglyphsCJKTraining HomoglyphsCJKTraining Public

    Quantifying Character Similarity with Vision Transformers

    Python 5

Repositories

Showing 10 of 28 repositories

Top languages

Loading…

Most used topics

Loading…