A place to release saved machine learning models for tika-dl
-
Updated
Sep 28, 2018
A place to release saved machine learning models for tika-dl
Application in php to test load of pdf files, using docker-compose and apache-tika.
Document management system implemented with microservices
a tool set for indexing and searching through documents
Developed a Spatial Search website that allow users to search documents from FBI Vault website. Extract the most frequently occurring location in each of documents, and load the geo-tagged data into Apache Solr to index the documents, visualize search results using the Google Maps API.
[SLOW][WIP] Broodmother is a high performance, distributed, search engine using Apache Tika, Apache Solr, Akka, Neo4j, and Spring.
Using Apache Lucene, TIKI, Solr
Tika detector for MKV and WebM
This API use Annif as local server, NER component is included. It also includes Tesseract and uses Apache-tika software for language detection. It also has a limited multilingual support.
AWS Lambda code to index S3 buckets into Elasticsearch
Run Apache Tika as a service in AWS Lambda by scanning documents in S3 and storing the extracted text back to S3
PDF parsing and extraction utility using Apache Tika
Information Retrieval system for indexing and searching files stored on disk, with support for Romanian language
ApacheDeepLearning101
All my processors (NARs) in one place
Apache Tika integration built in scala for indexing OneDrive files into ElasticSearch.
🚴♂️⛷Data Lake, Performance tuning for text extraction from a huge amount of files.
Apache Tika adapter in Go
Analysis of PixStory social media data combined with Snapchat, COVID-19, and YouTube data. This project uses the Apache Tika Clustering software to cluster certain social media posts together.
Add a description, image, and links to the apache-tika topic page so that developers can more easily learn about it.
To associate your repository with the apache-tika topic, visit your repo's landing page and select "manage topics."