Skip to content

ragomusic/crumbs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crumbs Document Viewer

Cookie logo of crumbs

Info

Downloads

Executables for windows, mac, examples, etc: https://github.com/ragomusic/crumbs/releases

What is Crumbs?

Crumbs is a cross platform application designed to easily navigate and search collections of documents. The files of interest are stored as pdf documents (for ease of visualization), and a text only version is created for ease of search.

Motivation:

I created the initial version of crumbs back in 2013 to help navigate the large amount of information for the discrimination and retaliation case my wife has against University of Chicago Medical Center (Artunduaga v. The University of Chicago Medical Center, No. 1:2012cv08733 https://dockets.justia.com/docket/illinois/ilndce/1:2012cv08733/276044/), scheduled to go to trial on 1/30/2017.

Although there are several commercial applications used mostly by big attorney firms, they were outside of my price range.

By Open Sourcing this software, I hope to help plaintiffs and their lawyers handle large amounts of data, and find key pieces of evidence which can help your case find its way to trial.

It has worked for us, and I hope it works for you too.

Good luck!

-Ricardo Garcia (rago)

Features:

  • Cross Platform: runs in Mac and Windows (easy to port to linux if desired)
  • Run it from a flash drive: The software and documents can be stored in a flash drive and run from there.
  • Tested with tens of thousands of documents
  • Complex searches using regular expressions
  • Integration with excel and other office utilities

crumbs in windows

crumbs in windows

crumbs in windows

Example and how to use:

The example can be downloaded as a zip file from https://github.com/ragomusic/crumbs/releases. It contains executables (mac and windows) and a small database with a subset of emails from the infamous Enron dataset ( https://www.cs.cmu.edu/~./enron/)

Crumbs looks by default for details on how the files are organized in its ini file: crumbs2.ini

Ini file:

[GENERAL]
DatabaseFile=crumbs2.db3
TableCount=1

[TABLE1]
TableName=ENRON
PathPDF=ENRON/pdf
PathTXT=ENRON/txt
DocumentRoot=ENR
DBtable=ENRONfiles

The sections of the ini file:

  • [GENERAL]

  • DatabaseFile= name of the sqlite database

  • TableCount= number of tables

  • [TABLE1] for each table have one of these, with their number at the end

  • TableName= Friendly name of table to be shown on Crumbs

  • PathPDF= Path to root of pdf files

  • PathTXT= Path to root of txt files

  • DocumentRoot= Prefix for all the documents in this table

  • DBtable= Desired name for this table on sqlite database

Files location:

Crumbs expects the ini file to be in the same folder than your CrumbsViewer executable. The sqlite database fill will also be placed on that folder.

The ini file will point to the root folder for each table pdf and text files. Although these can be in any place in your drive (paths can be absolute), we have had better experience with documents stored in folders at the same level where the CrumbsViere executable is located (as in the example).

Crumbs will crawl ALL the sub folder inside each one of those paths, so, it is recommended to organize your documents in subfolders containing just a portion of them. We found that batches of 1000 work very well.

How to compile:

See the compilation notes on the docs folder

License:

Copyright 2013 Ricardo Garcia (rago) Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at _ http://www.apache.org/licenses/LICENSE-2.0_ Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Donations and acknowledgments:

After several software developer friends insisted, we caved in and opened a donations button on paypal:

We pledge the proceedings of these (if any) to continue developing and documenting Crumbs. Thanks in advance.

Donate

If you use crumbs (as executable from this or other site) for your case, and feel like sharing a note with us, send an email to: info .at. crumbssoftware . com

If you create a derivative of Crumbs (follow the license guidelines), feel free to send us a note and share your project.

Big thanks:

Special thanks to:

  • Maria Artunduaga: For her support, strength, inspiration and drive to make the world a better place, even if it hurts
  • Jamie Franklin: For all her feedback. Best alpha tester ever
  • Mario Chamorro: For his expertise with reaching large audiences