Skip to content
This repository has been archived by the owner on May 14, 2018. It is now read-only.

Normalizing databases #23

Open
hilaryparker opened this issue Apr 1, 2014 · 1 comment
Open

Normalizing databases #23

hilaryparker opened this issue Apr 1, 2014 · 1 comment

Comments

@hilaryparker
Copy link

I asked at Etsy colleague how he normalizes data (in the database sense). Here was his response:

"I believe what I did was to lowercase, remove duplicate spaces, stem and get rid of stop words. all of these concepts are general"

Also: http://www.omegahat.org/Rstem/stemming.pdf

@karthik
Copy link
Owner

karthik commented Apr 5, 2014

Awesome! Thanks Hilary!
I know Duncan TL (He's also on our board). Will bring this up in my next
conversation with him.

Cheers!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants