forked from rnewson/couchdb-lucene
/
NOTES
68 lines (51 loc) · 2.4 KB
/
NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
If you're going to include a "tokenize" option, I'd lobby for "porter"
to be one of the non-custom options :)
If this ruby hacker can code it, it's not going to add to the overall
complexity ;) The code:
http://github.com/eee-c/couchdb-lucene/blob/8ebffd5ba1f887616b5ec47e10ca003f4d7d9797/src/main/java/com/github/rnewson/couchdb/lucene/MyAnalyzer.java
At first blush I was mildly concerned about having the default field
and the tokenize options in separate places because they are
intimately tied together in the QueryParser constructor. Still, I
don't know that it would be of any benefit to expose that in your API.
The "defaults" options that you already have planned make semantic
sense. Anything else would be harder to read, weakening the API. What
I'm trying to say is that I like the proposed "tokenize" (except to
maybe name it "tokenizer").
I have no immediate use for i18n, so I personally wouldn't miss it in
0.3, but it is a definite nice-to-have for the future.
Ah, I confused you. The "defaults" thing is so can say, for example,
that everything is stored by default (the default is "no"), e.g,
{
"view1": {
"defaults" : { "store" , "yes" }
"index": "fun(doc) ..."
}
The "tokenize" option would have options from Field.Tokenize like
"analyzed_no_norms", etc. so "tokenize" is correct. "tokenizer" could
be an extra option that does what you except ("simple", "standard",
"porter", etc).
What I mean by default field is that DEFAULT_FIELD in Config will
become "default", and QP will use that. If, in your function, you do
this;
ret.add("hello");
then "hello" is added to the field called "default". In order to
change that, you'll do this;
ret.add("hello", {"field":"subject"});
And that would do what field("subject", "hello"); does today. The
reason is to move everything that's Lucene-specific into an optional
object parameter, the hope being that other fulltext indexes use a
similar (or identical) syntax. It's also how you'd override the other
defaults on a per-call basis;
{
"defaults": { "store", "yes" },
"index":"function(doc) { var ret=new Document(); ret.add("take the
defaults"); ret.add("but not here", {"field":"subject",
"store":"no"});
}
The i18n support would, I think, just be that ability to say this;
{
"defaults": {"language":"fr"},
"index":"function(doc) { var ret=new Document(); ret.add("bonjour");
ret.add("hello", {"language":"en"} ....
}
So, I think it'll go in, I quite like that.