Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add configurable stopwords feature #305

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions CONFIGURATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,10 @@ Sonic Configuration
* `list_limit_default` (type: _integer_, allowed: numbers, default: `100`) — Default listed words limit for a list command (if the LIMIT command modifier is not used when issuing a LIST command)
* `list_limit_maximum` (type: _integer_, allowed: numbers, default: `500`) — Maximum listed words limit for a list command (if the LIMIT command modifier is being used when issuing a LIST command)

**[channel.search.stopwords]**

* `${language_code}` (type: _string[]_, allowed: [supported language codes](https://github.com/valeriansaliou/sonic/tree/master/src/stopwords), default: none) — User defined stopwords for the selected language. Use it only if you want to override the preset of Sonic. Setting this value explicitly to `[]` disables stopwords at all.

**[store]**

**[store.kv]**
Expand Down
1 change: 1 addition & 0 deletions config.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ suggest_limit_maximum = 20
list_limit_default = 100
list_limit_maximum = 500

[channel.search.stopwords]

[store]

Expand Down
5 changes: 5 additions & 0 deletions src/config/defaults.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
// Copyright: 2019, Valerian Saliou <valerian@valeriansaliou.name>
// License: Mozilla Public License v2.0 (MPL v2.0)

use super::options::ConfigChannelSearchStopwords;
use std::net::SocketAddr;
use std::path::PathBuf;

Expand Down Expand Up @@ -47,6 +48,10 @@ pub fn channel_search_list_limit_maximum() -> u16 {
500
}

pub fn channel_search_stopwords() -> ConfigChannelSearchStopwords {
ConfigChannelSearchStopwords::default()
}

pub fn store_kv_path() -> PathBuf {
PathBuf::from("./data/store/kv/")
}
Expand Down
82 changes: 82 additions & 0 deletions src/config/options.rs
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,88 @@ pub struct ConfigChannelSearch {

#[serde(default = "defaults::channel_search_list_limit_maximum")]
pub list_limit_maximum: u16,

#[serde(default = "defaults::channel_search_stopwords")]
pub stopwords: ConfigChannelSearchStopwords,
}

#[derive(Deserialize, Default)]
pub struct ConfigChannelSearchStopwords {
pub epo: Option<Vec<String>>,
pub eng: Option<Vec<String>>,
pub rus: Option<Vec<String>>,
pub cmn: Option<Vec<String>>,
pub spa: Option<Vec<String>>,
pub por: Option<Vec<String>>,
pub ita: Option<Vec<String>>,
pub ben: Option<Vec<String>>,
pub fra: Option<Vec<String>>,
pub deu: Option<Vec<String>>,

pub ukr: Option<Vec<String>>,
pub kat: Option<Vec<String>>,
pub ara: Option<Vec<String>>,
pub hin: Option<Vec<String>>,
pub jpn: Option<Vec<String>>,
pub heb: Option<Vec<String>>,
pub yid: Option<Vec<String>>,
pub pol: Option<Vec<String>>,
pub amh: Option<Vec<String>>,
pub jav: Option<Vec<String>>,

pub kor: Option<Vec<String>>,
pub nob: Option<Vec<String>>,
pub dan: Option<Vec<String>>,
pub swe: Option<Vec<String>>,
pub fin: Option<Vec<String>>,
pub tur: Option<Vec<String>>,
pub nld: Option<Vec<String>>,
pub hun: Option<Vec<String>>,
pub ces: Option<Vec<String>>,
pub ell: Option<Vec<String>>,

pub bul: Option<Vec<String>>,
pub bel: Option<Vec<String>>,
pub mar: Option<Vec<String>>,
pub kan: Option<Vec<String>>,
pub ron: Option<Vec<String>>,
pub slv: Option<Vec<String>>,
pub hrv: Option<Vec<String>>,
pub srp: Option<Vec<String>>,
pub mkd: Option<Vec<String>>,
pub lit: Option<Vec<String>>,

pub lav: Option<Vec<String>>,
pub est: Option<Vec<String>>,
pub tam: Option<Vec<String>>,
pub vie: Option<Vec<String>>,
pub urd: Option<Vec<String>>,
pub tha: Option<Vec<String>>,
pub guj: Option<Vec<String>>,
pub uzb: Option<Vec<String>>,
pub pan: Option<Vec<String>>,
pub aze: Option<Vec<String>>,

pub ind: Option<Vec<String>>,
pub tel: Option<Vec<String>>,
pub pes: Option<Vec<String>>,
pub mal: Option<Vec<String>>,
pub ori: Option<Vec<String>>,
pub mya: Option<Vec<String>>,
pub nep: Option<Vec<String>>,
pub sin: Option<Vec<String>>,
pub khm: Option<Vec<String>>,
pub tuk: Option<Vec<String>>,

pub aka: Option<Vec<String>>,
pub zul: Option<Vec<String>>,
pub sna: Option<Vec<String>>,
pub afr: Option<Vec<String>>,
pub lat: Option<Vec<String>>,
pub slk: Option<Vec<String>>,
pub cat: Option<Vec<String>>,
pub tgl: Option<Vec<String>>,
pub hye: Option<Vec<String>>,
}

#[derive(Deserialize)]
Expand Down