-
-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker & OpenTapioca #31
Comments
It would be great to have that indeed! I am unlikely to find the time to work on this soon but would be very much in favour of including that in the repository. |
@wetneb Hey, could you specify which version of zookeeper you are using, and what is your local config? Maybe it would be cool to have a series of steps for your specific zookeeper install procedure. |
I use Solr 7.7.3 and the Zookeeper that is bundled in it. I do not install Zookeeper itself, I just download Solr and that comes with Zookeeper in it. |
Before Solr version 8.11.1, the Log4Shell CVE is present and it is a security problem. |
I have not checked. I am not actively maintaining this project as you can see. But I will always be happy to merge PRs. |
Ok I kinda solved the previous problem. I will have a PR ready soon. One question, should I update the settings_template.py file: # The name of the Solr collection where Wikidata is indexed
SOLR_COLLECTION = 'wd_2019-02-24'
# The path to the language model, trained with "tapioca train-bow"
LANGUAGE_MODEL_PATH='data/wd_2019-02-24.bow.pkl'
# The path to the pagerank Numpy vector, computed with "tapioca compute-pagerank"
PAGERANK_PATH='data/wd_2019-02-24.pgrank.npy'
# The path to the trained classifier, obtained from "tapioca train-classifier"
CLASSIFIER_PATH='data/rss_istex_classifier.pkl' |
I am not sure what you want to change in the |
The CLI was asking me something about the settings.py file that probably is not included in the docs. Another question: tapioca index-dump my_collection_name latest-all.json.bz2 --profile profiles/human_organization_place.json What's my_collection_name? Could you provide some examples of its value? |
Indeed! And feel free to have a look at its contents and check if there is anything there that you want to change for your own purposes.
The docs say:
So the intention behind this sentence is to say that:
If you can think of ways to make the docs more understandable for you in both locations, do not hesitate to open a PR with the phrasing you would have preferred there, I am sure it is going to be much better. |
@wetneb Hi Antonin, I also notice there is the parameter skip_docs |
Hi @eracle, On my previous server I had 20+GB RAM. Now much less, so I can no longer update the index. Yes I suspect skip_docs can be used to resume the indexing from an offset, but I do not remember exactly. |
Would it be possible to have a Docker image to help testing/deploying OpenTapioca?
It could be a great feature to help new community to enter into OpenTapioca world.
The text was updated successfully, but these errors were encountered: