Skip to content

Commit

Permalink
Merge branch 'release-0.13.4'
Browse files Browse the repository at this point in the history
  • Loading branch information
tmylk committed Dec 25, 2016
2 parents 3067cb0 + 9f3d31e commit e838391
Show file tree
Hide file tree
Showing 82 changed files with 50,350 additions and 5,694 deletions.
3 changes: 2 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ python:
- "2.7"
- "3.3"
- "3.4"
- "3.5"
- "3.5"
before_install:
- wget 'http://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh' -O miniconda.sh
- chmod +x miniconda.sh
Expand All @@ -18,5 +18,6 @@ install:
- pip install pyemd
- pip install annoy
- pip install testfixtures
- pip install unittest2
- python setup.py install
script: python setup.py test
35 changes: 34 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,37 @@
Changes
=======
===========

Unreleased:

None

0.13.4, 2016-12-22

* Evaluation of word2vec models against semantic similarity datasets like SimLex-999 (#1047) (@akutuzov, [#1047](https://github.com/RaRe-Technologies/gensim/pull/1047))
* TensorBoard word embedding visualisation of Gensim Word2vec format (@loretoparisi, [#1051](https://github.com/RaRe-Technologies/gensim/pull/1051))
* Throw exception if load() is called on instance rather than the class in word2vec and doc2vec (@dus0x,[(#889](https://github.com/RaRe-Technologies/gensim/pull/889))
* Loading and Saving LDA Models across Python 2 and 3. Fix #853 (@anmolgulati, #913, [#1093](https://github.com/RaRe-Technologies/gensim/pull/1093))
* Fix automatic learning of eta (prior over words) in LDA (@olavurmortensen, [#1024](https://github.com/RaRe-Technologies/gensim/pull/1024)).
* eta should have dimensionality V (size of vocab) not K (number of topics). eta with shape K x V is still allowed, as the user may want to impose specific prior information to each topic.
* eta is no longer allowed the "asymmetric" option. Asymmetric priors over words in general are fine (learned or user defined).
* As a result, the eta update (`update_eta`) was simplified some. It also no longer logs eta when updated, because it is too large for that.
* Unit tests were updated accordingly. The unit tests expect a different shape than before; some unit tests were redundant after the change; `eta='asymmetric'` now should raise an error.
* Optimise show_topics to only call get_lambda once. Fix #1006. (@bhargavvader, [#1028](https://github.com/RaRe-Technologies/gensim/pull/1028))
* HdpModel doc improvement. Inference and print_topics (@dsquareindia, [#1029](https://github.com/RaRe-Technologies/gensim/pull/1029))
* Removing Doc2Vec defaults so that it won't override Word2Vec defaults. Fix #795 (@markroxor, [#929](https://github.com/RaRe-Technologies/gensim/pull/929))
Remove warning on gensim import "pattern not installed". Fix #1009 (@shashankg7, #1018)
* Add delete_temporary_training_data() function to word2vec and doc2vec models. (@deepmipt-VladZhukov, [#987](https://github.com/RaRe-Technologies/gensim/pull/987))
* New class KeyedVectors to store embedding separate from training code (@anmol01gulati and @droudy, [#980](https://github.com/RaRe-Technologies/gensim/pull/980))
* Documentation improvements (@IrinaGoloshchapova, [#1010](https://github.com/RaRe-Technologies/gensim/pull/1010), [#1011](https://github.com/RaRe-Technologies/gensim/pull/1011))
* LDA tutorial by Olavur, tips and tricks (@olavurmortensen, [#779](https://github.com/RaRe-Technologies/gensim/pull/779))
* Add double quote in commmand line to run on Windows (@akarazeev, [#1005](https://github.com/RaRe-Technologies/gensim/pull/1005))
* Fix directory names in notebooks to be OS-independent (@mamamot, [#1004](https://github.com/RaRe-Technologies/gensim/pull/1004))
* Respect clip_start, clip_end in most_similar. Fix #601. (@parulsethi, [#994](https://github.com/RaRe-Technologies/gensim/pull/994))
* Replace Python sigmoid function with scipy in word2vec & doc2vec (@markroxor, [#989](https://github.com/RaRe-Technologies/gensim/pull/989))
* WMD to return 0 instead of inf for sentences that contain a single word (@rbahumi, [#986](https://github.com/RaRe-Technologies/gensim/pull/986))
* Pass all the params through the apply call in lda.get_document_topics(), test case to use the per_word_topics through the corpus in test_ldamodel (@parthoiiitm, [#978](https://github.com/RaRe-Technologies/gensim/pull/978))
* Pyro annotations for lsi_worker (@markroxor, [#968](https://github.com/RaRe-Technologies/gensim/pull/968))


0.13.3, 2016-10-20

Expand All @@ -20,6 +52,7 @@ Changes
* Remove ShardedCorpus from init because of Theano dependency (@tmylk, [#919](https://github.com/RaRe-Technologies/gensim/pull/919))
* Documentation improvements ( @dsquareindia & @tmylk, [#914](https://github.com/RaRe-Technologies/gensim/pull/914), [#906](https://github.com/RaRe-Technologies/gensim/pull/906) )
* Add Annoy memory-mapping example (@harshul1610, [#899](https://github.com/RaRe-Technologies/gensim/pull/899))
* Fixed issue [#601](https://github.com/RaRe-Technologies/gensim/issues/601), correct docID in most_similar for clip range (@parulsethi, [#994](https://github.com/RaRe-Technologies/gensim/pull/994))

0.13.2, 2016-08-19

Expand Down
7 changes: 3 additions & 4 deletions appveyor.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ environment:
WHEELHOUSE_UPLOADER_USERNAME: "Lev.Konstantinovskiy"
WHEELHOUSE_UPLOADER_SECRET:
secure: qXqY3dFmLOqvxa3Om2gQi/BjotTOK+EP2IPLolBNo0c61yDtNWxbmE4wH3up72Be

matrix:
- PYTHON: "C:\\Python27"
PYTHON_VERSION: "2.7.8"
Expand Down Expand Up @@ -45,7 +45,7 @@ install:
# Install the build and runtime dependencies of the project.
# Install the build and runtime dependencies of the project.
- "%CMD_IN_ENV% pip install --timeout=60 --trusted-host 28daf2247a33ed269873-7b1aad3fab3cc330e1fd9d109892382a.r6.cf2.rackcdn.com -r continuous_integration/appveyor/requirements.txt"
- "%CMD_IN_ENV% python setup.py bdist_wheel bdist_wininst "
- "%CMD_IN_ENV% python setup.py bdist_wheel bdist_wininst"
- ps: "ls dist"

# Install the genreated wheel package to test it
Expand All @@ -59,7 +59,7 @@ test_script:
# installed library.
- "mkdir empty_folder"
- "cd empty_folder"
- "pip install pyemd testfixtures"
- "pip install pyemd testfixtures unittest2"

- "python -c \"import nose; nose.main()\" -s -v gensim"
# Move back to the project folder
Expand All @@ -86,4 +86,3 @@ cache:
# container, speed up the appveyor jobs and reduce bandwidth
# usage on our rackspace account.
- '%APPDATA%\pip\Cache'

2 changes: 2 additions & 0 deletions docs/notebooks/Word2Vec_FastText_Comparison.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -459,6 +459,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The `accuracy` takes an optional parameter `restrict_vocab`, which limits the vocabulary of model considered for fast approximate evaluation (default is 30000).\n",
"\n",
"Word2Vec embeddings seem to be slightly better than fastText embeddings at the semantic tasks, while the fastText embeddings do significantly better on the syntactic analogies. Makes sense, since fastText embeddings are trained for understanding morphological nuances, and most of the syntactic analogies are morphology based. \n",
"\n",
"Let me explain that better.\n",
Expand Down
115 changes: 9 additions & 106 deletions docs/notebooks/datasets/mycorpus.txt
Original file line number Diff line number Diff line change
@@ -1,106 +1,9 @@
<!DOCTYPE html>
<!--[if lt IE 7]> <html class="no-js ie6 oldie" lang="en-US"> <![endif]-->
<!--[if IE 7]> <html class="no-js ie7 oldie" lang="en-US"> <![endif]-->
<!--[if IE 8]> <html class="no-js ie8 oldie" lang="en-US"> <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en-US"> <!--<![endif]-->
<head>
<title>Attention Required! | CloudFlare</title>
<meta charset="UTF-8" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1" />
<meta name="robots" content="noindex, nofollow" />
<meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1" />
<link rel="stylesheet" id="cf_styles-css" href="/cdn-cgi/styles/cf.errors.css" type="text/css" media="screen,projection" />
<!--[if lt IE 9]><link rel="stylesheet" id='cf_styles-ie-css' href="/cdn-cgi/styles/cf.errors.ie.css" type="text/css" media="screen,projection" /><![endif]-->
<style type="text/css">body{margin:0;padding:0}</style>
<!--[if lte IE 9]><script type="text/javascript" src="/cdn-cgi/scripts/jquery.min.js"></script><![endif]-->
<!--[if gte IE 10]><!--><script type="text/javascript" src="/cdn-cgi/scripts/zepto.min.js"></script><!--<![endif]-->
<script type="text/javascript" src="/cdn-cgi/scripts/cf.common.js"></script>


</head>
<body>
<div id="cf-wrapper">
<div class="cf-alert cf-alert-error cf-cookie-error" id="cookie-alert" data-translate="enable_cookies">Please enable cookies.</div>
<div id="cf-error-details" class="cf-error-details-wrapper">
<div class="cf-wrapper cf-header cf-error-overview">
<h1 data-translate="challenge_headline">One more step</h1>
<h2 class="cf-subheadline"><span data-translate="complete_sec_check">Please complete the security check to access</span> radimrehurek.com</h2>
</div><!-- /.header -->

<div class="cf-section cf-highlight cf-captcha-container">
<div class="cf-wrapper">
<div class="cf-columns two">
<div class="cf-column">
<div class="cf-highlight-inverse cf-form-stacked">
<form class="challenge-form" id="challenge-form" action="/cdn-cgi/l/chk_captcha" method="get">
<script type="text/javascript" src="/cdn-cgi/scripts/cf.challenge.js" data-type="normal" data-ray="2e6e057767f92eff" async data-sitekey="6LfOYgoTAAAAAInWDVTLSc8Yibqp-c9DaLimzNGM" data-stoken="fl5gc_M14MlvBWkagabZ26h9QmYXjG-MwP3RCYKAYmws6uvoh8HXypNV4EIs6bWsA3DnLB2JtDqrqDW3zZUAHxQSSST-szj1FSE9yoiQTqw"></script>
<div class="g-recaptcha"></div>
<noscript id="cf-captcha-bookmark" class="cf-captcha-info">
<div><div style="width: 302px">
<div>
<iframe src="https://www.google.com/recaptcha/api/fallback?k=6LfOYgoTAAAAAInWDVTLSc8Yibqp-c9DaLimzNGM&stoken=fl5gc_M14MlvBWkagabZ26h9QmYXjG-MwP3RCYKAYmws6uvoh8HXypNV4EIs6bWsA3DnLB2JtDqrqDW3zZUAHxQSSST-szj1FSE9yoiQTqw" frameborder="0" scrolling="no" style="width: 302px; height:422px; border-style: none;"></iframe>
</div>
<div style="width: 300px; border-style: none; bottom: 12px; left: 25px; margin: 0px; padding: 0px; right: 25px; background: #f9f9f9; border: 1px solid #c1c1c1; border-radius: 3px;">
<textarea id="g-recaptcha-response" name="g-recaptcha-response" class="g-recaptcha-response" style="width: 250px; height: 40px; border: 1px solid #c1c1c1; margin: 10px 25px; padding: 0px; resize: none;"></textarea>
<input type="submit" value="Submit"></input>
</div>
</div></div>
</noscript>
</form>

</div>
</div>

<div class="cf-column">
<div class="cf-screenshot-container">

<span class="cf-no-screenshot"></span>

</div>
</div>
</div><!-- /.columns -->
</div>
</div><!-- /.captcha-container -->

<div class="cf-section cf-wrapper">
<div class="cf-columns two">
<div class="cf-column">
<h2 data-translate="why_captcha_headline">Why do I have to complete a CAPTCHA?</h2>

<p data-translate="why_captcha_detail">Completing the CAPTCHA proves you are a human and gives you temporary access to the web property.</p>
</div>

<div class="cf-column">
<h2 data-translate="resolve_captcha_headline">What can I do to prevent this in the future?</h2>

<p data-translate="resolve_captcha_antivirus">If you are on a personal connection, like at home, you can run an anti-virus scan on your device to make sure it is not infected with malware.</p>

<p data-translate="resolve_captcha_network">If you are at an office or shared network, you can ask the network administrator to run a scan across the network looking for misconfigured or infected devices.</p>
</div>
</div>
</div><!-- /.section -->

<div class="cf-error-footer cf-wrapper">
<p>
<span class="cf-footer-item">CloudFlare Ray ID: <strong>2e6e057767f92eff</strong></span>
<span class="cf-footer-separator">&bull;</span>
<span class="cf-footer-item"><span data-translate="your_ip">Your IP</span>: 202.41.10.3</span>
<span class="cf-footer-separator">&bull;</span>
<span class="cf-footer-item"><span data-translate="performance_security_by">Performance &amp; security by</span> <a data-orig-proto="https" data-orig-ref="www.cloudflare.com/5xx-error-landing?utm_source=error_footer" id="brand_link" target="_blank">CloudFlare</a></span>

</p>
</div><!-- /.error-footer -->


</div><!-- /#cf-error-details -->
</div><!-- /#cf-wrapper -->

<script type="text/javascript">
window._cf_translation = {};


</script>

</body>
</html>
Human machine interface for lab abc computer applications
A survey of user opinion of computer system response time
The EPS user interface management system
System and human system engineering testing of EPS
Relation of user perceived response time to error measurement
The generation of random binary unordered trees
The intersection graph of paths in trees
Graph minors IV Widths of trees and well quasi ordering
Graph minors A survey

0 comments on commit e838391

Please sign in to comment.