NEW: add a report summary #63

davidsbatista · 2023-04-16T12:25:15Z

Main Changes

Added two new functions:

summary_report_ent(): print of the evaluation results, aggregated by entity type and for a specific given scenario (i.e., strict, ent_type, partial, exact)
summary_report_overall(): the report over all entities for the 4 possible scenarios

Example of the output:

'Strict'
          correct   incorrect     partial      missed    spurious   precision      recall    f1-score

 LOC          475         507           0         102          28        0.47        0.44        0.45
MISC            0         290           0          50           0        0.00        0.00        0.00
 ORG          902         440           0          58         166        0.60        0.64        0.62
 PER          532         153           0          50          25        0.75        0.72        0.74


Overall
              correct   incorrect     partial      missed    spurious   precision      recall    f1-score

ent_type         2052        1247           0         260         219        0.58        0.58        0.58
 partial         2954           0         345         260         219        0.89        0.88        0.88
  strict         1909        1390           0         260         219        0.54        0.54        0.54
   exact         2954         345           0         260         219        0.84        0.83        0.83

Other "minor" changes:

remove the notebook with the example and added a fully functional example examples/example_no_loader.py
fixed some of the pre-commit and setup.cfg stuff, so that mypy is not applied to tests/ and examples/
added a specific make clean_venv only to remove the virtual environment
the building process was broken due to codecov being now non-existent
collect_named_entities() was duplicated, removed it from one file nerevaluate.py and updated the type definition since it was wrong
renamed nerevauate.py to evaluate.py since I was having lots of conflicts, was the easiast fixe

ivyleavedtoadflax · 2023-04-23T16:33:25Z

This looks good, but do you think it is worth being more explicit about the dependencies required for the new example, as it won't run out of the box as it is?

davidsbatista · 2023-04-23T16:45:49Z

probably yes, I'm not having the whole picture, can you give a concrete example?

in the meantime, I will just merge this, if you agree.

ivyleavedtoadflax · 2023-04-24T09:17:45Z

my point was that if you try to run the example without installing the dependencies, it failed, and there's no way to know what you need to install except by running the script and seeing the various errors like:

(virtualenv)  AWS: mantis    ~/Documents/mantis/nervaluate   NEW/add-code-with-examples  python examples/example_no_loader.py                                                                    <aws:mantis>
Loading CoNLL 2002 NER Spanish data
Traceback (most recent call last):
  File "/home/matthew/Documents/mantis/nervaluate/build/virtualenv/lib/python3.8/site-packages/nltk/corpus/util.py", line 84, in __load
    root = nltk.data.find(f"{self.subdir}/{zip_name}")
  File "/home/matthew/Documents/mantis/nervaluate/build/virtualenv/lib/python3.8/site-packages/nltk/data.py", line 583, in find
    raise LookupError(resource_not_found)
LookupError:
**********************************************************************
  Resource conll2002 not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('conll2002')

  For more information see: https://www.nltk.org/data.html

  Attempted to load corpora/conll2002.zip/conll2002/

  Searched in:
    - '/home/matthew/nltk_data'
    - '/home/matthew/Documents/mantis/nervaluate/build/virtualenv/nltk_data'
    - '/home/matthew/Documents/mantis/nervaluate/build/virtualenv/share/nltk_data'
    - '/home/matthew/Documents/mantis/nervaluate/build/virtualenv/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
**********************************************************************


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "examples/example_no_loader.py", line 120, in <module>
    main()
  File "examples/example_no_loader.py", line 71, in main
    nltk.corpus.conll2002.fileids()
  File "/home/matthew/Documents/mantis/nervaluate/build/virtualenv/lib/python3.8/site-packages/nltk/corpus/util.py", line 121, in __getattr__
    self.__load()
  File "/home/matthew/Documents/mantis/nervaluate/build/virtualenv/lib/python3.8/site-packages/nltk/corpus/util.py", line 86, in __load
    raise e
  File "/home/matthew/Documents/mantis/nervaluate/build/virtualenv/lib/python3.8/site-packages/nltk/corpus/util.py", line 81, in __load
    root = nltk.data.find(f"{self.subdir}/{self.__name}")
  File "/home/matthew/Documents/mantis/nervaluate/build/virtualenv/lib/python3.8/site-packages/nltk/data.py", line 583, in find
    raise LookupError(resource_not_found)
LookupError:
**********************************************************************
  Resource conll2002 not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('conll2002')

  For more information see: https://www.nltk.org/data.html

  Attempted to load corpora/conll2002

  Searched in:
    - '/home/matthew/nltk_data'
    - '/home/matthew/Documents/mantis/nervaluate/build/virtualenv/nltk_data'
    - '/home/matthew/Documents/mantis/nervaluate/build/virtualenv/share/nltk_data'
    - '/home/matthew/Documents/mantis/nervaluate/build/virtualenv/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
**********************************************************************

(virtualenv)  ✘  AWS: mantis    ~/Documents/mantis/nervaluate   NEW/add-code-with-examples 

Might be nice to include a list of dependencies for the example somewhere.

davidsbatista · 2023-04-24T10:23:27Z

ah ok - I see your point - I will try to fix this

ivyleavedtoadflax · 2023-04-24T10:46:33Z

Also it seems like line 11 of the Makefile is failing on my system:

source ${VIRTUALENV}/bin/activate && pip3 install --editable .

make virtualenv

$ make virtualenv
virtualenv --python python3.8 .venv
created virtual environment CPython3.8.10.final.0-64 in 160ms
  creator CPython3Posix(dest=/home/matthew/Documents/mantis/nervaluate/.venv, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/matthew/.local/share/virtualenv)
    added seed packages: pip==23.0.1, setuptools==67.6.1, wheel==0.40.0
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
.venv/bin/pip3 install -r requirements_dev.txt
Collecting black
  Using cached black-23.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB)
Collecting flake8
  Using cached flake8-6.0.0-py2.py3-none-any.whl (57 kB)
Collecting gitchangelog
  Using cached gitchangelog-3.0.4-py2.py3-none-any.whl (54 kB)
Collecting mypy
  Using cached mypy-1.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.2 MB)
Collecting pre-commit
  Using cached pre_commit-3.2.2-py2.py3-none-any.whl (202 kB)
Collecting pylint
  Using cached pylint-2.17.2-py3-none-any.whl (536 kB)
Collecting pytest
  Using cached pytest-7.3.1-py3-none-any.whl (320 kB)
Collecting pytest-cov
  Using cached pytest_cov-4.0.0-py3-none-any.whl (21 kB)
Collecting tox
  Using cached tox-4.4.12-py3-none-any.whl (148 kB)
Collecting tox-gh-actions
  Using cached tox_gh_actions-3.1.0-py2.py3-none-any.whl (9.8 kB)
Collecting twine
  Using cached twine-4.0.2-py3-none-any.whl (36 kB)
Requirement already satisfied: wheel in ./.venv/lib/python3.8/site-packages (from -r requirements_dev.txt (line 12)) (0.40.0)
Collecting click>=8.0.0
  Using cached click-8.1.3-py3-none-any.whl (96 kB)
Collecting mypy-extensions>=0.4.3
  Using cached mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)
Collecting tomli>=1.1.0
  Using cached tomli-2.0.1-py3-none-any.whl (12 kB)
Collecting packaging>=22.0
  Using cached packaging-23.1-py3-none-any.whl (48 kB)
Collecting typing-extensions>=3.10.0.0
  Using cached typing_extensions-4.5.0-py3-none-any.whl (27 kB)
Collecting platformdirs>=2
  Using cached platformdirs-3.2.0-py3-none-any.whl (14 kB)
Collecting pathspec>=0.9.0
  Using cached pathspec-0.11.1-py3-none-any.whl (29 kB)
Collecting pyflakes<3.1.0,>=3.0.0
  Using cached pyflakes-3.0.1-py2.py3-none-any.whl (62 kB)
Collecting mccabe<0.8.0,>=0.7.0
  Using cached mccabe-0.7.0-py2.py3-none-any.whl (7.3 kB)
Collecting pycodestyle<2.11.0,>=2.10.0
  Using cached pycodestyle-2.10.0-py2.py3-none-any.whl (41 kB)
Collecting nodeenv>=0.11.1
  Using cached nodeenv-1.7.0-py2.py3-none-any.whl (21 kB)
Collecting identify>=1.0.0
  Using cached identify-2.5.22-py2.py3-none-any.whl (98 kB)
Collecting pyyaml>=5.1
  Using cached PyYAML-6.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (701 kB)
Collecting cfgv>=2.0.0
  Using cached cfgv-3.3.1-py2.py3-none-any.whl (7.3 kB)
Collecting virtualenv>=20.10.0
  Downloading virtualenv-20.22.0-py3-none-any.whl (3.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.2/3.2 MB 11.7 MB/s eta 0:00:00
Collecting isort<6,>=4.2.5
  Using cached isort-5.12.0-py3-none-any.whl (91 kB)
Collecting dill>=0.2
  Using cached dill-0.3.6-py3-none-any.whl (110 kB)
Collecting tomlkit>=0.10.1
  Using cached tomlkit-0.11.7-py3-none-any.whl (35 kB)
Collecting astroid<=2.17.0-dev0,>=2.15.2
  Downloading astroid-2.15.4-py3-none-any.whl (278 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 278.1/278.1 kB 28.8 MB/s eta 0:00:00
Collecting pluggy<2.0,>=0.12
  Using cached pluggy-1.0.0-py2.py3-none-any.whl (13 kB)
Collecting iniconfig
  Using cached iniconfig-2.0.0-py3-none-any.whl (5.9 kB)
Collecting exceptiongroup>=1.0.0rc8
  Using cached exceptiongroup-1.1.1-py3-none-any.whl (14 kB)
Collecting coverage[toml]>=5.2.1
  Using cached coverage-7.2.3-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (229 kB)
Collecting chardet>=5.1
  Using cached chardet-5.1.0-py3-none-any.whl (199 kB)
Collecting cachetools>=5.3
  Using cached cachetools-5.3.0-py3-none-any.whl (9.3 kB)
Collecting filelock>=3.11
  Using cached filelock-3.12.0-py3-none-any.whl (10 kB)
Collecting pyproject-api>=1.5.1
  Using cached pyproject_api-1.5.1-py3-none-any.whl (12 kB)
Collecting colorama>=0.4.6
  Using cached colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Collecting keyring>=15.1
  Using cached keyring-23.13.1-py3-none-any.whl (37 kB)
Collecting readme-renderer>=35.0
  Using cached readme_renderer-37.3-py3-none-any.whl (14 kB)
Collecting requests-toolbelt!=0.9.0,>=0.8.0
  Using cached requests_toolbelt-0.10.1-py2.py3-none-any.whl (54 kB)
Collecting requests>=2.20
  Using cached requests-2.28.2-py3-none-any.whl (62 kB)
Collecting importlib-metadata>=3.6
  Downloading importlib_metadata-6.6.0-py3-none-any.whl (22 kB)
Collecting pkginfo>=1.8.1
  Using cached pkginfo-1.9.6-py3-none-any.whl (30 kB)
Collecting urllib3>=1.26.0
  Using cached urllib3-1.26.15-py2.py3-none-any.whl (140 kB)
Collecting rich>=12.0.0
  Using cached rich-13.3.4-py3-none-any.whl (238 kB)
Collecting rfc3986>=1.4.0
  Using cached rfc3986-2.0.0-py2.py3-none-any.whl (31 kB)
Collecting lazy-object-proxy>=1.4.0
  Using cached lazy_object_proxy-1.9.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (61 kB)
Collecting wrapt<2,>=1.11
  Using cached wrapt-1.15.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (81 kB)
Collecting zipp>=0.5
  Using cached zipp-3.15.0-py3-none-any.whl (6.8 kB)
Collecting jaraco.classes
  Using cached jaraco.classes-3.2.3-py3-none-any.whl (6.0 kB)
Collecting SecretStorage>=3.2
  Using cached SecretStorage-3.3.3-py3-none-any.whl (15 kB)
Collecting importlib-resources
  Using cached importlib_resources-5.12.0-py3-none-any.whl (36 kB)
Collecting jeepney>=0.4.2
  Using cached jeepney-0.8.0-py3-none-any.whl (48 kB)
Requirement already satisfied: setuptools in ./.venv/lib/python3.8/site-packages (from nodeenv>=0.11.1->pre-commit->-r requirements_dev.txt (line 5)) (67.6.1)
Collecting docutils>=0.13.1
  Using cached docutils-0.19-py3-none-any.whl (570 kB)
Collecting bleach>=2.1.0
  Using cached bleach-6.0.0-py3-none-any.whl (162 kB)
Collecting Pygments>=2.5.1
  Using cached Pygments-2.15.1-py3-none-any.whl (1.1 MB)
Collecting idna<4,>=2.5
  Using cached idna-3.4-py3-none-any.whl (61 kB)
Collecting certifi>=2017.4.17
  Using cached certifi-2022.12.7-py3-none-any.whl (155 kB)
Collecting charset-normalizer<4,>=2
  Using cached charset_normalizer-3.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (195 kB)
Collecting markdown-it-py<3.0.0,>=2.2.0
  Using cached markdown_it_py-2.2.0-py3-none-any.whl (84 kB)
Collecting distlib<1,>=0.3.6
  Using cached distlib-0.3.6-py2.py3-none-any.whl (468 kB)
Collecting webencodings
  Using cached webencodings-0.5.1-py2.py3-none-any.whl (11 kB)
Collecting six>=1.9.0
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting mdurl~=0.1
  Using cached mdurl-0.1.2-py3-none-any.whl (10.0 kB)
Collecting cryptography>=2.0
  Using cached cryptography-40.0.2-cp36-abi3-manylinux_2_28_x86_64.whl (3.7 MB)
Collecting more-itertools
  Using cached more_itertools-9.1.0-py3-none-any.whl (54 kB)
Collecting cffi>=1.12
  Using cached cffi-1.15.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (442 kB)
Collecting pycparser
  Using cached pycparser-2.21-py2.py3-none-any.whl (118 kB)
Installing collected packages: webencodings, gitchangelog, distlib, zipp, wrapt, urllib3, typing-extensions, tomlkit, tomli, six, rfc3986, pyyaml, Pygments, pyflakes, pycparser, pycodestyle, pluggy, platformdirs, pkginfo, pathspec, packaging, nodeenv, mypy-extensions, more-itertools, mdurl, mccabe, lazy-object-proxy, jeepney, isort, iniconfig, idna, identify, filelock, exceptiongroup, docutils, dill, coverage, colorama, click, charset-normalizer, chardet, cfgv, certifi, cachetools, virtualenv, requests, pytest, pyproject-api, mypy, markdown-it-py, jaraco.classes, importlib-resources, importlib-metadata, flake8, cffi, bleach, black, astroid, tox, rich, requests-toolbelt, readme-renderer, pytest-cov, pylint, pre-commit, cryptography, tox-gh-actions, SecretStorage, keyring, twine
Successfully installed Pygments-2.15.1 SecretStorage-3.3.3 astroid-2.15.4 black-23.3.0 bleach-6.0.0 cachetools-5.3.0 certifi-2022.12.7 cffi-1.15.1 cfgv-3.3.1 chardet-5.1.0 charset-normalizer-3.1.0 click-8.1.3 colorama-0.4.6 coverage-7.2.3 cryptography-40.0.2 dill-0.3.6 distlib-0.3.6 docutils-0.19 exceptiongroup-1.1.1 filelock-3.12.0 flake8-6.0.0 gitchangelog-3.0.4 identify-2.5.22 idna-3.4 importlib-metadata-6.6.0 importlib-resources-5.12.0 iniconfig-2.0.0 isort-5.12.0 jaraco.classes-3.2.3 jeepney-0.8.0 keyring-23.13.1 lazy-object-proxy-1.9.0 markdown-it-py-2.2.0 mccabe-0.7.0 mdurl-0.1.2 more-itertools-9.1.0 mypy-1.2.0 mypy-extensions-1.0.0 nodeenv-1.7.0 packaging-23.1 pathspec-0.11.1 pkginfo-1.9.6 platformdirs-3.2.0 pluggy-1.0.0 pre-commit-3.2.2 pycodestyle-2.10.0 pycparser-2.21 pyflakes-3.0.1 pylint-2.17.2 pyproject-api-1.5.1 pytest-7.3.1 pytest-cov-4.0.0 pyyaml-6.0 readme-renderer-37.3 requests-2.28.2 requests-toolbelt-0.10.1 rfc3986-2.0.0 rich-13.3.4 six-1.16.0 tomli-2.0.1 tomlkit-0.11.7 tox-4.4.12 tox-gh-actions-3.1.0 twine-4.0.2 typing-extensions-4.5.0 urllib3-1.26.15 virtualenv-20.22.0 webencodings-0.5.1 wrapt-1.15.0 zipp-3.15.0

[notice] A new release of pip is available: 23.0.1 -> 23.1
[notice] To update, run: python -m pip install --upgrade pip
.venv/bin/pre-commit install
pre-commit installed at .git/hooks/pre-commit
source .venv/bin/activate && pip3 install --editable .
/bin/sh: 1: source: not found
make: *** [Makefile:11: virtualenv] Error 127

Do you have any thoughts about moving to poetry for managing dependencies/build/upload to pypi etc? This would also resolve any virtualenv issues

davidsbatista · 2023-04-24T18:38:11Z

I've never worked with poetry - have you before?

I'm looking into poetry

ivyleavedtoadflax · 2023-04-25T09:07:44Z

I've never worked with poetry - have you before?

I'm looking into poetry

yes, we started recently, and it's actually pretty good!

ivyleavedtoadflax

Going to approve this to keep things moving, I think our poetry question is probably out of scope for this PR in any case!

davidsbatista · 2023-04-26T09:03:48Z

sorry Matt I have too much on my plate - I will merge but let's keep in mind to use poetry for deps + dev env. managing

davidsbatista added 9 commits April 14, 2023 19:56

removing duplicated code and fixing type hit

402e361

removed codecov from requirements.txt

2f701c0

wip: adding summary report

b2e11dc

wip: adding summary report

50c0ee0

wip: adding summary report and examples

278f44b

wip: adding summary report and examples

0622617

wip: adding summary report and examples

7594a80

wip: adding summary report and examples

3b966c2

wip: adding summary report and examples

d379fef

davidsbatista requested a review from ivyleavedtoadflax April 16, 2023 12:25

nit

36b30d7

full working example

290229c

ivyleavedtoadflax approved these changes Apr 26, 2023

View reviewed changes

davidsbatista merged commit 057a63f into main Apr 26, 2023
4 checks passed

davidsbatista deleted the NEW/add-code-with-examples branch May 16, 2023 19:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NEW: add a report summary #63

NEW: add a report summary #63

davidsbatista commented Apr 16, 2023 •

edited

ivyleavedtoadflax commented Apr 23, 2023

davidsbatista commented Apr 23, 2023

ivyleavedtoadflax commented Apr 24, 2023

davidsbatista commented Apr 24, 2023

ivyleavedtoadflax commented Apr 24, 2023

davidsbatista commented Apr 24, 2023 •

edited

ivyleavedtoadflax commented Apr 25, 2023

ivyleavedtoadflax left a comment •

edited

davidsbatista commented Apr 26, 2023

NEW: add a report summary #63

NEW: add a report summary #63

Conversation

davidsbatista commented Apr 16, 2023 • edited

Main Changes

Other "minor" changes:

ivyleavedtoadflax commented Apr 23, 2023

davidsbatista commented Apr 23, 2023

ivyleavedtoadflax commented Apr 24, 2023

davidsbatista commented Apr 24, 2023

ivyleavedtoadflax commented Apr 24, 2023

davidsbatista commented Apr 24, 2023 • edited

ivyleavedtoadflax commented Apr 25, 2023

ivyleavedtoadflax left a comment • edited

Choose a reason for hiding this comment

davidsbatista commented Apr 26, 2023

davidsbatista commented Apr 16, 2023 •

edited

davidsbatista commented Apr 24, 2023 •

edited

ivyleavedtoadflax left a comment •

edited