Named-entity recognition (NER)

NER is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages,etc.

CRF MODEL

Conditional random fields (CRFs) are a class of statistical modeling methods often applied in pattern recognition and machine learning and used for structured prediction. Whereas a classifier predicts a label for a single sample without considering "neighboring" samples, a CRF can take context into account. To do so, the prediction is modeled as a graphical model, which implements dependencies between the predictions. For example, in natural language processing, linear chain CRFs are popular, which implement sequential dependencies in the predictions.

Character Embeddings

As deep learning in NLP exploded, larger and larger vocabulary sizes where needed. Character and subword embeddings were an attempt to limit the size of embedding matrices. However, these types of embeddings do not encode the same deep sematics that word embeddings encode.Character embeddings are constructed in similar fashion to the way that word embeddings are constructed. However, instead of embedding at the word level, the vectors represent each character in a language. For example, instead a vector for "king", there would be a separate vector for each of the letters: "k", "i", "n", and "g". As mentioned these types of embeddings do not encode the same type of information that word embeddings contain. Instead, character level embedding can be thought of encoded lexical information and may be used to enhance or enrich word level emebddings.

Able to handle new words and misspellings.
The required embedding matrix is much smaller than what is required for word level embeddings.

Accuracy Report

References

Towards Data Science - here
Idea Based on - Paper Link
Dataset - here

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
NER_CHAR_CRF.ipynb		NER_CHAR_CRF.ipynb
NER_CHAR_CRF_training.ipynb		NER_CHAR_CRF_training.ipynb
PSvdx.png		PSvdx.png
README.md		README.md
Screenshot 2021-11-02 at 10.09.50 PM.png		Screenshot 2021-11-02 at 10.09.50 PM.png
Screenshot 2021-11-02 at 11.08.59 PM.png		Screenshot 2021-11-02 at 11.08.59 PM.png
Screenshot 2021-11-02 at 11.09.59 PM.png		Screenshot 2021-11-02 at 11.09.59 PM.png
Screenshot 2021-11-03 at 12.10.46 AM.png		Screenshot 2021-11-03 at 12.10.46 AM.png
Screenshot 2021-11-03 at 12.11.58 AM.png		Screenshot 2021-11-03 at 12.11.58 AM.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NER_CHAR_CRF.ipynb

NER_CHAR_CRF.ipynb

NER_CHAR_CRF_training.ipynb

NER_CHAR_CRF_training.ipynb

PSvdx.png

PSvdx.png

README.md

README.md

Screenshot 2021-11-02 at 10.09.50 PM.png

Screenshot 2021-11-02 at 10.09.50 PM.png

Screenshot 2021-11-02 at 11.08.59 PM.png

Screenshot 2021-11-02 at 11.08.59 PM.png

Screenshot 2021-11-02 at 11.09.59 PM.png

Screenshot 2021-11-02 at 11.09.59 PM.png

Screenshot 2021-11-03 at 12.10.46 AM.png

Screenshot 2021-11-03 at 12.10.46 AM.png

Screenshot 2021-11-03 at 12.11.58 AM.png

Screenshot 2021-11-03 at 12.11.58 AM.png

Repository files navigation

Named-entity recognition (NER)

CRF MODEL

Character Embeddings

Accuracy Report

References

About

Releases

Packages

Languages

Raj123majumder/NER_project

Folders and files

Latest commit

History

Repository files navigation

Named-entity recognition (NER)

CRF MODEL

Character Embeddings

Accuracy Report

References

About

Topics

Resources

Stars

Watchers

Forks

Languages