Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question - How to run Entity extraction #1345

Open
imvetri opened this issue Jul 24, 2023 · 5 comments
Open

Question - How to run Entity extraction #1345

imvetri opened this issue Jul 24, 2023 · 5 comments

Comments

@imvetri
Copy link

imvetri commented Jul 24, 2023

No description provided.

@jesus-seijas-sp
Copy link
Contributor

Few examples

  1. Run NER when the amount of entities is huge (20.000 airports)
    https://github.com/axa-group/nlp.js/blob/master/examples/06-huge-ner/index.js

  2. Using the NER at the corpus with the field contextData
    https://github.com/axa-group/nlp.js/tree/master/examples/14-ner-corpus

  3. Using the entities detected by the NER in the answers using the NLG
    https://github.com/axa-group/nlp.js/blob/master/examples/17-ner-nlg/index.js

  4. Use the built in detectors using the Microsoft library
    https://github.com/axa-group/nlp.js/blob/master/examples/18-ner-builtin-ms/index.js

  5. Using the Trim entity
    https://github.com/axa-group/nlp.js/blob/master/examples/19-ner-trim-entities/index.js

Documentation

  1. NER quickstart
    https://github.com/axa-group/nlp.js/blob/master/docs/v4/ner-quickstart.md

  2. Explanation of NER manager, enum entities, regex entities, trim entities and builtin entities
    https://github.com/axa-group/nlp.js/blob/master/docs/v4/ner-manager.md

  3. Explanation of the builtin entity extraction, different builtin entities and their integration with duckling and without duckling
    https://github.com/axa-group/nlp.js/blob/master/docs/v3/builtin-entity-extraction.md

  4. Integration with Duckling
    https://github.com/axa-group/nlp.js/blob/master/docs/v3/builtin-duckling.md

  5. Integration with Compromise
    https://github.com/axa-group/nlp.js/blob/master/packages/builtin-compromise/README.md

  6. Slot Filling
    https://github.com/axa-group/nlp.js/blob/master/docs/v4/slot-filling.md

Unit Tests that can be used as examples

  1. Default builtins
    https://github.com/axa-group/nlp.js/blob/master/packages/builtin-default/test/builtin-default.test.js

  2. Compromise builtins
    https://github.com/axa-group/nlp.js/blob/master/packages/builtin-compromise/test/compromise.test.js

  3. Microsoft builtins
    https://github.com/axa-group/nlp.js/blob/master/packages/builtin-microsoft/test/builtin-microsoft.test.js

  4. NER tests
    https://github.com/axa-group/nlp.js/blob/master/packages/ner/test/ner.test.js

  5. Enum Extractor tests
    https://github.com/axa-group/nlp.js/blob/master/packages/ner/test/extractor-enum.test.js

  6. Regex Extractor tests
    https://github.com/axa-group/nlp.js/blob/master/packages/ner/test/extractor-regex.test.js

  7. Trim Extractor tests
    https://github.com/axa-group/nlp.js/blob/master/packages/ner/test/extractor-trim.test.js

  8. NLP Manager extract entities tests:
    https://github.com/axa-group/nlp.js/blob/master/packages/node-nlp/test/nlp/nlp-manager.test.js#L357

  9. Lot of NER tests at NLP library
    https://github.com/axa-group/nlp.js/blob/master/packages/nlp/test/nlp.test.js

@MarketingPip
Copy link

Here is an example using trimmed entity extraction - hope this helps.

import { containerBootstrap } from "https://cdn.skypack.dev/@nlpjs/core@4.26.1";
import { Nlp } from "https://cdn.skypack.dev/@nlpjs/nlp@4.26.1";
import { LangEn } from "https://cdn.skypack.dev/@nlpjs/lang-en-min@4.26.1";


(async () => {
  const container = await containerBootstrap();
  container.use(Nlp);
  container.use(LangEn);
  const nlp = container.get('nlp');
  nlp.settings.autoSave = false;
  nlp.addLanguage('en');
  nlp.slotManager.addSlot('travel', 'fromCity', true);
  nlp.addDocument('en', 'I want to travel from %fromCity% to @toCity', 'travel')

  
    nlp.addNerBetweenLastCondition('en', 'fromCity', 'from', 'to');
  nlp.addNerAfterLastCondition('en', 'fromCity', 'from');
  nlp.addNerBetweenLastCondition('en', 'toCity', 'to', 'from');
  nlp.addNerAfterLastCondition('en', 'toCity', 'to');
 
  nlp.slotManager.addSlot('travel', 'fromCity', true);
  nlp.slotManager.addSlot('travel', 'toCity', true);
  
  
  await nlp.train();
  const response = await nlp.process('en', 'I want to travel from here to you go bro!');
  console.log(response.entities[0].utteranceText); // Outputs: here
   console.log(response.entities[1].utteranceText); // Outputs: you go bro!
})();

@imvetri
Copy link
Author

imvetri commented Jul 27, 2023

I have something in my mind, I'm not sure whether its right or not, what do you think about this https://vimeo.com/manage/videos/754241914 ?

@imvetri
Copy link
Author

imvetri commented Jul 28, 2023

I have an idea and I’d like to experiment it with a model.

Idea : teach a language or text based model with custom truths.

Prompt - Give me list of words in the model
Prompt - Give me list of action words in the model
Prompt - Build a fact tree of the model and return the text
Prompt - What is a fact ?
Answer - No answer
Prompt - A fact is a truth found after an experiment
Answer - (It doesnt know what is an experiment) What is an experiment
Prompt - feed information about experiment and how to conduct experiment

At this point, you have built a system that can take your natural language as instruction to model, and we can instruct it to take a shape that we intend to.

In other words, if I feed it grammar of a language, and the language, it runs against the grammar. If I feed it information on how to trace a control flow, then given a program, it can return information on the control flow that happens.

So what I’m trying to ask is, a way to make the model programmable, in natural language, force tell it to correct itself. If I try to correct it, I would want to visually see it happening. so having a fact tree will help me understand the model and be sure whether it actually learnt or not.

This also helps me to build a truth finder.

First step is to build a fact tree, where, subject part of a sentence is an action that happened in the past, and verb part of a sentence is action. At the very core of this fact tree is an ultimate truth, which is movement, time, space, and more units can appear above from it. Since we define a fundamental unit to this fact tree, and making the fact tree visual, it has information about its visual structure in itself, giving an opportunity to make it reach the state of singularity, In the beginning it may be just a visual singularity or consiousness, just to make the focus ring of the model to always remain in the core because thats where the truth is. The more the focus ring spends time on a node, more the memory get strengthened.

Terminology
Model - Nodes of information
Fact tree - A visual structure of the words in the model
Focus ring - Nodes spark up, based on the tuning level, or the text generation quantity, where ever, and whenever this focus ring moves across, a text is generated out. Just like how there is a focus part in our brain. This shape need not be ring, because it can be multi dimensional connecting dots from different pieces of the tree.

Fact tree - It changes shape visually. This is not performant technique.

@MarketingPip
Copy link

@imvetri - you will need to define a array and set context / memory to the bot. You can do something like define a utterance with slots like example "@thing is my @value". // GitHub is my favorite - let's say is the input and GitHub and favorite are the "slots" or "entities" filled.

Use a statement to add those slots to your "memory".

Then define another intent and utterance like something. "What is @value". If found in memory (ie: your array), return it.

Else say "I do not know what @value is".

For showing the memory - define another intent for that and call the function you create to show the memory.


For a more advanced approach you could use another library to find all nouns etc, check if in list & see if any context. And if not in list. Return ``What is an experiment ``` - await input and add to memory etc.

Hope this helps. Tho I highly doubt this will be added as a feature / implemented for you etc. You will need to do it yourself. Best of luck!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants