Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopt BNF (via llamacpp) and/or Dspy #1070

Open
2 tasks done
kratosok opened this issue Nov 21, 2023 · 2 comments
Open
2 tasks done

Adopt BNF (via llamacpp) and/or Dspy #1070

kratosok opened this issue Nov 21, 2023 · 2 comments
Labels
needs triage Needs labels assigned.

Comments

@kratosok
Copy link

Feature/Improvement Description

I've been using Localai (as an openai replacement) to serve multiple local models for inferencing and came across the idea of BNF grammars in llamacpp. I recall some talk here about the challenges of getting consistent responses from opensource models.

Adopting BNF grammars to constrain responses (See BNF grammars in llamacpp and other projects), or the new DSPy work from Stanford (You might be able to provide a good working conversation flow/examples from say GPT4, to give the DSPy compiler targets to learn how to use any given open source model the same or a similar type response) might be the solution to reliable responses.

One or both of these approaches would push AGIxt to the next level of usability and reliability, in my opinion.

Proposed Solution

See the BNF conversation here:
https://www.reddit.com/r/LocalLLaMA/comments/16wf3w5/ggml_bnf_grammar_the_key_to_autonomous_agents/

Or:
https://github.com/stanfordnlp/dspy

Acknowledgements

  • I have searched the existing issues to make sure this feature has not been requested yet.
  • I have provided enough information for everyone to understand why this feature request is needed in AGiXT.
@kratosok kratosok added the needs triage Needs labels assigned. label Nov 21, 2023
@kratosok
Copy link
Author

So I'm gonna try to take a crack at this myself It would seem. I'm adding "grammar_json_functions" and "grammar" attribute to the "ChatCompletion" class in agixt/Models.py. Next I will need to check the custom provider options to see how ChatCompletions are used and providing the grammar attributes and make some if they done.. Ultimately since the grammar would impact the responses coming back, I think I will add custom prompts for my custom provider so that I can have some sort of embedded grammar instruction in the prompts.

class ChatCompletions(BaseModel): model: str = "gpt-3.5-turbo" messages: List[dict] = None temperature: Optional[float] = 0.9 top_p: Optional[float] = 1.0 functions: Optional[List[dict]] = None function_call: Optional[str] = None n: Optional[int] = 1 stream: Optional[bool] = False stop: Optional[List[str]] = None max_tokens: Optional[int] = 8192 presence_penalty: Optional[float] = 0.0 frequency_penalty: Optional[float] = 0.0 logit_bias: Optional[Dict[str, float]] = None user: Optional[str] = None system_message: Optional[str] = "" grammar_json_functions: Optional[List[dict]] = None grammar: Optional[str] = None

Example BNF grammar call to llama.cpp api:
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Do you like apples?"}],
"grammar": "root ::= ("yes" | "no")"
}'

So i'm thinking a prompt say like "Break into steps.txt" might look like:
{context}

Primary objective: {user_input}

{introduction}

Instructions:

  1. Instead of solving the problem, we want to break it down into smaller problems that are easier to solve.
  2. Be very descriptive for every step that will be required and give those steps in a numbered list.
  3. Consider that each step needs to be worded as an instruction for another AI to follow that is not as smart as you.

{"grammar": "numbered_list ::= (task_number task_description)+; task_number ::= INTEGER; task_description ::= /[^"]+/;"}

Any thoughts/advice?

@Josh-XT
Copy link
Owner

Josh-XT commented Mar 7, 2024

Hey @kratosok ! I actually haven't used the grammars. I found an approach the worked before those started being implemented and stuck with it. I am open to moving to grammars since they're trending towards being the standard for that, but I don't currently have any data formatting issues with AGiXT that I am aware of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs triage Needs labels assigned.
Projects
None yet
Development

No branches or pull requests

2 participants