Exposing grammar as a request parameter in completion/chat with go-side grammar validation #4525

richardanaya · 2024-05-19T20:11:16Z

Why is passing down grammars needed?

Relying upon the context of a prompt to dictate structure can be unreliable (because its dependent upon the model and generational randomness) and takes up context space. Grammar is a well proven way to constrain generational output, and in fact format="JSON" even depends on it, but format="JSON" allows no reliable specification large complex structures and can even be tricked with prompt attacks.

Why grammar and not JSON schema?

While JSON schema would make a nice future addition, there's interest in data structures outside of JSON (simple enum values, programming languages, etc.). Also, JSON schema generators will rely upon grammars fundamentally, so validating the grammar generated by JSON schema will also benefit from grammar checking.

Why not just pass along the grammar to llama.cpp?

I looked into complexities of passing along grammar to llama.cpp server. There's a few challenges:

llama.cpp server doesn't return errors when bad grammar is passed to it with streaming mode on. It gives an incomprehensible "unexpected EOF"
the in memory model will be reused if the grammar is valid OR changed. BUT... the in-memory model appears to get reloaded if you give it a bad grammar and then follow up with a good grammar.
it appears to work perfectly reusing in memory models just passing along a completely valid grammar (even a variety of valid grammars)

My conclusion from this given the advice of the community is that we do indeed have to do our our GBNF grammar validation on the Go server side to do our best at preventing passing down bad grammar.

In this PR i've created:

the functionality to pass along grammar in chat and completion mode
documentation in readme related to new property
prevention of using grammar and json parameters at same time.
validation code for grammars
extensive set of 30+ tests for grammar ranging from character classes, strings, internationalizations comments, etc.
tests of every known grammar on llama.cpp and also individual unit tests
no usages of regex to make clear understandable parsing

Edge cases:

i've probably not implemented the entirety of whats possible in character classes, but I have a limited subset compatible with the grammar listed on llamma.cpp. My assumption is most people's grammars will be less complex than these.
there might be some valid grammars I don't currently support (but to the best of my knowledge we support all the major publicly available ones including ones as complex as C programming language), I chose not to use a full on go parser library because I wanted the cognitive load of this code to be approachable initially (rather than every viewer of this code to have to learn a new library). if in the future, we want to replace it with a more formal technology we can and tests can be reused.

Examples of success:

Example of failure:

I believe this PR satisfies #4074 with an acceptable amount of protection from sending invalid GBNF grammars with useful error messages.

richardanaya · 2024-05-20T02:43:26Z

I think i've added as many tests as I can think of to meaningfully add. I'll await feedback. @jmorganca

…de grammar validation

mitar · 2024-05-31T08:31:42Z

Have you seen #3618? It adds both grammar and JSON schema option which is then passed to llama.cpp. I think it would be nice to combine those two PRs (especially tests from this PR).

mitar · 2024-05-31T08:36:37Z

llm/server.go

+			return fmt.Errorf("grammar and format cannot be used together")
+		}
+
+		err := ValidateGrammar(req.Grammar)


Is it really necessary to validate the grammar? Llama.cpp does that anyway?

I showed in my investigation above bad grammars eject the model from memory and makes it reload. The streaming llama.cpp server has bad error handling. The go-side grammar check prevents that.

Hm, where is go-side in this PR? I see that you implemented your own validator?

Yah, I wrote my own validator in the grammar.go file of the commit. I didn't want to obligate this project to some special parsing library, so I tried to just be as straight forward as possible to get some initial validation going. I was a bit paranoid the PR might seem too strange if I did something too esoteric. The suite of tests I think could be useful for whatever go validation evolves into.

I was aiming for something just broadly adequate at validation to protect the lamma.cpp server from the ejections of models. I noticed that a lot of PRs didn't get accepted because they were pretty simplistic pass throughs. Talking with someone from the Discord said that some lack of even basic protection over the servers state might have been the reason why. That's sort of why this PR evolved as it did.

Oh, I thought "go-side" is some library for grammar checking. :-) Lol.

richardanaya · 2024-06-01T00:43:21Z

Have you seen #3618? It adds both grammar and JSON schema option which is then passed to llama.cpp. I think it would be nice to combine those two PRs (especially tests from this PR).

I have, but I think it suffers from the same problem as bad grammars that erroneous json schema cause model ejects. I think it would require a json schema validator. I think that's a big enough task it'd make sense to either leave that to another wrapper project to convert JSON schema to grammar, or put in separate PR. I'm interested in writing that separate PR, but i'd like to at least get this one finished. I think a lot of folks have been waiting on even just basic grammar support. Thanks!

mitar · 2024-06-01T17:03:28Z

I think it suffers from the same problem as bad grammars that erroneous json schema cause model ejects.

I am not sure why I would have to pay for grammar and JSON schema validation at every API request? I find this strange. In general I would say that in my case, JSON schema and/or grammar is part of a trusted input and not provided by the user. At least validation should then be cached or something?

Also is the point of this validation to prevent malicious values for grammar or just accidental erroneous values? Because if the goal is to prevent malicious values, then validator should completely match llama.cpp would reject. Otherwise an attacker would be able to bypass this validator by crafting the value which passes this validator but is still rejected by llama.cpp.

So I am not sure exactly why is this validation needed?

richardanaya · 2024-06-01T17:13:53Z

These are two valid concerns:

Cacheing could definitely help, I could add something small
You're right that a malicious hacker could find some way to bust the internal model. The goal of this PR isn't to be perfect security, its to get the ball rolling on getting grammar into the project and even get feedback and be generally aligned with the desired principles. As I specified above, my validator isn't a perfect representation of what's inside of llama.cpp's capabilities. To my knowledge, a spec of their grammar support doesn't even exist. So my validation is a subset and maybe has holes and is to what I could determine from their public documentation and public existing grammars.

Again :) I know nothing about the mindset of the project owners on what's holding them back from merging in grammar support. I did my best to make a PR in line with convos from older members in Discord to speculatively address their issues but also not make something too esoteric.

richardanaya · 2024-06-01T17:46:18Z

@mitar added simple caching and some simple sanity checking around size of grammar

richardanaya · 2024-06-01T20:09:26Z

@mitar I thought about your concern, I now only process grammars if OLLAMA_GRAMMAR set to "true". That way custom grammar is opt-in for people okay with it's trade offs.

mitar · 2024-06-01T20:12:48Z

I now only process grammars if OLLAMA_GRAMMAR set to "true". That way custom grammar is opt-in for people okay with it's trade offs.

I do not get why would this be useful? You should maybe only make validation an option. But you should always pass grammar through if user wants to use the grammar?

richardanaya · 2024-06-01T20:17:15Z

I now only process grammars if OLLAMA_GRAMMAR set to "true". That way custom grammar is opt-in for people okay with it's trade offs.

I do not get why would this be useful? You should maybe only make validation an option. But you should always pass grammar through if user wants to use the grammar?

My understanding the goal is to protect the model from being ejected by llama.cpp server. We should never pass down invalid gramma to the best of our ability. Therefore grammar being passed in is opt-in until the community is comfortable with the validation. In other words, we shouldn't let a vulnerability be the default behavior.

MHugonKaliop · 2024-06-07T06:11:35Z

Would like to see this merged, using grammar, plus having a warning when syntax is wrong is highly interested for me ;)

mann1x · 2024-06-07T19:46:03Z

@BruceMacD
This seems a pretty solid implementation, can someone review it and consider the merging?

ketsapiwiq · 2024-06-07T21:29:47Z

If the maintainers are afraid to add this grammar property to the API, maybe we can merge this PR with the one exposing OpenAI-like function calling (defining tools) in the API? (#3303)
I think that grammar constraints are mostly for JSON / function calling anyway.

richardanaya · 2024-06-07T21:50:24Z

Just for clarity, i'm not interested in @ketsapiwiq's PR.

mitar · 2024-06-08T08:27:24Z

Personally, I think only thing which should be done is https://gitlab.com/peerdb/ollama/-/commit/f1f7b7ea0cc1c582fdb122b69646fc1b5661c9c8 (+ documentation update). This adds little code to maintain to ollama and its simplicity I think is better than the potential downsides of passing invalid data. (Passing invalid data could also be handled in the future by llama.cpp.)

richardanaya · 2024-06-08T16:16:18Z

@mitar

I would actually prefer this as well out of simplicity. But, there's been countless PRs in Ollama that have suggested that already that never made it. The reason this PR is so complex is i'm addressing what I assume are objections that were concerns.

mitar · 2024-06-09T07:37:31Z

I assume are objections that were concerns.

But those objections you assume were never stated.

richardanaya · 2024-06-09T14:31:19Z

@mitar I'm just trying to do whatever possible to make this happen. I did not do it on whim, I took my best guess implementation based on @mann1x who is more familiar with the community than I. But you are correct there is a lot of silence around this issue, but trust, me, i've tried to get answers.

mann1x · 2024-06-09T16:40:23Z

@richardanaya @mitar

I don't know why there's no interest from the maintainers about this.
Can only speculate they have something different in mind.
Like an implementation that requires changes in the library.

I think a big factor is the fact that the grammars can easily tank performances and until a little while ago this was very easy and could easily get so bad that llama.cpp would become unresponsive.
You don't want to implement something that brings more support headaches than else.
I guess now that the situation is better on llama.cpp maybe they are more prone into considering it.

Personally, I like much more the complex implementation.
It's not that complex and a pre-filtering could be used to mitigate issues, which will come out for sure.

But the specifics of the implementation are irrelevant if there's no feedback from the maintainers.
Nothing can be done without.

richardanaya · 2024-06-09T16:51:59Z

@mann1x yah, something i'll offer about this specific PR to emphasize is its only enabled by env flag. Even if the maintainers are nervous about making it default, I see no reason why we can't enable it for those who voluntarily opt in.

mann1x · 2024-06-09T16:52:55Z

@mann1x yah, something i'll offer about this specific PR to emphasize is its only enabled by env flag. Even if the maintainers are nervous about making it default, I see no reason why we can't enable it for those who voluntarily opt in.

this is indeed a plus

MHugonKaliop · 2024-06-09T19:42:23Z

Some comments as I asked for this merge first ;) (well, as a simple user)

This feature (from what I have seen), allows to ask for an answer in a specific format. I know we can do it in the prompt, but from time to time, the answer doesn't follow the rules, and we have to code extra verification and "manual" correction (or rerun). I already "cheated" by using function format, but even like that, from time to time, i have answers not following the rules I need.

With the grammar, if I want a json with a value chosen from three options, I'll have it.
That's why this feature is very interesting for my use cases.

Concerning the validation, I understand the point. There are other ways to check the syntax, and I can agree that it may not be ollama's role.
I can deal without ;)

zutto · 2024-06-10T07:51:16Z

Looked briefly at some of the other pull requests related to this grammar change.

I think some effort from the maintainers is required to get this going, this change in general seems to be stagnating too much and would be great to get some insight on why this is the case, why are the efforts ignored?
If the first version of the feature is not perfect, it can and will be improved in the future. But for now, it would be very good to get grammar support in Ollama.

--
Similar/related pull requests:

richardanaya force-pushed the main branch from 4da5714 to 1dd5575 Compare May 19, 2024 20:13

richardanaya marked this pull request as draft May 19, 2024 20:19

richardanaya force-pushed the main branch from 1dd5575 to 24adac4 Compare May 19, 2024 20:25

richardanaya marked this pull request as ready for review May 19, 2024 20:26

richardanaya force-pushed the main branch from 24adac4 to d50ea5a Compare May 19, 2024 20:27

richardanaya mentioned this pull request May 19, 2024

Grammar Guided response from model. #4074

Open

richardanaya force-pushed the main branch 12 times, most recently from a5e458a to 037fbd6 Compare May 20, 2024 02:40

richardanaya marked this pull request as draft May 20, 2024 02:42

richardanaya marked this pull request as ready for review May 20, 2024 02:42

Exposing grammar as a request parameter in completion/chat with go-si…

80b46f7

…de grammar validation

richardanaya force-pushed the main branch from 037fbd6 to 80b46f7 Compare May 20, 2024 15:08

richardanaya mentioned this pull request May 21, 2024

Improved json grammar #3785

Closed

Merge branch 'ollama:main' into main

1181b8a

mitar reviewed May 31, 2024

View reviewed changes

richardanaya force-pushed the main branch from 9ee8d4e to 35c8e6f Compare June 1, 2024 20:08

richardanaya force-pushed the main branch from 35c8e6f to 43805ff Compare June 1, 2024 20:11

adding cacheing and new test

026f6c3

richardanaya force-pushed the main branch from 43805ff to 026f6c3 Compare June 1, 2024 20:12

richardanaya requested a review from mitar June 1, 2024 20:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exposing grammar as a request parameter in completion/chat with go-side grammar validation #4525

Exposing grammar as a request parameter in completion/chat with go-side grammar validation #4525

richardanaya commented May 19, 2024 •

edited

richardanaya commented May 20, 2024 •

edited

mitar commented May 31, 2024

mitar May 31, 2024

richardanaya Jun 1, 2024 •

edited

mitar Jun 1, 2024

richardanaya Jun 1, 2024

mitar Jun 1, 2024

richardanaya Jun 1, 2024

richardanaya commented Jun 1, 2024 •

edited

mitar commented Jun 1, 2024 •

edited

richardanaya commented Jun 1, 2024

richardanaya commented Jun 1, 2024

richardanaya commented Jun 1, 2024

mitar commented Jun 1, 2024

richardanaya commented Jun 1, 2024 •

edited

MHugonKaliop commented Jun 7, 2024

mann1x commented Jun 7, 2024

ketsapiwiq commented Jun 7, 2024 •

edited

richardanaya commented Jun 7, 2024

mitar commented Jun 8, 2024 •

edited

richardanaya commented Jun 8, 2024

mitar commented Jun 9, 2024

richardanaya commented Jun 9, 2024 •

edited

mann1x commented Jun 9, 2024

richardanaya commented Jun 9, 2024

mann1x commented Jun 9, 2024

MHugonKaliop commented Jun 9, 2024 •

edited

zutto commented Jun 10, 2024

Exposing grammar as a request parameter in completion/chat with go-side grammar validation #4525

Are you sure you want to change the base?

Exposing grammar as a request parameter in completion/chat with go-side grammar validation #4525

Conversation

richardanaya commented May 19, 2024 • edited

richardanaya commented May 20, 2024 • edited

mitar commented May 31, 2024

mitar May 31, 2024

Choose a reason for hiding this comment

richardanaya Jun 1, 2024 • edited

Choose a reason for hiding this comment

mitar Jun 1, 2024

Choose a reason for hiding this comment

richardanaya Jun 1, 2024

Choose a reason for hiding this comment

mitar Jun 1, 2024

Choose a reason for hiding this comment

richardanaya Jun 1, 2024

Choose a reason for hiding this comment

richardanaya commented Jun 1, 2024 • edited

mitar commented Jun 1, 2024 • edited

richardanaya commented Jun 1, 2024

richardanaya commented Jun 1, 2024

richardanaya commented Jun 1, 2024

mitar commented Jun 1, 2024

richardanaya commented Jun 1, 2024 • edited

MHugonKaliop commented Jun 7, 2024

mann1x commented Jun 7, 2024

ketsapiwiq commented Jun 7, 2024 • edited

richardanaya commented Jun 7, 2024

mitar commented Jun 8, 2024 • edited

richardanaya commented Jun 8, 2024

mitar commented Jun 9, 2024

richardanaya commented Jun 9, 2024 • edited

mann1x commented Jun 9, 2024

richardanaya commented Jun 9, 2024

mann1x commented Jun 9, 2024

MHugonKaliop commented Jun 9, 2024 • edited

zutto commented Jun 10, 2024

richardanaya commented May 19, 2024 •

edited

richardanaya commented May 20, 2024 •

edited

richardanaya Jun 1, 2024 •

edited

richardanaya commented Jun 1, 2024 •

edited

mitar commented Jun 1, 2024 •

edited

richardanaya commented Jun 1, 2024 •

edited

ketsapiwiq commented Jun 7, 2024 •

edited

mitar commented Jun 8, 2024 •

edited

richardanaya commented Jun 9, 2024 •

edited

MHugonKaliop commented Jun 9, 2024 •

edited