Allow to swap the `DefaultInlineCompletionHandler` #702

krassowski · 2024-03-27T11:27:26Z

Problem

DefaultInlineCompletionHandler makes some choices which may be sub-optimal when using a models tuned for code infill tasks.

For example, the reliance on langchain template system here:

jupyter-ai/packages/jupyter-ai/jupyter_ai/completions/handlers/default.py

Lines 45 to 47 in 4722dc7

    
           model_arguments = self._template_inputs_from_request(request) 
        
           suggestion = await self.llm_chain.ainvoke(input=model_arguments) 
        
           suggestion = self._post_process_suggestion(suggestion, request)

makes it hard to use structured prefix/suffix queries as described in
#669.

Further, the post processing assumptions may be different when using the infill model:

jupyter-ai/packages/jupyter-ai/jupyter_ai/completions/handlers/default.py

Lines 130 to 132 in 4722dc7

    
           def _post_process_suggestion( 
        
               self, suggestion: str, request: InlineCompletionRequest 
        
           ) -> str:

Proposed Solution

Either:

(a) add a traitlet allowing to swap the DefaultInlineCompletionHandler to a different class
(b) add an entry point allowing to swap the DefaultInlineCompletionHandler to a different class

Any preferences?

Additional context

Chat slash command handlers can be added/(swapped?) by using entry points:

[project.entry-points."jupyter_ai.chat_handlers"]
custom = "custom_package:CustomChatHandler"

The text was updated successfully, but these errors were encountered:

krassowski · 2024-03-29T10:58:09Z

The traitlet approach is commonly used for swapping classes and can be configured in standard jupyter config files. For example, it is used by jupyter-scheduler here:

scheduler_class = Type(
    default_value="jupyter_scheduler.scheduler.Scheduler",
    klass="jupyter_scheduler.scheduler.BaseScheduler",
    config=True,
    help=_i18n("The scheduler class to use."),
)

and extensively in jupyter-server itself.

While jupyter-ai uses traitlets for configuration, most of the extensibility is implemented using entry points:

jupyter_ai.chat_handlers,
jupyter_ai.model_providers,
jupyter_ai.default_tasks,
jupyter_ai.embeddings_model_providers

However, currently these entry points act as extension points rather than points for overriding the default handlers. In particular, a long standing issue is that we cannot use jupyter_ai.chat_handlers to override the default handlers (#507).

@dlqqq @JasonWeill any thoughts/preferences?

krassowski · 2024-04-04T10:04:39Z

Thinking more about it, there is a clear case to favour the traitlets approach:

using a traitlet makes sense if there there can always be only one completion handler; this ensures that there is no confusion if multiple packages provide entry points, because that would lead to undefined behaviour
if we wanted to have multiple models generate completions, we would probably want a some kind of meta handler rather than to register multiple handlers; this is because the top-level handler must coordinate sending the responses and streaming; thus even if we imagine having multiple handlers in the future, there should always be one top-level handler to govern them all, and that one could be traitlets-configurable to specify the list of child handlers.
a user may wish to install multiple external providers but may not want them to replace the default handler - installation should not be equivalent with agreeing to change the default handler
the handler exposes server resources; it should be a conscious user decision to swap it because swapping the handler to one which does not follow auth practices may turn out to be costly - another point to the traitlets approach

michaelchia · 2024-04-05T09:11:19Z

sorry if im intruding but as a user who has been extending jupyter-ai for use in my organisation, my personal preference would be for an entry point approach with different completion providers. mostly for consistency with the llm and embedding providers. (I've been a bit more detailed in #669). there are customisations in the current default handler that would make more sense to me to implement them in some completion provider subclass instead.

my preference would be to generalise the completion handler and have most of the current logic (prompt templates and llm_chain) be part of a subclass of a base completions provider class where you can easily apply with the current llm providers. while the base completions provider class would be more general allowing extensions to implement some other logic (potentially non-langchain) to provide completions.

thanks for your consideration.

krassowski · 2024-04-05T09:58:22Z

my preference would be to generalise the completion handler and have most of the current logic (prompt templates and llm_chain) be part of a subclass of a base completions provider class where you can easily apply with the current llm providers

I think agree. FYI I previously moved the prompt templates from handlers to providers in #581.

The provider could get two methods with default implementations:

async generate_inline_completion(request) -> InlineCompletionReply
async stream_inline_completion(request) -> AsyncGenerator[InlineCompletionStreamChunk]

This would be backward compatible. The only downside is extending the API surface of the BaseProvider and I am not sure if maintainers would accept such a contribution - I sometimes find it hard to get a reply here. I will open two PRs, one allowing to swap the default handler, and another implementing the above methods in provider, let's see which one works out better.

michaelchia · 2024-04-05T12:01:08Z

although i am not totally against overloading the same llm provider class, as i can see how you can get it to work, i would have a preference for having of having a separate provider for inline completion logic.

i think it would be cleaner to separate the provider class for chat and the provider class for completion.
firstly, you can cleanly separate completion concerns from chat concerns.
secondly, i personally think chat llms and code completions are different based on my own experimentations trying to get models like claude v2 and the google vertex models to do code completion reliably in a variety of scenarios (e.g. proper fill-in-the-middle from incomplete lines in prefix/suffix). i found that only true code completion models (e.g. code-llama, starcoder, code-gecko) are suitable for general code completion. some general LLMs may be able to do it and can be provided in both providers but it is not generally the case.
however, even with a separate completion provider class, if you do want to offer all llm providers as code completion providers by default, im sure there is some way of automatically creating classes of completion providers for all llm providers.

krassowski · 2024-04-05T12:49:05Z

i would have a preference for having of having a separate provider for inline completion logic.

I do not disagree.

i think it would be cleaner to separate the provider class for chat and the provider class for completion.
firstly, you can cleanly separate completion concerns from chat concerns.

True

secondly, i personally think chat llms and code completions are different based on my own experimentations trying to get models like claude v2 and the google vertex models to do code completion reliably in a variety of scenarios (e.g. proper fill-in-the-middle from incomplete lines in prefix/suffix). i found that only true code completion models (e.g. code-llama, starcoder, code-gecko) are suitable for general code completion. some general LLMs may be able to do it and can be provided in both providers but it is not generally the case.

Right, this is getting addressed by the ability to mark specific models as suitable for completion-only and suitable for chat-only, or both in #711

however, even with a separate completion provider class, if you do want to offer all llm providers as code completion providers by default, im sure there is some way of automatically creating classes of completion providers for all llm providers.

Sure. It does not seem pretty to me, but of course we can code anything up.

All this is to say that while I mostly agree with you, I also think that the way to get things merged and released today is by making small incremental changes in non-breaking fashion, and once it is time for a new major release, then rewrite the class hierarchy following the lessons we learned along the way.

krassowski added the enhancement New feature or request label Mar 27, 2024

krassowski mentioned this issue Mar 28, 2024

Use jupyter-collaboration to get the full notebook content for completions #708

Draft

krassowski self-assigned this Apr 4, 2024

krassowski mentioned this issue Apr 5, 2024

Move methods generating completion replies to the provider #717

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow to swap the `DefaultInlineCompletionHandler` #702

Allow to swap the `DefaultInlineCompletionHandler` #702

krassowski commented Mar 27, 2024

krassowski commented Mar 29, 2024

krassowski commented Apr 4, 2024

michaelchia commented Apr 5, 2024

krassowski commented Apr 5, 2024

michaelchia commented Apr 5, 2024

krassowski commented Apr 5, 2024

Allow to swap the DefaultInlineCompletionHandler #702

Allow to swap the DefaultInlineCompletionHandler #702

Comments

krassowski commented Mar 27, 2024

Problem

Proposed Solution

Additional context

krassowski commented Mar 29, 2024

krassowski commented Apr 4, 2024

michaelchia commented Apr 5, 2024

krassowski commented Apr 5, 2024

michaelchia commented Apr 5, 2024

krassowski commented Apr 5, 2024

Allow to swap the `DefaultInlineCompletionHandler` #702

Allow to swap the `DefaultInlineCompletionHandler` #702