Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Langchain4J embeddings - tests and native support #5973

Open
zbendhiba opened this issue Apr 6, 2024 · 3 comments
Open

Langchain4J embeddings - tests and native support #5973

zbendhiba opened this issue Apr 6, 2024 · 3 comments

Comments

@zbendhiba
Copy link
Contributor

Describe the feature here

Improve the Langchain4J embeddings extension, to provide integration tests and native support

@jamesnetherton
Copy link
Contributor

I wrote some tests but am currently blocked on a native issue related to JNI usage:

Caused by: java.lang.UnsatisfiedLinkError: ai.djl.huggingface.tokenizers.jni.TokenizersLibrary.encode(JLjava/lang/String;Z)J [symbol: Java_ai_djl_huggingface_tokenizers_jni_TokenizersLibrary_encode or Java_ai_djl_huggingface_tokenizers_jni_TokenizersLibrary_encode__JLjava_lang_String_2Z]
	at org.graalvm.nativeimage.builder/com.oracle.svm.core.jni.access.JNINativeLinkage.getOrFindEntryPoint(JNINativeLinkage.java:152)
	at org.graalvm.nativeimage.builder/com.oracle.svm.core.jni.JNIGeneratedMethodSupport.nativeCallAddress(JNIGeneratedMethodSupport.java:54)
	at ai.djl.huggingface.tokenizers.jni.TokenizersLibrary.encode(Native Method)
	at ai.djl.huggingface.tokenizers.HuggingFaceTokenizer.encode(HuggingFaceTokenizer.java:213)
	at ai.djl.huggingface.tokenizers.HuggingFaceTokenizer.encode(HuggingFaceTokenizer.java:224)
	at ai.djl.huggingface.tokenizers.HuggingFaceTokenizer.tokenize(HuggingFaceTokenizer.java:183)
	at dev.langchain4j.model.embedding.OnnxBertBiEncoder.embed(OnnxBertBiEncoder.java:59)
	at dev.langchain4j.model.embedding.AbstractInProcessEmbeddingModel.embedAll(AbstractInProcessEmbeddingModel.java:50)
	at dev.langchain4j.model.embedding.EmbeddingModel.embed(EmbeddingModel.java:34)
	at org.apache.camel.component.langchain4j.embeddings.LangChain4jEmbeddingsProducer.process(LangChain4jEmbeddingsProducer.java:41)

Seems there were some similar(ish) problems in quarkiverse-langchain4j. They actually have embeddings native tests disabled.

@zbendhiba
Copy link
Contributor Author

@jamesnetherton Do you know from where this Tokenizer is pulled ?
Is it coming directly from our camel component?

@jamesnetherton
Copy link
Contributor

Do you know from where this Tokenizer is pulled

Basically whenever you use any langchain4j-embeddings-*.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants