Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default metadata field names in PagePdfDocumentReader are can't be parsed in a filter expression #696

Open
markpollack opened this issue May 8, 2024 · 0 comments
Milestone

Comments

@markpollack
Copy link
Member

The field name file_name is not compatible with the filter expression parsing.

		SearchRequest searchRequest = SearchRequest.defaults()
				.withTopK(4)
				.withFilterExpression(PagePdfDocumentReader.METADATA_FILE_NAME + " == 'medicaid-wa-faqs.pdf'");

where `public static final String METADATA_FILE_NAME = "file_name"

throws the exception

Caused by: org.antlr.v4.runtime.NoViableAltException: null
	at org.antlr.v4.runtime.atn.ParserATNSimulator.noViableAlt(ParserATNSimulator.java:2014) ~[antlr4-runtime-4.13.1.jar:4.13.1]
	at org.antlr.v4.runtime.atn.ParserATNSimulator.execATN(ParserATNSimulator.java:445) ~[antlr4-runtime-4.13.1.jar:4.13.1]
	at org.antlr.v4.runtime.atn.ParserATNSimulator.adaptivePredict(ParserATNSimulator.java:371) ~[antlr4-runtime-4.13.1.jar:4.13.1]
	at org.springframework.ai.vectorstore.filter.antlr4.FiltersParser.booleanExpression(FiltersParser.java:556) ~[spring-ai-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
	at org.springframework.ai.vectorstore.filter.antlr4.FiltersParser.where(FiltersParser.java:199) ~[spring-ai-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
	at org.springframework.ai.vectorstore.filter.FilterExpressionTextParser.parse(FilterExpressionTextParser.java:147) ~[spring-ai-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
	... 46 common frames omitted

Underscore seems to be the issue. Suggest we change to use camel case for document readers that add metadata fields.

@markpollack markpollack added this to the 1.0.0-M1 milestone May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant