Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Constructs): Web RAG - Web Crawler, Chatting with Web Pages and Search #291

Open
1 of 2 tasks
spugachev opened this issue Feb 28, 2024 · 3 comments · May be fixed by #474
Open
1 of 2 tasks

(Constructs): Web RAG - Web Crawler, Chatting with Web Pages and Search #291

spugachev opened this issue Feb 28, 2024 · 3 comments · May be fixed by #474
Assignees
Labels
RFC-proposal RFC Proposal - used for tracking through process on Project board. NOT an "issue" as such. stale

Comments

@spugachev
Copy link
Contributor

Describe the feature

Many RAG experiences are built around websites. Users want to crawl one or more websites, retrieve content from pages, schedule periodic updates, and inject results into OpenSearch to enable RAG requests based on website data.

To support this scenario, a WebCrawler construct can be created. It should be capable of creating new OpenSearch indexes or using existing ones.

This construct can also be used to obtain data from websites in real-time. For example, a user could ask a chatbot to summarize a specific webpage. In this case, the web crawler should extract data from the webpage and provide it to the chatbot.

We should also consider web search scenarios, where users want to use a search engine to obtain results. The results found by the search engine should be parsed and returned to the chatbot.

Use Case

RAG over websites

Proposed Solution

No response

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change
@spugachev spugachev added the needs-triage This issue or PR still needs to be triaged. label Feb 28, 2024
@krokoko krokoko added RFC-proposal RFC Proposal - used for tracking through process on Project board. NOT an "issue" as such. and removed needs-triage This issue or PR still needs to be triaged. labels Feb 28, 2024
@krokoko
Copy link
Collaborator

krokoko commented Mar 11, 2024

As discussed, assigning it temporarily to you @spugachev , thanks ! :)

Copy link
Contributor

This issue is now marked as stale because it hasn't seen activity for a while. Add a comment or it will be closed soon. If you wish to exclude this issue from being marked as stale, add the "backlog" label.

@github-actions github-actions bot added the stale label May 11, 2024
Copy link
Contributor

Closing this issue as it hasn't seen activity for a while. Please add a comment @mentioning a maintainer to reopen. If you wish to exclude this issue from being marked as stale, add the "backlog" label.

@krokoko krokoko reopened this May 20, 2024
@krokoko krokoko linked a pull request May 22, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC-proposal RFC Proposal - used for tracking through process on Project board. NOT an "issue" as such. stale
Projects
Development

Successfully merging a pull request may close this issue.

2 participants