Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] A competitive Web Browsing agent #1856

Merged
merged 18 commits into from
May 21, 2024

Conversation

frankxu2004
Copy link
Collaborator

@frankxu2004 frankxu2004 commented May 17, 2024

This PR aims at enabling a competitive browsing agent for #1470.

Now I transplanted the simplified demo agent used in WebArena in our agent hub.

To test, it works best with GPT-4 LLMs such as GPT-4o.

poetry run python ./opendevin/core/main.py -i 5 -t "tell me the usa's president using google search" -c BrowsingAgent -m gpt-4o-2024-05-13

@frankxu2004 frankxu2004 marked this pull request as ready for review May 20, 2024 21:57
@frankxu2004
Copy link
Collaborator Author

frankxu2004 commented May 20, 2024

Example logs:

17:52:07 - opendevin:INFO: browsing_agent.py:128 - Last action failed:
click('235')
Try again with the current state of the page.


# Current Accessibility Tree:
RootWebArea 'Google', focused
        [20] navigation ''
                [22] link 'About'
                [23] link 'Store'
                [31] link 'Gmail'
                [33] link 'Search for Images'
                [38] button 'Google apps', expanded=False
                        [39] image ''
                [40] link 'Sign in'
                [a] IframePresentational ''
        [48] image 'Google'
        [78] search ''
                [88] image ''
                [92] combobox 'Search' value='current president of the USA', focused, autocomplete='both', hasPopup='listbox', expanded=True, controls='Alh6id'
                [98] button 'Clear'
                        [100] image ''
                [103] button 'Search by voice'
                        [104] image ''
                [106] button 'Search by image'
                        [107] image ''
                [127] listbox '', multiselectable=False, orientation='vertical'
                        [141] option 'current president of the usa', selected=False
                        [141] option 'who is the president of the usa', selected=False
                        [141] option 'who is the president of the usa now', selected=False
                        [141] option 'who is the president of the usa 2024', selected=False
                        [141] option 'who is the president of the usa 2023', selected=False
                        [141] option 'president of the senate us', selected=False
                        [141] option 'who is the president of the usa 2020', selected=False
                        [141] option 'who is the president of the usa 2021', selected=False
                        [141] option 'who is the president of the usa today', selected=False
                        [141] option 'who is the president of the usa during ww1', selected=False
                [226] button 'Google Search'
                [227] button "I'm Feeling Lucky"
                [230] button 'Report inappropriate predictions'
                [235] button 'Google Search'
                [236] button "I'm Feeling Lucky"
        [271] contentinfo ''
                [275] link 'Advertising'
                [276] link 'Business'
                [277] link 'How Search works'
                [279] link 'Our third decade of climate action: join us'
                        [280] image ''
                [283] link 'Privacy'
                [284] link 'Terms'
                [289] button 'Settings', hasPopup='menu', expanded=False
                        generic '', hasPopup='menu'

# Previous Actions
goto('https://www.google.com')
fill('92', 'current president of the USA')
click('235')
click('235')


Here is an example with chain of thought of a valid action when clicking on a button:
"
In order to accomplish my goal I need to click on the button with bid 12
```click("12")```
"

17:52:07 - opendevin:INFO: browsing_agent.py:129 - In order to accomplish my goal, I need to click on the button with bid 226 to perform the Google search.
```click('226')```
17:52:07 - ACTION
BrowseInteractiveAction(browser_actions="click('226')", thought='In order to accomplish my goal, I need to click on the button with bid 226 to perform the Google search.', action='browse_interactive')


==============
STEP 4

17:52:13 - opendevin:INFO: browsing_agent.py:141 - Cost: 0.02 USD | Accumulated Cost: 0.05 USD
17:52:13 - opendevin:INFO: browsing_agent.py:128 - 

# Current Accessibility Tree:
RootWebArea 'current president of the USA - Google Search', focused
        [15] heading 'Accessibility Links'
        [18] link 'Skip to main content'
        [19] link 'Turn off continuous scrolling'
        [27] link 'Accessibility help'
        [31] link 'Accessibility feedback'
        [35] search ''
                [39] link 'Google'
                        [40] image 'Google'
                [56] combobox 'Search' value='current president of the USA', autocomplete='both', hasPopup='listbox', expanded=False, controls='Alh6id'
                [59] button 'Clear'
                        [61] image ''
                [63] button 'Search by voice'
                        [64] image ''
                [65] button 'Search by image'
                        [66] image ''
                [67] button 'Search'
                        [70] image ''
        [248] button 'Settings'
                [250] image ''
        [253] banner ''
                [256] button 'Google apps', expanded=False
                        [257] image ''
                [259] link 'Sign in'
        [280] navigation ''
                [283] navigation ''
                        [287] heading 'Filters and Topics'
                        [291] list ''
                                [292] listitem ''
                                        StaticText 'All'
                                [295] listitem ''
                                        [296] link 'Images'
                                [298] listitem ''
                                        [299] link 'News'
                                [301] listitem ''
                                        [302] link 'Videos'
                                [304] listitem ''
                                        [305] link 'Shopping'
                                [307] listitem ''
                                        [309] button 'More', hasPopup='menu', expanded=False
                                                [312] image ''
                        [336] button 'Tools', expanded=False, controls='hdtbMenus'
                [340] list ''
                        [346] button 'SafeSearch', hasPopup='menu', expanded=False
                                [353] image ''
        [554] main ''
                [558] heading 'Search Results'
                [575] heading 'United States/President'
                [583] heading 'Joe Biden'
                        [585] link 'Joe Biden'
                [597] button 'Credit: Getty Images/The White House'
                        [600] image 'Credit: Getty Images/The White House'
                StaticText 'The 46th and current president of the United States is Joseph R. Biden, Jr. He was sworn into office on January 20, 2021.'
                StaticText 'Dec 6, 2023'
                [618] link 'Presidents, vice presidents, and first ladies | USAGov USA.gov https://www.usa.gov › ... › U.S. facts and figures'
                        [620] heading 'Presidents, vice presidents, and first ladies | USAGov'
                [649] button 'About this result'
                        [652] image ''
                [656] heading 'People also search for'
                        [657] link 'People also search for'
                [660] link 'Benjamin Netanyahu (Trending)'
                        [666] image ''
                [670] link 'Donald Trump'
                [676] link 'Katie Britt (Trending)'
                        [682] image ''
                [686] link 'Jill Biden'
                [692] link 'Barack Obama'
                [698] link 'Neilia Hunter Biden'
                [704] link 'Kamala Harris'
                [725] button 'Feedback'
                StaticText 'Sources include:'
                [730] link 'Ballotpedia'
                StaticText ','
                [731] link 'Wikipedia'
                StaticText '.'
                [732] link 'Learn more'
                [753] heading 'People also ask'
                [760] button 'About this result'
                        [763] image ''
                [771] button 'Who is next in line for president of us?', expanded=False, controls='_CMZLZpjVDMvXseMP7LSn8Ao_44'
                [830] button 'Who is the new president of United States?', expanded=False, controls='_CMZLZpjVDMvXseMP7LSn8Ao_34'
                [889] button 'Who is the number 1 US president?', expanded=False, controls='_CMZLZpjVDMvXseMP7LSn8Ao_42'
                [948] button 'What number president is Joe Biden?', expanded=False, controls='_CMZLZpjVDMvXseMP7LSn8Ao_43'
                [1029] button 'Feedback'
                [1051] link 'President of the United States Wikipedia https://en.wikipedia.org › wiki › President_of_the_Unit...'
                        [1053] heading 'President of the United States'
                [1082] button 'About this result'
                        [1085] image ''
                [1089] emphasis ''
                        StaticText 'Joe Biden'
                StaticText 'is the 46th and current president of the United States, having assumed office on January 20, 2021.'
                StaticText '\u200e'
                [1094] link 'List'
                StaticText '· \u200e'
                [1095] link 'Powers'
                [1096] link 'Executive Office of the'
                [1097] link 'Vice President'
                [1105] link 'Joe Biden: The President The White House (.gov) https://www.whitehouse.gov › administration › presiden...'
                        [1107] heading 'Joe Biden: The President'
                [1136] button 'About this result'
                        [1139] image ''
                StaticText 'As President,'
                [1143] emphasis ''
                        StaticText 'Biden'
                StaticText "will restore America's leadership and build our communities back better. Joseph Robinette Biden, Jr. was born in Scranton, Pennsylvania, the\xa0..."
                [1153] link 'President Joe Biden (@potus) • Instagram photos and videos Instagram\xa0·\xa0potus 19.2M+ followers'
                        [1155] heading 'President Joe Biden (@potus) • Instagram photos and videos'
                [1182] button 'About this result'
                        [1185] image ''
                [1189] emphasis ''
                        StaticText '46th'
                StaticText 'President of the United States, husband to @flotus, proud dad and pop. Finishing the job for all Americans. Text me: (302) 404-0880 ... Photo by President\xa0...'
                [1199] link 'President Joe Biden Facebook\xa0·\xa0President Joe Biden 11.9M+ followers'
                        [1201] heading 'President Joe Biden'
                [1228] button 'About this result'
                        [1231] image ''
                [1235] emphasis ''
                        StaticText 'President Joe Biden'
                StaticText '. 10M likes · 72129 talking about this. 46th President of the United States, husband to @FLOTUS, proud father and pop. Text me (302)...'
                [1245] link 'The Executive Branch The White House (.gov) https://www.whitehouse.gov › ... › Our Government'
                        [1247] heading 'The Executive Branch'
                [1276] button 'About this result'
                        [1279] image ''
                [1283] emphasis ''
                        StaticText 'President'
                StaticText 'is both the head of state and head of government of the'
                [1284] emphasis ''
                        StaticText 'United States of America'
                StaticText ', and Commander-in-Chief of the armed forces. Under Article II of\xa0...'
                [1294] link 'President of the United States United States Mission to the United Nations (.gov) https://usun.usmission.gov › Our Leaders'
                        [1296] heading 'President of the United States'
                [1325] button 'About this result'
                        [1328] image ''
                [1332] emphasis ''
                        StaticText 'Joseph R. Biden'
                StaticText '. President Biden represented Delaware for 36 years in the U.S. Senate before becoming the 47th Vice President of the United States.'
                [1342] link 'President of the United States Ballotpedia https://ballotpedia.org › President_of_the_United_States'
                        [1344] heading 'President of the United States'
                [1373] button 'About this result'
                        [1376] image ''
                StaticText 'The current president is'
                [1380] emphasis ''
                        StaticText 'Joe Biden (D'
                StaticText '). Election ... The executive Power shall be vested in a President of the United States of America. ... The President, Vice\xa0...'
                [1391] link 'Joe Biden Wikipedia https://en.wikipedia.org › wiki › Joe_Biden'
                        [1393] heading 'Joe Biden'
                [1422] button 'About this result'
                        [1425] image ''
                [1429] emphasis ''
                        StaticText 'Joseph Robinette Biden Jr'
                StaticText 'is an American politician who is the 46th and current president of the United States since 2021. A member of the Democratic Party,\xa0...'
                StaticText '\u200e'
                [1434] link 'Political positions'
                StaticText '· \u200e'
                [1435] link 'Electoral history'
                [1436] link '2008 Presidential Campaign'
                [1437] link 'Jill Biden'
                [1445] link 'Images'
                [1452] button 'About this result'
                        [1455] image ''
                [1462] button 'Joe Biden: The President | The White House'
                        [1465] image 'Joe Biden: The President | The White House'
                [1468] link 'Joe Biden: The President | The White House The White House'
                        [1473] image ''
                [1483] button 'About this result'
                        [1486] image ''
                [1488] button 'President of the USA | Current Leader'
                        [1491] image 'President of the USA | Current Leader'
                [1494] link 'President of the USA | Current Leader PlanetRulers'
                        [1499] image ''
                [1509] button 'About this result'
                        [1512] image ''
                [1514] button 'Joe Biden: The President | The White House'
                        [1517] image 'Joe Biden: The President | The White House'
                [1520] link 'Joe Biden: The President | The White House The White House'
                        [1525] image ''
                [1535] button 'About this result'
                        [1538] image ''
                [1708] button 'Feedback'
                [1720] button '6 more images'
                        [1726] image ''
        [1771] heading 'Related searches'
        [1778] button 'About this result'
                [1781] image ''
        [1786] link 'who is the 46th president'
        [1792] link 'who is the vice president of the united states'
        [1799] link 'who is the prime minister of usa'
        [1805] link 'all presidents in order'
        [1811] link 'first president of usa'
        [1816] link '5 requirements to be president'
        [1821] link 'joe biden'
        [1826] link 'presidential line of succession today'
        generic '', hidden=True
        generic '', hidden=True
        generic '', owns='rhs'
                [1868] complementary ''
                        generic '', hidden=True
                        generic '', hidden=True
                        [1872] heading 'Complementary Results'
                        [1891] link 'Joe Biden'
                                [1892] heading 'Joe Biden'
                        [1895] heading '46th U.S. President'
                        [1900] button 'More options', hasPopup='menu', expanded=False
                                [1901] image 'More options'
                                        [1902] image ''
                        [1992] link ''
                                [1994] image ''
                        [2009] link 'whitehouse.gov'
                                [2011] image ''
                        [2018] heading 'Description'
                        StaticText 'Joseph Robinette Biden Jr. is an American politician who is the 46th and current president of the United States since 2021.'
                        [2023] link 'Wikipedia'
                        StaticText 'Born'
                        StaticText ':'
                        StaticText 'November 20, 1942 (age 81\xa0years),'
                        [2033] link 'Scranton, PA'
                        StaticText 'Edited works'
                        StaticText ':'
                        [2042] link 'Halting the Spread of HIV/AIDS: Future Efforts in the U. S. Bilateral and Multilateral Response: Congressional Hearings'
                        StaticText ','
                        [2045] link 'MORE'
                        StaticText 'Organizations founded'
                        StaticText ':'
                        [2055] link 'United States Department of Defense China Task Force'
                        StaticText ','
                        [2058] link 'MORE'
                        StaticText 'Grandchildren'
                        StaticText ':'
                        [2068] link 'Navy Joan Roberts'
                        StaticText ','
                        [2069] link 'Natalie Biden'
                        [2070] link 'Maisy Biden'
                        [2071] link 'Robert Biden II'
                        StaticText ','
                        [2072] link 'Naomi Biden'
                        [2073] link 'Finnegan Biden'
                        StaticText 'Grandparents'
                        StaticText ':'
                        [2082] link 'Ambrose J. Finnegan'
                        StaticText ','
                        [2083] link 'Mary Elizabeth Robinette Biden'
                        [2084] link 'Joseph H. Biden'
                        [2085] link 'Geraldine C. Blewitt'
                        StaticText 'Great-grandparents'
                        StaticText ':'
                        [2094] link 'George Hamilton Robinette'
                        StaticText ','
                        [2097] link 'MORE'
                        StaticText 'Marriage location'
                        StaticText ':'
                        [2107] link 'New York, NY'
                        StaticText 'Sources include:'
                        [2111] link 'Ballotpedia'
                        [2112] link 'Wikipedia'
                        StaticText '.'
                        [2113] link 'Learn more'
                        [2119] heading 'Profiles'
                        [2125] link 'Instagram'
                                [2127] image ''
                        [2132] link 'X (Twitter)'
                                [2134] image ''
                        [2139] link 'Facebook'
                                [2141] image ''
                        [2146] link 'YouTube'
                                [2148] image ''
                        [2159] link 'More about Joe Biden'
                        [2166] button 'Feedback'
                        generic '', hidden=True
                        generic '', hidden=True
        [1843] progressbar 'Loading...', live='polite', relevant='additions text', valuemin=0, valuemax=100, valuetext=''
        [1847] heading 'Page Navigation'
        [1848] button 'More results'
        [1856] button '', live='polite', relevant='additions text'
        [1864] navigation ''
        generic '', live='polite', relevant='additions text'
        generic '', live='polite', relevant='additions text'
        generic '', live='polite', relevant='additions text'
        generic '', live='polite', relevant='additions text'

# Previous Actions
goto('https://www.google.com')
fill('92', 'current president of the USA')
click('235')
click('235')
click('226')


Here is an example with chain of thought of a valid action when clicking on a button:
"
In order to accomplish my goal I need to click on the button with bid 12
```click("12")```
"

17:52:13 - opendevin:INFO: browsing_agent.py:129 - In order to accomplish my goal of telling you the current president of the USA, I need to send a message with the relevant information found in the search results.

```send_msg_to_user('The current president of the USA is Joe Biden.')```
17:52:13 - ACTION
MessageAction(content='The current president of the USA is Joe Biden.', wait_for_response=False, action='message')

Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! This mainly adds a browsing agent to the agent hub and tweaked a little bit about browser env. I think we can approve it to unblock the integration of BrowserGym.

EDIT: I also locally tested and confirmed the sample command works on my end!

PS: When we figure out a way to do task decomposition, CodeAct can eventually delegate tasks to this BrowserAgent for complex web browsing tasks!

Copy link
Collaborator

@yufansong yufansong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave some nits. Mostly LGTM. I would be appricate it if you can add more comments or simply elaborate your design and some parameter setting. Then other people can add more work on your codebase. I don't want to block our integration progress and AP it. I can help for some follow up refactor or nits if you have no time.

agenthub/browsing_agent/README.md Outdated Show resolved Hide resolved
agenthub/browsing_agent/browsing_agent.py Outdated Show resolved Hide resolved
agenthub/browsing_agent/browsing_agent.py Outdated Show resolved Hide resolved
agenthub/browsing_agent/prompt.py Outdated Show resolved Hide resolved
agenthub/browsing_agent/prompt.py Outdated Show resolved Hide resolved
agenthub/browsing_agent/prompt.py Show resolved Hide resolved
@frankxu2004
Copy link
Collaborator Author

frankxu2004 commented May 21, 2024

Thanks! @yufansong I added some comments for things that are not clear. Hope it's good for now -- since I changed the BrowserOutputObservation a bit, the integration tests are failing for some, would you mind taking a look how to fix those?

EDIT: NVM, just fixed those, should be ready to go

@yufansong yufansong enabled auto-merge (squash) May 21, 2024 18:54
@yufansong yufansong disabled auto-merge May 21, 2024 19:02
@yufansong yufansong enabled auto-merge (squash) May 21, 2024 19:03
@yufansong yufansong merged commit 1fe290a into OpenDevin:main May 21, 2024
23 checks passed
@frankxu2004 frankxu2004 deleted the browsing-agent branch May 21, 2024 20:03
@li-boxuan
Copy link
Collaborator

li-boxuan commented May 22, 2024

Sad, our project test coverage reduced by 5.87%... let me see if there's anything we could do to test this.

@li-boxuan
Copy link
Collaborator

I've made some progress in creating an integration test for this agent! Will create a PR in a day.

)


class SystemPrompt(PromptElement):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@frankxu2004 this prompt (along with many other prompts in this file) seems unused? Is it by intention?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, basically this whole prompt.py file is not currently used. Currently the agent is a simplified version for ease of understanding. However I included here with the intention of incorporating a more complex agent using more comprehensive information as next steps. Here it's still useful as it provides others of building blocks of prompts and understanding what possible information to include as context for LLMs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These PRs are mostly for chasing the neurips paper deadline so not all features are implemented yet.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, sounds fair. I am just having a bit trouble reproducing poetry run python ./opendevin/core/main.py -i 5 -t "tell me the usa's president using google search" -c BrowsingAgent -m gpt-4o-2024-05-13... I tried like 5 times and only succeeded once.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a bit weird, what error are you seeing? do you have logs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the agent does not return AgentFinishAction, so to the eyes of the frame, it's always error in the end. Maybe I should add this Finish thing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logs.zip

image

Basically, keep clicking without making progress

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sometimes it's like this. I improved the agent a bit and fixed some issues here #1993

super-dainiu pushed a commit to super-dainiu/OpenDevin that referenced this pull request May 23, 2024
* initial attempt at a browsing only agent

* add browsing agent

* update

* implement agent

* update

* fix comments

* remove unnecessary things from memory extras

* update image processing

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants