Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sweep: Augment on_ticket so that when a user adds a slack link, we automatically unroll the thread and extract the information #3656

Closed
3 tasks done
wwzeng1 opened this issue May 2, 2024 · 12 comments · Fixed by #3668 · May be fixed by #3657
Closed
3 tasks done
Labels
sweep Assigns Sweep to an issue or pull request.

Comments

@wwzeng1
Copy link
Contributor

wwzeng1 commented May 2, 2024

Details

This involves adding a way for Sweep to authenticate to their slack (the user will provide an apikey in server.py), fetching the message thread, and adding this context to on_ticket's issue summary.

Checklist
@wwzeng1 wwzeng1 added the sweep Assigns Sweep to an issue or pull request. label May 2, 2024
Copy link
Contributor

sweep-nightly bot commented May 2, 2024

Sweeping

✨ Track Sweep's progress on our progress dashboard!


0%

💎 Sweep Pro: I'm using GPT-4. You have unlimited GPT-4 tickets. (tracking ID: a591e8c142)

Tip

I can email you when I complete this pull request if you set up your email here!


Actions (click)

  • ↻ Restart Sweep

I am currently looking into this ticket! I will update the progress of the ticket in this comment. I am currently searching through your code, looking for relevant snippets.


Step 1: 🔎 Searching

I'm searching for relevant snippets in your repository. If this is your first time using Sweep, I'm indexing your repository. You can monitor the progress using the progress dashboard


🎉 Latest improvements to Sweep:
  • New dashboard launched for real-time tracking of Sweep issues, covering all stages from search to coding.
  • Integration of OpenAI's latest Assistant API for more efficient and reliable code planning and editing, improving speed by 3x.
  • Use the GitHub issues extension for creating Sweep issues directly from your editor.

💡 To recreate the pull request edit the issue title or description.
Something wrong? Let us know.

Copy link
Contributor

sweep-nightly bot commented May 2, 2024

Sweeping

0%


Actions (click)

  • ↻ Restart Sweep

❌ Unable to Complete PR

The issue was rejected with the following response:

Thank you for submitting this issue. To make it more suitable for Sweep to handle, please provide more details on the following:
  1. How should Sweep authenticate to Slack? Are there specific API keys or credentials that can be provided?

  2. Can you clarify if Sweep is expected to make direct API calls to Slack to fetch the message threads? If so, please provide more information on the Slack API and how it should be used.

  3. Please give more context on the existing pipeline and how the information from the Slack threads should be incorporated. What specific changes need to be made to the current code?

With these additional details, we can better assess if this is a task that Sweep is capable of handling. Let me know if you have any other questions!

For bonus GPT-4 tickets, please report this bug on Discord (tracking ID: 5057af5492).


🎉 Latest improvements to Sweep:
  • New dashboard launched for real-time tracking of Sweep issues, covering all stages from search to coding.
  • Integration of OpenAI's latest Assistant API for more efficient and reliable code planning and editing, improving speed by 3x.
  • Use the GitHub issues extension for creating Sweep issues directly from your editor.

💡 To recreate the pull request edit the issue title or description.

This is an automated message generated by Sweep AI.

Copy link
Contributor

sweep-nightly bot commented May 2, 2024

🚀 Here's the PR! #3657

See Sweep's progress at the progress dashboard!
💎 Sweep Pro: I'm using GPT-4. You have unlimited GPT-4 tickets. (tracking ID: 406bb96ad2)

Tip

I can email you next time I complete a pull request if you set up your email here!


Actions (click)

  • ↻ Restart Sweep

Step 1: 🔎 Searching

I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.

Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.

import base64
import os
from dotenv import load_dotenv
from loguru import logger
logger.print = logger.info
load_dotenv(dotenv_path=".env", override=True, verbose=True)
os.environ["GITHUB_APP_PEM"] = os.environ.get("GITHUB_APP_PEM") or base64.b64decode(
os.environ.get("GITHUB_APP_PEM_BASE64", "")
).decode("utf-8")
if os.environ["GITHUB_APP_PEM"]:
os.environ["GITHUB_APP_ID"] = (
(os.environ.get("GITHUB_APP_ID") or os.environ.get("APP_ID"))
.replace("\\n", "\n")
.strip('"')
)
os.environ["TRANSFORMERS_CACHE"] = os.environ.get(
"TRANSFORMERS_CACHE", "/tmp/cache/model"
) # vector_db.py
os.environ["TIKTOKEN_CACHE_DIR"] = os.environ.get(
"TIKTOKEN_CACHE_DIR", "/tmp/cache/tiktoken"
) # utils.py
SENTENCE_TRANSFORMERS_MODEL = os.environ.get(
"SENTENCE_TRANSFORMERS_MODEL",
"sentence-transformers/all-MiniLM-L6-v2", # "all-mpnet-base-v2"
)
TEST_BOT_NAME = "sweep-nightly[bot]"
ENV = os.environ.get("ENV", "dev")
# ENV = os.environ.get("MODAL_ENVIRONMENT", "dev")
# ENV = PREFIX
# ENVIRONMENT = PREFIX
DB_MODAL_INST_NAME = "db"
DOCS_MODAL_INST_NAME = "docs"
API_MODAL_INST_NAME = "api"
UTILS_MODAL_INST_NAME = "utils"
BOT_TOKEN_NAME = "bot-token"
# goes under Modal 'discord' secret name (optional, can leave env var blank)
DISCORD_WEBHOOK_URL = os.environ.get("DISCORD_WEBHOOK_URL")
DISCORD_MEDIUM_PRIORITY_URL = os.environ.get("DISCORD_MEDIUM_PRIORITY_URL")
DISCORD_LOW_PRIORITY_URL = os.environ.get("DISCORD_LOW_PRIORITY_URL")
DISCORD_FEEDBACK_WEBHOOK_URL = os.environ.get("DISCORD_FEEDBACK_WEBHOOK_URL")
SWEEP_HEALTH_URL = os.environ.get("SWEEP_HEALTH_URL")
DISCORD_STATUS_WEBHOOK_URL = os.environ.get("DISCORD_STATUS_WEBHOOK_URL")
# goes under Modal 'github' secret name
GITHUB_APP_ID = os.environ.get("GITHUB_APP_ID", os.environ.get("APP_ID"))
# deprecated: old logic transfer so upstream can use this
if GITHUB_APP_ID is None:
if ENV == "prod":
GITHUB_APP_ID = "307814"
elif ENV == "dev":
GITHUB_APP_ID = "324098"
elif ENV == "staging":
GITHUB_APP_ID = "327588"
GITHUB_BOT_USERNAME = os.environ.get("GITHUB_BOT_USERNAME")
# deprecated: left to support old logic
if not GITHUB_BOT_USERNAME:
if ENV == "prod":
GITHUB_BOT_USERNAME = "sweep-ai[bot]"
elif ENV == "dev":
GITHUB_BOT_USERNAME = "sweep-nightly[bot]"
elif ENV == "staging":
GITHUB_BOT_USERNAME = "sweep-canary[bot]"
elif not GITHUB_BOT_USERNAME.endswith("[bot]"):
GITHUB_BOT_USERNAME = GITHUB_BOT_USERNAME + "[bot]"
GITHUB_LABEL_NAME = os.environ.get("GITHUB_LABEL_NAME", "sweep")
GITHUB_LABEL_COLOR = os.environ.get("GITHUB_LABEL_COLOR", "9400D3")
GITHUB_LABEL_DESCRIPTION = os.environ.get(
"GITHUB_LABEL_DESCRIPTION", "Sweep your software chores"
)
GITHUB_APP_PEM = os.environ.get("GITHUB_APP_PEM")
GITHUB_APP_PEM = GITHUB_APP_PEM or os.environ.get("PRIVATE_KEY")
if GITHUB_APP_PEM is not None:
GITHUB_APP_PEM = GITHUB_APP_PEM.strip(' \n"') # Remove whitespace and quotes
GITHUB_APP_PEM = GITHUB_APP_PEM.replace("\\n", "\n")
GITHUB_CONFIG_BRANCH = os.environ.get("GITHUB_CONFIG_BRANCH", "sweep/add-sweep-config")
GITHUB_DEFAULT_CONFIG = os.environ.get(
"GITHUB_DEFAULT_CONFIG",
"""# Sweep AI turns bugs & feature requests into code changes (https://sweep.dev)
# For details on our config file, check out our docs at https://docs.sweep.dev/usage/config
# This setting contains a list of rules that Sweep will check for. If any of these rules are broken in a new commit, Sweep will create an pull request to fix the broken rule.
rules:
{additional_rules}
# This is the branch that Sweep will develop from and make pull requests to. Most people use 'main' or 'master' but some users also use 'dev' or 'staging'.
branch: 'main'
# By default Sweep will read the logs and outputs from your existing Github Actions. To disable this, set this to false.
gha_enabled: True
# This is the description of your project. It will be used by sweep when creating PRs. You can tell Sweep what's unique about your project, what frameworks you use, or anything else you want.
#
# Example:
#
# description: sweepai/sweep is a python project. The main api endpoints are in sweepai/api.py. Write code that adheres to PEP8.
description: ''
# This sets whether to create pull requests as drafts. If this is set to True, then all pull requests will be created as drafts and GitHub Actions will not be triggered.
draft: False
# This is a list of directories that Sweep will not be able to edit.
blocked_dirs: []
""",
)
MONGODB_URI = os.environ.get("MONGODB_URI", None)
IS_SELF_HOSTED = os.environ.get("IS_SELF_HOSTED", "true").lower() == "true"
REDIS_URL = os.environ.get("REDIS_URL")
if not REDIS_URL:
REDIS_URL = os.environ.get("redis_url", "redis://0.0.0.0:6379/0")
ORG_ID = os.environ.get("ORG_ID", None)
POSTHOG_API_KEY = os.environ.get(
"POSTHOG_API_KEY", "phc_CnzwIB0W548wN4wEGeRuxXqidOlEUH2AcyV2sKTku8n"
)
E2B_API_KEY = os.environ.get("E2B_API_KEY")
SUPPORT_COUNTRY = os.environ.get("GDRP_LIST", "").split(",")
WHITELISTED_REPOS = os.environ.get("WHITELISTED_REPOS", "").split(",")
BLACKLISTED_USERS = os.environ.get("BLACKLISTED_USERS", "").split(",")
os.environ["TOKENIZERS_PARALLELISM"] = "false"
ACTIVELOOP_TOKEN = os.environ.get("ACTIVELOOP_TOKEN", None)
VECTOR_EMBEDDING_SOURCE = os.environ.get(
"VECTOR_EMBEDDING_SOURCE", "openai"
) # Alternate option is openai or huggingface and set the corresponding env vars
BASERUN_API_KEY = os.environ.get("BASERUN_API_KEY", None)
# Huggingface settings, only checked if VECTOR_EMBEDDING_SOURCE == "huggingface"
HUGGINGFACE_URL = os.environ.get("HUGGINGFACE_URL", None)
HUGGINGFACE_TOKEN = os.environ.get("HUGGINGFACE_TOKEN", None)
# Replicate settings, only checked if VECTOR_EMBEDDING_SOURCE == "replicate"
REPLICATE_API_KEY = os.environ.get("REPLICATE_API_KEY", None)
REPLICATE_URL = os.environ.get("REPLICATE_URL", None)
REPLICATE_DEPLOYMENT_URL = os.environ.get("REPLICATE_DEPLOYMENT_URL", None)
# Default OpenAI
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", None)
OPENAI_API_TYPE = os.environ.get("OPENAI_API_TYPE", "anthropic")
assert OPENAI_API_TYPE in ["anthropic", "azure", "openai"], "Invalid OPENAI_API_TYPE"
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
AZURE_API_KEY = os.environ.get("AZURE_API_KEY", None)
OPENAI_API_BASE = os.environ.get("OPENAI_API_BASE", None)
OPENAI_API_VERSION = os.environ.get("OPENAI_API_VERSION", None)
AZURE_OPENAI_DEPLOYMENT = os.environ.get("AZURE_OPENAI_DEPLOYMENT", None)
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
OPENAI_EMBEDDINGS_AZURE_ENDPOINT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_ENDPOINT", None
)
OPENAI_EMBEDDINGS_AZURE_API_KEY = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_API_KEY", None
)
OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT", None
)
OPENAI_EMBEDDINGS_AZURE_API_VERSION = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_API_VERSION", None
)
OPENAI_API_ENGINE_GPT35 = os.environ.get("OPENAI_API_ENGINE_GPT35", None)
OPENAI_API_ENGINE_GPT4 = os.environ.get("OPENAI_API_ENGINE_GPT4", None)
OPENAI_API_ENGINE_GPT4_32K = os.environ.get("OPENAI_API_ENGINE_GPT4_32K", None)
MULTI_REGION_CONFIG = os.environ.get("MULTI_REGION_CONFIG", None)
if isinstance(MULTI_REGION_CONFIG, str):
MULTI_REGION_CONFIG = MULTI_REGION_CONFIG.strip("'").replace("\\n", "\n")
MULTI_REGION_CONFIG = [item.split(",") for item in MULTI_REGION_CONFIG.split("\n")]
WHITELISTED_USERS = os.environ.get("WHITELISTED_USERS", None)
if WHITELISTED_USERS:
WHITELISTED_USERS = WHITELISTED_USERS.split(",")
WHITELISTED_USERS.append(GITHUB_BOT_USERNAME)
DEFAULT_GPT4_32K_MODEL = os.environ.get("DEFAULT_GPT4_32K_MODEL", "gpt-4-turbo-2024-04-09")
DEFAULT_GPT35_MODEL = os.environ.get("DEFAULT_GPT35_MODEL", "gpt-3.5-turbo-1106")
RESEND_API_KEY = os.environ.get("RESEND_API_KEY", None)
LOKI_URL = None
DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
ENV = "prod" if GITHUB_BOT_USERNAME != TEST_BOT_NAME else "dev"
PROGRESS_BASE_URL = os.environ.get(
"PROGRESS_BASE_URL", "https://progress.sweep.dev"
).rstrip("/")
DISABLED_REPOS = os.environ.get("DISABLED_REPOS", "").split(",")
GHA_AUTOFIX_ENABLED: bool = os.environ.get("GHA_AUTOFIX_ENABLED", False)
MERGE_CONFLICT_ENABLED: bool = os.environ.get("MERGE_CONFLICT_ENABLED", False)
INSTALLATION_ID = os.environ.get("INSTALLATION_ID", None)
AWS_ACCESS_KEY=os.environ.get("AWS_ACCESS_KEY")
AWS_SECRET_KEY=os.environ.get("AWS_SECRET_KEY")
AWS_REGION=os.environ.get("AWS_REGION")
ANTHROPIC_AVAILABLE = AWS_ACCESS_KEY and AWS_SECRET_KEY and AWS_REGION
USE_ASSISTANT = os.environ.get("USE_ASSISTANT", "true").lower() == "true"
ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY", None)
COHERE_API_KEY = os.environ.get("COHERE_API_KEY", None)
VOYAGE_API_KEY = os.environ.get("VOYAGE_API_KEY", None)
VOYAGE_API_AWS_ACCESS_KEY=os.environ.get("VOYAGE_API_AWS_ACCESS_KEY_ID")
VOYAGE_API_AWS_SECRET_KEY=os.environ.get("VOYAGE_API_AWS_SECRET_KEY")
VOYAGE_API_AWS_REGION=os.environ.get("VOYAGE_API_AWS_REGION")
VOYAGE_API_AWS_ENDPOINT_NAME=os.environ.get("VOYAGE_API_AWS_ENDPOINT_NAME", "voyage-code-2")
VOYAGE_API_USE_AWS = VOYAGE_API_AWS_ACCESS_KEY and VOYAGE_API_AWS_SECRET_KEY and VOYAGE_API_AWS_REGION
PAREA_API_KEY = os.environ.get("PAREA_API_KEY", None)
# TODO: we need to make this dynamic + backoff
BATCH_SIZE = int(
os.environ.get("BATCH_SIZE", 64 if VOYAGE_API_KEY else 256) # Voyage only allows 128 items per batch and 120000 tokens per batch
)
DEPLOYMENT_GHA_ENABLED = os.environ.get("DEPLOYMENT_GHA_ENABLED", "true").lower() == "true"
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

"""
on_ticket is the main function that is called when a new issue is created.
It is only called by the webhook handler in sweepai/api.py.
"""
import difflib
import io
import os
import re
import traceback
from typing import Any
import zipfile
from time import time
import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
from sweepai.core.sweep_bot import GHA_PROMPT
from sweepai.agents.pr_description_bot import PRDescriptionBot
from sweepai.agents.image_description_bot import ImageDescriptionBot
from sweepai.config.client import (
RESET_FILE,
RESTART_SWEEP_BUTTON,
REVERT_CHANGED_FILES_TITLE,
SweepConfig,
get_documentation_dict,
get_gha_enabled,
)
from sweepai.config.server import (
DEPLOYMENT_GHA_ENABLED,
ENV,
GITHUB_LABEL_NAME,
IS_SELF_HOSTED,
MONGODB_URI,
PROGRESS_BASE_URL,
)
from sweepai.core.entities import (
AssistantRaisedException,
FileChangeRequest,
MaxTokensExceeded,
NoFilesException,
PullRequest,
SandboxResponse,
)
from sweepai.core.entities import create_error_logs as entities_create_error_logs
from sweepai.core.pr_reader import PRReader
from sweepai.core.sweep_bot import SweepBot, get_files_to_change, get_files_to_change_for_gha, validate_file_change_requests
from sweepai.handlers.create_pr import (
create_config_pr,
create_pr_changes,
safe_delete_sweep_branch,
)
from sweepai.handlers.on_check_suite import clean_gh_logs
from sweepai.utils.image_utils import get_image_contents_from_urls, get_image_urls_from_issue
from sweepai.utils.issue_validator import validate_issue
from sweepai.utils.validate_license import validate_license
from sweepai.utils.buttons import Button, ButtonList, create_action_buttons
from sweepai.utils.chat_logger import ChatLogger
from sweepai.utils.diff import generate_diff
from sweepai.utils.event_logger import posthog
from sweepai.utils.github_utils import (
CURRENT_USERNAME,
ClonedRepo,
convert_pr_draft_field,
get_github_client,
get_token,
sanitize_string_for_github,
)
from sweepai.utils.progress import (
AssistantConversation,
PaymentContext,
TicketContext,
TicketProgress,
TicketProgressStatus,
)
from sweepai.utils.prompt_constructor import HumanMessagePrompt
from sweepai.utils.str_utils import (
BOT_SUFFIX,
FASTER_MODEL_MESSAGE,
UPDATES_MESSAGE,
blockquote,
bot_suffix,
checkbox_template,
clean_logs,
collapsible_template,
create_checkbox,
create_collapsible,
discord_suffix,
format_sandbox_success,
get_hash,
sep,
stars_suffix,
strip_sweep,
to_branch_name,
)
from sweepai.utils.ticket_utils import (
center,
fetch_relevant_files,
fire_and_forget_wrapper,
log_error,
prep_snippets,
)
from sweepai.utils.user_settings import UserSettings
# from sandbox.sandbox_utils import Sandbox
sweeping_gif = """<a href="https://github.com/sweepai/sweep"><img class="swing" src="https://raw.githubusercontent.com/sweepai/sweep/main/.assets/sweeping.gif" width="100" style="width:50px; margin-bottom:10px" alt="Sweeping"></a>"""
custom_config = """
extends: relaxed
rules:
line-length: disable
indentation: disable
"""
INSTRUCTIONS_FOR_REVIEW = """\
### 💡 To get Sweep to edit this pull request, you can:
* Comment below, and Sweep can edit the entire PR
* Comment on a file, Sweep will only modify the commented file
* Edit the original issue to get Sweep to recreate the PR from scratch"""
email_template = """Hey {name},
<br/><br/>
🚀 I just finished creating a pull request for your issue ({repo_full_name}#{issue_number}) at <a href="{pr_url}">{repo_full_name}#{pr_number}</a>!
<br/><br/>
You can view how I created this pull request <a href="{progress_url}">here</a>.
<h2>Summary</h2>
<blockquote>
{summary}
</blockquote>
<h2>Files Changed</h2>
<ul>
{files_changed}
</ul>
{sweeping_gif}
<br/>
Cheers,
<br/>
Sweep
<br/>"""
FAILING_GITHUB_ACTION_PROMPT = """\
The following Github Actions failed on a previous attempt at fixing this issue.
Propose a fix to the failing github actions. You must edit the source code, not the github action itself.
{github_action_log}
"""
# Add :eyes: emoji to ticket
def add_emoji(issue: Issue, comment_id: int = None, reaction_content="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
item_to_react_to.create_reaction(reaction_content)
# If SWEEP_BOT reacted to item_to_react_to with "rocket", then remove it.
def remove_emoji(issue: Issue, comment_id: int = None, content_to_delete="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
reactions = item_to_react_to.get_reactions()
for reaction in reactions:
if (
reaction.content == content_to_delete
and reaction.user.login == CURRENT_USERNAME
):
item_to_react_to.delete_reaction(reaction.id)
def create_error_logs(
commit_url_display: str,
sandbox_response: SandboxResponse,
status: str = "✓",
):
return (
(
"<br/>"
+ create_collapsible(
f"Sandbox logs for {commit_url_display} {status}",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
# takes in a list of workflow runs and returns a list of messages containing the logs of the failing runs
def get_failing_gha_logs(runs, installation_id) -> str:
token = get_token(installation_id)
all_logs = ""
for run in runs:
# jobs_url
jobs_url = run.jobs_url
jobs_response = requests.get(
jobs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
)
if jobs_response.status_code == 200:
failed_jobs = []
jobs = jobs_response.json()["jobs"]
for job in jobs:
if job["conclusion"] == "failure":
failed_jobs.append(job)
failed_jobs_name_list = []
for job in failed_jobs:
# add failed steps
for step in job["steps"]:
if step["conclusion"] == "failure":
failed_jobs_name_list.append(
f"{job['name']}/{step['number']}_{step['name']}"
)
else:
logger.error(
"Failed to get jobs for failing github actions, possible a credentials issue"
)
return all_logs
# make sure jobs in valid
if jobs_response.json()['total_count'] == 0:
logger.error(f"no jobs for this run: {run}, continuing...")
continue
# logs url
logs_url = run.logs_url
logs_response = requests.get(
logs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
allow_redirects=True,
)
# Check if the request was successful
if logs_response.status_code == 200:
zip_data = io.BytesIO(logs_response.content)
zip_file = zipfile.ZipFile(zip_data, "r")
zip_file_names = zip_file.namelist()
for file in failed_jobs_name_list:
if f"{file}.txt" in zip_file_names:
logs = zip_file.read(f"{file}.txt").decode("utf-8")
logs_prompt = clean_gh_logs(logs)
all_logs += logs_prompt + "\n"
else:
logger.error(
"Failed to get logs for failing github actions, likely a credentials issue"
)
return all_logs
def delete_old_prs(repo: Repository, issue_number: int):
logger.info("Deleting old PRs...")
prs = repo.get_pulls(
state="open",
sort="created",
direction="desc",
base=SweepConfig.get_branch(repo),
)
for pr in tqdm(prs.get_page(0)):
# # Check if this issue is mentioned in the PR, and pr is owned by bot
# # This is done in create_pr, (pr_description = ...)
if pr.user.login == CURRENT_USERNAME and f"Fixes #{issue_number}.\n" in pr.body:
safe_delete_sweep_branch(pr, repo)
break
def construct_sweep_bot(
repo: Repository,
repo_name: str,
issue_url: str,
repo_description: str,
title: str,
message_summary: str,
cloned_repo: ClonedRepo,
ticket_progress: TicketProgress,
chat_logger: ChatLogger,
snippets: Any = None,
tree: Any = None,
comments: Any = None,
) -> SweepBot:
human_message = HumanMessagePrompt(
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description.strip(),
title=title,
summary=message_summary,
snippets=snippets,
tree=tree,
)
sweep_bot = SweepBot.from_system_message_content(
human_message=human_message,
repo=repo,
is_reply=bool(comments),
chat_logger=chat_logger,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
)
return sweep_bot
def get_comment_header(
index: int,
g: Github,
repo_full_name: str,
user_settings: UserSettings,
progress_headers: list[None | str],
tracking_id: str | None,
payment_message_start: str,
user_settings_message: str,
errored: bool = False,
pr_message: str = "",
done: bool = False,
initial_sandbox_response: int | SandboxResponse = -1,
initial_sandbox_response_file=None,
config_pr_url: str | None = None,
):
config_pr_message = (
"\n"
+ f"<div align='center'>Install Sweep Configs: <a href='{config_pr_url}'>Pull Request</a></div>"
if config_pr_url is not None
else ""
)
actions_message = create_action_buttons(
[
RESTART_SWEEP_BUTTON,
]
)
sandbox_execution_message = "\n\n## GitHub Actions failed\n\nThe sandbox appears to be unavailable or down.\n\n"
if initial_sandbox_response == -1:
sandbox_execution_message = ""
elif initial_sandbox_response is not None:
repo = g.get_repo(repo_full_name)
commit_hash = repo.get_commits()[0].sha
success = initial_sandbox_response.outputs and initial_sandbox_response.success
status = "✓" if success else "X"
sandbox_execution_message = (
"\n\n## GitHub Actions"
+ status
+ "\n\nHere are the GitHub Actions logs prior to making any changes:\n\n"
)
sandbox_execution_message += entities_create_error_logs(
f'<a href="https://github.com/{repo_full_name}/commit/{commit_hash}"><code>{commit_hash[:7]}</code></a>',
initial_sandbox_response,
initial_sandbox_response_file,
)
if success:
sandbox_execution_message += f"\n\nSandbox passed on the latest `{repo.default_branch}`, so sandbox checks will be enabled for this issue."
else:
sandbox_execution_message += "\n\nSandbox failed, so all sandbox checks will be disabled for this issue."
if index < 0:
index = 0
if index == 4:
return (
pr_message
+ config_pr_message
+ f"\n\n---\n{user_settings.get_message(completed=True)}"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
total = len(progress_headers)
index += 1 if done else 0
index *= 100 / total
index = int(index)
index = min(100, index)
if errored:
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Errored&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}<br/>{center(pbar)}\n\n"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Progress&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}"
+ (
center(
f'\n\n<h2>✨ Track Sweep\'s progress on our <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">progress dashboard</a>!</h2>'
)
if MONGODB_URI is not None
else ""
)
+ f"<br/>{center(pbar)}"
+ ("\n" + stars_suffix if index != -1 else "")
+ "\n"
+ center(payment_message_start)
+ f"\n\n---\n{user_settings_message}"
+ config_pr_message
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
def on_ticket(
title: str,
summary: str,
issue_number: int,
issue_url: str, # purely for logging purposes
username: str,
repo_full_name: str,
repo_description: str,
installation_id: int,
comment_id: int = None,
edited: bool = False,
tracking_id: str | None = None,
):
if not os.environ.get("CLI"):
assert validate_license(), "License key is invalid or expired. Please contact us at team@sweep.dev to upgrade to an enterprise license."
with logger.contextualize(
tracking_id=tracking_id,
):
if tracking_id is None:
tracking_id = get_hash()
on_ticket_start_time = time()
logger.info(f"Starting on_ticket with title {title} and summary {summary}")
(
title,
slow_mode,
do_map,
subissues_mode,
sandbox_mode,
fast_mode,
lint_mode,
) = strip_sweep(title)
# fetch images from body of issue
image_urls = get_image_urls_from_issue(issue_number, repo_full_name, installation_id)
image_contents = get_image_contents_from_urls(image_urls)
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n<summary>Checklist</summary>.*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- \[[ X]\].*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
ticket_progress = TicketProgress(
tracking_id=tracking_id,
username=username,
context=TicketContext(
title=title,
description=summary,
repo_full_name=repo_full_name,
issue_number=issue_number,
is_public=repo.private is False,
start_time=int(time()),
),
)
branch_match = re.search(
r"([B|b]ranch:) *(?P<branch_name>.+?)(\s|$)", summary
)
overrided_branch_name = None
if branch_match and "branch_name" in branch_match.groupdict():
overrided_branch_name = (
branch_match.groupdict()["branch_name"].strip().strip("`\"'")
)
# TODO: this code might be finicky, might have missed edge cases
if overrided_branch_name.startswith("https://github.com/"):
overrided_branch_name = overrided_branch_name.split("?")[0].split(
"tree/"
)[-1]
SweepConfig.get_branch(repo, overrided_branch_name)
chat_logger = (
ChatLogger(
{
"repo_name": repo_name,
"title": title,
"summary": summary,
"issue_number": issue_number,
"issue_url": issue_url,
"username": (
username if not username.startswith("sweep") else assignee
),
"repo_full_name": repo_full_name,
"repo_description": repo_description,
"installation_id": installation_id,
"type": "ticket",
"mode": ENV,
"comment_id": comment_id,
"edited": edited,
"tracking_id": tracking_id,
},
active=True,
)
if MONGODB_URI
else None
)
if chat_logger and not IS_SELF_HOSTED:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
if use_faster_model:
raise Exception(FASTER_MODEL_MESSAGE)
if fast_mode:
use_faster_model = True
if not comment_id and not edited and chat_logger and not sandbox_mode:
fire_and_forget_wrapper(chat_logger.add_successful_ticket)(
gpt3=use_faster_model
)
organization, repo_name = repo_full_name.split("/")
metadata = {
"issue_url": issue_url,
"repo_full_name": repo_full_name,
"organization": organization,
"repo_name": repo_name,
"repo_description": repo_description,
"username": username,
"comment_id": comment_id,
"title": title,
"installation_id": installation_id,
"function": "on_ticket",
"edited": edited,
"model": "gpt-3.5" if use_faster_model else "gpt-4",
"tier": "pro" if is_paying_user else "free",
"mode": ENV,
"slow_mode": slow_mode,
"do_map": do_map,
"subissues_mode": subissues_mode,
"sandbox_mode": sandbox_mode,
"fast_mode": fast_mode,
"is_self_hosted": IS_SELF_HOSTED,
"tracking_id": tracking_id,
}
fire_and_forget_wrapper(posthog.capture)(
username, "started", properties=metadata
)
try:
if current_issue.state == "closed":
fire_and_forget_wrapper(posthog.capture)(
username,
"issue_closed",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": False, "reason": "Issue is closed"}
fire_and_forget_wrapper(add_emoji)(current_issue, comment_id)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="rocket"
)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="confused"
)
fire_and_forget_wrapper(current_issue.edit)(body=summary)
replies_text = ""
summary = summary if summary else ""
fire_and_forget_wrapper(delete_old_prs)(repo, issue_number)
if not sandbox_mode:
progress_headers = [
None,
"Step 1: 🔎 Searching",
"Step 2: ⌨️ Coding",
"Step 3: 🔁 Code Review",
]
else:
progress_headers = [
None,
"📖 Reading File",
"🛠️ Executing Sandbox",
]
issue_comment = None
payment_message, payment_message_start = get_payment_messages(
chat_logger
)
ticket_progress.context.payment_context = PaymentContext(
use_faster_model=use_faster_model,
pro_user=is_paying_user,
daily_tickets_used=(
chat_logger.get_ticket_count(use_date=True)
if chat_logger
else 0
),
monthly_tickets_used=(
chat_logger.get_ticket_count() if chat_logger else 0
),
)
ticket_progress.save()
config_pr_url = None
user_settings = UserSettings.from_username(username=username)
user_settings_message = user_settings.get_message()
cloned_repo = ClonedRepo(
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=overrided_branch_name,
)
# check that repo's directory is non-empty
if os.listdir(cloned_repo.cached_dir) == []:
logger.info("Empty repo")
first_comment = (
"Sweep is currently not supported on empty repositories. Please add some"
f" code to your repository and try again.\n{sep}##"
f" {progress_headers[1]}\n{bot_suffix}{discord_suffix}"
)
if issue_comment is None:
issue_comment = current_issue.create_comment(
first_comment + BOT_SUFFIX
)
else:
issue_comment.edit(first_comment + BOT_SUFFIX)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {"success": False}
indexing_message = (
"I'm searching for relevant snippets in your repository. If this is your first"
" time using Sweep, I'm indexing your repository. You can monitor the progress using the progress dashboard"
)
first_comment = (
f"{get_comment_header(0, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message)}\n{sep}I am currently looking into this ticket! I"
" will update the progress of the ticket in this comment. I am currently"
f" searching through your code, looking for relevant snippets.\n{sep}##"
f" {progress_headers[1]}\n{indexing_message}{bot_suffix}{discord_suffix}"
)
# Find Sweep's previous comment
comments = []
for comment in current_issue.get_comments():
comments.append(comment)
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
break
if issue_comment is None:
issue_comment = current_issue.create_comment(first_comment)
else:
fire_and_forget_wrapper(issue_comment.edit)(first_comment)
old_edit = issue_comment.edit
issue_comment.edit = lambda msg: old_edit(msg + BOT_SUFFIX)
past_messages = {}
current_index = 0
table = None
initial_sandbox_response = -1
initial_sandbox_response_file = None
def refresh_token():
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
return user_token, g, repo
def edit_sweep_comment(
message: str,
index: int,
pr_message="",
done=False,
add_bonus_message=True,
):
nonlocal current_index, user_token, g, repo, issue_comment, initial_sandbox_response, initial_sandbox_response_file
message = sanitize_string_for_github(message)
if pr_message:
pr_message = sanitize_string_for_github(pr_message)
# -1 = error, -2 = retry
# Only update the progress bar if the issue generation errors.
errored = index == -1
if index >= 0:
past_messages[index] = message
current_index = index
agg_message = None
# Include progress history
# index = -2 is reserved for
for i in range(
current_index + 2
): # go to next header (for Working on it... text)
if i == 0 or i >= len(progress_headers):
continue # skip None header
header = progress_headers[i]
if header is not None:
header = "## " + header + "\n"
else:
header = "No header\n"
msg = header + (past_messages.get(i) or "Working on it...")
if agg_message is None:
agg_message = msg
else:
agg_message = agg_message + f"\n{sep}" + msg
suffix = bot_suffix + discord_suffix
if errored:
agg_message = (
"## ❌ Unable to Complete PR"
+ "\n"
+ message
+ (
"\n\nFor bonus GPT-4 tickets, please report this bug on"
f" **[Discord](https://discord.gg/invite/sweep)** (tracking ID: `{tracking_id}`)."
if add_bonus_message
else ""
)
)
if table is not None:
agg_message = (
agg_message
+ f"\n{sep}Please look at the generated plan. If something looks"
f" wrong, please add more details to your issue.\n\n{table}"
)
suffix = bot_suffix # don't include discord suffix for error messages
# Update the issue comment
msg = f"{get_comment_header(current_index, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message, errored=errored, pr_message=pr_message, done=done, initial_sandbox_response=initial_sandbox_response, initial_sandbox_response_file=initial_sandbox_response_file, config_pr_url=config_pr_url)}\n{sep}{agg_message}{suffix}"
try:
issue_comment.edit(msg)
except BadCredentialsException:
logger.error(
f"Bad credentials, refreshing token (tracking ID: `{tracking_id}`)"
)
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
issue_comment = None
for comment in comments:
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
current_issue = repo.get_issue(number=issue_number)
if issue_comment is None:
issue_comment = current_issue.create_comment(msg)
else:
issue_comment = [
comment
for comment in current_issue.get_comments()
if comment.user.login == CURRENT_USERNAME
][0]
issue_comment.edit(msg)
if use_faster_model:
edit_sweep_comment(
FASTER_MODEL_MESSAGE, -1, add_bonus_message=False
)
posthog.capture(
username,
"ran_out_of_tickets",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {
"success": False,
"error_message": "We deprecated supporting GPT 3.5.",
}
error_message = validate_issue(title + summary)
if error_message:
logger.warning(f"Validation error: {error_message}")
edit_sweep_comment(
(
f"The issue was rejected with the following response:\n\n{blockquote(error_message)}"
),
-1,
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
posthog.capture(
username,
"invalid_issue",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": True}
prs_extracted = PRReader.extract_prs(repo, summary)
message_summary = summary
if prs_extracted:
message_summary += "\n\n" + prs_extracted
edit_sweep_comment(
create_collapsible(
"I found that you mentioned the following Pull Requests that might be important:",
blockquote(
prs_extracted,
),
),
1,
)
try:
# search/context manager
logger.info("Searching for relevant snippets...")
if image_contents: # doing it here to avoid editing the original issue
message_summary += ImageDescriptionBot().describe_images(text=title + message_summary, images=image_contents)
snippets, tree, _, repo_context_manager = fetch_relevant_files(
cloned_repo,
title,
message_summary,
replies_text,
username,
metadata,
on_ticket_start_time,
tracking_id,
is_paying_user,
is_consumer_tier,
issue_url,
chat_logger,
ticket_progress,
images=image_contents
)
cloned_repo = repo_context_manager.cloned_repo
except Exception as e:
edit_sweep_comment(
(
"It looks like an issue has occurred around fetching the files."
f" The exception was {str(e)}. If this error persists"
f" contact team@sweep.dev.\n\n> @{username}, editing this issue description to include more details will automatically make me relaunch. Please join our Discord server for support (tracking_id={tracking_id})"
),
-1,
)
raise Exception("Failed to fetch files") from e
_user_token, g = get_github_client(installation_id)
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
repo = g.get_repo(repo_full_name)
ticket_progress.search_progress.indexing_progress = (
ticket_progress.search_progress.indexing_total
)
ticket_progress.status = TicketProgressStatus.PLANNING
ticket_progress.save()
# Fetch git commit history
if not repo_description:
repo_description = "No description provided."
message_summary += replies_text
get_documentation_dict(repo)
docs_results = ""
sweep_bot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title=title,
message_summary=message_summary,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
# Check repository for sweep.yml file.
sweep_yml_exists = False
sweep_yml_failed = False
for content_file in repo.get_contents(""):
if content_file.name == "sweep.yaml":
sweep_yml_exists = True
# Check if YAML is valid
yaml_content = content_file.decoded_content.decode("utf-8")
sweep_yaml_dict = {}
try:
sweep_yaml_dict = yaml.safe_load(yaml_content)
except Exception:
logger.error(f"Failed to load YAML file: {yaml_content}")
if len(sweep_yaml_dict) > 0:
break
linter_config = yamllint_config.YamlLintConfig(custom_config)
problems = list(linter.run(yaml_content, linter_config))
if problems:
errors = [
f"Line {problem.line}: {problem.desc} (rule: {problem.rule})"
for problem in problems
]
error_message = "\n".join(errors)
markdown_error_message = f"**There is something wrong with your [sweep.yaml](https://github.com/{repo_full_name}/blob/main/sweep.yaml):**\n```\n{error_message}\n```"
sweep_yml_failed = True
logger.error(markdown_error_message)
edit_sweep_comment(markdown_error_message, -1)
else:
logger.info("The YAML file is valid. No errors found.")
break
# If sweep.yaml does not exist, then create a new PR that simply creates the sweep.yaml file.
if not sweep_yml_exists:
try:
logger.info("Creating sweep.yaml file...")
config_pr = create_config_pr(sweep_bot, cloned_repo=cloned_repo)
config_pr_url = config_pr.html_url
edit_sweep_comment(message="", index=-2)
except Exception as e:
logger.error(
"Failed to create new branch for sweep.yaml file.\n",
e,
traceback.format_exc(),
)
else:
logger.info("sweep.yaml file already exists.")
try:
# ANALYZE SNIPPETS
newline = "\n"
edit_sweep_comment(
"I found the following snippets in your repository. I will now analyze"
" these snippets and come up with a plan."
+ "\n\n"
+ create_collapsible(
"Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.",
"\n".join(
[
f"https://github.com/{organization}/{repo_name}/blob/{repo.get_commits()[0].sha}/{snippet.file_path}#L{max(snippet.start, 1)}-L{min(snippet.end, snippet.content.count(newline) - 1)}\n"
for snippet in snippets
]
),
)
+ (
create_collapsible(
"I also found that you mentioned the following Pull Requests that may be helpful:",
blockquote(prs_extracted),
)
if prs_extracted
else ""
)
+ (f"\n\n{docs_results}\n\n" if docs_results else ""),
1,
)
logger.info("Fetching files to modify/create...")
file_change_requests, plan = get_files_to_change(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=f"{title}\n\n{message_summary}",
repo_name=repo_full_name,
cloned_repo=cloned_repo,
images=image_contents
)
validate_file_change_requests(file_change_requests, cloned_repo)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.status = TicketProgressStatus.CODING
ticket_progress.save()
if not file_change_requests:
if len(title + summary) < 60:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details? Please make sure that the title and"
" summary of the issue are at least 60 characters."
),
-1,
)
else:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details?"
),
-1,
)
raise Exception("No files to modify.")
file_change_requests: list[
FileChangeRequest
] = sweep_bot.validate_file_change_requests(
file_change_requests,
)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.save()
table = tabulate(
[
[
file_change_request.entity_display,
file_change_request.instructions_display.replace(
"\n", "<br/>"
).replace("```", "\\```"),
]
for file_change_request in file_change_requests
if file_change_request.change_type != "check"
],
headers=["File Path", "Proposed Changes"],
tablefmt="pipe",
)
logger.info("Generating PR...")
pull_request = PullRequest(
title="Sweep: " + title,
branch_name="sweep/" + to_branch_name(title),
content="",
)
logger.info("Making PR...")
ticket_progress.context.branch_name = pull_request.branch_name
ticket_progress.save()
files_progress: list[tuple[str, str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
"⏳ In Progress",
"",
)
for file_change_request in file_change_requests
]
checkboxes_progress: list[tuple[str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
" ",
)
for file_change_request in file_change_requests
if not file_change_request.change_type == "check"
]
checkboxes_contents = "\n".join(
[
create_checkbox(
f"`{filename}`", blockquote(instructions), check == "X"
)
for filename, instructions, check in checkboxes_progress
]
)
create_collapsible("Checklist", checkboxes_contents, opened=True)
file_change_requests[0].status = "running"
condensed_checkboxes_contents = "\n".join(
[
create_checkbox(f"`{filename}`", "", check == "X").strip()
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_collapsible = create_collapsible(
"Checklist", condensed_checkboxes_contents, opened=True
)
current_issue = repo.get_issue(number=issue_number)
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
delete_branch = False
generator = create_pr_changes(
file_change_requests,
pull_request,
sweep_bot,
username,
installation_id,
issue_number,
chat_logger=chat_logger,
base_branch=overrided_branch_name,
additional_messages=[],
)
edit_sweep_comment(checkboxes_contents, 2)
if not file_change_requests:
raise NoFilesException()
response = {
"error": Exception(
f"Sweep failed to generate any file change requests! This could mean that Sweep failed to find the correct lines of code to modify or that GPT-4 did not respond in our specified format. Sometimes, retrying will fix this error. Otherwise, reach out to our Discord server for support (tracking_id={tracking_id})."
)
}
changed_files = []
for item in generator:
if isinstance(item, dict):
response = item
break
(
new_file_contents,
_,
commit,
file_change_requests,
) = item
# append all files that have been changed
if new_file_contents:
for file_name, _ in new_file_contents.items():
changed_files.append(file_name)
commit_hash: str = (
commit
if isinstance(commit, str)
else (
commit.sha
if commit is not None
else repo.get_branch(
pull_request.branch_name
).commit.sha
)
)
commit_url = (
f"https://github.com/{repo_full_name}/commit/{commit_hash}"
)
commit_url_display = (
f"<a href='{commit_url}'><code>{commit_hash[:7]}</code></a>"
)
create_error_logs(
commit_url_display,
None,
status=(
"✓"
),
)
checkboxes_progress = [
(
file_change_request.display_summary
+ " "
+ file_change_request.status_display
+ " "
+ (file_change_request.commit_hash_url or "")
+ f" [Edit]({file_change_request.get_edit_url(repo.full_name, pull_request.branch_name)})",
file_change_request.instructions_ticket_display
+ f"\n\n{file_change_request.diff_display}",
(
"X"
if file_change_request.status
in ("succeeded", "failed")
else " "
),
)
for file_change_request in file_change_requests
]
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
collapsible_template.format(
summary="Checklist",
body=checkboxes_contents,
opened="open",
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
try:
current_issue = repo.get_issue(number=issue_number)
except BadCredentialsException:
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
logger.info(files_progress)
edit_sweep_comment(checkboxes_contents, 2)
if not response.get("success"):
raise Exception(f"Failed to create PR: {response.get('error')}")
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
for _ in range(3):
try:
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
break
except Exception:
from time import sleep
sleep(1)
edit_sweep_comment(checkboxes_contents, 2)
pr_changes = response["pull_request"]
# change the body here
diff_text = get_branch_diff_text(
repo=repo,
branch=pull_request.branch_name,
base_branch=overrided_branch_name,
)
new_description = PRDescriptionBot().describe_diffs(
diff_text,
pull_request.title,
)
# TODO: update the title as well
if new_description:
pr_changes.body = (
f"{new_description}\n\nFixes"
f" #{issue_number}.\n\n---\n\n{UPDATES_MESSAGE}\n\n---\n\n{INSTRUCTIONS_FOR_REVIEW}{BOT_SUFFIX}"
)
edit_sweep_comment(
"I have finished coding the issue. I am now reviewing it for completeness.",
3,
)
change_location = f" [`{pr_changes.pr_head}`](https://github.com/{repo_full_name}/commits/{pr_changes.pr_head}).\n\n"
review_message = (
"Here are my self-reviews of my changes at" + change_location
)
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
except Exception:
pass
changes_required, review_message = False, ""
if changes_required:
edit_sweep_comment(
review_message
+ "\n\nI finished incorporating these changes.",
3,
)
else:
edit_sweep_comment(
f"I have finished reviewing the code for completeness. I did not find errors for {change_location}",
3,
)
revert_buttons = []
for changed_file in set(changed_files):
revert_buttons.append(
Button(label=f"{RESET_FILE} {changed_file}")
)
revert_buttons_list = ButtonList(
buttons=revert_buttons, title=REVERT_CHANGED_FILES_TITLE
)
# delete failing sweep yaml if applicable
if sweep_yml_failed:
try:
repo.delete_file(
"sweep.yaml",
"Delete failing sweep.yaml",
branch=pr_changes.pr_head,
sha=repo.get_contents("sweep.yaml").sha,
)
except Exception:
pass
# create draft pr, then convert to regular pr later
pr: GithubPullRequest = repo.create_pull(
title=pr_changes.title,
body=pr_changes.body,
head=pr_changes.pr_head,
base=overrided_branch_name or SweepConfig.get_branch(repo),
# removed draft PR
draft=False,
)
try:
pr.add_to_assignees(username)
except Exception as e:
logger.error(
f"Failed to add assignee {username}: {e}, probably a bot."
)
ticket_progress.status = TicketProgressStatus.COMPLETE
ticket_progress.context.done_time = time()
ticket_progress.context.pr_id = pr.number
ticket_progress.save()
if revert_buttons:
pr.create_issue_comment(
revert_buttons_list.serialize() + BOT_SUFFIX
)
# add comments before labelling
pr.add_to_labels(GITHUB_LABEL_NAME)
current_issue.create_reaction("rocket")
heres_pr_message = f'<h1 align="center">🚀 Here\'s the PR! <a href="{pr.html_url}">#{pr.number}</a></h1>'
progress_message = f'<div align="center"><b>See Sweep\'s progress at <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">the progress dashboard</a>!</b></div>'
edit_sweep_comment(
review_message + "\n\nSuccess! 🚀",
4,
pr_message=(
f"{center(heres_pr_message)}\n{center(progress_message)}\n{center(payment_message_start)}"
),
done=True,
)
user_settings = UserSettings.from_username(username=username)
user = g.get_user(username)
full_name = user.name or user.login
name = full_name.split(" ")[0]
files_changed = []
for fcr in file_change_requests:
if fcr.change_type in ("create", "modify"):
diff = list(
difflib.unified_diff(
(fcr.old_content or "").splitlines() or [],
(fcr.new_content or "").splitlines() or [],
lineterm="",
)
)
added = sum(
1
for line in diff
if line.startswith("+") and not line.startswith("+++")
)
removed = sum(
1
for line in diff
if line.startswith("-") and not line.startswith("---")
)
files_changed.append(
f"<code>{fcr.filename}</code> (+{added}/-{removed})"
)
user_settings.send_email(
subject=f"Sweep Pull Request Complete for {repo_name}#{issue_number} {title}",
html=email_template.format(
name=name,
pr_url=pr.html_url,
issue_number=issue_number,
repo_full_name=repo_full_name,
pr_number=pr.number,
progress_url=f"{PROGRESS_BASE_URL}/issues/{tracking_id}",
summary=markdown.markdown(pr_changes.body),
files_changed="\n".join(
[f"<li>{item}</li>" for item in files_changed]
),
sweeping_gif=sweeping_gif,
),
)
# poll for github to check when gha are done
total_poll_attempts = 0
total_edit_attempts = 0
SLEEP_DURATION_SECONDS = 15
GITHUB_ACTIONS_ENABLED = get_gha_enabled(repo=repo) and DEPLOYMENT_GHA_ENABLED
GHA_MAX_EDIT_ATTEMPTS = 5 # max number of times to edit PR
current_commit = pr.head.sha
while True and GITHUB_ACTIONS_ENABLED:
logger.info(
f"Polling to see if Github Actions have finished... {total_poll_attempts}"
)
# we wait at most 60 minutes
if total_poll_attempts * SLEEP_DURATION_SECONDS // 60 >= 60:
break
else:
# wait one minute between check attempts
total_poll_attempts += 1
from time import sleep
sleep(SLEEP_DURATION_SECONDS)
runs = list(repo.get_workflow_runs(branch=pr.head.ref, head_sha=current_commit))
# if all runs have succeeded, break
if all([run.conclusion == "success" for run in runs]):
break
# if any of them have failed we retry
if any([run.conclusion == "failure" for run in runs]):
failed_runs = [
run for run in runs if run.conclusion == "failure"
]
failed_gha_logs: list[str] = get_failing_gha_logs(
failed_runs,
installation_id,
)
if failed_gha_logs:
# make edits to the PR
# TODO: look into rollbacks so we don't continue adding onto errors
cloned_repo = ClonedRepo( # reinitialize cloned_repo to avoid conflicts
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=pr.head.ref,
)
diffs = get_branch_diff_text(repo=repo, branch=pr.head.ref, base_branch=pr.base.ref)
problem_statement = f"{title}\n{message_summary}\n{replies_text}"
all_information_prompt = GHA_PROMPT.format(
problem_statement=problem_statement,
github_actions_logs=failed_gha_logs,
changes_made=diffs,
)
repo_context_manager = prep_snippets(cloned_repo=cloned_repo, query=(title + message_summary + replies_text).strip("\n"), ticket_progress=ticket_progress) # need to do this, can use the old query for speed
sweep_bot: SweepBot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title="Fix the following errors to complete the user request.",
message_summary=all_information_prompt,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
file_change_requests, plan = get_files_to_change_for_gha(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=all_information_prompt,
updated_files=new_file_contents,
cloned_repo=cloned_repo,
chat_logger=chat_logger,
)
validate_file_change_requests(file_change_requests, cloned_repo)
previous_modify_files_dict: dict[str, dict[str, str | list[str]]] | None = None
_, commit, _ = sweep_bot.handle_modify_file_main(
branch=pr.head.ref,
assistant_conversation=None,
additional_messages=[],
previous_modify_files_dict=previous_modify_files_dict,
file_change_requests=file_change_requests,
username=username
)
current_commit = commit.sha
pr = repo.get_pull(pr.number) # IMPORTANT: resync PR otherwise you'll fetch old GHA runs
total_edit_attempts += 1
if total_edit_attempts >= GHA_MAX_EDIT_ATTEMPTS:
logger.info(f"Tried to edit PR {GHA_MAX_EDIT_ATTEMPTS} times, giving up.")
break
# if none of the runs have completed we wait and poll github
logger.info(
f"No Github Actions have failed yet and not all have succeeded yet, waiting for {SLEEP_DURATION_SECONDS} seconds before polling again..."
)
# break from main for loop
convert_pr_draft_field(pr, is_draft=False, installation_id=installation_id)
except MaxTokensExceeded as e:
logger.info("Max tokens exceeded")
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Max tokens exceeded. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Max Tokens Exceeded",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
if chat_logger and chat_logger.is_paying_user():
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too long."
" We are currently working on improved file streaming to address"
" this issue.\n"
),
-1,
)
else:
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too"
" long.\n\nIf this file is incorrect, please describe the desired"
" file in the prompt. However, if you would like to edit longer"
" files, consider upgrading to [Sweep Pro](https://sweep.dev/) for"
" longer context lengths.\n"
),
-1,
)
delete_branch = True
raise e
except NoFilesException as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sweep could not find files to modify to address this issue. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.info("Sweep could not find files to modify")
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Sweep could not find files to modify",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
edit_sweep_comment(
(
"Sorry, Sweep could not find any appropriate files to edit to address"
" this issue. If this is a mistake, please provide more context and Sweep"
f" will retry!\n\n@{username}, please edit the issue description to"
" include more details. You can also ask for help on our community"
" forum: https://community.sweep.dev/"
),
-1,
)
delete_branch = True
raise e
except openai.BadRequestError as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sorry, it looks like there is an error with communicating with OpenAI. If this error persists, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
edit_sweep_comment(
(
"I'm sorry, but it looks our model has ran out of context length. We're"
" trying to make this happen less, but one way to mitigate this is to"
" code smaller files. If this error persists report it at"
" https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Context Length",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
posthog.capture(
username,
"failed",
properties={
"error": str(e),
"trace": traceback.format_exc(),
"reason": "Invalid request error / context length",
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
delete_branch = True
raise e
except AssistantRaisedException as e:
if ticket_progress is not None:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Sweep raised an error with the following message: {e.message}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.exception(e)
edit_sweep_comment(
f"Sweep raised an error with the following message:\n{blockquote(e.message)}",
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
except Exception as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Internal server error: {str(e)}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
# title and summary are defined elsewhere
if len(title + summary) < 60:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error occurred due to"
f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
" so Sweep can better address it. Alternatively, post on our community forum"
" for assistance: https://community.sweep.dev/"
),
-1,
)
else:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error has occurred due to"
+ f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
+ " so Sweep can better address it. Alternatively, reach out to Kevin or William for help at"
+ " https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
else:
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
fire_and_forget_wrapper(add_emoji)("rocket")
except SystemExit:
raise SystemExit
except Exception as e:
logger.error(e)
if delete_branch:
try:
if pull_request.branch_name.startswith("sweep"):
repo.get_git_ref(
f"heads/{pull_request.branch_name}"
).delete()
else:
raise Exception(
f"Branch name {pull_request.branch_name} does not start with sweep/"
)
except Exception as e:
logger.error(e)
logger.error(traceback.format_exc())
logger.info("Deleted branch", pull_request.branch_name)
except Exception as e:
posthog.capture(
username,
"failed",
properties={
**metadata,
"error": str(e),
"trace": traceback.format_exc(),
"duration": round(time() - on_ticket_start_time),
},
)
raise e
posthog.capture(
username,
"success",
properties={**metadata, "duration": round(time() - on_ticket_start_time)},
)
logger.info("on_ticket success in " + str(round(time() - on_ticket_start_time)))
return {"success": True}
def handle_sandbox_mode(
title, repo_full_name, repo, ticket_progress, edit_sweep_comment
):
logger.info("Running in sandbox mode")
sweep_bot = SweepBot(repo=repo, ticket_progress=ticket_progress)
logger.info("Getting file contents")
file_name = title.split(":")[1].strip()
file_contents = sweep_bot.get_contents(file_name).decoded_content.decode("utf-8")
try:
ext = file_name.split(".")[-1]
except Exception:
ext = ""
file_contents.replace("```", "\`\`\`")
sha = repo.get_branch(repo.default_branch).commit.sha
permalink = f"https://github.com/{repo_full_name}/blob/{sha}/{file_name}#L1-L{len(file_contents.splitlines())}"
logger.info("Running sandbox")
edit_sweep_comment(
f"Running sandbox for {file_name}. Current Code:\n\n{permalink}",
1,
)
updated_contents, sandbox_response = sweep_bot.check_sandbox(
file_name, file_contents
)
logger.info("Sandbox finished")
logs = (
(
"<br/>"
+ create_collapsible(
"Sandbox logs",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
updated_contents = updated_contents.replace("```", "\`\`\`")
diff = generate_diff(file_contents, updated_contents).replace("```", "\`\`\`")
diff_display = (
f"Updated Code:\n\n```{ext}\n{updated_contents}```\nDiff:\n```diff\n{diff}\n```"
if diff
else f"Sandbox made no changes to {file_name} (formatters were not configured or Sweep didn't make changes)."
)
edit_sweep_comment(
f"{logs}\n{diff_display}",
2,
)
edit_sweep_comment("N/A", 3)
logger.info("Sandbox comments updated")
def get_branch_diff_text(repo, branch, base_branch=None):
base_branch = base_branch or SweepConfig.get_branch(repo)
comparison = repo.compare(base_branch, branch)
file_diffs = comparison.files
pr_diffs = []
for file in file_diffs:
diff = file.patch
if (
file.status == "added"
or file.status == "modified"
or file.status == "removed"
):
pr_diffs.append((file.filename, diff))
else:
logger.info(
f"File status {file.status} not recognized"
) # TODO(sweep): We don't handle renamed files
return "\n".join([f"{filename}\n{diff}" for filename, diff in pr_diffs])
def get_payment_messages(chat_logger: ChatLogger):
if chat_logger:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
tracking_id = chat_logger.data["tracking_id"] if MONGODB_URI is not None else None
# Find the first comment made by the bot
tickets_allocated = 5
if is_consumer_tier:
tickets_allocated = 15
if is_paying_user:
tickets_allocated = 500
purchased_ticket_count = (
chat_logger.get_ticket_count(purchased=True) if chat_logger else 0
)
ticket_count = (
max(tickets_allocated - chat_logger.get_ticket_count(), 0)
+ purchased_ticket_count
if chat_logger
else 999
)
daily_ticket_count = (
(3 - chat_logger.get_ticket_count(use_date=True) if not use_faster_model else 0)
if chat_logger
else 999
)
model_name = "GPT-4"
single_payment_link = "https://buy.stripe.com/00g3fh7qF85q0AE14d"
pro_payment_link = "https://buy.stripe.com/00g5npeT71H2gzCfZ8"
daily_message = (
f" and {daily_ticket_count} for the day"
if not is_paying_user and not is_consumer_tier
else ""
)
user_type = "💎 <b>Sweep Pro</b>" if is_paying_user else "⚡ <b>Sweep Basic Tier</b>"
gpt_tickets_left_message = (
f"{ticket_count} GPT-4 tickets left for the month"
if not is_paying_user
else "unlimited GPT-4 tickets"
)
purchase_message = f"<br/><br/> For more GPT-4 tickets, visit <a href={single_payment_link}>our payment portal</a>. For a one week free trial, try <a href={pro_payment_link}>Sweep Pro</a> (unlimited GPT-4 tickets)."
payment_message = (
f"{user_type}: I used {model_name} to create this ticket. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)
payment_message_start = (
f"{user_type}: I'm using {model_name}. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)

from collections import defaultdict
import copy
import traceback
from time import time
from loguru import logger
from tqdm import tqdm
import networkx as nx
from sweepai.config.client import SweepConfig, get_blocked_dirs
from sweepai.config.server import COHERE_API_KEY
from sweepai.core.context_pruning import RepoContextManager, add_relevant_files_to_top_snippets, build_import_trees, integrate_graph_retrieval
from sweepai.core.entities import Snippet
from sweepai.core.lexical_search import (
compute_vector_search_scores,
prepare_lexical_search_index,
search_index,
)
from sweepai.core.sweep_bot import context_get_files_to_change
from sweepai.logn.cache import file_cache
from sweepai.utils.chat_logger import discord_log_error
from sweepai.utils.cohere_utils import cohere_rerank_call
from sweepai.utils.event_logger import posthog
from sweepai.utils.github_utils import ClonedRepo
from sweepai.utils.multi_query import generate_multi_queries
from sweepai.utils.openai_listwise_reranker import listwise_rerank_snippets
from sweepai.utils.progress import TicketProgress
from sweepai.utils.tree_utils import DirectoryTree
"""
Input queries are in natural language so both lexical search
and vector search have a heavy bias towards natural language
files such as tests, docs and localization files. Therefore,
we add adjustment scores to compensate for this bias.
"""
prefix_adjustment = {
".": 0.5,
"doc": 0.3,
"example": 0.7,
}
suffix_adjustment = {
".cfg": 0.8,
".ini": 0.8,
".txt": 0.8,
".rst": 0.8,
".md": 0.8,
".html": 0.8,
".po": 0.5,
".json": 0.8,
".toml": 0.8,
".yaml": 0.8,
".yml": 0.8,
".1": 0.5, # man pages
".spec.ts": 0.6,
".spec.js": 0.6,
".test.ts": 0.6,
".generated.ts": 0.5,
".generated.graphql": 0.5,
".generated.js": 0.5,
"ChangeLog": 0.5,
}
substring_adjustment = {
"tests/": 0.5,
"test/": 0.5,
"/test": 0.5,
"_test": 0.5,
"egg-info": 0.5,
"LICENSE": 0.5,
}
def apply_adjustment_score(
snippet: str,
old_score: float,
):
snippet_score = old_score
file_path, *_ = snippet.rsplit(":", 1)
file_path = file_path.lower()
for prefix, adjustment in prefix_adjustment.items():
if file_path.startswith(prefix):
snippet_score *= adjustment
break
for suffix, adjustment in suffix_adjustment.items():
if file_path.endswith(suffix):
snippet_score *= adjustment
break
for substring, adjustment in substring_adjustment.items():
if substring in file_path:
snippet_score *= adjustment
break
# Penalize numbers as they are usually examples of:
# 1. Test files (e.g. test_utils_3*.py)
# 2. Generated files (from builds or snapshot tests)
# 3. Versioned files (e.g. v1.2.3)
# 4. Migration files (e.g. 2022_01_01_*.sql)
base_file_name = file_path.split("/")[-1]
num_numbers = sum(c.isdigit() for c in base_file_name)
snippet_score *= (1 - 1 / len(base_file_name)) ** num_numbers
return snippet_score
NUM_SNIPPETS_TO_RERANK = 100
@file_cache()
def multi_get_top_k_snippets(
cloned_repo: ClonedRepo,
queries: list[str],
ticket_progress: TicketProgress | None = None,
k: int = 15,
):
"""
Handles multiple queries at once now. Makes the vector search faster.
"""
sweep_config: SweepConfig = SweepConfig()
blocked_dirs = get_blocked_dirs(cloned_repo.repo)
sweep_config.exclude_dirs += blocked_dirs
_, snippets, lexical_index = prepare_lexical_search_index(
cloned_repo.cached_dir,
sweep_config,
ticket_progress,
ref_name=f"{str(cloned_repo.git_repo.head.commit.hexsha)}",
)
if ticket_progress:
ticket_progress.search_progress.indexing_progress = (
ticket_progress.search_progress.indexing_total
)
ticket_progress.save()
for snippet in snippets:
snippet.file_path = snippet.file_path[len(cloned_repo.cached_dir) + 1 :]
# We can mget the lexical search scores for all queries at once
# But it's not that slow anyways
content_to_lexical_score_list = [search_index(query, lexical_index) for query in queries]
files_to_scores_list = compute_vector_search_scores(queries, snippets)
for i, query in enumerate(queries):
for snippet in tqdm(snippets):
vector_score = files_to_scores_list[i].get(snippet.denotation, 0.04)
snippet_score = 0.02
if snippet.denotation in content_to_lexical_score_list[i]:
# roughly fine tuned vector score weight based on average score from search_eval.py on 10 test cases Feb. 13, 2024
snippet_score = content_to_lexical_score_list[i][snippet.denotation] + (
vector_score * 3.5
)
content_to_lexical_score_list[i][snippet.denotation] = snippet_score
else:
content_to_lexical_score_list[i][snippet.denotation] = snippet_score * vector_score
content_to_lexical_score_list[i][snippet.denotation] = apply_adjustment_score(
snippet.denotation, content_to_lexical_score_list[i][snippet.denotation]
)
ranked_snippets_list = [
sorted(
snippets,
key=lambda snippet: content_to_lexical_score[snippet.denotation],
reverse=True,
)[:k] for content_to_lexical_score in content_to_lexical_score_list
]
return ranked_snippets_list, snippets, content_to_lexical_score_list
@file_cache()
def get_top_k_snippets(
cloned_repo: ClonedRepo,
query: str,
ticket_progress: TicketProgress | None = None,
k: int = 15,
):
ranked_snippets_list, snippets, content_to_lexical_score_list = multi_get_top_k_snippets(
cloned_repo, [query], ticket_progress, k
)
return ranked_snippets_list[0], snippets, content_to_lexical_score_list[0]
def get_pointwise_reranked_snippet_scores(
query: str,
snippets: list[Snippet],
snippet_scores: dict[str, float],
):
"""
Ranks 1-5 snippets are frozen. They're just passed into Cohere since it helps with reranking. We multiply the scores by 1_000 to make them more significant.
Ranks 6-100 are reranked using Cohere. Then we divide the scores by 1_000 to make them comparable to the original scores.
"""
if not COHERE_API_KEY:
return snippet_scores
sorted_snippets = sorted(
snippets,
key=lambda snippet: snippet_scores[snippet.denotation],
reverse=True,
)
NUM_SNIPPETS_TO_KEEP = 5
NUM_SNIPPETS_TO_RERANK = 100
response = cohere_rerank_call(
model='rerank-english-v3.0',
query=query,
documents=[snippet.xml for snippet in sorted_snippets[:NUM_SNIPPETS_TO_RERANK]],
max_chunks_per_doc=900 // NUM_SNIPPETS_TO_RERANK,
)
new_snippet_scores = {k: v / 1000 for k, v in snippet_scores.items()}
for document in response.results:
new_snippet_scores[sorted_snippets[document.index].denotation] = apply_adjustment_score(
sorted_snippets[document.index].denotation,
document.relevance_score,
)
for snippet in sorted_snippets[:NUM_SNIPPETS_TO_KEEP]:
new_snippet_scores[snippet.denotation] = snippet_scores[snippet.denotation] * 1_000
# override score with Cohere score
for snippet in sorted_snippets[:NUM_SNIPPETS_TO_RERANK]:
if snippet.denotation in new_snippet_scores:
snippet.score = new_snippet_scores[snippet.denotation]
return new_snippet_scores
def multi_prep_snippets(
cloned_repo: ClonedRepo,
queries: list[str],
ticket_progress: TicketProgress | None = None,
k: int = 15,
skip_reranking: bool = False, # This is only for pointwise reranking
skip_pointwise_reranking: bool = False,
) -> RepoContextManager:
"""
Assume 0th index is the main query.
"""
rank_fusion_offset = 0
if len(queries) > 1:
logger.info("Using multi query...")
ranked_snippets_list, snippets, content_to_lexical_score_list = multi_get_top_k_snippets(
cloned_repo, queries, ticket_progress, k * 3 # k * 3 to have enough snippets to rerank
)
# Use RRF to rerank snippets
content_to_lexical_score = defaultdict(float)
for i, ordered_snippets in enumerate(ranked_snippets_list):
for j, snippet in enumerate(ordered_snippets):
content_to_lexical_score[snippet.denotation] += content_to_lexical_score_list[i][snippet.denotation] * (1 / 2 ** (rank_fusion_offset + j))
if not skip_pointwise_reranking:
content_to_lexical_score = get_pointwise_reranked_snippet_scores(
queries[0], snippets, content_to_lexical_score
)
ranked_snippets = sorted(
snippets,
key=lambda snippet: content_to_lexical_score[snippet.denotation],
reverse=True,
)[:k]
else:
ranked_snippets, snippets, content_to_lexical_score = get_top_k_snippets(
cloned_repo, queries[0], ticket_progress, k
)
if not skip_pointwise_reranking:
content_to_lexical_score = get_pointwise_reranked_snippet_scores(
queries[0], snippets, content_to_lexical_score
)
ranked_snippets = sorted(
snippets,
key=lambda snippet: content_to_lexical_score[snippet.denotation],
reverse=True,
)[:k]
if ticket_progress:
ticket_progress.search_progress.retrieved_snippets = ranked_snippets
ticket_progress.save()
# you can use snippet.denotation and snippet.get_snippet()
if not skip_reranking and skip_pointwise_reranking:
ranked_snippets[:NUM_SNIPPETS_TO_RERANK] = listwise_rerank_snippets(queries[0], ranked_snippets[:NUM_SNIPPETS_TO_RERANK])
snippet_paths = [snippet.file_path for snippet in ranked_snippets]
prefixes = []
for snippet_path in snippet_paths:
snippet_depth = len(snippet_path.split("/"))
for idx in range(snippet_depth): # heuristic
if idx > snippet_depth // 2:
prefixes.append("/".join(snippet_path.split("/")[:idx]) + "/")
prefixes.append(snippet_path)
# _, dir_obj = cloned_repo.list_directory_tree(
# included_directories=list(set(prefixes)),
# included_files=list(set(snippet_paths)),
# )
dir_obj = DirectoryTree() # init dummy one for now, this shouldn't be used
repo_context_manager = RepoContextManager(
dir_obj=dir_obj,
current_top_tree=str(dir_obj),
current_top_snippets=ranked_snippets,
snippets=snippets,
snippet_scores=content_to_lexical_score,
cloned_repo=cloned_repo,
)
return repo_context_manager
def prep_snippets(
cloned_repo: ClonedRepo,
query: str,
ticket_progress: TicketProgress | None = None,
k: int = 15,
skip_reranking: bool = False,
use_multi_query: bool = True,
) -> RepoContextManager:
if use_multi_query:
queries = [query, *generate_multi_queries(query)]
else:
queries = [query]
return multi_prep_snippets(
cloned_repo, queries, ticket_progress, k, skip_reranking
)
def get_relevant_context(
query: str,
repo_context_manager: RepoContextManager,
seed: int = None,
import_graph: nx.DiGraph = None,
chat_logger = None,
images = None
) -> RepoContextManager:
logger.info("Seed: " + str(seed))
repo_context_manager = build_import_trees(
repo_context_manager,
import_graph,
)
repo_context_manager = add_relevant_files_to_top_snippets(repo_context_manager)
repo_context_manager.dir_obj.add_relevant_files(
repo_context_manager.relevant_file_paths
)
relevant_files, read_only_files = context_get_files_to_change(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=query,
repo_name=repo_context_manager.cloned_repo.repo_full_name,
import_graph=import_graph,
chat_logger=chat_logger,
seed=seed,
cloned_repo=repo_context_manager.cloned_repo,
images=images
)
previous_top_snippets = copy.deepcopy(repo_context_manager.current_top_snippets)
previous_read_only_snippets = copy.deepcopy(repo_context_manager.read_only_snippets)
repo_context_manager.current_top_snippets = []
repo_context_manager.read_only_snippets = []
for relevant_file in relevant_files:
try:
content = repo_context_manager.cloned_repo.get_file_contents(relevant_file)
except FileNotFoundError:
continue
snippet = Snippet(
file_path=relevant_file,
start=0,
end=len(content.split("\n")),
content=content,
)
repo_context_manager.current_top_snippets.append(snippet)
for read_only_file in read_only_files:
try:
content = repo_context_manager.cloned_repo.get_file_contents(read_only_file)
except FileNotFoundError:
continue
snippet = Snippet(
file_path=read_only_file,
start=0,
end=len(content.split("\n")),
content=content,
)
repo_context_manager.read_only_snippets.append(snippet)
if not repo_context_manager.current_top_snippets and not repo_context_manager.read_only_snippets:
repo_context_manager.current_top_snippets = copy.deepcopy(previous_top_snippets)
repo_context_manager.read_only_snippets = copy.deepcopy(previous_read_only_snippets)
return repo_context_manager
def fetch_relevant_files(
cloned_repo,
title,
summary,
replies_text,
username,
metadata,
on_ticket_start_time,
tracking_id,
is_paying_user,
is_consumer_tier,
issue_url,
chat_logger,
ticket_progress: TicketProgress,
images = None
):
logger.info("Fetching relevant files...")
try:
search_query = (title + summary + replies_text).strip("\n")
replies_text = f"\n{replies_text}" if replies_text else ""
formatted_query = (f"{title.strip()}\n{summary.strip()}" + replies_text).strip(
"\n"
)
repo_context_manager = prep_snippets(cloned_repo, search_query, ticket_progress)
repo_context_manager, import_graph = integrate_graph_retrieval(search_query, repo_context_manager)
ticket_progress.save()
repo_context_manager = get_relevant_context(
formatted_query,
repo_context_manager,
ticket_progress,
chat_logger=chat_logger,
import_graph=import_graph,
images=images
)
snippets = repo_context_manager.current_top_snippets
ticket_progress.search_progress.final_snippets = snippets
ticket_progress.save()
dir_obj = repo_context_manager.dir_obj
tree = str(dir_obj)
except Exception as e:
trace = traceback.format_exc()
logger.exception(f"{trace} (tracking ID: `{tracking_id}`)")
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"File Fetch",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
posthog.capture(
username,
"failed",
properties={
**metadata,
"error": str(e),
"duration": time() - on_ticket_start_time,
},
)
raise e
return snippets, tree, dir_obj, repo_context_manager
SLOW_MODE = False
SLOW_MODE = True
def log_error(
is_paying_user,
is_trial_user,
username,
issue_url,
error_type,
exception,
priority=0,
):
if is_paying_user or is_trial_user:
if priority == 1:
priority = 0
elif priority == 2:
priority = 1
prefix = ""
if is_trial_user:
prefix = " (TRIAL)"
if is_paying_user:
prefix = " (PRO)"
content = (
f"**{error_type} Error**{prefix}\n{username}:"
f" {issue_url}\n```{exception}```"
)
discord_log_error(content, priority=2)
def center(text: str) -> str:
return f"<div align='center'>{text}</div>"
def fire_and_forget_wrapper(call):
"""
This decorator is used to run a function in a separate thread.
It does not return anything and does not wait for the function to finish.
It fails silently.
"""
def wrapper(*args, **kwargs):
try:
return call(*args, **kwargs)
except Exception:
pass
# def run_in_thread(call, *a, **kw):
# try:
# call(*a, **kw)
# except:
# pass
# thread = Thread(target=run_in_thread, args=(call,) + args, kwargs=kwargs)
# thread.start()
return wrapper
if __name__ == "__main__":
from sweepai.utils.github_utils import MockClonedRepo
cloned_repo = MockClonedRepo(
_repo_dir="/tmp/sweep",
repo_full_name="sweepai/sweep",
)
cloned_repo = MockClonedRepo(
_repo_dir="/tmp/pulse-alp",
repo_full_name="trilogy-group/pulse-alp",
)
rcm = prep_snippets(
cloned_repo,
# "I am trying to set up payment processing in my app using Stripe, but I keep getting a 400 error when I try to create a payment intent. I have checked the API key and the request body, but I can't figure out what's wrong. Here is the error message I'm getting: 'Invalid request: request parameters are invalid'. I have attached the relevant code snippets below. Can you help me find the part of the code that is causing this error?",
"Where can I find the section that checks if assembly line workers are active or disabled?",
use_multi_query=False,
skip_reranking=True


Step 2: ⌨️ Coding

Modify sweepai/config/server.py with contents: Add a new environment variable for the user's Slack API key.

<original_code>
RESEND_API_KEY = os.environ.get("RESEND_API_KEY", None)
LOKI_URL = None

DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
ENV = "prod" if GITHUB_BOT_USERNAME != TEST_BOT_NAME else "dev"

PROGRESS_BASE_URL = os.environ.get(
"PROGRESS_BASE_URL", "https://progress.sweep.dev"
).rstrip("/")

DISABLED_REPOS = os.environ.get("DISABLED_REPOS", "").split(",")

GHA_AUTOFIX_ENABLED: bool = os.environ.get("GHA_AUTOFIX_ENABLED", False)
MERGE_CONFLICT_ENABLED: bool = os.environ.get("MERGE_CONFLICT_ENABLED", False)
INSTALLATION_ID = os.environ.get("INSTALLATION_ID", None)

AWS_ACCESS_KEY=os.environ.get("AWS_ACCESS_KEY")
AWS_SECRET_KEY=os.environ.get("AWS_SECRET_KEY")
AWS_REGION=os.environ.get("AWS_REGION")
ANTHROPIC_AVAILABLE = AWS_ACCESS_KEY and AWS_SECRET_KEY and AWS_REGION
</original_code>

<new_code>
RESEND_API_KEY = os.environ.get("RESEND_API_KEY", None)
LOKI_URL = None

DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
ENV = "prod" if GITHUB_BOT_USERNAME != TEST_BOT_NAME else "dev"

PROGRESS_BASE_URL = os.environ.get(
"PROGRESS_BASE_URL", "https://progress.sweep.dev"
).rstrip("/")

DISABLED_REPOS = os.environ.get("DISABLED_REPOS", "").split(",")

GHA_AUTOFIX_ENABLED: bool = os.environ.get("GHA_AUTOFIX_ENABLED", False)
MERGE_CONFLICT_ENABLED: bool = os.environ.get("MERGE_CONFLICT_ENABLED", False)
INSTALLATION_ID = os.environ.get("INSTALLATION_ID", None)

AWS_ACCESS_KEY=os.environ.get("AWS_ACCESS_KEY")
AWS_SECRET_KEY=os.environ.get("AWS_SECRET_KEY")
AWS_REGION=os.environ.get("AWS_REGION")
ANTHROPIC_AVAILABLE = AWS_ACCESS_KEY and AWS_SECRET_KEY and AWS_REGION

SLACK_API_KEY = os.environ.get("SLACK_API_KEY", None)
</new_code>

  • Modify sweepai/handlers/on_ticket.pye88ab2a Edit
Modify sweepai/handlers/on_ticket.py with contents:

Import the slack_sdk package and add the SLACK_API_KEY import.

<original_code>
"""
on_ticket is the main function that is called when a new issue is created.
It is only called by the webhook handler in sweepai/api.py.
"""

import difflib
import io
import os
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter

from sweepai.core.sweep_bot import GHA_PROMPT
from sweepai.agents.pr_description_bot import PRDescriptionBot
from sweepai.agents.image_description_bot import ImageDescriptionBot
from sweepai.config.client import (
RESET_FILE,
RESTART_SWEEP_BUTTON,
REVERT_CHANGED_FILES_TITLE,
SweepConfig,
get_documentation_dict,
get_gha_enabled,
)
from sweepai.config.server import (
DEPLOYMENT_GHA_ENABLED,
ENV,
GITHUB_LABEL_NAME,
IS_SELF_HOSTED,
MONGODB_URI,
PROGRESS_BASE_URL,
)
</original_code>

<new_code>
"""
on_ticket is the main function that is called when a new issue is created.
It is only called by the webhook handler in sweepai/api.py.
"""

import difflib
import io
import os
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
from slack_sdk import WebClient

from sweepai.core.sweep_bot import GHA_PROMPT
from sweepai.agents.pr_description_bot import PRDescriptionBot
from sweepai.agents.image_description_bot import ImageDescriptionBot
from sweepai.config.client import (
RESET_FILE,
RESTART_SWEEP_BUTTON,
REVERT_CHANGED_FILES_TITLE,
SweepConfig,
get_documentation_dict,
get_gha_enabled,
)
from sweepai.config.server import (
DEPLOYMENT_GHA_ENABLED,
ENV,
GITHUB_LABEL_NAME,
IS_SELF_HOSTED,
MONGODB_URI,
PROGRESS_BASE_URL,
SLACK_API_KEY,
)
</new_code>

  • Modify sweepai/handlers/on_ticket.pye88ab2a Edit
Modify sweepai/handlers/on_ticket.py with contents: Add logic to check for Slack links, authenticate, fetch the thread, and integrate the information into the issue summary.

<original_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
</original_code>

<new_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)

    # Check if the issue summary contains a Slack link
    slack_link_match = re.search(r'https://\S+\.slack\.com/archives/\S+/p\d+', summary)
    if slack_link_match and SLACK_API_KEY:
        # Authenticate to Slack using the provided API key
        slack_client = WebClient(token=SLACK_API_KEY)
        
        # Extract the Slack link from the match
        slack_link = slack_link_match.group()
        
        # Fetch the Slack message thread
        channel_id = slack_link.split('/')[4]
        message_ts = slack_link.split('/')[-1][1:]
        
        try:
            result = slack_client.conversations_replies(channel=channel_id, ts=message_ts)
            
            # Extract the relevant information from the Slack thread
            slack_messages = result['messages']
            slack_thread_info = '\n'.join([msg['text'] for msg in slack_messages])
            
            # Append the Slack thread information to the issue summary
            summary += f'\n\nSlack Thread:\n{slack_thread_info}'
        except Exception as e:
            logger.warning(f'Failed to fetch Slack thread: {e}')
    
    repo_name = repo_full_name
    user_token, g = get_github_client(installation_id)
    repo = g.get_repo(repo_full_name)
    current_issue: Issue = repo.get_issue(number=issue_number)
    assignee = current_issue.assignee.login if current_issue.assignee else None
    if assignee is None:
        assignee = current_issue.user.login

</new_code>


Step 3: 🔁 Code Review

I have finished reviewing the code for completeness. I did not find errors for sweep/augment_on_ticket_so_that_when_a_user_ad.


🎉 Latest improvements to Sweep:
  • New dashboard launched for real-time tracking of Sweep issues, covering all stages from search to coding.
  • Integration of OpenAI's latest Assistant API for more efficient and reliable code planning and editing, improving speed by 3x.
  • Use the GitHub issues extension for creating Sweep issues directly from your editor.

💡 To recreate the pull request edit the issue title or description.
Something wrong? Let us know.

This is an automated message generated by Sweep AI.

Copy link
Contributor

sweep-nightly bot commented May 2, 2024


Tip

I can email you next time I complete a pull request if you set up your email here!


Actions (click)

  • ↻ Restart Sweep

❌ Unable to Complete PR

I'm sorry, but it looks like an error has occurred due to a planning failure. The error message is 401 {"message": "Bad credentials", "documentation_url": "https://docs.github.com/rest"}. Feel free to add more details to the issue description so Sweep can better address it. Alternatively, reach out to Kevin or William for help at https://discord.gg/sweep.

For bonus GPT-4 tickets, please report this bug on Discord (tracking ID: 406bb96ad2).


Please look at the generated plan. If something looks wrong, please add more details to your issue.

File Path Proposed Changes
sweepai/config/server.py Modify sweepai/config/server.py with contents:
Add a new environment variable for the user's Slack API key.

<original_code>
RESEND_API_KEY = os.environ.get("RESEND_API_KEY", None)
LOKI_URL = None

DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
ENV = "prod" if GITHUB_BOT_USERNAME != TEST_BOT_NAME else "dev"

PROGRESS_BASE_URL = os.environ.get(
"PROGRESS_BASE_URL", "https://progress.sweep.dev"
).rstrip("/")

DISABLED_REPOS = os.environ.get("DISABLED_REPOS", "").split(",")

GHA_AUTOFIX_ENABLED: bool = os.environ.get("GHA_AUTOFIX_ENABLED", False)
MERGE_CONFLICT_ENABLED: bool = os.environ.get("MERGE_CONFLICT_ENABLED", False)
INSTALLATION_ID = os.environ.get("INSTALLATION_ID", None)

AWS_ACCESS_KEY=os.environ.get("AWS_ACCESS_KEY")
AWS_SECRET_KEY=os.environ.get("AWS_SECRET_KEY")
AWS_REGION=os.environ.get("AWS_REGION")
ANTHROPIC_AVAILABLE = AWS_ACCESS_KEY and AWS_SECRET_KEY and AWS_REGION
</original_code>

<new_code>
RESEND_API_KEY = os.environ.get("RESEND_API_KEY", None)
LOKI_URL = None

DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
ENV = "prod" if GITHUB_BOT_USERNAME != TEST_BOT_NAME else "dev"

PROGRESS_BASE_URL = os.environ.get(
"PROGRESS_BASE_URL", "https://progress.sweep.dev"
).rstrip("/")

DISABLED_REPOS = os.environ.get("DISABLED_REPOS", "").split(",")

GHA_AUTOFIX_ENABLED: bool = os.environ.get("GHA_AUTOFIX_ENABLED", False)
MERGE_CONFLICT_ENABLED: bool = os.environ.get("MERGE_CONFLICT_ENABLED", False)
INSTALLATION_ID = os.environ.get("INSTALLATION_ID", None)

AWS_ACCESS_KEY=os.environ.get("AWS_ACCESS_KEY")
AWS_SECRET_KEY=os.environ.get("AWS_SECRET_KEY")
AWS_REGION=os.environ.get("AWS_REGION")
ANTHROPIC_AVAILABLE = AWS_ACCESS_KEY and AWS_SECRET_KEY and AWS_REGION

SLACK_API_KEY = os.environ.get("SLACK_API_KEY", None)
</new_code>
sweepai/handlers/on_ticket.py Modify sweepai/handlers/on_ticket.py with contents:

Import the slack_sdk package and add the SLACK_API_KEY import.

<original_code>
"""
on_ticket is the main function that is called when a new issue is created.
It is only called by the webhook handler in sweepai/api.py.
"""

import difflib
import io
import os
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter

from sweepai.core.sweep_bot import GHA_PROMPT
from sweepai.agents.pr_description_bot import PRDescriptionBot
from sweepai.agents.image_description_bot import ImageDescriptionBot
from sweepai.config.client import (
RESET_FILE,
RESTART_SWEEP_BUTTON,
REVERT_CHANGED_FILES_TITLE,
SweepConfig,
get_documentation_dict,
get_gha_enabled,
)
from sweepai.config.server import (
DEPLOYMENT_GHA_ENABLED,
ENV,
GITHUB_LABEL_NAME,
IS_SELF_HOSTED,
MONGODB_URI,
PROGRESS_BASE_URL,
)
</original_code>

<new_code>
"""
on_ticket is the main function that is called when a new issue is created.
It is only called by the webhook handler in sweepai/api.py.
"""

import difflib
import io
import os
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
from slack_sdk import WebClient

from sweepai.core.sweep_bot import GHA_PROMPT
from sweepai.agents.pr_description_bot import PRDescriptionBot
from sweepai.agents.image_description_bot import ImageDescriptionBot
from sweepai.config.client import (
RESET_FILE,
RESTART_SWEEP_BUTTON,
REVERT_CHANGED_FILES_TITLE,
SweepConfig,
get_documentation_dict,
get_gha_enabled,
)
from sweepai.config.server import (
DEPLOYMENT_GHA_ENABLED,
ENV,
GITHUB_LABEL_NAME,
IS_SELF_HOSTED,
MONGODB_URI,
PROGRESS_BASE_URL,
SLACK_API_KEY,
)
</new_code>
sweepai/handlers/on_ticket.py Modify sweepai/handlers/on_ticket.py with contents:
Add logic to check for Slack links, authenticate, fetch the thread, and integrate the information into the issue summary.

<original_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\nChecklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
</original_code>

<new_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\nChecklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)

# Check if the issue summary contains a Slack link
slack_link_match = re.search(r'https://\S+.slack.com/archives/\S+/p\d+', summary)
if slack_link_match and SLACK_API_KEY:
# Authenticate to Slack using the provided API key
slack_client = WebClient(token=SLACK_API_KEY)

# Extract the Slack link from the match
slack_link = slack_link_match.group()

# Fetch the Slack message thread
channel_id = slack_link.split('/')[4]
message_ts = slack_link.split('/')[-1][1:]

try:
result = slack_client.conversations_replies(channel=channel_id, ts=message_ts)

# Extract the relevant information from the Slack thread
slack_messages = result['messages']
slack_thread_info = '\n'.join([msg['text'] for msg in slack_messages])

# Append the Slack thread information to the issue summary
summary += f'\n\nSlack Thread:\n{slack_thread_info}'
except Exception as e:
logger.warning(f'Failed to fetch Slack thread: {e}')

repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
</new_code>

🎉 Latest improvements to Sweep:
  • New dashboard launched for real-time tracking of Sweep issues, covering all stages from search to coding.
  • Integration of OpenAI's latest Assistant API for more efficient and reliable code planning and editing, improving speed by 3x.
  • Use the GitHub issues extension for creating Sweep issues directly from your editor.

💡 To recreate the pull request edit the issue title or description.

Copy link
Contributor

sweep-nightly bot commented May 2, 2024

🚀 Here's the PR! #3664

See Sweep's progress at the progress dashboard!
💎 Sweep Pro: I'm using GPT-4. You have unlimited GPT-4 tickets. (tracking ID: b1144b23a8)

Tip

I can email you next time I complete a pull request if you set up your email here!


Actions (click)

  • ↻ Restart Sweep

Step 1: 🔎 Searching

I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.

Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.

import base64
import os
from dotenv import load_dotenv
from loguru import logger
logger.print = logger.info
load_dotenv(dotenv_path=".env", override=True, verbose=True)
os.environ["GITHUB_APP_PEM"] = os.environ.get("GITHUB_APP_PEM") or base64.b64decode(
os.environ.get("GITHUB_APP_PEM_BASE64", "")
).decode("utf-8")
if os.environ["GITHUB_APP_PEM"]:
os.environ["GITHUB_APP_ID"] = (
(os.environ.get("GITHUB_APP_ID") or os.environ.get("APP_ID"))
.replace("\\n", "\n")
.strip('"')
)
TEST_BOT_NAME = "sweep-nightly[bot]"
ENV = os.environ.get("ENV", "dev")
BOT_TOKEN_NAME = "bot-token"
# goes under Modal 'discord' secret name (optional, can leave env var blank)
DISCORD_WEBHOOK_URL = os.environ.get("DISCORD_WEBHOOK_URL")
DISCORD_MEDIUM_PRIORITY_URL = os.environ.get("DISCORD_MEDIUM_PRIORITY_URL")
DISCORD_LOW_PRIORITY_URL = os.environ.get("DISCORD_LOW_PRIORITY_URL")
DISCORD_FEEDBACK_WEBHOOK_URL = os.environ.get("DISCORD_FEEDBACK_WEBHOOK_URL")
SWEEP_HEALTH_URL = os.environ.get("SWEEP_HEALTH_URL")
DISCORD_STATUS_WEBHOOK_URL = os.environ.get("DISCORD_STATUS_WEBHOOK_URL")
# goes under Modal 'github' secret name
GITHUB_APP_ID = os.environ.get("GITHUB_APP_ID", os.environ.get("APP_ID"))
# deprecated: old logic transfer so upstream can use this
if GITHUB_APP_ID is None:
if ENV == "prod":
GITHUB_APP_ID = "307814"
elif ENV == "dev":
GITHUB_APP_ID = "324098"
elif ENV == "staging":
GITHUB_APP_ID = "327588"
GITHUB_BOT_USERNAME = os.environ.get("GITHUB_BOT_USERNAME")
# deprecated: left to support old logic
if not GITHUB_BOT_USERNAME:
if ENV == "prod":
GITHUB_BOT_USERNAME = "sweep-ai[bot]"
elif ENV == "dev":
GITHUB_BOT_USERNAME = "sweep-nightly[bot]"
elif ENV == "staging":
GITHUB_BOT_USERNAME = "sweep-canary[bot]"
elif not GITHUB_BOT_USERNAME.endswith("[bot]"):
GITHUB_BOT_USERNAME = GITHUB_BOT_USERNAME + "[bot]"
GITHUB_LABEL_NAME = os.environ.get("GITHUB_LABEL_NAME", "sweep")
GITHUB_LABEL_COLOR = os.environ.get("GITHUB_LABEL_COLOR", "9400D3")
GITHUB_LABEL_DESCRIPTION = os.environ.get(
"GITHUB_LABEL_DESCRIPTION", "Sweep your software chores"
)
GITHUB_APP_PEM = os.environ.get("GITHUB_APP_PEM")
GITHUB_APP_PEM = GITHUB_APP_PEM or os.environ.get("PRIVATE_KEY")
if GITHUB_APP_PEM is not None:
GITHUB_APP_PEM = GITHUB_APP_PEM.strip(' \n"') # Remove whitespace and quotes
GITHUB_APP_PEM = GITHUB_APP_PEM.replace("\\n", "\n")
GITHUB_CONFIG_BRANCH = os.environ.get("GITHUB_CONFIG_BRANCH", "sweep/add-sweep-config")
GITHUB_DEFAULT_CONFIG = os.environ.get(
"GITHUB_DEFAULT_CONFIG",
"""# Sweep AI turns bugs & feature requests into code changes (https://sweep.dev)
# For details on our config file, check out our docs at https://docs.sweep.dev/usage/config
# This setting contains a list of rules that Sweep will check for. If any of these rules are broken in a new commit, Sweep will create an pull request to fix the broken rule.
rules:
{additional_rules}
# This is the branch that Sweep will develop from and make pull requests to. Most people use 'main' or 'master' but some users also use 'dev' or 'staging'.
branch: 'main'
# By default Sweep will read the logs and outputs from your existing Github Actions. To disable this, set this to false.
gha_enabled: True
# This is the description of your project. It will be used by sweep when creating PRs. You can tell Sweep what's unique about your project, what frameworks you use, or anything else you want.
#
# Example:
#
# description: sweepai/sweep is a python project. The main api endpoints are in sweepai/api.py. Write code that adheres to PEP8.
description: ''
# This sets whether to create pull requests as drafts. If this is set to True, then all pull requests will be created as drafts and GitHub Actions will not be triggered.
draft: False
# This is a list of directories that Sweep will not be able to edit.
blocked_dirs: []
""",
)
MONGODB_URI = os.environ.get("MONGODB_URI", None)
IS_SELF_HOSTED = os.environ.get("IS_SELF_HOSTED", "true").lower() == "true"
REDIS_URL = os.environ.get("REDIS_URL")
if not REDIS_URL:
REDIS_URL = os.environ.get("redis_url", "redis://0.0.0.0:6379/0")
ORG_ID = os.environ.get("ORG_ID", None)
POSTHOG_API_KEY = os.environ.get(
"POSTHOG_API_KEY", "phc_CnzwIB0W548wN4wEGeRuxXqidOlEUH2AcyV2sKTku8n"
)
SUPPORT_COUNTRY = os.environ.get("GDRP_LIST", "").split(",")
WHITELISTED_REPOS = os.environ.get("WHITELISTED_REPOS", "").split(",")
BLACKLISTED_USERS = os.environ.get("BLACKLISTED_USERS", "").split(",")
# Default OpenAI
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", None) # this may be none, and it will use azure
OPENAI_API_TYPE = os.environ.get("OPENAI_API_TYPE", "anthropic")
assert OPENAI_API_TYPE in ["anthropic", "azure", "openai"], "Invalid OPENAI_API_TYPE"
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
AZURE_API_KEY = os.environ.get("AZURE_API_KEY", None)
OPENAI_API_BASE = os.environ.get("OPENAI_API_BASE", None)
OPENAI_API_VERSION = os.environ.get("OPENAI_API_VERSION", None)
AZURE_OPENAI_DEPLOYMENT = os.environ.get("AZURE_OPENAI_DEPLOYMENT", None)
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
OPENAI_EMBEDDINGS_AZURE_ENDPOINT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_ENDPOINT", None
)
OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT", None
)
OPENAI_EMBEDDINGS_AZURE_API_VERSION = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_API_VERSION", None
)
OPENAI_API_ENGINE_GPT35 = os.environ.get("OPENAI_API_ENGINE_GPT35", None)
OPENAI_API_ENGINE_GPT4 = os.environ.get("OPENAI_API_ENGINE_GPT4", None)
MULTI_REGION_CONFIG = os.environ.get("MULTI_REGION_CONFIG", None)
if isinstance(MULTI_REGION_CONFIG, str):
MULTI_REGION_CONFIG = MULTI_REGION_CONFIG.strip("'").replace("\\n", "\n")
MULTI_REGION_CONFIG = [item.split(",") for item in MULTI_REGION_CONFIG.split("\n")]
WHITELISTED_USERS = os.environ.get("WHITELISTED_USERS", None)
if WHITELISTED_USERS:
WHITELISTED_USERS = WHITELISTED_USERS.split(",")
WHITELISTED_USERS.append(GITHUB_BOT_USERNAME)
DEFAULT_GPT4_MODEL = os.environ.get("DEFAULT_GPT4_MODEL", "gpt-4-0125-preview")
RESEND_API_KEY = os.environ.get("RESEND_API_KEY", None)
LOKI_URL = None
DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
ENV = "prod" if GITHUB_BOT_USERNAME != TEST_BOT_NAME else "dev"
PROGRESS_BASE_URL = os.environ.get(
"PROGRESS_BASE_URL", "https://progress.sweep.dev"
).rstrip("/")
DISABLED_REPOS = os.environ.get("DISABLED_REPOS", "").split(",")
GHA_AUTOFIX_ENABLED: bool = os.environ.get("GHA_AUTOFIX_ENABLED", False)
MERGE_CONFLICT_ENABLED: bool = os.environ.get("MERGE_CONFLICT_ENABLED", False)
INSTALLATION_ID = os.environ.get("INSTALLATION_ID", None)
AWS_ACCESS_KEY=os.environ.get("AWS_ACCESS_KEY")
AWS_SECRET_KEY=os.environ.get("AWS_SECRET_KEY")
AWS_REGION=os.environ.get("AWS_REGION")
ANTHROPIC_AVAILABLE = AWS_ACCESS_KEY and AWS_SECRET_KEY and AWS_REGION
ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY", None)
COHERE_API_KEY = os.environ.get("COHERE_API_KEY", None)
VOYAGE_API_KEY = os.environ.get("VOYAGE_API_KEY", None)
VOYAGE_API_AWS_ACCESS_KEY=os.environ.get("VOYAGE_API_AWS_ACCESS_KEY_ID")
VOYAGE_API_AWS_SECRET_KEY=os.environ.get("VOYAGE_API_AWS_SECRET_KEY")
VOYAGE_API_AWS_REGION=os.environ.get("VOYAGE_API_AWS_REGION")
VOYAGE_API_AWS_ENDPOINT_NAME=os.environ.get("VOYAGE_API_AWS_ENDPOINT_NAME", "voyage-code-2")
VOYAGE_API_USE_AWS = VOYAGE_API_AWS_ACCESS_KEY and VOYAGE_API_AWS_SECRET_KEY and VOYAGE_API_AWS_REGION
PAREA_API_KEY = os.environ.get("PAREA_API_KEY", None)
# TODO: we need to make this dynamic + backoff
BATCH_SIZE = int(
os.environ.get("BATCH_SIZE", 64 if VOYAGE_API_KEY else 256) # Voyage only allows 128 items per batch and 120000 tokens per batch
)
DEPLOYMENT_GHA_ENABLED = os.environ.get("DEPLOYMENT_GHA_ENABLED", "true").lower() == "true"
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

"""
on_ticket is the main function that is called when a new issue is created.
It is only called by the webhook handler in sweepai/api.py.
"""
import difflib
import io
import os
import re
import traceback
from typing import Any
import zipfile
from time import time
import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
from sweepai.core.sweep_bot import GHA_PROMPT
from sweepai.agents.pr_description_bot import PRDescriptionBot
from sweepai.agents.image_description_bot import ImageDescriptionBot
from sweepai.config.client import (
RESET_FILE,
RESTART_SWEEP_BUTTON,
REVERT_CHANGED_FILES_TITLE,
SweepConfig,
get_documentation_dict,
get_gha_enabled,
)
from sweepai.config.server import (
DEPLOYMENT_GHA_ENABLED,
ENV,
GITHUB_LABEL_NAME,
IS_SELF_HOSTED,
MONGODB_URI,
PROGRESS_BASE_URL,
)
from sweepai.core.entities import (
AssistantRaisedException,
FileChangeRequest,
MaxTokensExceeded,
NoFilesException,
PullRequest,
SandboxResponse,
)
from sweepai.core.entities import create_error_logs as entities_create_error_logs
from sweepai.core.pr_reader import PRReader
from sweepai.core.sweep_bot import SweepBot, get_files_to_change, get_files_to_change_for_gha, validate_file_change_requests
from sweepai.handlers.create_pr import (
create_config_pr,
create_pr_changes,
safe_delete_sweep_branch,
)
from sweepai.handlers.on_check_suite import clean_gh_logs
from sweepai.utils.image_utils import get_image_contents_from_urls, get_image_urls_from_issue
from sweepai.utils.issue_validator import validate_issue
from sweepai.utils.validate_license import validate_license
from sweepai.utils.buttons import Button, ButtonList, create_action_buttons
from sweepai.utils.chat_logger import ChatLogger
from sweepai.utils.diff import generate_diff
from sweepai.utils.event_logger import posthog
from sweepai.utils.github_utils import (
CURRENT_USERNAME,
ClonedRepo,
convert_pr_draft_field,
get_github_client,
get_token,
sanitize_string_for_github,
)
from sweepai.utils.progress import (
AssistantConversation,
PaymentContext,
TicketContext,
TicketProgress,
TicketProgressStatus,
)
from sweepai.utils.prompt_constructor import HumanMessagePrompt
from sweepai.utils.str_utils import (
BOT_SUFFIX,
FASTER_MODEL_MESSAGE,
UPDATES_MESSAGE,
blockquote,
bot_suffix,
checkbox_template,
clean_logs,
collapsible_template,
create_checkbox,
create_collapsible,
discord_suffix,
format_sandbox_success,
get_hash,
sep,
stars_suffix,
strip_sweep,
to_branch_name,
)
from sweepai.utils.ticket_utils import (
center,
fetch_relevant_files,
fire_and_forget_wrapper,
log_error,
prep_snippets,
)
from sweepai.utils.user_settings import UserSettings
# from sandbox.sandbox_utils import Sandbox
sweeping_gif = """<a href="https://github.com/sweepai/sweep"><img class="swing" src="https://raw.githubusercontent.com/sweepai/sweep/main/.assets/sweeping.gif" width="100" style="width:50px; margin-bottom:10px" alt="Sweeping"></a>"""
custom_config = """
extends: relaxed
rules:
line-length: disable
indentation: disable
"""
INSTRUCTIONS_FOR_REVIEW = """\
### 💡 To get Sweep to edit this pull request, you can:
* Comment below, and Sweep can edit the entire PR
* Comment on a file, Sweep will only modify the commented file
* Edit the original issue to get Sweep to recreate the PR from scratch"""
email_template = """Hey {name},
<br/><br/>
🚀 I just finished creating a pull request for your issue ({repo_full_name}#{issue_number}) at <a href="{pr_url}">{repo_full_name}#{pr_number}</a>!
<br/><br/>
You can view how I created this pull request <a href="{progress_url}">here</a>.
<h2>Summary</h2>
<blockquote>
{summary}
</blockquote>
<h2>Files Changed</h2>
<ul>
{files_changed}
</ul>
{sweeping_gif}
<br/>
Cheers,
<br/>
Sweep
<br/>"""
FAILING_GITHUB_ACTION_PROMPT = """\
The following Github Actions failed on a previous attempt at fixing this issue.
Propose a fix to the failing github actions. You must edit the source code, not the github action itself.
{github_action_log}
"""
# Add :eyes: emoji to ticket
def add_emoji(issue: Issue, comment_id: int = None, reaction_content="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
item_to_react_to.create_reaction(reaction_content)
# If SWEEP_BOT reacted to item_to_react_to with "rocket", then remove it.
def remove_emoji(issue: Issue, comment_id: int = None, content_to_delete="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
reactions = item_to_react_to.get_reactions()
for reaction in reactions:
if (
reaction.content == content_to_delete
and reaction.user.login == CURRENT_USERNAME
):
item_to_react_to.delete_reaction(reaction.id)
def create_error_logs(
commit_url_display: str,
sandbox_response: SandboxResponse,
status: str = "✓",
):
return (
(
"<br/>"
+ create_collapsible(
f"Sandbox logs for {commit_url_display} {status}",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
# takes in a list of workflow runs and returns a list of messages containing the logs of the failing runs
def get_failing_gha_logs(runs, installation_id) -> str:
token = get_token(installation_id)
all_logs = ""
for run in runs:
# jobs_url
jobs_url = run.jobs_url
jobs_response = requests.get(
jobs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
)
if jobs_response.status_code == 200:
failed_jobs = []
jobs = jobs_response.json()["jobs"]
for job in jobs:
if job["conclusion"] == "failure":
failed_jobs.append(job)
failed_jobs_name_list = []
for job in failed_jobs:
# add failed steps
for step in job["steps"]:
if step["conclusion"] == "failure":
failed_jobs_name_list.append(
f"{job['name']}/{step['number']}_{step['name']}"
)
else:
logger.error(
"Failed to get jobs for failing github actions, possible a credentials issue"
)
return all_logs
# make sure jobs in valid
if jobs_response.json()['total_count'] == 0:
logger.error(f"no jobs for this run: {run}, continuing...")
continue
# logs url
logs_url = run.logs_url
logs_response = requests.get(
logs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
allow_redirects=True,
)
# Check if the request was successful
if logs_response.status_code == 200:
zip_data = io.BytesIO(logs_response.content)
zip_file = zipfile.ZipFile(zip_data, "r")
zip_file_names = zip_file.namelist()
for file in failed_jobs_name_list:
if f"{file}.txt" in zip_file_names:
logs = zip_file.read(f"{file}.txt").decode("utf-8")
logs_prompt = clean_gh_logs(logs)
all_logs += logs_prompt + "\n"
else:
logger.error(
"Failed to get logs for failing github actions, likely a credentials issue"
)
return all_logs
def delete_old_prs(repo: Repository, issue_number: int):
logger.info("Deleting old PRs...")
prs = repo.get_pulls(
state="open",
sort="created",
direction="desc",
base=SweepConfig.get_branch(repo),
)
for pr in tqdm(prs.get_page(0)):
# # Check if this issue is mentioned in the PR, and pr is owned by bot
# # This is done in create_pr, (pr_description = ...)
if pr.user.login == CURRENT_USERNAME and f"Fixes #{issue_number}.\n" in pr.body:
safe_delete_sweep_branch(pr, repo)
break
def construct_sweep_bot(
repo: Repository,
repo_name: str,
issue_url: str,
repo_description: str,
title: str,
message_summary: str,
cloned_repo: ClonedRepo,
ticket_progress: TicketProgress,
chat_logger: ChatLogger,
snippets: Any = None,
tree: Any = None,
comments: Any = None,
) -> SweepBot:
human_message = HumanMessagePrompt(
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description.strip(),
title=title,
summary=message_summary,
snippets=snippets,
tree=tree,
)
sweep_bot = SweepBot.from_system_message_content(
human_message=human_message,
repo=repo,
is_reply=bool(comments),
chat_logger=chat_logger,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
)
return sweep_bot
def get_comment_header(
index: int,
g: Github,
repo_full_name: str,
user_settings: UserSettings,
progress_headers: list[None | str],
tracking_id: str | None,
payment_message_start: str,
user_settings_message: str,
errored: bool = False,
pr_message: str = "",
done: bool = False,
initial_sandbox_response: int | SandboxResponse = -1,
initial_sandbox_response_file=None,
config_pr_url: str | None = None,
):
config_pr_message = (
"\n"
+ f"<div align='center'>Install Sweep Configs: <a href='{config_pr_url}'>Pull Request</a></div>"
if config_pr_url is not None
else ""
)
actions_message = create_action_buttons(
[
RESTART_SWEEP_BUTTON,
]
)
sandbox_execution_message = "\n\n## GitHub Actions failed\n\nThe sandbox appears to be unavailable or down.\n\n"
if initial_sandbox_response == -1:
sandbox_execution_message = ""
elif initial_sandbox_response is not None:
repo = g.get_repo(repo_full_name)
commit_hash = repo.get_commits()[0].sha
success = initial_sandbox_response.outputs and initial_sandbox_response.success
status = "✓" if success else "X"
sandbox_execution_message = (
"\n\n## GitHub Actions"
+ status
+ "\n\nHere are the GitHub Actions logs prior to making any changes:\n\n"
)
sandbox_execution_message += entities_create_error_logs(
f'<a href="https://github.com/{repo_full_name}/commit/{commit_hash}"><code>{commit_hash[:7]}</code></a>',
initial_sandbox_response,
initial_sandbox_response_file,
)
if success:
sandbox_execution_message += f"\n\nSandbox passed on the latest `{repo.default_branch}`, so sandbox checks will be enabled for this issue."
else:
sandbox_execution_message += "\n\nSandbox failed, so all sandbox checks will be disabled for this issue."
if index < 0:
index = 0
if index == 4:
return (
pr_message
+ config_pr_message
+ f"\n\n---\n{user_settings.get_message(completed=True)}"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
total = len(progress_headers)
index += 1 if done else 0
index *= 100 / total
index = int(index)
index = min(100, index)
if errored:
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Errored&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}<br/>{center(pbar)}\n\n"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Progress&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}"
+ (
center(
f'\n\n<h2>✨ Track Sweep\'s progress on our <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">progress dashboard</a>!</h2>'
)
if MONGODB_URI is not None
else ""
)
+ f"<br/>{center(pbar)}"
+ ("\n" + stars_suffix if index != -1 else "")
+ "\n"
+ center(payment_message_start)
+ f"\n\n---\n{user_settings_message}"
+ config_pr_message
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
def on_ticket(
title: str,
summary: str,
issue_number: int,
issue_url: str, # purely for logging purposes
username: str,
repo_full_name: str,
repo_description: str,
installation_id: int,
comment_id: int = None,
edited: bool = False,
tracking_id: str | None = None,
):
if not os.environ.get("CLI"):
assert validate_license(), "License key is invalid or expired. Please contact us at team@sweep.dev to upgrade to an enterprise license."
with logger.contextualize(
tracking_id=tracking_id,
):
if tracking_id is None:
tracking_id = get_hash()
on_ticket_start_time = time()
logger.info(f"Starting on_ticket with title {title} and summary {summary}")
(
title,
slow_mode,
do_map,
subissues_mode,
sandbox_mode,
fast_mode,
lint_mode,
) = strip_sweep(title)
# fetch images from body of issue
image_urls = get_image_urls_from_issue(issue_number, repo_full_name, installation_id)
image_contents = get_image_contents_from_urls(image_urls)
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n<summary>Checklist</summary>.*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- \[[ X]\].*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
ticket_progress = TicketProgress(
tracking_id=tracking_id,
username=username,
context=TicketContext(
title=title,
description=summary,
repo_full_name=repo_full_name,
issue_number=issue_number,
is_public=repo.private is False,
start_time=int(time()),
),
)
branch_match = re.search(
r"([B|b]ranch:) *(?P<branch_name>.+?)(\s|$)", summary
)
overrided_branch_name = None
if branch_match and "branch_name" in branch_match.groupdict():
overrided_branch_name = (
branch_match.groupdict()["branch_name"].strip().strip("`\"'")
)
# TODO: this code might be finicky, might have missed edge cases
if overrided_branch_name.startswith("https://github.com/"):
overrided_branch_name = overrided_branch_name.split("?")[0].split(
"tree/"
)[-1]
SweepConfig.get_branch(repo, overrided_branch_name)
chat_logger = (
ChatLogger(
{
"repo_name": repo_name,
"title": title,
"summary": summary,
"issue_number": issue_number,
"issue_url": issue_url,
"username": (
username if not username.startswith("sweep") else assignee
),
"repo_full_name": repo_full_name,
"repo_description": repo_description,
"installation_id": installation_id,
"type": "ticket",
"mode": ENV,
"comment_id": comment_id,
"edited": edited,
"tracking_id": tracking_id,
},
active=True,
)
if MONGODB_URI
else None
)
if chat_logger and not IS_SELF_HOSTED:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
if use_faster_model:
raise Exception(FASTER_MODEL_MESSAGE)
if fast_mode:
use_faster_model = True
if not comment_id and not edited and chat_logger and not sandbox_mode:
fire_and_forget_wrapper(chat_logger.add_successful_ticket)(
gpt3=use_faster_model
)
organization, repo_name = repo_full_name.split("/")
metadata = {
"issue_url": issue_url,
"repo_full_name": repo_full_name,
"organization": organization,
"repo_name": repo_name,
"repo_description": repo_description,
"username": username,
"comment_id": comment_id,
"title": title,
"installation_id": installation_id,
"function": "on_ticket",
"edited": edited,
"model": "gpt-3.5" if use_faster_model else "gpt-4",
"tier": "pro" if is_paying_user else "free",
"mode": ENV,
"slow_mode": slow_mode,
"do_map": do_map,
"subissues_mode": subissues_mode,
"sandbox_mode": sandbox_mode,
"fast_mode": fast_mode,
"is_self_hosted": IS_SELF_HOSTED,
"tracking_id": tracking_id,
}
fire_and_forget_wrapper(posthog.capture)(
username, "started", properties=metadata
)
try:
if current_issue.state == "closed":
fire_and_forget_wrapper(posthog.capture)(
username,
"issue_closed",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": False, "reason": "Issue is closed"}
fire_and_forget_wrapper(add_emoji)(current_issue, comment_id)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="rocket"
)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="confused"
)
fire_and_forget_wrapper(current_issue.edit)(body=summary)
replies_text = ""
summary = summary if summary else ""
fire_and_forget_wrapper(delete_old_prs)(repo, issue_number)
if not sandbox_mode:
progress_headers = [
None,
"Step 1: 🔎 Searching",
"Step 2: ⌨️ Coding",
"Step 3: 🔁 Code Review",
]
else:
progress_headers = [
None,
"📖 Reading File",
"🛠️ Executing Sandbox",
]
issue_comment = None
payment_message, payment_message_start = get_payment_messages(
chat_logger
)
ticket_progress.context.payment_context = PaymentContext(
use_faster_model=use_faster_model,
pro_user=is_paying_user,
daily_tickets_used=(
chat_logger.get_ticket_count(use_date=True)
if chat_logger
else 0
),
monthly_tickets_used=(
chat_logger.get_ticket_count() if chat_logger else 0
),
)
ticket_progress.save()
config_pr_url = None
user_settings = UserSettings.from_username(username=username)
user_settings_message = user_settings.get_message()
cloned_repo = ClonedRepo(
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=overrided_branch_name,
)
# check that repo's directory is non-empty
if os.listdir(cloned_repo.cached_dir) == []:
logger.info("Empty repo")
first_comment = (
"Sweep is currently not supported on empty repositories. Please add some"
f" code to your repository and try again.\n{sep}##"
f" {progress_headers[1]}\n{bot_suffix}{discord_suffix}"
)
if issue_comment is None:
issue_comment = current_issue.create_comment(
first_comment + BOT_SUFFIX
)
else:
issue_comment.edit(first_comment + BOT_SUFFIX)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {"success": False}
indexing_message = (
"I'm searching for relevant snippets in your repository. If this is your first"
" time using Sweep, I'm indexing your repository. You can monitor the progress using the progress dashboard"
)
first_comment = (
f"{get_comment_header(0, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message)}\n{sep}I am currently looking into this ticket! I"
" will update the progress of the ticket in this comment. I am currently"
f" searching through your code, looking for relevant snippets.\n{sep}##"
f" {progress_headers[1]}\n{indexing_message}{bot_suffix}{discord_suffix}"
)
# Find Sweep's previous comment
comments = []
for comment in current_issue.get_comments():
comments.append(comment)
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
break
if issue_comment is None:
issue_comment = current_issue.create_comment(first_comment)
else:
fire_and_forget_wrapper(issue_comment.edit)(first_comment)
old_edit = issue_comment.edit
issue_comment.edit = lambda msg: old_edit(msg + BOT_SUFFIX)
past_messages = {}
current_index = 0
table = None
initial_sandbox_response = -1
initial_sandbox_response_file = None
def refresh_token():
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
return user_token, g, repo
def edit_sweep_comment(
message: str,
index: int,
pr_message="",
done=False,
add_bonus_message=True,
):
nonlocal current_index, user_token, g, repo, issue_comment, initial_sandbox_response, initial_sandbox_response_file
message = sanitize_string_for_github(message)
if pr_message:
pr_message = sanitize_string_for_github(pr_message)
# -1 = error, -2 = retry
# Only update the progress bar if the issue generation errors.
errored = index == -1
if index >= 0:
past_messages[index] = message
current_index = index
agg_message = None
# Include progress history
# index = -2 is reserved for
for i in range(
current_index + 2
): # go to next header (for Working on it... text)
if i == 0 or i >= len(progress_headers):
continue # skip None header
header = progress_headers[i]
if header is not None:
header = "## " + header + "\n"
else:
header = "No header\n"
msg = header + (past_messages.get(i) or "Working on it...")
if agg_message is None:
agg_message = msg
else:
agg_message = agg_message + f"\n{sep}" + msg
suffix = bot_suffix + discord_suffix
if errored:
agg_message = (
"## ❌ Unable to Complete PR"
+ "\n"
+ message
+ (
"\n\nFor bonus GPT-4 tickets, please report this bug on"
f" **[Discord](https://discord.gg/invite/sweep)** (tracking ID: `{tracking_id}`)."
if add_bonus_message
else ""
)
)
if table is not None:
agg_message = (
agg_message
+ f"\n{sep}Please look at the generated plan. If something looks"
f" wrong, please add more details to your issue.\n\n{table}"
)
suffix = bot_suffix # don't include discord suffix for error messages
# Update the issue comment
msg = f"{get_comment_header(current_index, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message, errored=errored, pr_message=pr_message, done=done, initial_sandbox_response=initial_sandbox_response, initial_sandbox_response_file=initial_sandbox_response_file, config_pr_url=config_pr_url)}\n{sep}{agg_message}{suffix}"
try:
issue_comment.edit(msg)
except BadCredentialsException:
logger.error(
f"Bad credentials, refreshing token (tracking ID: `{tracking_id}`)"
)
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
issue_comment = None
for comment in comments:
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
current_issue = repo.get_issue(number=issue_number)
if issue_comment is None:
issue_comment = current_issue.create_comment(msg)
else:
issue_comment = [
comment
for comment in current_issue.get_comments()
if comment.user.login == CURRENT_USERNAME
][0]
issue_comment.edit(msg)
if use_faster_model:
edit_sweep_comment(
FASTER_MODEL_MESSAGE, -1, add_bonus_message=False
)
posthog.capture(
username,
"ran_out_of_tickets",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {
"success": False,
"error_message": "We deprecated supporting GPT 3.5.",
}
error_message = validate_issue(title + summary)
if error_message:
logger.warning(f"Validation error: {error_message}")
edit_sweep_comment(
(
f"The issue was rejected with the following response:\n\n{blockquote(error_message)}"
),
-1,
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
posthog.capture(
username,
"invalid_issue",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": True}
prs_extracted = PRReader.extract_prs(repo, summary)
message_summary = summary
if prs_extracted:
message_summary += "\n\n" + prs_extracted
edit_sweep_comment(
create_collapsible(
"I found that you mentioned the following Pull Requests that might be important:",
blockquote(
prs_extracted,
),
),
1,
)
try:
# search/context manager
logger.info("Searching for relevant snippets...")
if image_contents: # doing it here to avoid editing the original issue
message_summary += ImageDescriptionBot().describe_images(text=title + message_summary, images=image_contents)
snippets, tree, _, repo_context_manager = fetch_relevant_files(
cloned_repo,
title,
message_summary,
replies_text,
username,
metadata,
on_ticket_start_time,
tracking_id,
is_paying_user,
is_consumer_tier,
issue_url,
chat_logger,
ticket_progress,
images=image_contents
)
cloned_repo = repo_context_manager.cloned_repo
except Exception as e:
edit_sweep_comment(
(
"It looks like an issue has occurred around fetching the files."
f" The exception was {str(e)}. If this error persists"
f" contact team@sweep.dev.\n\n> @{username}, editing this issue description to include more details will automatically make me relaunch. Please join our Discord server for support (tracking_id={tracking_id})"
),
-1,
)
raise Exception("Failed to fetch files") from e
_user_token, g = get_github_client(installation_id)
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
repo = g.get_repo(repo_full_name)
ticket_progress.search_progress.indexing_progress = (
ticket_progress.search_progress.indexing_total
)
ticket_progress.status = TicketProgressStatus.PLANNING
ticket_progress.save()
# Fetch git commit history
if not repo_description:
repo_description = "No description provided."
message_summary += replies_text
get_documentation_dict(repo)
docs_results = ""
sweep_bot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title=title,
message_summary=message_summary,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
# Check repository for sweep.yml file.
sweep_yml_exists = False
sweep_yml_failed = False
for content_file in repo.get_contents(""):
if content_file.name == "sweep.yaml":
sweep_yml_exists = True
# Check if YAML is valid
yaml_content = content_file.decoded_content.decode("utf-8")
sweep_yaml_dict = {}
try:
sweep_yaml_dict = yaml.safe_load(yaml_content)
except Exception:
logger.error(f"Failed to load YAML file: {yaml_content}")
if len(sweep_yaml_dict) > 0:
break
linter_config = yamllint_config.YamlLintConfig(custom_config)
problems = list(linter.run(yaml_content, linter_config))
if problems:
errors = [
f"Line {problem.line}: {problem.desc} (rule: {problem.rule})"
for problem in problems
]
error_message = "\n".join(errors)
markdown_error_message = f"**There is something wrong with your [sweep.yaml](https://github.com/{repo_full_name}/blob/main/sweep.yaml):**\n```\n{error_message}\n```"
sweep_yml_failed = True
logger.error(markdown_error_message)
edit_sweep_comment(markdown_error_message, -1)
else:
logger.info("The YAML file is valid. No errors found.")
break
# If sweep.yaml does not exist, then create a new PR that simply creates the sweep.yaml file.
if not sweep_yml_exists:
try:
logger.info("Creating sweep.yaml file...")
config_pr = create_config_pr(sweep_bot, cloned_repo=cloned_repo)
config_pr_url = config_pr.html_url
edit_sweep_comment(message="", index=-2)
except Exception as e:
logger.error(
"Failed to create new branch for sweep.yaml file.\n",
e,
traceback.format_exc(),
)
else:
logger.info("sweep.yaml file already exists.")
try:
# ANALYZE SNIPPETS
newline = "\n"
edit_sweep_comment(
"I found the following snippets in your repository. I will now analyze"
" these snippets and come up with a plan."
+ "\n\n"
+ create_collapsible(
"Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.",
"\n".join(
[
f"https://github.com/{organization}/{repo_name}/blob/{repo.get_commits()[0].sha}/{snippet.file_path}#L{max(snippet.start, 1)}-L{min(snippet.end, snippet.content.count(newline) - 1)}\n"
for snippet in snippets
]
),
)
+ (
create_collapsible(
"I also found that you mentioned the following Pull Requests that may be helpful:",
blockquote(prs_extracted),
)
if prs_extracted
else ""
)
+ (f"\n\n{docs_results}\n\n" if docs_results else ""),
1,
)
logger.info("Fetching files to modify/create...")
file_change_requests, plan = get_files_to_change(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=f"{title}\n\n{message_summary}",
repo_name=repo_full_name,
cloned_repo=cloned_repo,
images=image_contents
)
validate_file_change_requests(file_change_requests, cloned_repo)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.status = TicketProgressStatus.CODING
ticket_progress.save()
if not file_change_requests:
if len(title + summary) < 60:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details? Please make sure that the title and"
" summary of the issue are at least 60 characters."
),
-1,
)
else:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details?"
),
-1,
)
raise Exception("No files to modify.")
file_change_requests: list[
FileChangeRequest
] = sweep_bot.validate_file_change_requests(
file_change_requests,
)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.save()
table = tabulate(
[
[
file_change_request.entity_display,
file_change_request.instructions_display.replace(
"\n", "<br/>"
).replace("```", "\\```"),
]
for file_change_request in file_change_requests
if file_change_request.change_type != "check"
],
headers=["File Path", "Proposed Changes"],
tablefmt="pipe",
)
logger.info("Generating PR...")
pull_request = PullRequest(
title="Sweep: " + title,
branch_name="sweep/" + to_branch_name(title),
content="",
)
logger.info("Making PR...")
ticket_progress.context.branch_name = pull_request.branch_name
ticket_progress.save()
files_progress: list[tuple[str, str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
"⏳ In Progress",
"",
)
for file_change_request in file_change_requests
]
checkboxes_progress: list[tuple[str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
" ",
)
for file_change_request in file_change_requests
if not file_change_request.change_type == "check"
]
checkboxes_contents = "\n".join(
[
create_checkbox(
f"`{filename}`", blockquote(instructions), check == "X"
)
for filename, instructions, check in checkboxes_progress
]
)
create_collapsible("Checklist", checkboxes_contents, opened=True)
file_change_requests[0].status = "running"
condensed_checkboxes_contents = "\n".join(
[
create_checkbox(f"`{filename}`", "", check == "X").strip()
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_collapsible = create_collapsible(
"Checklist", condensed_checkboxes_contents, opened=True
)
current_issue = repo.get_issue(number=issue_number)
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
delete_branch = False
generator = create_pr_changes(
file_change_requests,
pull_request,
sweep_bot,
username,
installation_id,
issue_number,
chat_logger=chat_logger,
base_branch=overrided_branch_name,
additional_messages=[],
)
edit_sweep_comment(checkboxes_contents, 2)
if not file_change_requests:
raise NoFilesException()
response = {
"error": Exception(
f"Sweep failed to generate any file change requests! This could mean that Sweep failed to find the correct lines of code to modify or that GPT-4 did not respond in our specified format. Sometimes, retrying will fix this error. Otherwise, reach out to our Discord server for support (tracking_id={tracking_id})."
)
}
changed_files = []
for item in generator:
if isinstance(item, dict):
response = item
break
(
new_file_contents,
_,
commit,
file_change_requests,
) = item
# append all files that have been changed
if new_file_contents:
for file_name, _ in new_file_contents.items():
changed_files.append(file_name)
commit_hash: str = (
commit
if isinstance(commit, str)
else (
commit.sha
if commit is not None
else repo.get_branch(
pull_request.branch_name
).commit.sha
)
)
commit_url = (
f"https://github.com/{repo_full_name}/commit/{commit_hash}"
)
commit_url_display = (
f"<a href='{commit_url}'><code>{commit_hash[:7]}</code></a>"
)
create_error_logs(
commit_url_display,
None,
status=(
"✓"
),
)
checkboxes_progress = [
(
file_change_request.display_summary
+ " "
+ file_change_request.status_display
+ " "
+ (file_change_request.commit_hash_url or "")
+ f" [Edit]({file_change_request.get_edit_url(repo.full_name, pull_request.branch_name)})",
file_change_request.instructions_ticket_display
+ f"\n\n{file_change_request.diff_display}",
(
"X"
if file_change_request.status
in ("succeeded", "failed")
else " "
),
)
for file_change_request in file_change_requests
]
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
collapsible_template.format(
summary="Checklist",
body=checkboxes_contents,
opened="open",
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
try:
current_issue = repo.get_issue(number=issue_number)
except BadCredentialsException:
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
logger.info(files_progress)
edit_sweep_comment(checkboxes_contents, 2)
if not response.get("success"):
raise Exception(f"Failed to create PR: {response.get('error')}")
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
for _ in range(3):
try:
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
break
except Exception:
from time import sleep
sleep(1)
edit_sweep_comment(checkboxes_contents, 2)
pr_changes = response["pull_request"]
# change the body here
diff_text = get_branch_diff_text(
repo=repo,
branch=pull_request.branch_name,
base_branch=overrided_branch_name,
)
new_description = PRDescriptionBot().describe_diffs(
diff_text,
pull_request.title,
)
# TODO: update the title as well
if new_description:
pr_changes.body = (
f"{new_description}\n\nFixes"
f" #{issue_number}.\n\n---\n\n{UPDATES_MESSAGE}\n\n---\n\n{INSTRUCTIONS_FOR_REVIEW}{BOT_SUFFIX}"
)
edit_sweep_comment(
"I have finished coding the issue. I am now reviewing it for completeness.",
3,
)
change_location = f" [`{pr_changes.pr_head}`](https://github.com/{repo_full_name}/commits/{pr_changes.pr_head}).\n\n"
review_message = (
"Here are my self-reviews of my changes at" + change_location
)
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
except Exception:
pass
changes_required, review_message = False, ""
if changes_required:
edit_sweep_comment(
review_message
+ "\n\nI finished incorporating these changes.",
3,
)
else:
edit_sweep_comment(
f"I have finished reviewing the code for completeness. I did not find errors for {change_location}",
3,
)
revert_buttons = []
for changed_file in set(changed_files):
revert_buttons.append(
Button(label=f"{RESET_FILE} {changed_file}")
)
revert_buttons_list = ButtonList(
buttons=revert_buttons, title=REVERT_CHANGED_FILES_TITLE
)
# delete failing sweep yaml if applicable
if sweep_yml_failed:
try:
repo.delete_file(
"sweep.yaml",
"Delete failing sweep.yaml",
branch=pr_changes.pr_head,
sha=repo.get_contents("sweep.yaml").sha,
)
except Exception:
pass
# create draft pr, then convert to regular pr later
pr: GithubPullRequest = repo.create_pull(
title=pr_changes.title,
body=pr_changes.body,
head=pr_changes.pr_head,
base=overrided_branch_name or SweepConfig.get_branch(repo),
# removed draft PR
draft=False,
)
try:
pr.add_to_assignees(username)
except Exception as e:
logger.error(
f"Failed to add assignee {username}: {e}, probably a bot."
)
ticket_progress.status = TicketProgressStatus.COMPLETE
ticket_progress.context.done_time = time()
ticket_progress.context.pr_id = pr.number
ticket_progress.save()
if revert_buttons:
pr.create_issue_comment(
revert_buttons_list.serialize() + BOT_SUFFIX
)
# add comments before labelling
pr.add_to_labels(GITHUB_LABEL_NAME)
current_issue.create_reaction("rocket")
heres_pr_message = f'<h1 align="center">🚀 Here\'s the PR! <a href="{pr.html_url}">#{pr.number}</a></h1>'
progress_message = f'<div align="center"><b>See Sweep\'s progress at <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">the progress dashboard</a>!</b></div>'
edit_sweep_comment(
review_message + "\n\nSuccess! 🚀",
4,
pr_message=(
f"{center(heres_pr_message)}\n{center(progress_message)}\n{center(payment_message_start)}"
),
done=True,
)
user_settings = UserSettings.from_username(username=username)
user = g.get_user(username)
full_name = user.name or user.login
name = full_name.split(" ")[0]
files_changed = []
for fcr in file_change_requests:
if fcr.change_type in ("create", "modify"):
diff = list(
difflib.unified_diff(
(fcr.old_content or "").splitlines() or [],
(fcr.new_content or "").splitlines() or [],
lineterm="",
)
)
added = sum(
1
for line in diff
if line.startswith("+") and not line.startswith("+++")
)
removed = sum(
1
for line in diff
if line.startswith("-") and not line.startswith("---")
)
files_changed.append(
f"<code>{fcr.filename}</code> (+{added}/-{removed})"
)
user_settings.send_email(
subject=f"Sweep Pull Request Complete for {repo_name}#{issue_number} {title}",
html=email_template.format(
name=name,
pr_url=pr.html_url,
issue_number=issue_number,
repo_full_name=repo_full_name,
pr_number=pr.number,
progress_url=f"{PROGRESS_BASE_URL}/issues/{tracking_id}",
summary=markdown.markdown(pr_changes.body),
files_changed="\n".join(
[f"<li>{item}</li>" for item in files_changed]
),
sweeping_gif=sweeping_gif,
),
)
# poll for github to check when gha are done
total_poll_attempts = 0
total_edit_attempts = 0
SLEEP_DURATION_SECONDS = 15
GITHUB_ACTIONS_ENABLED = get_gha_enabled(repo=repo) and DEPLOYMENT_GHA_ENABLED
GHA_MAX_EDIT_ATTEMPTS = 5 # max number of times to edit PR
current_commit = pr.head.sha
while True and GITHUB_ACTIONS_ENABLED:
logger.info(
f"Polling to see if Github Actions have finished... {total_poll_attempts}"
)
# we wait at most 60 minutes
if total_poll_attempts * SLEEP_DURATION_SECONDS // 60 >= 60:
break
else:
# wait one minute between check attempts
total_poll_attempts += 1
from time import sleep
sleep(SLEEP_DURATION_SECONDS)
runs = list(repo.get_workflow_runs(branch=pr.head.ref, head_sha=current_commit))
# if all runs have succeeded, break
if all([run.conclusion == "success" for run in runs]):
break
# if any of them have failed we retry
if any([run.conclusion == "failure" for run in runs]):
failed_runs = [
run for run in runs if run.conclusion == "failure"
]
failed_gha_logs: list[str] = get_failing_gha_logs(
failed_runs,
installation_id,
)
if failed_gha_logs:
# make edits to the PR
# TODO: look into rollbacks so we don't continue adding onto errors
cloned_repo = ClonedRepo( # reinitialize cloned_repo to avoid conflicts
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=pr.head.ref,
)
diffs = get_branch_diff_text(repo=repo, branch=pr.head.ref, base_branch=pr.base.ref)
problem_statement = f"{title}\n{message_summary}\n{replies_text}"
all_information_prompt = GHA_PROMPT.format(
problem_statement=problem_statement,
github_actions_logs=failed_gha_logs,
changes_made=diffs,
)
repo_context_manager = prep_snippets(cloned_repo=cloned_repo, query=(title + message_summary + replies_text).strip("\n"), ticket_progress=ticket_progress) # need to do this, can use the old query for speed
sweep_bot: SweepBot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title="Fix the following errors to complete the user request.",
message_summary=all_information_prompt,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
file_change_requests, plan = get_files_to_change_for_gha(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=all_information_prompt,
updated_files=new_file_contents,
cloned_repo=cloned_repo,
chat_logger=chat_logger,
)
validate_file_change_requests(file_change_requests, cloned_repo)
previous_modify_files_dict: dict[str, dict[str, str | list[str]]] | None = None
_, commit, _ = sweep_bot.handle_modify_file_main(
branch=pr.head.ref,
assistant_conversation=None,
additional_messages=[],
previous_modify_files_dict=previous_modify_files_dict,
file_change_requests=file_change_requests,
username=username
)
current_commit = commit.sha
pr = repo.get_pull(pr.number) # IMPORTANT: resync PR otherwise you'll fetch old GHA runs
total_edit_attempts += 1
if total_edit_attempts >= GHA_MAX_EDIT_ATTEMPTS:
logger.info(f"Tried to edit PR {GHA_MAX_EDIT_ATTEMPTS} times, giving up.")
break
# if none of the runs have completed we wait and poll github
logger.info(
f"No Github Actions have failed yet and not all have succeeded yet, waiting for {SLEEP_DURATION_SECONDS} seconds before polling again..."
)
# break from main for loop
convert_pr_draft_field(pr, is_draft=False, installation_id=installation_id)
except MaxTokensExceeded as e:
logger.info("Max tokens exceeded")
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Max tokens exceeded. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Max Tokens Exceeded",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
if chat_logger and chat_logger.is_paying_user():
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too long."
" We are currently working on improved file streaming to address"
" this issue.\n"
),
-1,
)
else:
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too"
" long.\n\nIf this file is incorrect, please describe the desired"
" file in the prompt. However, if you would like to edit longer"
" files, consider upgrading to [Sweep Pro](https://sweep.dev/) for"
" longer context lengths.\n"
),
-1,
)
delete_branch = True
raise e
except NoFilesException as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sweep could not find files to modify to address this issue. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.info("Sweep could not find files to modify")
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Sweep could not find files to modify",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
edit_sweep_comment(
(
"Sorry, Sweep could not find any appropriate files to edit to address"
" this issue. If this is a mistake, please provide more context and Sweep"
f" will retry!\n\n@{username}, please edit the issue description to"
" include more details. You can also ask for help on our community"
" forum: https://community.sweep.dev/"
),
-1,
)
delete_branch = True
raise e
except openai.BadRequestError as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sorry, it looks like there is an error with communicating with OpenAI. If this error persists, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
edit_sweep_comment(
(
"I'm sorry, but it looks our model has ran out of context length. We're"
" trying to make this happen less, but one way to mitigate this is to"
" code smaller files. If this error persists report it at"
" https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Context Length",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
posthog.capture(
username,
"failed",
properties={
"error": str(e),
"trace": traceback.format_exc(),
"reason": "Invalid request error / context length",
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
delete_branch = True
raise e
except AssistantRaisedException as e:
if ticket_progress is not None:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Sweep raised an error with the following message: {e.message}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.exception(e)
edit_sweep_comment(
f"Sweep raised an error with the following message:\n{blockquote(e.message)}",
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
except Exception as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Internal server error: {str(e)}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
# title and summary are defined elsewhere
if len(title + summary) < 60:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error occurred due to"
f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
" so Sweep can better address it. Alternatively, post on our community forum"
" for assistance: https://community.sweep.dev/"
),
-1,
)
else:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error has occurred due to"
+ f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
+ " so Sweep can better address it. Alternatively, reach out to Kevin or William for help at"
+ " https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
else:
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
fire_and_forget_wrapper(add_emoji)("rocket")
except SystemExit:
raise SystemExit
except Exception as e:
logger.error(e)
if delete_branch:
try:
if pull_request.branch_name.startswith("sweep"):
repo.get_git_ref(
f"heads/{pull_request.branch_name}"
).delete()
else:
raise Exception(
f"Branch name {pull_request.branch_name} does not start with sweep/"
)
except Exception as e:
logger.error(e)
logger.error(traceback.format_exc())
logger.info("Deleted branch", pull_request.branch_name)
except Exception as e:
posthog.capture(
username,
"failed",
properties={
**metadata,
"error": str(e),
"trace": traceback.format_exc(),
"duration": round(time() - on_ticket_start_time),
},
)
raise e
posthog.capture(
username,
"success",
properties={**metadata, "duration": round(time() - on_ticket_start_time)},
)
logger.info("on_ticket success in " + str(round(time() - on_ticket_start_time)))
return {"success": True}
def handle_sandbox_mode(
title, repo_full_name, repo, ticket_progress, edit_sweep_comment
):
logger.info("Running in sandbox mode")
sweep_bot = SweepBot(repo=repo, ticket_progress=ticket_progress)
logger.info("Getting file contents")
file_name = title.split(":")[1].strip()
file_contents = sweep_bot.get_contents(file_name).decoded_content.decode("utf-8")
try:
ext = file_name.split(".")[-1]
except Exception:
ext = ""
file_contents.replace("```", "\`\`\`")
sha = repo.get_branch(repo.default_branch).commit.sha
permalink = f"https://github.com/{repo_full_name}/blob/{sha}/{file_name}#L1-L{len(file_contents.splitlines())}"
logger.info("Running sandbox")
edit_sweep_comment(
f"Running sandbox for {file_name}. Current Code:\n\n{permalink}",
1,
)
updated_contents, sandbox_response = sweep_bot.check_sandbox(
file_name, file_contents
)
logger.info("Sandbox finished")
logs = (
(
"<br/>"
+ create_collapsible(
"Sandbox logs",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
updated_contents = updated_contents.replace("```", "\`\`\`")
diff = generate_diff(file_contents, updated_contents).replace("```", "\`\`\`")
diff_display = (
f"Updated Code:\n\n```{ext}\n{updated_contents}```\nDiff:\n```diff\n{diff}\n```"
if diff
else f"Sandbox made no changes to {file_name} (formatters were not configured or Sweep didn't make changes)."
)
edit_sweep_comment(
f"{logs}\n{diff_display}",
2,
)
edit_sweep_comment("N/A", 3)
logger.info("Sandbox comments updated")
def get_branch_diff_text(repo, branch, base_branch=None):
base_branch = base_branch or SweepConfig.get_branch(repo)
comparison = repo.compare(base_branch, branch)
file_diffs = comparison.files
pr_diffs = []
for file in file_diffs:
diff = file.patch
if (
file.status == "added"
or file.status == "modified"
or file.status == "removed"
):
pr_diffs.append((file.filename, diff))
else:
logger.info(
f"File status {file.status} not recognized"
) # TODO(sweep): We don't handle renamed files
return "\n".join([f"{filename}\n{diff}" for filename, diff in pr_diffs])
def get_payment_messages(chat_logger: ChatLogger):
if chat_logger:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
tracking_id = chat_logger.data["tracking_id"] if MONGODB_URI is not None else None
# Find the first comment made by the bot
tickets_allocated = 5
if is_consumer_tier:
tickets_allocated = 15
if is_paying_user:
tickets_allocated = 500
purchased_ticket_count = (
chat_logger.get_ticket_count(purchased=True) if chat_logger else 0
)
ticket_count = (
max(tickets_allocated - chat_logger.get_ticket_count(), 0)
+ purchased_ticket_count
if chat_logger
else 999
)
daily_ticket_count = (
(3 - chat_logger.get_ticket_count(use_date=True) if not use_faster_model else 0)
if chat_logger
else 999
)
model_name = "GPT-4"
single_payment_link = "https://buy.stripe.com/00g3fh7qF85q0AE14d"
pro_payment_link = "https://buy.stripe.com/00g5npeT71H2gzCfZ8"
daily_message = (
f" and {daily_ticket_count} for the day"
if not is_paying_user and not is_consumer_tier
else ""
)
user_type = "💎 <b>Sweep Pro</b>" if is_paying_user else "⚡ <b>Sweep Basic Tier</b>"
gpt_tickets_left_message = (
f"{ticket_count} GPT-4 tickets left for the month"
if not is_paying_user
else "unlimited GPT-4 tickets"
)
purchase_message = f"<br/><br/> For more GPT-4 tickets, visit <a href={single_payment_link}>our payment portal</a>. For a one week free trial, try <a href={pro_payment_link}>Sweep Pro</a> (unlimited GPT-4 tickets)."
payment_message = (
f"{user_type}: I used {model_name} to create this ticket. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)
payment_message_start = (
f"{user_type}: I'm using {model_name}. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)


Step 2: ⌨️ Coding

Modify sweepai/config/server.py with contents: Add a new environment variable for the Slack API key.

<original_code>
RESEND_API_KEY = os.environ.get("RESEND_API_KEY", None)
LOKI_URL = None

DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
ENV = "prod" if GITHUB_BOT_USERNAME != TEST_BOT_NAME else "dev"

PROGRESS_BASE_URL = os.environ.get(
"PROGRESS_BASE_URL", "https://progress.sweep.dev"
).rstrip("/")

DISABLED_REPOS = os.environ.get("DISABLED_REPOS", "").split(",")

GHA_AUTOFIX_ENABLED: bool = os.environ.get("GHA_AUTOFIX_ENABLED", False)
MERGE_CONFLICT_ENABLED: bool = os.environ.get("MERGE_CONFLICT_ENABLED", False)
INSTALLATION_ID = os.environ.get("INSTALLATION_ID", None)

AWS_ACCESS_KEY=os.environ.get("AWS_ACCESS_KEY")
AWS_SECRET_KEY=os.environ.get("AWS_SECRET_KEY")
AWS_REGION=os.environ.get("AWS_REGION")
ANTHROPIC_AVAILABLE = AWS_ACCESS_KEY and AWS_SECRET_KEY and AWS_REGION
</original_code>

<new_code>
RESEND_API_KEY = os.environ.get("RESEND_API_KEY", None)
LOKI_URL = None

DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
ENV = "prod" if GITHUB_BOT_USERNAME != TEST_BOT_NAME else "dev"

PROGRESS_BASE_URL = os.environ.get(
"PROGRESS_BASE_URL", "https://progress.sweep.dev"
).rstrip("/")

DISABLED_REPOS = os.environ.get("DISABLED_REPOS", "").split(",")

GHA_AUTOFIX_ENABLED: bool = os.environ.get("GHA_AUTOFIX_ENABLED", False)
MERGE_CONFLICT_ENABLED: bool = os.environ.get("MERGE_CONFLICT_ENABLED", False)
INSTALLATION_ID = os.environ.get("INSTALLATION_ID", None)

AWS_ACCESS_KEY=os.environ.get("AWS_ACCESS_KEY")
AWS_SECRET_KEY=os.environ.get("AWS_SECRET_KEY")
AWS_REGION=os.environ.get("AWS_REGION")
ANTHROPIC_AVAILABLE = AWS_ACCESS_KEY and AWS_SECRET_KEY and AWS_REGION

SLACK_API_KEY = os.environ.get("SLACK_API_KEY", None)
</new_code>

  • Modify sweepai/handlers/on_ticket.py7f526be Edit
Modify sweepai/handlers/on_ticket.py with contents: Import the necessary modules for handling Slack links and authentication.

<original_code>
import difflib
import io
import os
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
</original_code>

<new_code>
import difflib
import io
import os
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter

from slack_sdk import WebClient
</new_code>

  • Modify sweepai/handlers/on_ticket.py7f526be Edit
Modify sweepai/handlers/on_ticket.py with contents: Extract the Slack link from the issue summary and authenticate with the user's Slack account.

<original_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
</original_code>

<new_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)

slack_link_pattern = re.compile(r'https://\w+\.slack\.com/archives/\w+/p\d+')
slack_link_match = slack_link_pattern.search(summary)

if slack_link_match:
    slack_link = slack_link_match.group()
    slack_client = WebClient(token=os.environ["SLACK_API_KEY"])
    
    # Extract the channel ID and message timestamp from the Slack link
    link_parts = slack_link.split('/')
    channel_id = link_parts[-2]
    message_ts = link_parts[-1].replace('p', '')

    # Fetch the Slack message thread
    thread_response = slack_client.conversations_replies(
        channel=channel_id,
        ts=message_ts
    )
    
    # Extract relevant information from the Slack thread
    slack_thread_info = ""
    for message in thread_response["messages"]:
        slack_thread_info += f"{message['text']}\n"
    
    # Append the Slack thread information to the issue summary
    summary += f"\n\nSlack Thread:\n{slack_thread_info}"

repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
    assignee = current_issue.user.login

</new_code>


Step 3: 🔁 Code Review

I have finished reviewing the code for completeness. I did not find errors for sweep/augment_on_ticket_so_that_when_a_user_ad_a3c2f.


🎉 Latest improvements to Sweep:
  • New dashboard launched for real-time tracking of Sweep issues, covering all stages from search to coding.
  • Integration of OpenAI's latest Assistant API for more efficient and reliable code planning and editing, improving speed by 3x.
  • Use the GitHub issues extension for creating Sweep issues directly from your editor.

💡 To recreate the pull request edit the issue title or description.
Something wrong? Let us know.

This is an automated message generated by Sweep AI.

Copy link
Contributor

sweep-nightly bot commented May 2, 2024

Sweeping

✨ Track Sweep's progress on our progress dashboard!


50%

💎 Sweep Pro: I'm using GPT-4. You have unlimited GPT-4 tickets. (tracking ID: a1b68d3b50)

Tip

I can email you when I complete this pull request if you set up your email here!


Actions (click)

  • ↻ Restart Sweep

Step 1: 🔎 Searching

I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.

Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.

"""
on_ticket is the main function that is called when a new issue is created.
It is only called by the webhook handler in sweepai/api.py.
"""
import difflib
import io
import os
import re
import traceback
from typing import Any
import zipfile
from time import time
import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
from sweepai.core.sweep_bot import GHA_PROMPT
from sweepai.agents.pr_description_bot import PRDescriptionBot
from sweepai.agents.image_description_bot import ImageDescriptionBot
from sweepai.config.client import (
RESET_FILE,
RESTART_SWEEP_BUTTON,
REVERT_CHANGED_FILES_TITLE,
SweepConfig,
get_documentation_dict,
get_gha_enabled,
)
from sweepai.config.server import (
DEPLOYMENT_GHA_ENABLED,
ENV,
GITHUB_LABEL_NAME,
IS_SELF_HOSTED,
MONGODB_URI,
PROGRESS_BASE_URL,
)
from sweepai.core.entities import (
AssistantRaisedException,
FileChangeRequest,
MaxTokensExceeded,
NoFilesException,
PullRequest,
SandboxResponse,
)
from sweepai.core.entities import create_error_logs as entities_create_error_logs
from sweepai.core.pr_reader import PRReader
from sweepai.core.sweep_bot import SweepBot, get_files_to_change, get_files_to_change_for_gha, validate_file_change_requests
from sweepai.handlers.create_pr import (
create_config_pr,
create_pr_changes,
safe_delete_sweep_branch,
)
from sweepai.handlers.on_check_suite import clean_gh_logs
from sweepai.utils.image_utils import get_image_contents_from_urls, get_image_urls_from_issue
from sweepai.utils.issue_validator import validate_issue
from sweepai.utils.validate_license import validate_license
from sweepai.utils.buttons import Button, ButtonList, create_action_buttons
from sweepai.utils.chat_logger import ChatLogger
from sweepai.utils.diff import generate_diff
from sweepai.utils.event_logger import posthog
from sweepai.utils.github_utils import (
CURRENT_USERNAME,
ClonedRepo,
convert_pr_draft_field,
get_github_client,
get_token,
sanitize_string_for_github,
)
from sweepai.utils.progress import (
AssistantConversation,
PaymentContext,
TicketContext,
TicketProgress,
TicketProgressStatus,
)
from sweepai.utils.prompt_constructor import HumanMessagePrompt
from sweepai.utils.str_utils import (
BOT_SUFFIX,
FASTER_MODEL_MESSAGE,
UPDATES_MESSAGE,
blockquote,
bot_suffix,
checkbox_template,
clean_logs,
collapsible_template,
create_checkbox,
create_collapsible,
discord_suffix,
format_sandbox_success,
get_hash,
sep,
stars_suffix,
strip_sweep,
to_branch_name,
)
from sweepai.utils.ticket_utils import (
center,
fetch_relevant_files,
fire_and_forget_wrapper,
log_error,
prep_snippets,
)
from sweepai.utils.user_settings import UserSettings
# from sandbox.sandbox_utils import Sandbox
sweeping_gif = """<a href="https://github.com/sweepai/sweep"><img class="swing" src="https://raw.githubusercontent.com/sweepai/sweep/main/.assets/sweeping.gif" width="100" style="width:50px; margin-bottom:10px" alt="Sweeping"></a>"""
custom_config = """
extends: relaxed
rules:
line-length: disable
indentation: disable
"""
INSTRUCTIONS_FOR_REVIEW = """\
### 💡 To get Sweep to edit this pull request, you can:
* Comment below, and Sweep can edit the entire PR
* Comment on a file, Sweep will only modify the commented file
* Edit the original issue to get Sweep to recreate the PR from scratch"""
email_template = """Hey {name},
<br/><br/>
🚀 I just finished creating a pull request for your issue ({repo_full_name}#{issue_number}) at <a href="{pr_url}">{repo_full_name}#{pr_number}</a>!
<br/><br/>
You can view how I created this pull request <a href="{progress_url}">here</a>.
<h2>Summary</h2>
<blockquote>
{summary}
</blockquote>
<h2>Files Changed</h2>
<ul>
{files_changed}
</ul>
{sweeping_gif}
<br/>
Cheers,
<br/>
Sweep
<br/>"""
FAILING_GITHUB_ACTION_PROMPT = """\
The following Github Actions failed on a previous attempt at fixing this issue.
Propose a fix to the failing github actions. You must edit the source code, not the github action itself.
{github_action_log}
"""
# Add :eyes: emoji to ticket
def add_emoji(issue: Issue, comment_id: int = None, reaction_content="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
item_to_react_to.create_reaction(reaction_content)
# If SWEEP_BOT reacted to item_to_react_to with "rocket", then remove it.
def remove_emoji(issue: Issue, comment_id: int = None, content_to_delete="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
reactions = item_to_react_to.get_reactions()
for reaction in reactions:
if (
reaction.content == content_to_delete
and reaction.user.login == CURRENT_USERNAME
):
item_to_react_to.delete_reaction(reaction.id)
def create_error_logs(
commit_url_display: str,
sandbox_response: SandboxResponse,
status: str = "✓",
):
return (
(
"<br/>"
+ create_collapsible(
f"Sandbox logs for {commit_url_display} {status}",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
# takes in a list of workflow runs and returns a list of messages containing the logs of the failing runs
def get_failing_gha_logs(runs, installation_id) -> str:
token = get_token(installation_id)
all_logs = ""
for run in runs:
# jobs_url
jobs_url = run.jobs_url
jobs_response = requests.get(
jobs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
)
if jobs_response.status_code == 200:
failed_jobs = []
jobs = jobs_response.json()["jobs"]
for job in jobs:
if job["conclusion"] == "failure":
failed_jobs.append(job)
failed_jobs_name_list = []
for job in failed_jobs:
# add failed steps
for step in job["steps"]:
if step["conclusion"] == "failure":
failed_jobs_name_list.append(
f"{job['name']}/{step['number']}_{step['name']}"
)
else:
logger.error(
"Failed to get jobs for failing github actions, possible a credentials issue"
)
return all_logs
# make sure jobs in valid
if jobs_response.json()['total_count'] == 0:
logger.error(f"no jobs for this run: {run}, continuing...")
continue
# logs url
logs_url = run.logs_url
logs_response = requests.get(
logs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
allow_redirects=True,
)
# Check if the request was successful
if logs_response.status_code == 200:
zip_data = io.BytesIO(logs_response.content)
zip_file = zipfile.ZipFile(zip_data, "r")
zip_file_names = zip_file.namelist()
for file in failed_jobs_name_list:
if f"{file}.txt" in zip_file_names:
logs = zip_file.read(f"{file}.txt").decode("utf-8")
logs_prompt = clean_gh_logs(logs)
all_logs += logs_prompt + "\n"
else:
logger.error(
"Failed to get logs for failing github actions, likely a credentials issue"
)
return all_logs
def delete_old_prs(repo: Repository, issue_number: int):
logger.info("Deleting old PRs...")
prs = repo.get_pulls(
state="open",
sort="created",
direction="desc",
base=SweepConfig.get_branch(repo),
)
for pr in tqdm(prs.get_page(0)):
# # Check if this issue is mentioned in the PR, and pr is owned by bot
# # This is done in create_pr, (pr_description = ...)
if pr.user.login == CURRENT_USERNAME and f"Fixes #{issue_number}.\n" in pr.body:
safe_delete_sweep_branch(pr, repo)
break
def construct_sweep_bot(
repo: Repository,
repo_name: str,
issue_url: str,
repo_description: str,
title: str,
message_summary: str,
cloned_repo: ClonedRepo,
ticket_progress: TicketProgress,
chat_logger: ChatLogger,
snippets: Any = None,
tree: Any = None,
comments: Any = None,
) -> SweepBot:
human_message = HumanMessagePrompt(
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description.strip(),
title=title,
summary=message_summary,
snippets=snippets,
tree=tree,
)
sweep_bot = SweepBot.from_system_message_content(
human_message=human_message,
repo=repo,
is_reply=bool(comments),
chat_logger=chat_logger,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
)
return sweep_bot
def get_comment_header(
index: int,
g: Github,
repo_full_name: str,
user_settings: UserSettings,
progress_headers: list[None | str],
tracking_id: str | None,
payment_message_start: str,
user_settings_message: str,
errored: bool = False,
pr_message: str = "",
done: bool = False,
initial_sandbox_response: int | SandboxResponse = -1,
initial_sandbox_response_file=None,
config_pr_url: str | None = None,
):
config_pr_message = (
"\n"
+ f"<div align='center'>Install Sweep Configs: <a href='{config_pr_url}'>Pull Request</a></div>"
if config_pr_url is not None
else ""
)
actions_message = create_action_buttons(
[
RESTART_SWEEP_BUTTON,
]
)
sandbox_execution_message = "\n\n## GitHub Actions failed\n\nThe sandbox appears to be unavailable or down.\n\n"
if initial_sandbox_response == -1:
sandbox_execution_message = ""
elif initial_sandbox_response is not None:
repo = g.get_repo(repo_full_name)
commit_hash = repo.get_commits()[0].sha
success = initial_sandbox_response.outputs and initial_sandbox_response.success
status = "✓" if success else "X"
sandbox_execution_message = (
"\n\n## GitHub Actions"
+ status
+ "\n\nHere are the GitHub Actions logs prior to making any changes:\n\n"
)
sandbox_execution_message += entities_create_error_logs(
f'<a href="https://github.com/{repo_full_name}/commit/{commit_hash}"><code>{commit_hash[:7]}</code></a>',
initial_sandbox_response,
initial_sandbox_response_file,
)
if success:
sandbox_execution_message += f"\n\nSandbox passed on the latest `{repo.default_branch}`, so sandbox checks will be enabled for this issue."
else:
sandbox_execution_message += "\n\nSandbox failed, so all sandbox checks will be disabled for this issue."
if index < 0:
index = 0
if index == 4:
return (
pr_message
+ config_pr_message
+ f"\n\n---\n{user_settings.get_message(completed=True)}"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
total = len(progress_headers)
index += 1 if done else 0
index *= 100 / total
index = int(index)
index = min(100, index)
if errored:
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Errored&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}<br/>{center(pbar)}\n\n"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Progress&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}"
+ (
center(
f'\n\n<h2>✨ Track Sweep\'s progress on our <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">progress dashboard</a>!</h2>'
)
if MONGODB_URI is not None
else ""
)
+ f"<br/>{center(pbar)}"
+ ("\n" + stars_suffix if index != -1 else "")
+ "\n"
+ center(payment_message_start)
+ f"\n\n---\n{user_settings_message}"
+ config_pr_message
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
def on_ticket(
title: str,
summary: str,
issue_number: int,
issue_url: str, # purely for logging purposes
username: str,
repo_full_name: str,
repo_description: str,
installation_id: int,
comment_id: int = None,
edited: bool = False,
tracking_id: str | None = None,
):
if not os.environ.get("CLI"):
assert validate_license(), "License key is invalid or expired. Please contact us at team@sweep.dev to upgrade to an enterprise license."
with logger.contextualize(
tracking_id=tracking_id,
):
if tracking_id is None:
tracking_id = get_hash()
on_ticket_start_time = time()
logger.info(f"Starting on_ticket with title {title} and summary {summary}")
(
title,
slow_mode,
do_map,
subissues_mode,
sandbox_mode,
fast_mode,
lint_mode,
) = strip_sweep(title)
# fetch images from body of issue
image_urls = get_image_urls_from_issue(issue_number, repo_full_name, installation_id)
image_contents = get_image_contents_from_urls(image_urls)
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n<summary>Checklist</summary>.*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- \[[ X]\].*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
ticket_progress = TicketProgress(
tracking_id=tracking_id,
username=username,
context=TicketContext(
title=title,
description=summary,
repo_full_name=repo_full_name,
issue_number=issue_number,
is_public=repo.private is False,
start_time=int(time()),
),
)
branch_match = re.search(
r"([B|b]ranch:) *(?P<branch_name>.+?)(\s|$)", summary
)
overrided_branch_name = None
if branch_match and "branch_name" in branch_match.groupdict():
overrided_branch_name = (
branch_match.groupdict()["branch_name"].strip().strip("`\"'")
)
# TODO: this code might be finicky, might have missed edge cases
if overrided_branch_name.startswith("https://github.com/"):
overrided_branch_name = overrided_branch_name.split("?")[0].split(
"tree/"
)[-1]
SweepConfig.get_branch(repo, overrided_branch_name)
chat_logger = (
ChatLogger(
{
"repo_name": repo_name,
"title": title,
"summary": summary,
"issue_number": issue_number,
"issue_url": issue_url,
"username": (
username if not username.startswith("sweep") else assignee
),
"repo_full_name": repo_full_name,
"repo_description": repo_description,
"installation_id": installation_id,
"type": "ticket",
"mode": ENV,
"comment_id": comment_id,
"edited": edited,
"tracking_id": tracking_id,
},
active=True,
)
if MONGODB_URI
else None
)
if chat_logger and not IS_SELF_HOSTED:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
if use_faster_model:
raise Exception(FASTER_MODEL_MESSAGE)
if fast_mode:
use_faster_model = True
if not comment_id and not edited and chat_logger and not sandbox_mode:
fire_and_forget_wrapper(chat_logger.add_successful_ticket)(
gpt3=use_faster_model
)
organization, repo_name = repo_full_name.split("/")
metadata = {
"issue_url": issue_url,
"repo_full_name": repo_full_name,
"organization": organization,
"repo_name": repo_name,
"repo_description": repo_description,
"username": username,
"comment_id": comment_id,
"title": title,
"installation_id": installation_id,
"function": "on_ticket",
"edited": edited,
"model": "gpt-3.5" if use_faster_model else "gpt-4",
"tier": "pro" if is_paying_user else "free",
"mode": ENV,
"slow_mode": slow_mode,
"do_map": do_map,
"subissues_mode": subissues_mode,
"sandbox_mode": sandbox_mode,
"fast_mode": fast_mode,
"is_self_hosted": IS_SELF_HOSTED,
"tracking_id": tracking_id,
}
fire_and_forget_wrapper(posthog.capture)(
username, "started", properties=metadata
)
try:
if current_issue.state == "closed":
fire_and_forget_wrapper(posthog.capture)(
username,
"issue_closed",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": False, "reason": "Issue is closed"}
fire_and_forget_wrapper(add_emoji)(current_issue, comment_id)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="rocket"
)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="confused"
)
fire_and_forget_wrapper(current_issue.edit)(body=summary)
replies_text = ""
summary = summary if summary else ""
fire_and_forget_wrapper(delete_old_prs)(repo, issue_number)
if not sandbox_mode:
progress_headers = [
None,
"Step 1: 🔎 Searching",
"Step 2: ⌨️ Coding",
"Step 3: 🔁 Code Review",
]
else:
progress_headers = [
None,
"📖 Reading File",
"🛠️ Executing Sandbox",
]
issue_comment = None
payment_message, payment_message_start = get_payment_messages(
chat_logger
)
ticket_progress.context.payment_context = PaymentContext(
use_faster_model=use_faster_model,
pro_user=is_paying_user,
daily_tickets_used=(
chat_logger.get_ticket_count(use_date=True)
if chat_logger
else 0
),
monthly_tickets_used=(
chat_logger.get_ticket_count() if chat_logger else 0
),
)
ticket_progress.save()
config_pr_url = None
user_settings = UserSettings.from_username(username=username)
user_settings_message = user_settings.get_message()
cloned_repo = ClonedRepo(
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=overrided_branch_name,
)
# check that repo's directory is non-empty
if os.listdir(cloned_repo.cached_dir) == []:
logger.info("Empty repo")
first_comment = (
"Sweep is currently not supported on empty repositories. Please add some"
f" code to your repository and try again.\n{sep}##"
f" {progress_headers[1]}\n{bot_suffix}{discord_suffix}"
)
if issue_comment is None:
issue_comment = current_issue.create_comment(
first_comment + BOT_SUFFIX
)
else:
issue_comment.edit(first_comment + BOT_SUFFIX)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {"success": False}
indexing_message = (
"I'm searching for relevant snippets in your repository. If this is your first"
" time using Sweep, I'm indexing your repository. You can monitor the progress using the progress dashboard"
)
first_comment = (
f"{get_comment_header(0, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message)}\n{sep}I am currently looking into this ticket! I"
" will update the progress of the ticket in this comment. I am currently"
f" searching through your code, looking for relevant snippets.\n{sep}##"
f" {progress_headers[1]}\n{indexing_message}{bot_suffix}{discord_suffix}"
)
# Find Sweep's previous comment
comments = []
for comment in current_issue.get_comments():
comments.append(comment)
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
break
if issue_comment is None:
issue_comment = current_issue.create_comment(first_comment)
else:
fire_and_forget_wrapper(issue_comment.edit)(first_comment)
old_edit = issue_comment.edit
issue_comment.edit = lambda msg: old_edit(msg + BOT_SUFFIX)
past_messages = {}
current_index = 0
table = None
initial_sandbox_response = -1
initial_sandbox_response_file = None
def refresh_token():
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
return user_token, g, repo
def edit_sweep_comment(
message: str,
index: int,
pr_message="",
done=False,
add_bonus_message=True,
):
nonlocal current_index, user_token, g, repo, issue_comment, initial_sandbox_response, initial_sandbox_response_file
message = sanitize_string_for_github(message)
if pr_message:
pr_message = sanitize_string_for_github(pr_message)
# -1 = error, -2 = retry
# Only update the progress bar if the issue generation errors.
errored = index == -1
if index >= 0:
past_messages[index] = message
current_index = index
agg_message = None
# Include progress history
# index = -2 is reserved for
for i in range(
current_index + 2
): # go to next header (for Working on it... text)
if i == 0 or i >= len(progress_headers):
continue # skip None header
header = progress_headers[i]
if header is not None:
header = "## " + header + "\n"
else:
header = "No header\n"
msg = header + (past_messages.get(i) or "Working on it...")
if agg_message is None:
agg_message = msg
else:
agg_message = agg_message + f"\n{sep}" + msg
suffix = bot_suffix + discord_suffix
if errored:
agg_message = (
"## ❌ Unable to Complete PR"
+ "\n"
+ message
+ (
"\n\nFor bonus GPT-4 tickets, please report this bug on"
f" **[Discord](https://discord.gg/invite/sweep)** (tracking ID: `{tracking_id}`)."
if add_bonus_message
else ""
)
)
if table is not None:
agg_message = (
agg_message
+ f"\n{sep}Please look at the generated plan. If something looks"
f" wrong, please add more details to your issue.\n\n{table}"
)
suffix = bot_suffix # don't include discord suffix for error messages
# Update the issue comment
msg = f"{get_comment_header(current_index, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message, errored=errored, pr_message=pr_message, done=done, initial_sandbox_response=initial_sandbox_response, initial_sandbox_response_file=initial_sandbox_response_file, config_pr_url=config_pr_url)}\n{sep}{agg_message}{suffix}"
try:
issue_comment.edit(msg)
except BadCredentialsException:
logger.error(
f"Bad credentials, refreshing token (tracking ID: `{tracking_id}`)"
)
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
issue_comment = None
for comment in comments:
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
current_issue = repo.get_issue(number=issue_number)
if issue_comment is None:
issue_comment = current_issue.create_comment(msg)
else:
issue_comment = [
comment
for comment in current_issue.get_comments()
if comment.user.login == CURRENT_USERNAME
][0]
issue_comment.edit(msg)
if use_faster_model:
edit_sweep_comment(
FASTER_MODEL_MESSAGE, -1, add_bonus_message=False
)
posthog.capture(
username,
"ran_out_of_tickets",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {
"success": False,
"error_message": "We deprecated supporting GPT 3.5.",
}
error_message = validate_issue(title + summary)
if error_message:
logger.warning(f"Validation error: {error_message}")
edit_sweep_comment(
(
f"The issue was rejected with the following response:\n\n{blockquote(error_message)}"
),
-1,
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
posthog.capture(
username,
"invalid_issue",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": True}
prs_extracted = PRReader.extract_prs(repo, summary)
message_summary = summary
if prs_extracted:
message_summary += "\n\n" + prs_extracted
edit_sweep_comment(
create_collapsible(
"I found that you mentioned the following Pull Requests that might be important:",
blockquote(
prs_extracted,
),
),
1,
)
try:
# search/context manager
logger.info("Searching for relevant snippets...")
if image_contents: # doing it here to avoid editing the original issue
message_summary += ImageDescriptionBot().describe_images(text=title + message_summary, images=image_contents)
snippets, tree, _, repo_context_manager = fetch_relevant_files(
cloned_repo,
title,
message_summary,
replies_text,
username,
metadata,
on_ticket_start_time,
tracking_id,
is_paying_user,
is_consumer_tier,
issue_url,
chat_logger,
ticket_progress,
images=image_contents
)
cloned_repo = repo_context_manager.cloned_repo
except Exception as e:
edit_sweep_comment(
(
"It looks like an issue has occurred around fetching the files."
f" The exception was {str(e)}. If this error persists"
f" contact team@sweep.dev.\n\n> @{username}, editing this issue description to include more details will automatically make me relaunch. Please join our Discord server for support (tracking_id={tracking_id})"
),
-1,
)
raise Exception("Failed to fetch files") from e
_user_token, g = get_github_client(installation_id)
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
repo = g.get_repo(repo_full_name)
ticket_progress.search_progress.indexing_progress = (
ticket_progress.search_progress.indexing_total
)
ticket_progress.status = TicketProgressStatus.PLANNING
ticket_progress.save()
# Fetch git commit history
if not repo_description:
repo_description = "No description provided."
message_summary += replies_text
get_documentation_dict(repo)
docs_results = ""
sweep_bot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title=title,
message_summary=message_summary,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
# Check repository for sweep.yml file.
sweep_yml_exists = False
sweep_yml_failed = False
for content_file in repo.get_contents(""):
if content_file.name == "sweep.yaml":
sweep_yml_exists = True
# Check if YAML is valid
yaml_content = content_file.decoded_content.decode("utf-8")
sweep_yaml_dict = {}
try:
sweep_yaml_dict = yaml.safe_load(yaml_content)
except Exception:
logger.error(f"Failed to load YAML file: {yaml_content}")
if len(sweep_yaml_dict) > 0:
break
linter_config = yamllint_config.YamlLintConfig(custom_config)
problems = list(linter.run(yaml_content, linter_config))
if problems:
errors = [
f"Line {problem.line}: {problem.desc} (rule: {problem.rule})"
for problem in problems
]
error_message = "\n".join(errors)
markdown_error_message = f"**There is something wrong with your [sweep.yaml](https://github.com/{repo_full_name}/blob/main/sweep.yaml):**\n```\n{error_message}\n```"
sweep_yml_failed = True
logger.error(markdown_error_message)
edit_sweep_comment(markdown_error_message, -1)
else:
logger.info("The YAML file is valid. No errors found.")
break
# If sweep.yaml does not exist, then create a new PR that simply creates the sweep.yaml file.
if not sweep_yml_exists:
try:
logger.info("Creating sweep.yaml file...")
config_pr = create_config_pr(sweep_bot, cloned_repo=cloned_repo)
config_pr_url = config_pr.html_url
edit_sweep_comment(message="", index=-2)
except Exception as e:
logger.error(
"Failed to create new branch for sweep.yaml file.\n",
e,
traceback.format_exc(),
)
else:
logger.info("sweep.yaml file already exists.")
try:
# ANALYZE SNIPPETS
newline = "\n"
edit_sweep_comment(
"I found the following snippets in your repository. I will now analyze"
" these snippets and come up with a plan."
+ "\n\n"
+ create_collapsible(
"Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.",
"\n".join(
[
f"https://github.com/{organization}/{repo_name}/blob/{repo.get_commits()[0].sha}/{snippet.file_path}#L{max(snippet.start, 1)}-L{min(snippet.end, snippet.content.count(newline) - 1)}\n"
for snippet in snippets
]
),
)
+ (
create_collapsible(
"I also found that you mentioned the following Pull Requests that may be helpful:",
blockquote(prs_extracted),
)
if prs_extracted
else ""
)
+ (f"\n\n{docs_results}\n\n" if docs_results else ""),
1,
)
logger.info("Fetching files to modify/create...")
file_change_requests, plan = get_files_to_change(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=f"{title}\n\n{message_summary}",
repo_name=repo_full_name,
cloned_repo=cloned_repo,
images=image_contents
)
validate_file_change_requests(file_change_requests, cloned_repo)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.status = TicketProgressStatus.CODING
ticket_progress.save()
if not file_change_requests:
if len(title + summary) < 60:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details? Please make sure that the title and"
" summary of the issue are at least 60 characters."
),
-1,
)
else:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details?"
),
-1,
)
raise Exception("No files to modify.")
file_change_requests: list[
FileChangeRequest
] = sweep_bot.validate_file_change_requests(
file_change_requests,
)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.save()
table = tabulate(
[
[
file_change_request.entity_display,
file_change_request.instructions_display.replace(
"\n", "<br/>"
).replace("```", "\\```"),
]
for file_change_request in file_change_requests
if file_change_request.change_type != "check"
],
headers=["File Path", "Proposed Changes"],
tablefmt="pipe",
)
logger.info("Generating PR...")
pull_request = PullRequest(
title="Sweep: " + title,
branch_name="sweep/" + to_branch_name(title),
content="",
)
logger.info("Making PR...")
ticket_progress.context.branch_name = pull_request.branch_name
ticket_progress.save()
files_progress: list[tuple[str, str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
"⏳ In Progress",
"",
)
for file_change_request in file_change_requests
]
checkboxes_progress: list[tuple[str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
" ",
)
for file_change_request in file_change_requests
if not file_change_request.change_type == "check"
]
checkboxes_contents = "\n".join(
[
create_checkbox(
f"`{filename}`", blockquote(instructions), check == "X"
)
for filename, instructions, check in checkboxes_progress
]
)
create_collapsible("Checklist", checkboxes_contents, opened=True)
file_change_requests[0].status = "running"
condensed_checkboxes_contents = "\n".join(
[
create_checkbox(f"`{filename}`", "", check == "X").strip()
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_collapsible = create_collapsible(
"Checklist", condensed_checkboxes_contents, opened=True
)
current_issue = repo.get_issue(number=issue_number)
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
delete_branch = False
generator = create_pr_changes(
file_change_requests,
pull_request,
sweep_bot,
username,
installation_id,
issue_number,
chat_logger=chat_logger,
base_branch=overrided_branch_name,
additional_messages=[],
)
edit_sweep_comment(checkboxes_contents, 2)
if not file_change_requests:
raise NoFilesException()
response = {
"error": Exception(
f"Sweep failed to generate any file change requests! This could mean that Sweep failed to find the correct lines of code to modify or that GPT-4 did not respond in our specified format. Sometimes, retrying will fix this error. Otherwise, reach out to our Discord server for support (tracking_id={tracking_id})."
)
}
changed_files = []
for item in generator:
if isinstance(item, dict):
response = item
break
(
new_file_contents,
_,
commit,
file_change_requests,
) = item
# append all files that have been changed
if new_file_contents:
for file_name, _ in new_file_contents.items():
changed_files.append(file_name)
commit_hash: str = (
commit
if isinstance(commit, str)
else (
commit.sha
if commit is not None
else repo.get_branch(
pull_request.branch_name
).commit.sha
)
)
commit_url = (
f"https://github.com/{repo_full_name}/commit/{commit_hash}"
)
commit_url_display = (
f"<a href='{commit_url}'><code>{commit_hash[:7]}</code></a>"
)
create_error_logs(
commit_url_display,
None,
status=(
"✓"
),
)
checkboxes_progress = [
(
file_change_request.display_summary
+ " "
+ file_change_request.status_display
+ " "
+ (file_change_request.commit_hash_url or "")
+ f" [Edit]({file_change_request.get_edit_url(repo.full_name, pull_request.branch_name)})",
file_change_request.instructions_ticket_display
+ f"\n\n{file_change_request.diff_display}",
(
"X"
if file_change_request.status
in ("succeeded", "failed")
else " "
),
)
for file_change_request in file_change_requests
]
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
collapsible_template.format(
summary="Checklist",
body=checkboxes_contents,
opened="open",
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
try:
current_issue = repo.get_issue(number=issue_number)
except BadCredentialsException:
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
logger.info(files_progress)
edit_sweep_comment(checkboxes_contents, 2)
if not response.get("success"):
raise Exception(f"Failed to create PR: {response.get('error')}")
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
for _ in range(3):
try:
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
break
except Exception:
from time import sleep
sleep(1)
edit_sweep_comment(checkboxes_contents, 2)
pr_changes = response["pull_request"]
# change the body here
diff_text = get_branch_diff_text(
repo=repo,
branch=pull_request.branch_name,
base_branch=overrided_branch_name,
)
new_description = PRDescriptionBot().describe_diffs(
diff_text,
pull_request.title,
)
# TODO: update the title as well
if new_description:
pr_changes.body = (
f"{new_description}\n\nFixes"
f" #{issue_number}.\n\n---\n\n{UPDATES_MESSAGE}\n\n---\n\n{INSTRUCTIONS_FOR_REVIEW}{BOT_SUFFIX}"
)
edit_sweep_comment(
"I have finished coding the issue. I am now reviewing it for completeness.",
3,
)
change_location = f" [`{pr_changes.pr_head}`](https://github.com/{repo_full_name}/commits/{pr_changes.pr_head}).\n\n"
review_message = (
"Here are my self-reviews of my changes at" + change_location
)
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
except Exception:
pass
changes_required, review_message = False, ""
if changes_required:
edit_sweep_comment(
review_message
+ "\n\nI finished incorporating these changes.",
3,
)
else:
edit_sweep_comment(
f"I have finished reviewing the code for completeness. I did not find errors for {change_location}",
3,
)
revert_buttons = []
for changed_file in set(changed_files):
revert_buttons.append(
Button(label=f"{RESET_FILE} {changed_file}")
)
revert_buttons_list = ButtonList(
buttons=revert_buttons, title=REVERT_CHANGED_FILES_TITLE
)
# delete failing sweep yaml if applicable
if sweep_yml_failed:
try:
repo.delete_file(
"sweep.yaml",
"Delete failing sweep.yaml",
branch=pr_changes.pr_head,
sha=repo.get_contents("sweep.yaml").sha,
)
except Exception:
pass
# create draft pr, then convert to regular pr later
pr: GithubPullRequest = repo.create_pull(
title=pr_changes.title,
body=pr_changes.body,
head=pr_changes.pr_head,
base=overrided_branch_name or SweepConfig.get_branch(repo),
# removed draft PR
draft=False,
)
try:
pr.add_to_assignees(username)
except Exception as e:
logger.error(
f"Failed to add assignee {username}: {e}, probably a bot."
)
ticket_progress.status = TicketProgressStatus.COMPLETE
ticket_progress.context.done_time = time()
ticket_progress.context.pr_id = pr.number
ticket_progress.save()
if revert_buttons:
pr.create_issue_comment(
revert_buttons_list.serialize() + BOT_SUFFIX
)
# add comments before labelling
pr.add_to_labels(GITHUB_LABEL_NAME)
current_issue.create_reaction("rocket")
heres_pr_message = f'<h1 align="center">🚀 Here\'s the PR! <a href="{pr.html_url}">#{pr.number}</a></h1>'
progress_message = f'<div align="center"><b>See Sweep\'s progress at <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">the progress dashboard</a>!</b></div>'
edit_sweep_comment(
review_message + "\n\nSuccess! 🚀",
4,
pr_message=(
f"{center(heres_pr_message)}\n{center(progress_message)}\n{center(payment_message_start)}"
),
done=True,
)
user_settings = UserSettings.from_username(username=username)
user = g.get_user(username)
full_name = user.name or user.login
name = full_name.split(" ")[0]
files_changed = []
for fcr in file_change_requests:
if fcr.change_type in ("create", "modify"):
diff = list(
difflib.unified_diff(
(fcr.old_content or "").splitlines() or [],
(fcr.new_content or "").splitlines() or [],
lineterm="",
)
)
added = sum(
1
for line in diff
if line.startswith("+") and not line.startswith("+++")
)
removed = sum(
1
for line in diff
if line.startswith("-") and not line.startswith("---")
)
files_changed.append(
f"<code>{fcr.filename}</code> (+{added}/-{removed})"
)
user_settings.send_email(
subject=f"Sweep Pull Request Complete for {repo_name}#{issue_number} {title}",
html=email_template.format(
name=name,
pr_url=pr.html_url,
issue_number=issue_number,
repo_full_name=repo_full_name,
pr_number=pr.number,
progress_url=f"{PROGRESS_BASE_URL}/issues/{tracking_id}",
summary=markdown.markdown(pr_changes.body),
files_changed="\n".join(
[f"<li>{item}</li>" for item in files_changed]
),
sweeping_gif=sweeping_gif,
),
)
# poll for github to check when gha are done
total_poll_attempts = 0
total_edit_attempts = 0
SLEEP_DURATION_SECONDS = 15
GITHUB_ACTIONS_ENABLED = get_gha_enabled(repo=repo) and DEPLOYMENT_GHA_ENABLED
GHA_MAX_EDIT_ATTEMPTS = 5 # max number of times to edit PR
current_commit = pr.head.sha
while True and GITHUB_ACTIONS_ENABLED:
logger.info(
f"Polling to see if Github Actions have finished... {total_poll_attempts}"
)
# we wait at most 60 minutes
if total_poll_attempts * SLEEP_DURATION_SECONDS // 60 >= 60:
break
else:
# wait one minute between check attempts
total_poll_attempts += 1
from time import sleep
sleep(SLEEP_DURATION_SECONDS)
runs = list(repo.get_workflow_runs(branch=pr.head.ref, head_sha=current_commit))
# if all runs have succeeded, break
if all([run.conclusion == "success" for run in runs]):
break
# if any of them have failed we retry
if any([run.conclusion == "failure" for run in runs]):
failed_runs = [
run for run in runs if run.conclusion == "failure"
]
failed_gha_logs: list[str] = get_failing_gha_logs(
failed_runs,
installation_id,
)
if failed_gha_logs:
# make edits to the PR
# TODO: look into rollbacks so we don't continue adding onto errors
cloned_repo = ClonedRepo( # reinitialize cloned_repo to avoid conflicts
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=pr.head.ref,
)
diffs = get_branch_diff_text(repo=repo, branch=pr.head.ref, base_branch=pr.base.ref)
problem_statement = f"{title}\n{message_summary}\n{replies_text}"
all_information_prompt = GHA_PROMPT.format(
problem_statement=problem_statement,
github_actions_logs=failed_gha_logs,
changes_made=diffs,
)
repo_context_manager = prep_snippets(cloned_repo=cloned_repo, query=(title + message_summary + replies_text).strip("\n"), ticket_progress=ticket_progress) # need to do this, can use the old query for speed
sweep_bot: SweepBot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title="Fix the following errors to complete the user request.",
message_summary=all_information_prompt,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
file_change_requests, plan = get_files_to_change_for_gha(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=all_information_prompt,
updated_files=new_file_contents,
cloned_repo=cloned_repo,
chat_logger=chat_logger,
)
validate_file_change_requests(file_change_requests, cloned_repo)
previous_modify_files_dict: dict[str, dict[str, str | list[str]]] | None = None
_, commit, _ = sweep_bot.handle_modify_file_main(
branch=pr.head.ref,
assistant_conversation=None,
additional_messages=[],
previous_modify_files_dict=previous_modify_files_dict,
file_change_requests=file_change_requests,
username=username
)
current_commit = commit.sha
pr = repo.get_pull(pr.number) # IMPORTANT: resync PR otherwise you'll fetch old GHA runs
total_edit_attempts += 1
if total_edit_attempts >= GHA_MAX_EDIT_ATTEMPTS:
logger.info(f"Tried to edit PR {GHA_MAX_EDIT_ATTEMPTS} times, giving up.")
break
# if none of the runs have completed we wait and poll github
logger.info(
f"No Github Actions have failed yet and not all have succeeded yet, waiting for {SLEEP_DURATION_SECONDS} seconds before polling again..."
)
# break from main for loop
convert_pr_draft_field(pr, is_draft=False, installation_id=installation_id)
except MaxTokensExceeded as e:
logger.info("Max tokens exceeded")
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Max tokens exceeded. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Max Tokens Exceeded",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
if chat_logger and chat_logger.is_paying_user():
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too long."
" We are currently working on improved file streaming to address"
" this issue.\n"
),
-1,
)
else:
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too"
" long.\n\nIf this file is incorrect, please describe the desired"
" file in the prompt. However, if you would like to edit longer"
" files, consider upgrading to [Sweep Pro](https://sweep.dev/) for"
" longer context lengths.\n"
),
-1,
)
delete_branch = True
raise e
except NoFilesException as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sweep could not find files to modify to address this issue. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.info("Sweep could not find files to modify")
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Sweep could not find files to modify",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
edit_sweep_comment(
(
"Sorry, Sweep could not find any appropriate files to edit to address"
" this issue. If this is a mistake, please provide more context and Sweep"
f" will retry!\n\n@{username}, please edit the issue description to"
" include more details. You can also ask for help on our community"
" forum: https://community.sweep.dev/"
),
-1,
)
delete_branch = True
raise e
except openai.BadRequestError as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sorry, it looks like there is an error with communicating with OpenAI. If this error persists, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
edit_sweep_comment(
(
"I'm sorry, but it looks our model has ran out of context length. We're"
" trying to make this happen less, but one way to mitigate this is to"
" code smaller files. If this error persists report it at"
" https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Context Length",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
posthog.capture(
username,
"failed",
properties={
"error": str(e),
"trace": traceback.format_exc(),
"reason": "Invalid request error / context length",
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
delete_branch = True
raise e
except AssistantRaisedException as e:
if ticket_progress is not None:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Sweep raised an error with the following message: {e.message}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.exception(e)
edit_sweep_comment(
f"Sweep raised an error with the following message:\n{blockquote(e.message)}",
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
except Exception as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Internal server error: {str(e)}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
# title and summary are defined elsewhere
if len(title + summary) < 60:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error occurred due to"
f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
" so Sweep can better address it. Alternatively, post on our community forum"
" for assistance: https://community.sweep.dev/"
),
-1,
)
else:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error has occurred due to"
+ f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
+ " so Sweep can better address it. Alternatively, reach out to Kevin or William for help at"
+ " https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
else:
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
fire_and_forget_wrapper(add_emoji)("rocket")
except SystemExit:
raise SystemExit
except Exception as e:
logger.error(e)
if delete_branch:
try:
if pull_request.branch_name.startswith("sweep"):
repo.get_git_ref(
f"heads/{pull_request.branch_name}"
).delete()
else:
raise Exception(
f"Branch name {pull_request.branch_name} does not start with sweep/"
)
except Exception as e:
logger.error(e)
logger.error(traceback.format_exc())
logger.info("Deleted branch", pull_request.branch_name)
except Exception as e:
posthog.capture(
username,
"failed",
properties={
**metadata,
"error": str(e),
"trace": traceback.format_exc(),
"duration": round(time() - on_ticket_start_time),
},
)
raise e
posthog.capture(
username,
"success",
properties={**metadata, "duration": round(time() - on_ticket_start_time)},
)
logger.info("on_ticket success in " + str(round(time() - on_ticket_start_time)))
return {"success": True}
def handle_sandbox_mode(
title, repo_full_name, repo, ticket_progress, edit_sweep_comment
):
logger.info("Running in sandbox mode")
sweep_bot = SweepBot(repo=repo, ticket_progress=ticket_progress)
logger.info("Getting file contents")
file_name = title.split(":")[1].strip()
file_contents = sweep_bot.get_contents(file_name).decoded_content.decode("utf-8")
try:
ext = file_name.split(".")[-1]
except Exception:
ext = ""
file_contents.replace("```", "\`\`\`")
sha = repo.get_branch(repo.default_branch).commit.sha
permalink = f"https://github.com/{repo_full_name}/blob/{sha}/{file_name}#L1-L{len(file_contents.splitlines())}"
logger.info("Running sandbox")
edit_sweep_comment(
f"Running sandbox for {file_name}. Current Code:\n\n{permalink}",
1,
)
updated_contents, sandbox_response = sweep_bot.check_sandbox(
file_name, file_contents
)
logger.info("Sandbox finished")
logs = (
(
"<br/>"
+ create_collapsible(
"Sandbox logs",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
updated_contents = updated_contents.replace("```", "\`\`\`")
diff = generate_diff(file_contents, updated_contents).replace("```", "\`\`\`")
diff_display = (
f"Updated Code:\n\n```{ext}\n{updated_contents}```\nDiff:\n```diff\n{diff}\n```"
if diff
else f"Sandbox made no changes to {file_name} (formatters were not configured or Sweep didn't make changes)."
)
edit_sweep_comment(
f"{logs}\n{diff_display}",
2,
)
edit_sweep_comment("N/A", 3)
logger.info("Sandbox comments updated")
def get_branch_diff_text(repo, branch, base_branch=None):
base_branch = base_branch or SweepConfig.get_branch(repo)
comparison = repo.compare(base_branch, branch)
file_diffs = comparison.files
pr_diffs = []
for file in file_diffs:
diff = file.patch
if (
file.status == "added"
or file.status == "modified"
or file.status == "removed"
):
pr_diffs.append((file.filename, diff))
else:
logger.info(
f"File status {file.status} not recognized"
) # TODO(sweep): We don't handle renamed files
return "\n".join([f"{filename}\n{diff}" for filename, diff in pr_diffs])
def get_payment_messages(chat_logger: ChatLogger):
if chat_logger:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
tracking_id = chat_logger.data["tracking_id"] if MONGODB_URI is not None else None
# Find the first comment made by the bot
tickets_allocated = 5
if is_consumer_tier:
tickets_allocated = 15
if is_paying_user:
tickets_allocated = 500
purchased_ticket_count = (
chat_logger.get_ticket_count(purchased=True) if chat_logger else 0
)
ticket_count = (
max(tickets_allocated - chat_logger.get_ticket_count(), 0)
+ purchased_ticket_count
if chat_logger
else 999
)
daily_ticket_count = (
(3 - chat_logger.get_ticket_count(use_date=True) if not use_faster_model else 0)
if chat_logger
else 999
)
model_name = "GPT-4"
single_payment_link = "https://buy.stripe.com/00g3fh7qF85q0AE14d"
pro_payment_link = "https://buy.stripe.com/00g5npeT71H2gzCfZ8"
daily_message = (
f" and {daily_ticket_count} for the day"
if not is_paying_user and not is_consumer_tier
else ""
)
user_type = "💎 <b>Sweep Pro</b>" if is_paying_user else "⚡ <b>Sweep Basic Tier</b>"
gpt_tickets_left_message = (
f"{ticket_count} GPT-4 tickets left for the month"
if not is_paying_user
else "unlimited GPT-4 tickets"
)
purchase_message = f"<br/><br/> For more GPT-4 tickets, visit <a href={single_payment_link}>our payment portal</a>. For a one week free trial, try <a href={pro_payment_link}>Sweep Pro</a> (unlimited GPT-4 tickets)."
payment_message = (
f"{user_type}: I used {model_name} to create this ticket. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)
payment_message_start = (
f"{user_type}: I'm using {model_name}. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)

import base64
import os
from dotenv import load_dotenv
from loguru import logger
logger.print = logger.info
load_dotenv(dotenv_path=".env", override=True, verbose=True)
os.environ["GITHUB_APP_PEM"] = os.environ.get("GITHUB_APP_PEM") or base64.b64decode(
os.environ.get("GITHUB_APP_PEM_BASE64", "")
).decode("utf-8")
if os.environ["GITHUB_APP_PEM"]:
os.environ["GITHUB_APP_ID"] = (
(os.environ.get("GITHUB_APP_ID") or os.environ.get("APP_ID"))
.replace("\\n", "\n")
.strip('"')
)
TEST_BOT_NAME = "sweep-nightly[bot]"
ENV = os.environ.get("ENV", "dev")
BOT_TOKEN_NAME = "bot-token"
# goes under Modal 'discord' secret name (optional, can leave env var blank)
DISCORD_WEBHOOK_URL = os.environ.get("DISCORD_WEBHOOK_URL")
DISCORD_MEDIUM_PRIORITY_URL = os.environ.get("DISCORD_MEDIUM_PRIORITY_URL")
DISCORD_LOW_PRIORITY_URL = os.environ.get("DISCORD_LOW_PRIORITY_URL")
DISCORD_FEEDBACK_WEBHOOK_URL = os.environ.get("DISCORD_FEEDBACK_WEBHOOK_URL")
SWEEP_HEALTH_URL = os.environ.get("SWEEP_HEALTH_URL")
DISCORD_STATUS_WEBHOOK_URL = os.environ.get("DISCORD_STATUS_WEBHOOK_URL")
# goes under Modal 'github' secret name
GITHUB_APP_ID = os.environ.get("GITHUB_APP_ID", os.environ.get("APP_ID"))
# deprecated: old logic transfer so upstream can use this
if GITHUB_APP_ID is None:
if ENV == "prod":
GITHUB_APP_ID = "307814"
elif ENV == "dev":
GITHUB_APP_ID = "324098"
elif ENV == "staging":
GITHUB_APP_ID = "327588"
GITHUB_BOT_USERNAME = os.environ.get("GITHUB_BOT_USERNAME")
# deprecated: left to support old logic
if not GITHUB_BOT_USERNAME:
if ENV == "prod":
GITHUB_BOT_USERNAME = "sweep-ai[bot]"
elif ENV == "dev":
GITHUB_BOT_USERNAME = "sweep-nightly[bot]"
elif ENV == "staging":
GITHUB_BOT_USERNAME = "sweep-canary[bot]"
elif not GITHUB_BOT_USERNAME.endswith("[bot]"):
GITHUB_BOT_USERNAME = GITHUB_BOT_USERNAME + "[bot]"
GITHUB_LABEL_NAME = os.environ.get("GITHUB_LABEL_NAME", "sweep")
GITHUB_LABEL_COLOR = os.environ.get("GITHUB_LABEL_COLOR", "9400D3")
GITHUB_LABEL_DESCRIPTION = os.environ.get(
"GITHUB_LABEL_DESCRIPTION", "Sweep your software chores"
)
GITHUB_APP_PEM = os.environ.get("GITHUB_APP_PEM")
GITHUB_APP_PEM = GITHUB_APP_PEM or os.environ.get("PRIVATE_KEY")
if GITHUB_APP_PEM is not None:
GITHUB_APP_PEM = GITHUB_APP_PEM.strip(' \n"') # Remove whitespace and quotes
GITHUB_APP_PEM = GITHUB_APP_PEM.replace("\\n", "\n")
GITHUB_CONFIG_BRANCH = os.environ.get("GITHUB_CONFIG_BRANCH", "sweep/add-sweep-config")
GITHUB_DEFAULT_CONFIG = os.environ.get(
"GITHUB_DEFAULT_CONFIG",
"""# Sweep AI turns bugs & feature requests into code changes (https://sweep.dev)
# For details on our config file, check out our docs at https://docs.sweep.dev/usage/config
# This setting contains a list of rules that Sweep will check for. If any of these rules are broken in a new commit, Sweep will create an pull request to fix the broken rule.
rules:
{additional_rules}
# This is the branch that Sweep will develop from and make pull requests to. Most people use 'main' or 'master' but some users also use 'dev' or 'staging'.
branch: 'main'
# By default Sweep will read the logs and outputs from your existing Github Actions. To disable this, set this to false.
gha_enabled: True
# This is the description of your project. It will be used by sweep when creating PRs. You can tell Sweep what's unique about your project, what frameworks you use, or anything else you want.
#
# Example:
#
# description: sweepai/sweep is a python project. The main api endpoints are in sweepai/api.py. Write code that adheres to PEP8.
description: ''
# This sets whether to create pull requests as drafts. If this is set to True, then all pull requests will be created as drafts and GitHub Actions will not be triggered.
draft: False
# This is a list of directories that Sweep will not be able to edit.
blocked_dirs: []
""",
)
MONGODB_URI = os.environ.get("MONGODB_URI", None)
IS_SELF_HOSTED = os.environ.get("IS_SELF_HOSTED", "true").lower() == "true"
REDIS_URL = os.environ.get("REDIS_URL")
if not REDIS_URL:
REDIS_URL = os.environ.get("redis_url", "redis://0.0.0.0:6379/0")
ORG_ID = os.environ.get("ORG_ID", None)
POSTHOG_API_KEY = os.environ.get(
"POSTHOG_API_KEY", "phc_CnzwIB0W548wN4wEGeRuxXqidOlEUH2AcyV2sKTku8n"
)
SUPPORT_COUNTRY = os.environ.get("GDRP_LIST", "").split(",")
WHITELISTED_REPOS = os.environ.get("WHITELISTED_REPOS", "").split(",")
BLACKLISTED_USERS = os.environ.get("BLACKLISTED_USERS", "").split(",")
# Default OpenAI
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", None) # this may be none, and it will use azure
OPENAI_API_TYPE = os.environ.get("OPENAI_API_TYPE", "anthropic")
assert OPENAI_API_TYPE in ["anthropic", "azure", "openai"], "Invalid OPENAI_API_TYPE"
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
AZURE_API_KEY = os.environ.get("AZURE_API_KEY", None)
OPENAI_API_BASE = os.environ.get("OPENAI_API_BASE", None)
OPENAI_API_VERSION = os.environ.get("OPENAI_API_VERSION", None)
AZURE_OPENAI_DEPLOYMENT = os.environ.get("AZURE_OPENAI_DEPLOYMENT", None)
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
OPENAI_EMBEDDINGS_AZURE_ENDPOINT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_ENDPOINT", None
)
OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT", None
)
OPENAI_EMBEDDINGS_AZURE_API_VERSION = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_API_VERSION", None
)
OPENAI_API_ENGINE_GPT35 = os.environ.get("OPENAI_API_ENGINE_GPT35", None)
OPENAI_API_ENGINE_GPT4 = os.environ.get("OPENAI_API_ENGINE_GPT4", None)
MULTI_REGION_CONFIG = os.environ.get("MULTI_REGION_CONFIG", None)
if isinstance(MULTI_REGION_CONFIG, str):
MULTI_REGION_CONFIG = MULTI_REGION_CONFIG.strip("'").replace("\\n", "\n")
MULTI_REGION_CONFIG = [item.split(",") for item in MULTI_REGION_CONFIG.split("\n")]
WHITELISTED_USERS = os.environ.get("WHITELISTED_USERS", None)
if WHITELISTED_USERS:
WHITELISTED_USERS = WHITELISTED_USERS.split(",")
WHITELISTED_USERS.append(GITHUB_BOT_USERNAME)
DEFAULT_GPT4_MODEL = os.environ.get("DEFAULT_GPT4_MODEL", "gpt-4-0125-preview")
RESEND_API_KEY = os.environ.get("RESEND_API_KEY", None)
LOKI_URL = None
DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
ENV = "prod" if GITHUB_BOT_USERNAME != TEST_BOT_NAME else "dev"
PROGRESS_BASE_URL = os.environ.get(
"PROGRESS_BASE_URL", "https://progress.sweep.dev"
).rstrip("/")
DISABLED_REPOS = os.environ.get("DISABLED_REPOS", "").split(",")
GHA_AUTOFIX_ENABLED: bool = os.environ.get("GHA_AUTOFIX_ENABLED", False)
MERGE_CONFLICT_ENABLED: bool = os.environ.get("MERGE_CONFLICT_ENABLED", False)
INSTALLATION_ID = os.environ.get("INSTALLATION_ID", None)
AWS_ACCESS_KEY=os.environ.get("AWS_ACCESS_KEY")
AWS_SECRET_KEY=os.environ.get("AWS_SECRET_KEY")
AWS_REGION=os.environ.get("AWS_REGION")
ANTHROPIC_AVAILABLE = AWS_ACCESS_KEY and AWS_SECRET_KEY and AWS_REGION
ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY", None)
COHERE_API_KEY = os.environ.get("COHERE_API_KEY", None)
VOYAGE_API_KEY = os.environ.get("VOYAGE_API_KEY", None)
VOYAGE_API_AWS_ACCESS_KEY=os.environ.get("VOYAGE_API_AWS_ACCESS_KEY_ID")
VOYAGE_API_AWS_SECRET_KEY=os.environ.get("VOYAGE_API_AWS_SECRET_KEY")
VOYAGE_API_AWS_REGION=os.environ.get("VOYAGE_API_AWS_REGION")
VOYAGE_API_AWS_ENDPOINT_NAME=os.environ.get("VOYAGE_API_AWS_ENDPOINT_NAME", "voyage-code-2")
VOYAGE_API_USE_AWS = VOYAGE_API_AWS_ACCESS_KEY and VOYAGE_API_AWS_SECRET_KEY and VOYAGE_API_AWS_REGION
PAREA_API_KEY = os.environ.get("PAREA_API_KEY", None)
# TODO: we need to make this dynamic + backoff
BATCH_SIZE = int(
os.environ.get("BATCH_SIZE", 64 if VOYAGE_API_KEY else 256) # Voyage only allows 128 items per batch and 120000 tokens per batch
)
DEPLOYMENT_GHA_ENABLED = os.environ.get("DEPLOYMENT_GHA_ENABLED", "true").lower() == "true"
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

import re
from sweepai.core.chat import ChatGPT
issue_validator_instructions_prompt = """# Instructions
A good issue for Sweep is actionable and it is clear how to resolve it. Here is what Sweep is currently capable of:
- Access to the entire codebase, with a high-quality search engine to find specific code snippets. Sweep is able to pinpoint the exact location of the code that needs to be changed based on vague descriptions.
- Making code changes to fix bugs or add features.
- Reading the GitHub Action logs to run tests and check the results.
- Ability to read images such as screenshots and charts.
Here are some examples of things Sweep does not currently support:
- Large-scale changes like migrations and large version upgrades.
- Tasks requiring accessing outside information like AWS consoles or retrieving API keys.
- Tasks requiring fixes outside of code changes
- Issues that have an existing fix or duplicate issues
Respond in the following format:
<thinking>
Provide an analysis of why it is a good or bad issue to pass on to Sweep. If it is a bad issue, suggest how the issue could be improved or clarified to make it more suitable for Sweep.
</thinking>
<pass>True or False</pass>
If False, respond to the user:
<response_to_user>
Response to user with justification on why the issue is unclear.
</response_to_user>"""
issue_validator_system_prompt = """You are an AI assistant tasked with determining whether an issue reported by customer support should be passed on to be resolved by Sweep, an AI-powered software engineer.
""" + issue_validator_instructions_prompt
issue_validator_user_prompt = """<issue>
{issue}
</issue>\n\n""" + issue_validator_instructions_prompt
def validate_issue(issue: str) -> str:
"""
Somehow haiku and GPT-4 can't do this consistently.
"""
chat_gpt = ChatGPT.from_system_message_string(
prompt_string=issue_validator_system_prompt,
)
response = chat_gpt.chat_anthropic(
issue_validator_user_prompt.format(
issue=issue
),
model="claude-3-opus-20240229",
temperature=0.0,
)
if "<pass>False</pass>" in response:
pattern = "<response_to_user>(.*)</response_to_user>"
return re.search(pattern, response, re.DOTALL).group(1).strip()
return ""
if __name__ == "__main__":


Step 2: ⌨️ Coding

  • sweepai/config/server.py
Modify sweepai/config/server.py with contents: Add a new environment variable for the user's Slack API key.

<original_code>
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

LICENSE_KEY = os.environ.get("LICENSE_KEY", None)
ALTERNATE_AWS = os.environ.get("ALTERNATE_AWS", "none").lower() == "true"
</original_code>

<new_code>
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

SLACK_API_KEY = os.environ.get("SLACK_API_KEY", None)

LICENSE_KEY = os.environ.get("LICENSE_KEY", None)
ALTERNATE_AWS = os.environ.get("ALTERNATE_AWS", "none").lower() == "true"
</new_code>

  • sweepai/handlers/on_ticket.py
Modify sweepai/handlers/on_ticket.py with contents: Import the necessary modules for making HTTP requests and parsing Slack message threads.

<original_code>
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
</original_code>

<new_code>
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter

from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError
</new_code>

  • sweepai/handlers/on_ticket.py
Modify sweepai/handlers/on_ticket.py with contents: In the `on_ticket` function, check if the issue description contains a Slack link. If found, authenticate to Slack and fetch the message thread.

<original_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
</original_code>

<new_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)

slack_link_match = re.search(r'(https://\w+\.slack\.com/archives/\w+/p\d+)', summary)
if slack_link_match:
    slack_link = slack_link_match.group(1)
    slack_client = WebClient(token=SLACK_API_KEY)
    
    try:
        slack_permalink_data = slack_client.chat_getPermalink(
            link=slack_link
        )
        slack_channel_id = slack_permalink_data['channel']
        slack_message_ts = slack_permalink_data['message_ts']

        slack_thread_replies = slack_client.conversations_replies(
            channel=slack_channel_id,
            ts=slack_message_ts
        )

        slack_thread_messages = [message['text'] for message in slack_thread_replies['messages']]
        slack_thread_text = '\n'.join(slack_thread_messages)

        summary += f"\n\nSlack Thread:\n{slack_thread_text}"

    except SlackApiError as e:
        logger.error(f"Error fetching Slack thread: {e}")

repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
    assignee = current_issue.user.login

</new_code>


Step 3: 🔁 Code Review

Working on it...


🎉 Latest improvements to Sweep:
  • New dashboard launched for real-time tracking of Sweep issues, covering all stages from search to coding.
  • Integration of OpenAI's latest Assistant API for more efficient and reliable code planning and editing, improving speed by 3x.
  • Use the GitHub issues extension for creating Sweep issues directly from your editor.

💡 To recreate the pull request edit the issue title or description.
Something wrong? Let us know.

This is an automated message generated by Sweep AI.

Copy link
Contributor

sweep-nightly bot commented May 2, 2024

🚀 Here's the PR! #3665

See Sweep's progress at the progress dashboard!
💎 Sweep Pro: I'm using GPT-4. You have unlimited GPT-4 tickets. (tracking ID: db9aeea37c)

Tip

I can email you next time I complete a pull request if you set up your email here!


Actions (click)

  • ↻ Restart Sweep

Step 1: 🔎 Searching

I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.

Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.

"""
on_ticket is the main function that is called when a new issue is created.
It is only called by the webhook handler in sweepai/api.py.
"""
import difflib
import io
import os
import re
import traceback
from typing import Any
import zipfile
from time import time
import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
from sweepai.core.sweep_bot import GHA_PROMPT
from sweepai.agents.pr_description_bot import PRDescriptionBot
from sweepai.agents.image_description_bot import ImageDescriptionBot
from sweepai.config.client import (
RESET_FILE,
RESTART_SWEEP_BUTTON,
REVERT_CHANGED_FILES_TITLE,
SweepConfig,
get_documentation_dict,
get_gha_enabled,
)
from sweepai.config.server import (
DEPLOYMENT_GHA_ENABLED,
ENV,
GITHUB_LABEL_NAME,
IS_SELF_HOSTED,
MONGODB_URI,
PROGRESS_BASE_URL,
)
from sweepai.core.entities import (
AssistantRaisedException,
FileChangeRequest,
MaxTokensExceeded,
NoFilesException,
PullRequest,
SandboxResponse,
)
from sweepai.core.entities import create_error_logs as entities_create_error_logs
from sweepai.core.pr_reader import PRReader
from sweepai.core.sweep_bot import SweepBot, get_files_to_change, get_files_to_change_for_gha, validate_file_change_requests
from sweepai.handlers.create_pr import (
create_config_pr,
create_pr_changes,
safe_delete_sweep_branch,
)
from sweepai.handlers.on_check_suite import clean_gh_logs
from sweepai.utils.image_utils import get_image_contents_from_urls, get_image_urls_from_issue
from sweepai.utils.issue_validator import validate_issue
from sweepai.utils.validate_license import validate_license
from sweepai.utils.buttons import Button, ButtonList, create_action_buttons
from sweepai.utils.chat_logger import ChatLogger
from sweepai.utils.diff import generate_diff
from sweepai.utils.event_logger import posthog
from sweepai.utils.github_utils import (
CURRENT_USERNAME,
ClonedRepo,
convert_pr_draft_field,
get_github_client,
get_token,
sanitize_string_for_github,
)
from sweepai.utils.progress import (
AssistantConversation,
PaymentContext,
TicketContext,
TicketProgress,
TicketProgressStatus,
)
from sweepai.utils.prompt_constructor import HumanMessagePrompt
from sweepai.utils.str_utils import (
BOT_SUFFIX,
FASTER_MODEL_MESSAGE,
UPDATES_MESSAGE,
blockquote,
bot_suffix,
checkbox_template,
clean_logs,
collapsible_template,
create_checkbox,
create_collapsible,
discord_suffix,
format_sandbox_success,
get_hash,
sep,
stars_suffix,
strip_sweep,
to_branch_name,
)
from sweepai.utils.ticket_utils import (
center,
fetch_relevant_files,
fire_and_forget_wrapper,
log_error,
prep_snippets,
)
from sweepai.utils.user_settings import UserSettings
# from sandbox.sandbox_utils import Sandbox
sweeping_gif = """<a href="https://github.com/sweepai/sweep"><img class="swing" src="https://raw.githubusercontent.com/sweepai/sweep/main/.assets/sweeping.gif" width="100" style="width:50px; margin-bottom:10px" alt="Sweeping"></a>"""
custom_config = """
extends: relaxed
rules:
line-length: disable
indentation: disable
"""
INSTRUCTIONS_FOR_REVIEW = """\
### 💡 To get Sweep to edit this pull request, you can:
* Comment below, and Sweep can edit the entire PR
* Comment on a file, Sweep will only modify the commented file
* Edit the original issue to get Sweep to recreate the PR from scratch"""
email_template = """Hey {name},
<br/><br/>
🚀 I just finished creating a pull request for your issue ({repo_full_name}#{issue_number}) at <a href="{pr_url}">{repo_full_name}#{pr_number}</a>!
<br/><br/>
You can view how I created this pull request <a href="{progress_url}">here</a>.
<h2>Summary</h2>
<blockquote>
{summary}
</blockquote>
<h2>Files Changed</h2>
<ul>
{files_changed}
</ul>
{sweeping_gif}
<br/>
Cheers,
<br/>
Sweep
<br/>"""
FAILING_GITHUB_ACTION_PROMPT = """\
The following Github Actions failed on a previous attempt at fixing this issue.
Propose a fix to the failing github actions. You must edit the source code, not the github action itself.
{github_action_log}
"""
# Add :eyes: emoji to ticket
def add_emoji(issue: Issue, comment_id: int = None, reaction_content="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
item_to_react_to.create_reaction(reaction_content)
# If SWEEP_BOT reacted to item_to_react_to with "rocket", then remove it.
def remove_emoji(issue: Issue, comment_id: int = None, content_to_delete="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
reactions = item_to_react_to.get_reactions()
for reaction in reactions:
if (
reaction.content == content_to_delete
and reaction.user.login == CURRENT_USERNAME
):
item_to_react_to.delete_reaction(reaction.id)
def create_error_logs(
commit_url_display: str,
sandbox_response: SandboxResponse,
status: str = "✓",
):
return (
(
"<br/>"
+ create_collapsible(
f"Sandbox logs for {commit_url_display} {status}",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
# takes in a list of workflow runs and returns a list of messages containing the logs of the failing runs
def get_failing_gha_logs(runs, installation_id) -> str:
token = get_token(installation_id)
all_logs = ""
for run in runs:
# jobs_url
jobs_url = run.jobs_url
jobs_response = requests.get(
jobs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
)
if jobs_response.status_code == 200:
failed_jobs = []
jobs = jobs_response.json()["jobs"]
for job in jobs:
if job["conclusion"] == "failure":
failed_jobs.append(job)
failed_jobs_name_list = []
for job in failed_jobs:
# add failed steps
for step in job["steps"]:
if step["conclusion"] == "failure":
failed_jobs_name_list.append(
f"{job['name']}/{step['number']}_{step['name']}"
)
else:
logger.error(
"Failed to get jobs for failing github actions, possible a credentials issue"
)
return all_logs
# make sure jobs in valid
if jobs_response.json()['total_count'] == 0:
logger.error(f"no jobs for this run: {run}, continuing...")
continue
# logs url
logs_url = run.logs_url
logs_response = requests.get(
logs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
allow_redirects=True,
)
# Check if the request was successful
if logs_response.status_code == 200:
zip_data = io.BytesIO(logs_response.content)
zip_file = zipfile.ZipFile(zip_data, "r")
zip_file_names = zip_file.namelist()
for file in failed_jobs_name_list:
if f"{file}.txt" in zip_file_names:
logs = zip_file.read(f"{file}.txt").decode("utf-8")
logs_prompt = clean_gh_logs(logs)
all_logs += logs_prompt + "\n"
else:
logger.error(
"Failed to get logs for failing github actions, likely a credentials issue"
)
return all_logs
def delete_old_prs(repo: Repository, issue_number: int):
logger.info("Deleting old PRs...")
prs = repo.get_pulls(
state="open",
sort="created",
direction="desc",
base=SweepConfig.get_branch(repo),
)
for pr in tqdm(prs.get_page(0)):
# # Check if this issue is mentioned in the PR, and pr is owned by bot
# # This is done in create_pr, (pr_description = ...)
if pr.user.login == CURRENT_USERNAME and f"Fixes #{issue_number}.\n" in pr.body:
safe_delete_sweep_branch(pr, repo)
break
def construct_sweep_bot(
repo: Repository,
repo_name: str,
issue_url: str,
repo_description: str,
title: str,
message_summary: str,
cloned_repo: ClonedRepo,
ticket_progress: TicketProgress,
chat_logger: ChatLogger,
snippets: Any = None,
tree: Any = None,
comments: Any = None,
) -> SweepBot:
human_message = HumanMessagePrompt(
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description.strip(),
title=title,
summary=message_summary,
snippets=snippets,
tree=tree,
)
sweep_bot = SweepBot.from_system_message_content(
human_message=human_message,
repo=repo,
is_reply=bool(comments),
chat_logger=chat_logger,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
)
return sweep_bot
def get_comment_header(
index: int,
g: Github,
repo_full_name: str,
user_settings: UserSettings,
progress_headers: list[None | str],
tracking_id: str | None,
payment_message_start: str,
user_settings_message: str,
errored: bool = False,
pr_message: str = "",
done: bool = False,
initial_sandbox_response: int | SandboxResponse = -1,
initial_sandbox_response_file=None,
config_pr_url: str | None = None,
):
config_pr_message = (
"\n"
+ f"<div align='center'>Install Sweep Configs: <a href='{config_pr_url}'>Pull Request</a></div>"
if config_pr_url is not None
else ""
)
actions_message = create_action_buttons(
[
RESTART_SWEEP_BUTTON,
]
)
sandbox_execution_message = "\n\n## GitHub Actions failed\n\nThe sandbox appears to be unavailable or down.\n\n"
if initial_sandbox_response == -1:
sandbox_execution_message = ""
elif initial_sandbox_response is not None:
repo = g.get_repo(repo_full_name)
commit_hash = repo.get_commits()[0].sha
success = initial_sandbox_response.outputs and initial_sandbox_response.success
status = "✓" if success else "X"
sandbox_execution_message = (
"\n\n## GitHub Actions"
+ status
+ "\n\nHere are the GitHub Actions logs prior to making any changes:\n\n"
)
sandbox_execution_message += entities_create_error_logs(
f'<a href="https://github.com/{repo_full_name}/commit/{commit_hash}"><code>{commit_hash[:7]}</code></a>',
initial_sandbox_response,
initial_sandbox_response_file,
)
if success:
sandbox_execution_message += f"\n\nSandbox passed on the latest `{repo.default_branch}`, so sandbox checks will be enabled for this issue."
else:
sandbox_execution_message += "\n\nSandbox failed, so all sandbox checks will be disabled for this issue."
if index < 0:
index = 0
if index == 4:
return (
pr_message
+ config_pr_message
+ f"\n\n---\n{user_settings.get_message(completed=True)}"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
total = len(progress_headers)
index += 1 if done else 0
index *= 100 / total
index = int(index)
index = min(100, index)
if errored:
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Errored&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}<br/>{center(pbar)}\n\n"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Progress&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}"
+ (
center(
f'\n\n<h2>✨ Track Sweep\'s progress on our <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">progress dashboard</a>!</h2>'
)
if MONGODB_URI is not None
else ""
)
+ f"<br/>{center(pbar)}"
+ ("\n" + stars_suffix if index != -1 else "")
+ "\n"
+ center(payment_message_start)
+ f"\n\n---\n{user_settings_message}"
+ config_pr_message
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
def on_ticket(
title: str,
summary: str,
issue_number: int,
issue_url: str, # purely for logging purposes
username: str,
repo_full_name: str,
repo_description: str,
installation_id: int,
comment_id: int = None,
edited: bool = False,
tracking_id: str | None = None,
):
if not os.environ.get("CLI"):
assert validate_license(), "License key is invalid or expired. Please contact us at team@sweep.dev to upgrade to an enterprise license."
with logger.contextualize(
tracking_id=tracking_id,
):
if tracking_id is None:
tracking_id = get_hash()
on_ticket_start_time = time()
logger.info(f"Starting on_ticket with title {title} and summary {summary}")
(
title,
slow_mode,
do_map,
subissues_mode,
sandbox_mode,
fast_mode,
lint_mode,
) = strip_sweep(title)
# fetch images from body of issue
image_urls = get_image_urls_from_issue(issue_number, repo_full_name, installation_id)
image_contents = get_image_contents_from_urls(image_urls)
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n<summary>Checklist</summary>.*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- \[[ X]\].*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
ticket_progress = TicketProgress(
tracking_id=tracking_id,
username=username,
context=TicketContext(
title=title,
description=summary,
repo_full_name=repo_full_name,
issue_number=issue_number,
is_public=repo.private is False,
start_time=int(time()),
),
)
branch_match = re.search(
r"([B|b]ranch:) *(?P<branch_name>.+?)(\s|$)", summary
)
overrided_branch_name = None
if branch_match and "branch_name" in branch_match.groupdict():
overrided_branch_name = (
branch_match.groupdict()["branch_name"].strip().strip("`\"'")
)
# TODO: this code might be finicky, might have missed edge cases
if overrided_branch_name.startswith("https://github.com/"):
overrided_branch_name = overrided_branch_name.split("?")[0].split(
"tree/"
)[-1]
SweepConfig.get_branch(repo, overrided_branch_name)
chat_logger = (
ChatLogger(
{
"repo_name": repo_name,
"title": title,
"summary": summary,
"issue_number": issue_number,
"issue_url": issue_url,
"username": (
username if not username.startswith("sweep") else assignee
),
"repo_full_name": repo_full_name,
"repo_description": repo_description,
"installation_id": installation_id,
"type": "ticket",
"mode": ENV,
"comment_id": comment_id,
"edited": edited,
"tracking_id": tracking_id,
},
active=True,
)
if MONGODB_URI
else None
)
if chat_logger and not IS_SELF_HOSTED:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
if use_faster_model:
raise Exception(FASTER_MODEL_MESSAGE)
if fast_mode:
use_faster_model = True
if not comment_id and not edited and chat_logger and not sandbox_mode:
fire_and_forget_wrapper(chat_logger.add_successful_ticket)(
gpt3=use_faster_model
)
organization, repo_name = repo_full_name.split("/")
metadata = {
"issue_url": issue_url,
"repo_full_name": repo_full_name,
"organization": organization,
"repo_name": repo_name,
"repo_description": repo_description,
"username": username,
"comment_id": comment_id,
"title": title,
"installation_id": installation_id,
"function": "on_ticket",
"edited": edited,
"model": "gpt-3.5" if use_faster_model else "gpt-4",
"tier": "pro" if is_paying_user else "free",
"mode": ENV,
"slow_mode": slow_mode,
"do_map": do_map,
"subissues_mode": subissues_mode,
"sandbox_mode": sandbox_mode,
"fast_mode": fast_mode,
"is_self_hosted": IS_SELF_HOSTED,
"tracking_id": tracking_id,
}
fire_and_forget_wrapper(posthog.capture)(
username, "started", properties=metadata
)
try:
if current_issue.state == "closed":
fire_and_forget_wrapper(posthog.capture)(
username,
"issue_closed",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": False, "reason": "Issue is closed"}
fire_and_forget_wrapper(add_emoji)(current_issue, comment_id)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="rocket"
)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="confused"
)
fire_and_forget_wrapper(current_issue.edit)(body=summary)
replies_text = ""
summary = summary if summary else ""
fire_and_forget_wrapper(delete_old_prs)(repo, issue_number)
if not sandbox_mode:
progress_headers = [
None,
"Step 1: 🔎 Searching",
"Step 2: ⌨️ Coding",
"Step 3: 🔁 Code Review",
]
else:
progress_headers = [
None,
"📖 Reading File",
"🛠️ Executing Sandbox",
]
issue_comment = None
payment_message, payment_message_start = get_payment_messages(
chat_logger
)
ticket_progress.context.payment_context = PaymentContext(
use_faster_model=use_faster_model,
pro_user=is_paying_user,
daily_tickets_used=(
chat_logger.get_ticket_count(use_date=True)
if chat_logger
else 0
),
monthly_tickets_used=(
chat_logger.get_ticket_count() if chat_logger else 0
),
)
ticket_progress.save()
config_pr_url = None
user_settings = UserSettings.from_username(username=username)
user_settings_message = user_settings.get_message()
cloned_repo = ClonedRepo(
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=overrided_branch_name,
)
# check that repo's directory is non-empty
if os.listdir(cloned_repo.cached_dir) == []:
logger.info("Empty repo")
first_comment = (
"Sweep is currently not supported on empty repositories. Please add some"
f" code to your repository and try again.\n{sep}##"
f" {progress_headers[1]}\n{bot_suffix}{discord_suffix}"
)
if issue_comment is None:
issue_comment = current_issue.create_comment(
first_comment + BOT_SUFFIX
)
else:
issue_comment.edit(first_comment + BOT_SUFFIX)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {"success": False}
indexing_message = (
"I'm searching for relevant snippets in your repository. If this is your first"
" time using Sweep, I'm indexing your repository. You can monitor the progress using the progress dashboard"
)
first_comment = (
f"{get_comment_header(0, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message)}\n{sep}I am currently looking into this ticket! I"
" will update the progress of the ticket in this comment. I am currently"
f" searching through your code, looking for relevant snippets.\n{sep}##"
f" {progress_headers[1]}\n{indexing_message}{bot_suffix}{discord_suffix}"
)
# Find Sweep's previous comment
comments = []
for comment in current_issue.get_comments():
comments.append(comment)
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
break
if issue_comment is None:
issue_comment = current_issue.create_comment(first_comment)
else:
fire_and_forget_wrapper(issue_comment.edit)(first_comment)
old_edit = issue_comment.edit
issue_comment.edit = lambda msg: old_edit(msg + BOT_SUFFIX)
past_messages = {}
current_index = 0
table = None
initial_sandbox_response = -1
initial_sandbox_response_file = None
def refresh_token():
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
return user_token, g, repo
def edit_sweep_comment(
message: str,
index: int,
pr_message="",
done=False,
add_bonus_message=True,
):
nonlocal current_index, user_token, g, repo, issue_comment, initial_sandbox_response, initial_sandbox_response_file
message = sanitize_string_for_github(message)
if pr_message:
pr_message = sanitize_string_for_github(pr_message)
# -1 = error, -2 = retry
# Only update the progress bar if the issue generation errors.
errored = index == -1
if index >= 0:
past_messages[index] = message
current_index = index
agg_message = None
# Include progress history
# index = -2 is reserved for
for i in range(
current_index + 2
): # go to next header (for Working on it... text)
if i == 0 or i >= len(progress_headers):
continue # skip None header
header = progress_headers[i]
if header is not None:
header = "## " + header + "\n"
else:
header = "No header\n"
msg = header + (past_messages.get(i) or "Working on it...")
if agg_message is None:
agg_message = msg
else:
agg_message = agg_message + f"\n{sep}" + msg
suffix = bot_suffix + discord_suffix
if errored:
agg_message = (
"## ❌ Unable to Complete PR"
+ "\n"
+ message
+ (
"\n\nFor bonus GPT-4 tickets, please report this bug on"
f" **[Discord](https://discord.gg/invite/sweep)** (tracking ID: `{tracking_id}`)."
if add_bonus_message
else ""
)
)
if table is not None:
agg_message = (
agg_message
+ f"\n{sep}Please look at the generated plan. If something looks"
f" wrong, please add more details to your issue.\n\n{table}"
)
suffix = bot_suffix # don't include discord suffix for error messages
# Update the issue comment
msg = f"{get_comment_header(current_index, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message, errored=errored, pr_message=pr_message, done=done, initial_sandbox_response=initial_sandbox_response, initial_sandbox_response_file=initial_sandbox_response_file, config_pr_url=config_pr_url)}\n{sep}{agg_message}{suffix}"
try:
issue_comment.edit(msg)
except BadCredentialsException:
logger.error(
f"Bad credentials, refreshing token (tracking ID: `{tracking_id}`)"
)
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
issue_comment = None
for comment in comments:
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
current_issue = repo.get_issue(number=issue_number)
if issue_comment is None:
issue_comment = current_issue.create_comment(msg)
else:
issue_comment = [
comment
for comment in current_issue.get_comments()
if comment.user.login == CURRENT_USERNAME
][0]
issue_comment.edit(msg)
if use_faster_model:
edit_sweep_comment(
FASTER_MODEL_MESSAGE, -1, add_bonus_message=False
)
posthog.capture(
username,
"ran_out_of_tickets",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {
"success": False,
"error_message": "We deprecated supporting GPT 3.5.",
}
error_message = validate_issue(title + summary)
if error_message:
logger.warning(f"Validation error: {error_message}")
edit_sweep_comment(
(
f"The issue was rejected with the following response:\n\n{blockquote(error_message)}"
),
-1,
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
posthog.capture(
username,
"invalid_issue",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": True}
prs_extracted = PRReader.extract_prs(repo, summary)
message_summary = summary
if prs_extracted:
message_summary += "\n\n" + prs_extracted
edit_sweep_comment(
create_collapsible(
"I found that you mentioned the following Pull Requests that might be important:",
blockquote(
prs_extracted,
),
),
1,
)
try:
# search/context manager
logger.info("Searching for relevant snippets...")
if image_contents: # doing it here to avoid editing the original issue
message_summary += ImageDescriptionBot().describe_images(text=title + message_summary, images=image_contents)
snippets, tree, _, repo_context_manager = fetch_relevant_files(
cloned_repo,
title,
message_summary,
replies_text,
username,
metadata,
on_ticket_start_time,
tracking_id,
is_paying_user,
is_consumer_tier,
issue_url,
chat_logger,
ticket_progress,
images=image_contents
)
cloned_repo = repo_context_manager.cloned_repo
except Exception as e:
edit_sweep_comment(
(
"It looks like an issue has occurred around fetching the files."
f" The exception was {str(e)}. If this error persists"
f" contact team@sweep.dev.\n\n> @{username}, editing this issue description to include more details will automatically make me relaunch. Please join our Discord server for support (tracking_id={tracking_id})"
),
-1,
)
raise Exception("Failed to fetch files") from e
_user_token, g = get_github_client(installation_id)
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
repo = g.get_repo(repo_full_name)
ticket_progress.search_progress.indexing_progress = (
ticket_progress.search_progress.indexing_total
)
ticket_progress.status = TicketProgressStatus.PLANNING
ticket_progress.save()
# Fetch git commit history
if not repo_description:
repo_description = "No description provided."
message_summary += replies_text
get_documentation_dict(repo)
docs_results = ""
sweep_bot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title=title,
message_summary=message_summary,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
# Check repository for sweep.yml file.
sweep_yml_exists = False
sweep_yml_failed = False
for content_file in repo.get_contents(""):
if content_file.name == "sweep.yaml":
sweep_yml_exists = True
# Check if YAML is valid
yaml_content = content_file.decoded_content.decode("utf-8")
sweep_yaml_dict = {}
try:
sweep_yaml_dict = yaml.safe_load(yaml_content)
except Exception:
logger.error(f"Failed to load YAML file: {yaml_content}")
if len(sweep_yaml_dict) > 0:
break
linter_config = yamllint_config.YamlLintConfig(custom_config)
problems = list(linter.run(yaml_content, linter_config))
if problems:
errors = [
f"Line {problem.line}: {problem.desc} (rule: {problem.rule})"
for problem in problems
]
error_message = "\n".join(errors)
markdown_error_message = f"**There is something wrong with your [sweep.yaml](https://github.com/{repo_full_name}/blob/main/sweep.yaml):**\n```\n{error_message}\n```"
sweep_yml_failed = True
logger.error(markdown_error_message)
edit_sweep_comment(markdown_error_message, -1)
else:
logger.info("The YAML file is valid. No errors found.")
break
# If sweep.yaml does not exist, then create a new PR that simply creates the sweep.yaml file.
if not sweep_yml_exists:
try:
logger.info("Creating sweep.yaml file...")
config_pr = create_config_pr(sweep_bot, cloned_repo=cloned_repo)
config_pr_url = config_pr.html_url
edit_sweep_comment(message="", index=-2)
except Exception as e:
logger.error(
"Failed to create new branch for sweep.yaml file.\n",
e,
traceback.format_exc(),
)
else:
logger.info("sweep.yaml file already exists.")
try:
# ANALYZE SNIPPETS
newline = "\n"
edit_sweep_comment(
"I found the following snippets in your repository. I will now analyze"
" these snippets and come up with a plan."
+ "\n\n"
+ create_collapsible(
"Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.",
"\n".join(
[
f"https://github.com/{organization}/{repo_name}/blob/{repo.get_commits()[0].sha}/{snippet.file_path}#L{max(snippet.start, 1)}-L{min(snippet.end, snippet.content.count(newline) - 1)}\n"
for snippet in snippets
]
),
)
+ (
create_collapsible(
"I also found that you mentioned the following Pull Requests that may be helpful:",
blockquote(prs_extracted),
)
if prs_extracted
else ""
)
+ (f"\n\n{docs_results}\n\n" if docs_results else ""),
1,
)
logger.info("Fetching files to modify/create...")
file_change_requests, plan = get_files_to_change(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=f"{title}\n\n{message_summary}",
repo_name=repo_full_name,
cloned_repo=cloned_repo,
images=image_contents
)
validate_file_change_requests(file_change_requests, cloned_repo)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.status = TicketProgressStatus.CODING
ticket_progress.save()
if not file_change_requests:
if len(title + summary) < 60:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details? Please make sure that the title and"
" summary of the issue are at least 60 characters."
),
-1,
)
else:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details?"
),
-1,
)
raise Exception("No files to modify.")
file_change_requests: list[
FileChangeRequest
] = sweep_bot.validate_file_change_requests(
file_change_requests,
)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.save()
table = tabulate(
[
[
file_change_request.entity_display,
file_change_request.instructions_display.replace(
"\n", "<br/>"
).replace("```", "\\```"),
]
for file_change_request in file_change_requests
if file_change_request.change_type != "check"
],
headers=["File Path", "Proposed Changes"],
tablefmt="pipe",
)
logger.info("Generating PR...")
pull_request = PullRequest(
title="Sweep: " + title,
branch_name="sweep/" + to_branch_name(title),
content="",
)
logger.info("Making PR...")
ticket_progress.context.branch_name = pull_request.branch_name
ticket_progress.save()
files_progress: list[tuple[str, str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
"⏳ In Progress",
"",
)
for file_change_request in file_change_requests
]
checkboxes_progress: list[tuple[str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
" ",
)
for file_change_request in file_change_requests
if not file_change_request.change_type == "check"
]
checkboxes_contents = "\n".join(
[
create_checkbox(
f"`{filename}`", blockquote(instructions), check == "X"
)
for filename, instructions, check in checkboxes_progress
]
)
create_collapsible("Checklist", checkboxes_contents, opened=True)
file_change_requests[0].status = "running"
condensed_checkboxes_contents = "\n".join(
[
create_checkbox(f"`{filename}`", "", check == "X").strip()
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_collapsible = create_collapsible(
"Checklist", condensed_checkboxes_contents, opened=True
)
current_issue = repo.get_issue(number=issue_number)
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
delete_branch = False
generator = create_pr_changes(
file_change_requests,
pull_request,
sweep_bot,
username,
installation_id,
issue_number,
chat_logger=chat_logger,
base_branch=overrided_branch_name,
additional_messages=[],
)
edit_sweep_comment(checkboxes_contents, 2)
if not file_change_requests:
raise NoFilesException()
response = {
"error": Exception(
f"Sweep failed to generate any file change requests! This could mean that Sweep failed to find the correct lines of code to modify or that GPT-4 did not respond in our specified format. Sometimes, retrying will fix this error. Otherwise, reach out to our Discord server for support (tracking_id={tracking_id})."
)
}
changed_files = []
for item in generator:
if isinstance(item, dict):
response = item
break
(
new_file_contents,
_,
commit,
file_change_requests,
) = item
# append all files that have been changed
if new_file_contents:
for file_name, _ in new_file_contents.items():
changed_files.append(file_name)
commit_hash: str = (
commit
if isinstance(commit, str)
else (
commit.sha
if commit is not None
else repo.get_branch(
pull_request.branch_name
).commit.sha
)
)
commit_url = (
f"https://github.com/{repo_full_name}/commit/{commit_hash}"
)
commit_url_display = (
f"<a href='{commit_url}'><code>{commit_hash[:7]}</code></a>"
)
create_error_logs(
commit_url_display,
None,
status=(
"✓"
),
)
checkboxes_progress = [
(
file_change_request.display_summary
+ " "
+ file_change_request.status_display
+ " "
+ (file_change_request.commit_hash_url or "")
+ f" [Edit]({file_change_request.get_edit_url(repo.full_name, pull_request.branch_name)})",
file_change_request.instructions_ticket_display
+ f"\n\n{file_change_request.diff_display}",
(
"X"
if file_change_request.status
in ("succeeded", "failed")
else " "
),
)
for file_change_request in file_change_requests
]
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
collapsible_template.format(
summary="Checklist",
body=checkboxes_contents,
opened="open",
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
try:
current_issue = repo.get_issue(number=issue_number)
except BadCredentialsException:
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
logger.info(files_progress)
edit_sweep_comment(checkboxes_contents, 2)
if not response.get("success"):
raise Exception(f"Failed to create PR: {response.get('error')}")
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
for _ in range(3):
try:
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
break
except Exception:
from time import sleep
sleep(1)
edit_sweep_comment(checkboxes_contents, 2)
pr_changes = response["pull_request"]
# change the body here
diff_text = get_branch_diff_text(
repo=repo,
branch=pull_request.branch_name,
base_branch=overrided_branch_name,
)
new_description = PRDescriptionBot().describe_diffs(
diff_text,
pull_request.title,
)
# TODO: update the title as well
if new_description:
pr_changes.body = (
f"{new_description}\n\nFixes"
f" #{issue_number}.\n\n---\n\n{UPDATES_MESSAGE}\n\n---\n\n{INSTRUCTIONS_FOR_REVIEW}{BOT_SUFFIX}"
)
edit_sweep_comment(
"I have finished coding the issue. I am now reviewing it for completeness.",
3,
)
change_location = f" [`{pr_changes.pr_head}`](https://github.com/{repo_full_name}/commits/{pr_changes.pr_head}).\n\n"
review_message = (
"Here are my self-reviews of my changes at" + change_location
)
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
except Exception:
pass
changes_required, review_message = False, ""
if changes_required:
edit_sweep_comment(
review_message
+ "\n\nI finished incorporating these changes.",
3,
)
else:
edit_sweep_comment(
f"I have finished reviewing the code for completeness. I did not find errors for {change_location}",
3,
)
revert_buttons = []
for changed_file in set(changed_files):
revert_buttons.append(
Button(label=f"{RESET_FILE} {changed_file}")
)
revert_buttons_list = ButtonList(
buttons=revert_buttons, title=REVERT_CHANGED_FILES_TITLE
)
# delete failing sweep yaml if applicable
if sweep_yml_failed:
try:
repo.delete_file(
"sweep.yaml",
"Delete failing sweep.yaml",
branch=pr_changes.pr_head,
sha=repo.get_contents("sweep.yaml").sha,
)
except Exception:
pass
# create draft pr, then convert to regular pr later
pr: GithubPullRequest = repo.create_pull(
title=pr_changes.title,
body=pr_changes.body,
head=pr_changes.pr_head,
base=overrided_branch_name or SweepConfig.get_branch(repo),
# removed draft PR
draft=False,
)
try:
pr.add_to_assignees(username)
except Exception as e:
logger.error(
f"Failed to add assignee {username}: {e}, probably a bot."
)
ticket_progress.status = TicketProgressStatus.COMPLETE
ticket_progress.context.done_time = time()
ticket_progress.context.pr_id = pr.number
ticket_progress.save()
if revert_buttons:
pr.create_issue_comment(
revert_buttons_list.serialize() + BOT_SUFFIX
)
# add comments before labelling
pr.add_to_labels(GITHUB_LABEL_NAME)
current_issue.create_reaction("rocket")
heres_pr_message = f'<h1 align="center">🚀 Here\'s the PR! <a href="{pr.html_url}">#{pr.number}</a></h1>'
progress_message = f'<div align="center"><b>See Sweep\'s progress at <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">the progress dashboard</a>!</b></div>'
edit_sweep_comment(
review_message + "\n\nSuccess! 🚀",
4,
pr_message=(
f"{center(heres_pr_message)}\n{center(progress_message)}\n{center(payment_message_start)}"
),
done=True,
)
user_settings = UserSettings.from_username(username=username)
user = g.get_user(username)
full_name = user.name or user.login
name = full_name.split(" ")[0]
files_changed = []
for fcr in file_change_requests:
if fcr.change_type in ("create", "modify"):
diff = list(
difflib.unified_diff(
(fcr.old_content or "").splitlines() or [],
(fcr.new_content or "").splitlines() or [],
lineterm="",
)
)
added = sum(
1
for line in diff
if line.startswith("+") and not line.startswith("+++")
)
removed = sum(
1
for line in diff
if line.startswith("-") and not line.startswith("---")
)
files_changed.append(
f"<code>{fcr.filename}</code> (+{added}/-{removed})"
)
user_settings.send_email(
subject=f"Sweep Pull Request Complete for {repo_name}#{issue_number} {title}",
html=email_template.format(
name=name,
pr_url=pr.html_url,
issue_number=issue_number,
repo_full_name=repo_full_name,
pr_number=pr.number,
progress_url=f"{PROGRESS_BASE_URL}/issues/{tracking_id}",
summary=markdown.markdown(pr_changes.body),
files_changed="\n".join(
[f"<li>{item}</li>" for item in files_changed]
),
sweeping_gif=sweeping_gif,
),
)
# poll for github to check when gha are done
total_poll_attempts = 0
total_edit_attempts = 0
SLEEP_DURATION_SECONDS = 15
GITHUB_ACTIONS_ENABLED = get_gha_enabled(repo=repo) and DEPLOYMENT_GHA_ENABLED
GHA_MAX_EDIT_ATTEMPTS = 5 # max number of times to edit PR
current_commit = pr.head.sha
while True and GITHUB_ACTIONS_ENABLED:
logger.info(
f"Polling to see if Github Actions have finished... {total_poll_attempts}"
)
# we wait at most 60 minutes
if total_poll_attempts * SLEEP_DURATION_SECONDS // 60 >= 60:
break
else:
# wait one minute between check attempts
total_poll_attempts += 1
from time import sleep
sleep(SLEEP_DURATION_SECONDS)
runs = list(repo.get_workflow_runs(branch=pr.head.ref, head_sha=current_commit))
# if all runs have succeeded, break
if all([run.conclusion == "success" for run in runs]):
break
# if any of them have failed we retry
if any([run.conclusion == "failure" for run in runs]):
failed_runs = [
run for run in runs if run.conclusion == "failure"
]
failed_gha_logs: list[str] = get_failing_gha_logs(
failed_runs,
installation_id,
)
if failed_gha_logs:
# make edits to the PR
# TODO: look into rollbacks so we don't continue adding onto errors
cloned_repo = ClonedRepo( # reinitialize cloned_repo to avoid conflicts
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=pr.head.ref,
)
diffs = get_branch_diff_text(repo=repo, branch=pr.head.ref, base_branch=pr.base.ref)
problem_statement = f"{title}\n{message_summary}\n{replies_text}"
all_information_prompt = GHA_PROMPT.format(
problem_statement=problem_statement,
github_actions_logs=failed_gha_logs,
changes_made=diffs,
)
repo_context_manager = prep_snippets(cloned_repo=cloned_repo, query=(title + message_summary + replies_text).strip("\n"), ticket_progress=ticket_progress) # need to do this, can use the old query for speed
sweep_bot: SweepBot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title="Fix the following errors to complete the user request.",
message_summary=all_information_prompt,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
file_change_requests, plan = get_files_to_change_for_gha(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=all_information_prompt,
updated_files=new_file_contents,
cloned_repo=cloned_repo,
chat_logger=chat_logger,
)
validate_file_change_requests(file_change_requests, cloned_repo)
previous_modify_files_dict: dict[str, dict[str, str | list[str]]] | None = None
_, commit, _ = sweep_bot.handle_modify_file_main(
branch=pr.head.ref,
assistant_conversation=None,
additional_messages=[],
previous_modify_files_dict=previous_modify_files_dict,
file_change_requests=file_change_requests,
username=username
)
current_commit = commit.sha
pr = repo.get_pull(pr.number) # IMPORTANT: resync PR otherwise you'll fetch old GHA runs
total_edit_attempts += 1
if total_edit_attempts >= GHA_MAX_EDIT_ATTEMPTS:
logger.info(f"Tried to edit PR {GHA_MAX_EDIT_ATTEMPTS} times, giving up.")
break
# if none of the runs have completed we wait and poll github
logger.info(
f"No Github Actions have failed yet and not all have succeeded yet, waiting for {SLEEP_DURATION_SECONDS} seconds before polling again..."
)
# break from main for loop
convert_pr_draft_field(pr, is_draft=False, installation_id=installation_id)
except MaxTokensExceeded as e:
logger.info("Max tokens exceeded")
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Max tokens exceeded. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Max Tokens Exceeded",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
if chat_logger and chat_logger.is_paying_user():
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too long."
" We are currently working on improved file streaming to address"
" this issue.\n"
),
-1,
)
else:
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too"
" long.\n\nIf this file is incorrect, please describe the desired"
" file in the prompt. However, if you would like to edit longer"
" files, consider upgrading to [Sweep Pro](https://sweep.dev/) for"
" longer context lengths.\n"
),
-1,
)
delete_branch = True
raise e
except NoFilesException as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sweep could not find files to modify to address this issue. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.info("Sweep could not find files to modify")
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Sweep could not find files to modify",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
edit_sweep_comment(
(
"Sorry, Sweep could not find any appropriate files to edit to address"
" this issue. If this is a mistake, please provide more context and Sweep"
f" will retry!\n\n@{username}, please edit the issue description to"
" include more details. You can also ask for help on our community"
" forum: https://community.sweep.dev/"
),
-1,
)
delete_branch = True
raise e
except openai.BadRequestError as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sorry, it looks like there is an error with communicating with OpenAI. If this error persists, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
edit_sweep_comment(
(
"I'm sorry, but it looks our model has ran out of context length. We're"
" trying to make this happen less, but one way to mitigate this is to"
" code smaller files. If this error persists report it at"
" https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Context Length",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
posthog.capture(
username,
"failed",
properties={
"error": str(e),
"trace": traceback.format_exc(),
"reason": "Invalid request error / context length",
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
delete_branch = True
raise e
except AssistantRaisedException as e:
if ticket_progress is not None:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Sweep raised an error with the following message: {e.message}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.exception(e)
edit_sweep_comment(
f"Sweep raised an error with the following message:\n{blockquote(e.message)}",
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
except Exception as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Internal server error: {str(e)}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
# title and summary are defined elsewhere
if len(title + summary) < 60:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error occurred due to"
f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
" so Sweep can better address it. Alternatively, post on our community forum"
" for assistance: https://community.sweep.dev/"
),
-1,
)
else:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error has occurred due to"
+ f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
+ " so Sweep can better address it. Alternatively, reach out to Kevin or William for help at"
+ " https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
else:
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
fire_and_forget_wrapper(add_emoji)("rocket")
except SystemExit:
raise SystemExit
except Exception as e:
logger.error(e)
if delete_branch:
try:
if pull_request.branch_name.startswith("sweep"):
repo.get_git_ref(
f"heads/{pull_request.branch_name}"
).delete()
else:
raise Exception(
f"Branch name {pull_request.branch_name} does not start with sweep/"
)
except Exception as e:
logger.error(e)
logger.error(traceback.format_exc())
logger.info("Deleted branch", pull_request.branch_name)
except Exception as e:
posthog.capture(
username,
"failed",
properties={
**metadata,
"error": str(e),
"trace": traceback.format_exc(),
"duration": round(time() - on_ticket_start_time),
},
)
raise e
posthog.capture(
username,
"success",
properties={**metadata, "duration": round(time() - on_ticket_start_time)},
)
logger.info("on_ticket success in " + str(round(time() - on_ticket_start_time)))
return {"success": True}
def handle_sandbox_mode(
title, repo_full_name, repo, ticket_progress, edit_sweep_comment
):
logger.info("Running in sandbox mode")
sweep_bot = SweepBot(repo=repo, ticket_progress=ticket_progress)
logger.info("Getting file contents")
file_name = title.split(":")[1].strip()
file_contents = sweep_bot.get_contents(file_name).decoded_content.decode("utf-8")
try:
ext = file_name.split(".")[-1]
except Exception:
ext = ""
file_contents.replace("```", "\`\`\`")
sha = repo.get_branch(repo.default_branch).commit.sha
permalink = f"https://github.com/{repo_full_name}/blob/{sha}/{file_name}#L1-L{len(file_contents.splitlines())}"
logger.info("Running sandbox")
edit_sweep_comment(
f"Running sandbox for {file_name}. Current Code:\n\n{permalink}",
1,
)
updated_contents, sandbox_response = sweep_bot.check_sandbox(
file_name, file_contents
)
logger.info("Sandbox finished")
logs = (
(
"<br/>"
+ create_collapsible(
"Sandbox logs",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
updated_contents = updated_contents.replace("```", "\`\`\`")
diff = generate_diff(file_contents, updated_contents).replace("```", "\`\`\`")
diff_display = (
f"Updated Code:\n\n```{ext}\n{updated_contents}```\nDiff:\n```diff\n{diff}\n```"
if diff
else f"Sandbox made no changes to {file_name} (formatters were not configured or Sweep didn't make changes)."
)
edit_sweep_comment(
f"{logs}\n{diff_display}",
2,
)
edit_sweep_comment("N/A", 3)
logger.info("Sandbox comments updated")
def get_branch_diff_text(repo, branch, base_branch=None):
base_branch = base_branch or SweepConfig.get_branch(repo)
comparison = repo.compare(base_branch, branch)
file_diffs = comparison.files
pr_diffs = []
for file in file_diffs:
diff = file.patch
if (
file.status == "added"
or file.status == "modified"
or file.status == "removed"
):
pr_diffs.append((file.filename, diff))
else:
logger.info(
f"File status {file.status} not recognized"
) # TODO(sweep): We don't handle renamed files
return "\n".join([f"{filename}\n{diff}" for filename, diff in pr_diffs])
def get_payment_messages(chat_logger: ChatLogger):
if chat_logger:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
tracking_id = chat_logger.data["tracking_id"] if MONGODB_URI is not None else None
# Find the first comment made by the bot
tickets_allocated = 5
if is_consumer_tier:
tickets_allocated = 15
if is_paying_user:
tickets_allocated = 500
purchased_ticket_count = (
chat_logger.get_ticket_count(purchased=True) if chat_logger else 0
)
ticket_count = (
max(tickets_allocated - chat_logger.get_ticket_count(), 0)
+ purchased_ticket_count
if chat_logger
else 999
)
daily_ticket_count = (
(3 - chat_logger.get_ticket_count(use_date=True) if not use_faster_model else 0)
if chat_logger
else 999
)
model_name = "GPT-4"
single_payment_link = "https://buy.stripe.com/00g3fh7qF85q0AE14d"
pro_payment_link = "https://buy.stripe.com/00g5npeT71H2gzCfZ8"
daily_message = (
f" and {daily_ticket_count} for the day"
if not is_paying_user and not is_consumer_tier
else ""
)
user_type = "💎 <b>Sweep Pro</b>" if is_paying_user else "⚡ <b>Sweep Basic Tier</b>"
gpt_tickets_left_message = (
f"{ticket_count} GPT-4 tickets left for the month"
if not is_paying_user
else "unlimited GPT-4 tickets"
)
purchase_message = f"<br/><br/> For more GPT-4 tickets, visit <a href={single_payment_link}>our payment portal</a>. For a one week free trial, try <a href={pro_payment_link}>Sweep Pro</a> (unlimited GPT-4 tickets)."
payment_message = (
f"{user_type}: I used {model_name} to create this ticket. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)
payment_message_start = (
f"{user_type}: I'm using {model_name}. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)

import base64
import os
from dotenv import load_dotenv
from loguru import logger
logger.print = logger.info
load_dotenv(dotenv_path=".env", override=True, verbose=True)
os.environ["GITHUB_APP_PEM"] = os.environ.get("GITHUB_APP_PEM") or base64.b64decode(
os.environ.get("GITHUB_APP_PEM_BASE64", "")
).decode("utf-8")
if os.environ["GITHUB_APP_PEM"]:
os.environ["GITHUB_APP_ID"] = (
(os.environ.get("GITHUB_APP_ID") or os.environ.get("APP_ID"))
.replace("\\n", "\n")
.strip('"')
)
TEST_BOT_NAME = "sweep-nightly[bot]"
ENV = os.environ.get("ENV", "dev")
BOT_TOKEN_NAME = "bot-token"
# goes under Modal 'discord' secret name (optional, can leave env var blank)
DISCORD_WEBHOOK_URL = os.environ.get("DISCORD_WEBHOOK_URL")
DISCORD_MEDIUM_PRIORITY_URL = os.environ.get("DISCORD_MEDIUM_PRIORITY_URL")
DISCORD_LOW_PRIORITY_URL = os.environ.get("DISCORD_LOW_PRIORITY_URL")
DISCORD_FEEDBACK_WEBHOOK_URL = os.environ.get("DISCORD_FEEDBACK_WEBHOOK_URL")
SWEEP_HEALTH_URL = os.environ.get("SWEEP_HEALTH_URL")
DISCORD_STATUS_WEBHOOK_URL = os.environ.get("DISCORD_STATUS_WEBHOOK_URL")
# goes under Modal 'github' secret name
GITHUB_APP_ID = os.environ.get("GITHUB_APP_ID", os.environ.get("APP_ID"))
# deprecated: old logic transfer so upstream can use this
if GITHUB_APP_ID is None:
if ENV == "prod":
GITHUB_APP_ID = "307814"
elif ENV == "dev":
GITHUB_APP_ID = "324098"
elif ENV == "staging":
GITHUB_APP_ID = "327588"
GITHUB_BOT_USERNAME = os.environ.get("GITHUB_BOT_USERNAME")
# deprecated: left to support old logic
if not GITHUB_BOT_USERNAME:
if ENV == "prod":
GITHUB_BOT_USERNAME = "sweep-ai[bot]"
elif ENV == "dev":
GITHUB_BOT_USERNAME = "sweep-nightly[bot]"
elif ENV == "staging":
GITHUB_BOT_USERNAME = "sweep-canary[bot]"
elif not GITHUB_BOT_USERNAME.endswith("[bot]"):
GITHUB_BOT_USERNAME = GITHUB_BOT_USERNAME + "[bot]"
GITHUB_LABEL_NAME = os.environ.get("GITHUB_LABEL_NAME", "sweep")
GITHUB_LABEL_COLOR = os.environ.get("GITHUB_LABEL_COLOR", "9400D3")
GITHUB_LABEL_DESCRIPTION = os.environ.get(
"GITHUB_LABEL_DESCRIPTION", "Sweep your software chores"
)
GITHUB_APP_PEM = os.environ.get("GITHUB_APP_PEM")
GITHUB_APP_PEM = GITHUB_APP_PEM or os.environ.get("PRIVATE_KEY")
if GITHUB_APP_PEM is not None:
GITHUB_APP_PEM = GITHUB_APP_PEM.strip(' \n"') # Remove whitespace and quotes
GITHUB_APP_PEM = GITHUB_APP_PEM.replace("\\n", "\n")
GITHUB_CONFIG_BRANCH = os.environ.get("GITHUB_CONFIG_BRANCH", "sweep/add-sweep-config")
GITHUB_DEFAULT_CONFIG = os.environ.get(
"GITHUB_DEFAULT_CONFIG",
"""# Sweep AI turns bugs & feature requests into code changes (https://sweep.dev)
# For details on our config file, check out our docs at https://docs.sweep.dev/usage/config
# This setting contains a list of rules that Sweep will check for. If any of these rules are broken in a new commit, Sweep will create an pull request to fix the broken rule.
rules:
{additional_rules}
# This is the branch that Sweep will develop from and make pull requests to. Most people use 'main' or 'master' but some users also use 'dev' or 'staging'.
branch: 'main'
# By default Sweep will read the logs and outputs from your existing Github Actions. To disable this, set this to false.
gha_enabled: True
# This is the description of your project. It will be used by sweep when creating PRs. You can tell Sweep what's unique about your project, what frameworks you use, or anything else you want.
#
# Example:
#
# description: sweepai/sweep is a python project. The main api endpoints are in sweepai/api.py. Write code that adheres to PEP8.
description: ''
# This sets whether to create pull requests as drafts. If this is set to True, then all pull requests will be created as drafts and GitHub Actions will not be triggered.
draft: False
# This is a list of directories that Sweep will not be able to edit.
blocked_dirs: []
""",
)
MONGODB_URI = os.environ.get("MONGODB_URI", None)
IS_SELF_HOSTED = os.environ.get("IS_SELF_HOSTED", "true").lower() == "true"
REDIS_URL = os.environ.get("REDIS_URL")
if not REDIS_URL:
REDIS_URL = os.environ.get("redis_url", "redis://0.0.0.0:6379/0")
ORG_ID = os.environ.get("ORG_ID", None)
POSTHOG_API_KEY = os.environ.get(
"POSTHOG_API_KEY", "phc_CnzwIB0W548wN4wEGeRuxXqidOlEUH2AcyV2sKTku8n"
)
SUPPORT_COUNTRY = os.environ.get("GDRP_LIST", "").split(",")
WHITELISTED_REPOS = os.environ.get("WHITELISTED_REPOS", "").split(",")
BLACKLISTED_USERS = os.environ.get("BLACKLISTED_USERS", "").split(",")
# Default OpenAI
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", None) # this may be none, and it will use azure
OPENAI_API_TYPE = os.environ.get("OPENAI_API_TYPE", "anthropic")
assert OPENAI_API_TYPE in ["anthropic", "azure", "openai"], "Invalid OPENAI_API_TYPE"
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
AZURE_API_KEY = os.environ.get("AZURE_API_KEY", None)
OPENAI_API_BASE = os.environ.get("OPENAI_API_BASE", None)
OPENAI_API_VERSION = os.environ.get("OPENAI_API_VERSION", None)
AZURE_OPENAI_DEPLOYMENT = os.environ.get("AZURE_OPENAI_DEPLOYMENT", None)
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
OPENAI_EMBEDDINGS_AZURE_ENDPOINT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_ENDPOINT", None
)
OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT", None
)
OPENAI_EMBEDDINGS_AZURE_API_VERSION = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_API_VERSION", None
)
OPENAI_API_ENGINE_GPT35 = os.environ.get("OPENAI_API_ENGINE_GPT35", None)
OPENAI_API_ENGINE_GPT4 = os.environ.get("OPENAI_API_ENGINE_GPT4", None)
MULTI_REGION_CONFIG = os.environ.get("MULTI_REGION_CONFIG", None)
if isinstance(MULTI_REGION_CONFIG, str):
MULTI_REGION_CONFIG = MULTI_REGION_CONFIG.strip("'").replace("\\n", "\n")
MULTI_REGION_CONFIG = [item.split(",") for item in MULTI_REGION_CONFIG.split("\n")]
WHITELISTED_USERS = os.environ.get("WHITELISTED_USERS", None)
if WHITELISTED_USERS:
WHITELISTED_USERS = WHITELISTED_USERS.split(",")
WHITELISTED_USERS.append(GITHUB_BOT_USERNAME)
DEFAULT_GPT4_MODEL = os.environ.get("DEFAULT_GPT4_MODEL", "gpt-4-0125-preview")
RESEND_API_KEY = os.environ.get("RESEND_API_KEY", None)
LOKI_URL = None
DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
ENV = "prod" if GITHUB_BOT_USERNAME != TEST_BOT_NAME else "dev"
PROGRESS_BASE_URL = os.environ.get(
"PROGRESS_BASE_URL", "https://progress.sweep.dev"
).rstrip("/")
DISABLED_REPOS = os.environ.get("DISABLED_REPOS", "").split(",")
GHA_AUTOFIX_ENABLED: bool = os.environ.get("GHA_AUTOFIX_ENABLED", False)
MERGE_CONFLICT_ENABLED: bool = os.environ.get("MERGE_CONFLICT_ENABLED", False)
INSTALLATION_ID = os.environ.get("INSTALLATION_ID", None)
AWS_ACCESS_KEY=os.environ.get("AWS_ACCESS_KEY")
AWS_SECRET_KEY=os.environ.get("AWS_SECRET_KEY")
AWS_REGION=os.environ.get("AWS_REGION")
ANTHROPIC_AVAILABLE = AWS_ACCESS_KEY and AWS_SECRET_KEY and AWS_REGION
ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY", None)
COHERE_API_KEY = os.environ.get("COHERE_API_KEY", None)
VOYAGE_API_KEY = os.environ.get("VOYAGE_API_KEY", None)
VOYAGE_API_AWS_ACCESS_KEY=os.environ.get("VOYAGE_API_AWS_ACCESS_KEY_ID")
VOYAGE_API_AWS_SECRET_KEY=os.environ.get("VOYAGE_API_AWS_SECRET_KEY")
VOYAGE_API_AWS_REGION=os.environ.get("VOYAGE_API_AWS_REGION")
VOYAGE_API_AWS_ENDPOINT_NAME=os.environ.get("VOYAGE_API_AWS_ENDPOINT_NAME", "voyage-code-2")
VOYAGE_API_USE_AWS = VOYAGE_API_AWS_ACCESS_KEY and VOYAGE_API_AWS_SECRET_KEY and VOYAGE_API_AWS_REGION
PAREA_API_KEY = os.environ.get("PAREA_API_KEY", None)
# TODO: we need to make this dynamic + backoff
BATCH_SIZE = int(
os.environ.get("BATCH_SIZE", 64 if VOYAGE_API_KEY else 256) # Voyage only allows 128 items per batch and 120000 tokens per batch
)
DEPLOYMENT_GHA_ENABLED = os.environ.get("DEPLOYMENT_GHA_ENABLED", "true").lower() == "true"
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

import re
from sweepai.core.chat import ChatGPT
issue_validator_instructions_prompt = """# Instructions
A good issue for Sweep is actionable and it is clear how to resolve it. Here is what Sweep is currently capable of:
- Access to the entire codebase, with a high-quality search engine to find specific code snippets. Sweep is able to pinpoint the exact location of the code that needs to be changed based on vague descriptions.
- Making code changes to fix bugs or add features.
- Reading the GitHub Action logs to run tests and check the results.
- Ability to read images such as screenshots and charts.
Here are some examples of things Sweep does not currently support:
- Large-scale changes like migrations and large version upgrades.
- Tasks requiring accessing outside information like AWS consoles or retrieving API keys.
- Tasks requiring fixes outside of code changes
- Issues that have an existing fix or duplicate issues
Respond in the following format:
<thinking>
Provide an analysis of why it is a good or bad issue to pass on to Sweep. If it is a bad issue, suggest how the issue could be improved or clarified to make it more suitable for Sweep.
</thinking>
<pass>True or False</pass>
If False, respond to the user:
<response_to_user>
Response to user with justification on why the issue is unclear.
</response_to_user>"""
issue_validator_system_prompt = """You are an AI assistant tasked with determining whether an issue reported by customer support should be passed on to be resolved by Sweep, an AI-powered software engineer.
""" + issue_validator_instructions_prompt
issue_validator_user_prompt = """<issue>
{issue}
</issue>\n\n""" + issue_validator_instructions_prompt
def validate_issue(issue: str) -> str:
"""
Somehow haiku and GPT-4 can't do this consistently.
"""
chat_gpt = ChatGPT.from_system_message_string(
prompt_string=issue_validator_system_prompt,
)
response = chat_gpt.chat_anthropic(
issue_validator_user_prompt.format(
issue=issue
),
model="claude-3-opus-20240229",
temperature=0.0,
)
if "<pass>False</pass>" in response:
pattern = "<response_to_user>(.*)</response_to_user>"
return re.search(pattern, response, re.DOTALL).group(1).strip()
return ""
if __name__ == "__main__":


Step 2: ⌨️ Coding

Modify sweepai/config/server.py with contents: Add a new environment variable for the user's Slack API key.

<original_code>
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

LICENSE_KEY = os.environ.get("LICENSE_KEY", None)
ALTERNATE_AWS = os.environ.get("ALTERNATE_AWS", "none").lower() == "true"
</original_code>

<new_code>
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

SLACK_API_KEY = os.environ.get("SLACK_API_KEY", None)

LICENSE_KEY = os.environ.get("LICENSE_KEY", None)
ALTERNATE_AWS = os.environ.get("ALTERNATE_AWS", "none").lower() == "true"
</new_code>

  • Modify sweepai/handlers/on_ticket.pyc6dd95d Edit
Modify sweepai/handlers/on_ticket.py with contents: Import the necessary modules for making HTTP requests and parsing Slack message threads.

<original_code>
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
</original_code>

<new_code>
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter

from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError
</new_code>

  • Modify sweepai/handlers/on_ticket.pyc6dd95d Edit
Modify sweepai/handlers/on_ticket.py with contents: In the `on_ticket` function, check if the issue description contains a Slack link. If found, authenticate to Slack and fetch the message thread.

<original_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
</original_code>

<new_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)

slack_link_match = re.search(r'(https://\w+\.slack\.com/archives/\w+/p\d+)', summary)
if slack_link_match:
    slack_link = slack_link_match.group(1)
    slack_client = WebClient(token=SLACK_API_KEY)
    
    try:
        slack_permalink_data = slack_client.chat_getPermalink(
            link=slack_link
        )
        slack_channel_id = slack_permalink_data['channel']
        slack_message_ts = slack_permalink_data['message_ts']

        slack_thread_replies = slack_client.conversations_replies(
            channel=slack_channel_id,
            ts=slack_message_ts
        )

        slack_thread_messages = [message['text'] for message in slack_thread_replies['messages']]
        slack_thread_text = '\n'.join(slack_thread_messages)

        summary += f"\n\nSlack Thread:\n{slack_thread_text}"

    except SlackApiError as e:
        logger.error(f"Error fetching Slack thread: {e}")

repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
    assignee = current_issue.user.login

</new_code>


Step 3: 🔁 Code Review

I have finished reviewing the code for completeness. I did not find errors for sweep/augment_on_ticket_so_that_when_a_user_ad_663e7.


🎉 Latest improvements to Sweep:
  • New dashboard launched for real-time tracking of Sweep issues, covering all stages from search to coding.
  • Integration of OpenAI's latest Assistant API for more efficient and reliable code planning and editing, improving speed by 3x.
  • Use the GitHub issues extension for creating Sweep issues directly from your editor.

💡 To recreate the pull request edit the issue title or description.
Something wrong? Let us know.

This is an automated message generated by Sweep AI.

Copy link
Contributor

sweep-nightly bot commented May 2, 2024

Sweeping

✨ Track Sweep's progress on our progress dashboard!


50%

💎 Sweep Pro: I'm using GPT-4. You have unlimited GPT-4 tickets. (tracking ID: c684dda8f8)

Tip

I can email you when I complete this pull request if you set up your email here!


Actions (click)

  • ↻ Restart Sweep

Step 1: 🔎 Searching

I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.

Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.

"""
on_ticket is the main function that is called when a new issue is created.
It is only called by the webhook handler in sweepai/api.py.
"""
import difflib
import io
import os
import re
import traceback
from typing import Any
import zipfile
from time import time
import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
from sweepai.core.sweep_bot import GHA_PROMPT
from sweepai.agents.pr_description_bot import PRDescriptionBot
from sweepai.agents.image_description_bot import ImageDescriptionBot
from sweepai.config.client import (
RESET_FILE,
RESTART_SWEEP_BUTTON,
REVERT_CHANGED_FILES_TITLE,
SweepConfig,
get_documentation_dict,
get_gha_enabled,
)
from sweepai.config.server import (
DEPLOYMENT_GHA_ENABLED,
ENV,
GITHUB_LABEL_NAME,
IS_SELF_HOSTED,
MONGODB_URI,
PROGRESS_BASE_URL,
)
from sweepai.core.entities import (
AssistantRaisedException,
FileChangeRequest,
MaxTokensExceeded,
NoFilesException,
PullRequest,
SandboxResponse,
)
from sweepai.core.entities import create_error_logs as entities_create_error_logs
from sweepai.core.pr_reader import PRReader
from sweepai.core.sweep_bot import SweepBot, get_files_to_change, get_files_to_change_for_gha, validate_file_change_requests
from sweepai.handlers.create_pr import (
create_config_pr,
create_pr_changes,
safe_delete_sweep_branch,
)
from sweepai.handlers.on_check_suite import clean_gh_logs
from sweepai.utils.image_utils import get_image_contents_from_urls, get_image_urls_from_issue
from sweepai.utils.issue_validator import validate_issue
from sweepai.utils.validate_license import validate_license
from sweepai.utils.buttons import Button, ButtonList, create_action_buttons
from sweepai.utils.chat_logger import ChatLogger
from sweepai.utils.diff import generate_diff
from sweepai.utils.event_logger import posthog
from sweepai.utils.github_utils import (
CURRENT_USERNAME,
ClonedRepo,
convert_pr_draft_field,
get_github_client,
get_token,
sanitize_string_for_github,
)
from sweepai.utils.progress import (
AssistantConversation,
PaymentContext,
TicketContext,
TicketProgress,
TicketProgressStatus,
)
from sweepai.utils.prompt_constructor import HumanMessagePrompt
from sweepai.utils.str_utils import (
BOT_SUFFIX,
FASTER_MODEL_MESSAGE,
UPDATES_MESSAGE,
blockquote,
bot_suffix,
checkbox_template,
clean_logs,
collapsible_template,
create_checkbox,
create_collapsible,
discord_suffix,
format_sandbox_success,
get_hash,
sep,
stars_suffix,
strip_sweep,
to_branch_name,
)
from sweepai.utils.ticket_utils import (
center,
fetch_relevant_files,
fire_and_forget_wrapper,
log_error,
prep_snippets,
)
from sweepai.utils.user_settings import UserSettings
# from sandbox.sandbox_utils import Sandbox
sweeping_gif = """<a href="https://github.com/sweepai/sweep"><img class="swing" src="https://raw.githubusercontent.com/sweepai/sweep/main/.assets/sweeping.gif" width="100" style="width:50px; margin-bottom:10px" alt="Sweeping"></a>"""
custom_config = """
extends: relaxed
rules:
line-length: disable
indentation: disable
"""
INSTRUCTIONS_FOR_REVIEW = """\
### 💡 To get Sweep to edit this pull request, you can:
* Comment below, and Sweep can edit the entire PR
* Comment on a file, Sweep will only modify the commented file
* Edit the original issue to get Sweep to recreate the PR from scratch"""
email_template = """Hey {name},
<br/><br/>
🚀 I just finished creating a pull request for your issue ({repo_full_name}#{issue_number}) at <a href="{pr_url}">{repo_full_name}#{pr_number}</a>!
<br/><br/>
You can view how I created this pull request <a href="{progress_url}">here</a>.
<h2>Summary</h2>
<blockquote>
{summary}
</blockquote>
<h2>Files Changed</h2>
<ul>
{files_changed}
</ul>
{sweeping_gif}
<br/>
Cheers,
<br/>
Sweep
<br/>"""
FAILING_GITHUB_ACTION_PROMPT = """\
The following Github Actions failed on a previous attempt at fixing this issue.
Propose a fix to the failing github actions. You must edit the source code, not the github action itself.
{github_action_log}
"""
# Add :eyes: emoji to ticket
def add_emoji(issue: Issue, comment_id: int = None, reaction_content="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
item_to_react_to.create_reaction(reaction_content)
# If SWEEP_BOT reacted to item_to_react_to with "rocket", then remove it.
def remove_emoji(issue: Issue, comment_id: int = None, content_to_delete="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
reactions = item_to_react_to.get_reactions()
for reaction in reactions:
if (
reaction.content == content_to_delete
and reaction.user.login == CURRENT_USERNAME
):
item_to_react_to.delete_reaction(reaction.id)
def create_error_logs(
commit_url_display: str,
sandbox_response: SandboxResponse,
status: str = "✓",
):
return (
(
"<br/>"
+ create_collapsible(
f"Sandbox logs for {commit_url_display} {status}",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
# takes in a list of workflow runs and returns a list of messages containing the logs of the failing runs
def get_failing_gha_logs(runs, installation_id) -> str:
token = get_token(installation_id)
all_logs = ""
for run in runs:
# jobs_url
jobs_url = run.jobs_url
jobs_response = requests.get(
jobs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
)
if jobs_response.status_code == 200:
failed_jobs = []
jobs = jobs_response.json()["jobs"]
for job in jobs:
if job["conclusion"] == "failure":
failed_jobs.append(job)
failed_jobs_name_list = []
for job in failed_jobs:
# add failed steps
for step in job["steps"]:
if step["conclusion"] == "failure":
failed_jobs_name_list.append(
f"{job['name']}/{step['number']}_{step['name']}"
)
else:
logger.error(
"Failed to get jobs for failing github actions, possible a credentials issue"
)
return all_logs
# make sure jobs in valid
if jobs_response.json()['total_count'] == 0:
logger.error(f"no jobs for this run: {run}, continuing...")
continue
# logs url
logs_url = run.logs_url
logs_response = requests.get(
logs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
allow_redirects=True,
)
# Check if the request was successful
if logs_response.status_code == 200:
zip_data = io.BytesIO(logs_response.content)
zip_file = zipfile.ZipFile(zip_data, "r")
zip_file_names = zip_file.namelist()
for file in failed_jobs_name_list:
if f"{file}.txt" in zip_file_names:
logs = zip_file.read(f"{file}.txt").decode("utf-8")
logs_prompt = clean_gh_logs(logs)
all_logs += logs_prompt + "\n"
else:
logger.error(
"Failed to get logs for failing github actions, likely a credentials issue"
)
return all_logs
def delete_old_prs(repo: Repository, issue_number: int):
logger.info("Deleting old PRs...")
prs = repo.get_pulls(
state="open",
sort="created",
direction="desc",
base=SweepConfig.get_branch(repo),
)
for pr in tqdm(prs.get_page(0)):
# # Check if this issue is mentioned in the PR, and pr is owned by bot
# # This is done in create_pr, (pr_description = ...)
if pr.user.login == CURRENT_USERNAME and f"Fixes #{issue_number}.\n" in pr.body:
safe_delete_sweep_branch(pr, repo)
break
def construct_sweep_bot(
repo: Repository,
repo_name: str,
issue_url: str,
repo_description: str,
title: str,
message_summary: str,
cloned_repo: ClonedRepo,
ticket_progress: TicketProgress,
chat_logger: ChatLogger,
snippets: Any = None,
tree: Any = None,
comments: Any = None,
) -> SweepBot:
human_message = HumanMessagePrompt(
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description.strip(),
title=title,
summary=message_summary,
snippets=snippets,
tree=tree,
)
sweep_bot = SweepBot.from_system_message_content(
human_message=human_message,
repo=repo,
is_reply=bool(comments),
chat_logger=chat_logger,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
)
return sweep_bot
def get_comment_header(
index: int,
g: Github,
repo_full_name: str,
user_settings: UserSettings,
progress_headers: list[None | str],
tracking_id: str | None,
payment_message_start: str,
user_settings_message: str,
errored: bool = False,
pr_message: str = "",
done: bool = False,
initial_sandbox_response: int | SandboxResponse = -1,
initial_sandbox_response_file=None,
config_pr_url: str | None = None,
):
config_pr_message = (
"\n"
+ f"<div align='center'>Install Sweep Configs: <a href='{config_pr_url}'>Pull Request</a></div>"
if config_pr_url is not None
else ""
)
actions_message = create_action_buttons(
[
RESTART_SWEEP_BUTTON,
]
)
sandbox_execution_message = "\n\n## GitHub Actions failed\n\nThe sandbox appears to be unavailable or down.\n\n"
if initial_sandbox_response == -1:
sandbox_execution_message = ""
elif initial_sandbox_response is not None:
repo = g.get_repo(repo_full_name)
commit_hash = repo.get_commits()[0].sha
success = initial_sandbox_response.outputs and initial_sandbox_response.success
status = "✓" if success else "X"
sandbox_execution_message = (
"\n\n## GitHub Actions"
+ status
+ "\n\nHere are the GitHub Actions logs prior to making any changes:\n\n"
)
sandbox_execution_message += entities_create_error_logs(
f'<a href="https://github.com/{repo_full_name}/commit/{commit_hash}"><code>{commit_hash[:7]}</code></a>',
initial_sandbox_response,
initial_sandbox_response_file,
)
if success:
sandbox_execution_message += f"\n\nSandbox passed on the latest `{repo.default_branch}`, so sandbox checks will be enabled for this issue."
else:
sandbox_execution_message += "\n\nSandbox failed, so all sandbox checks will be disabled for this issue."
if index < 0:
index = 0
if index == 4:
return (
pr_message
+ config_pr_message
+ f"\n\n---\n{user_settings.get_message(completed=True)}"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
total = len(progress_headers)
index += 1 if done else 0
index *= 100 / total
index = int(index)
index = min(100, index)
if errored:
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Errored&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}<br/>{center(pbar)}\n\n"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Progress&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}"
+ (
center(
f'\n\n<h2>✨ Track Sweep\'s progress on our <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">progress dashboard</a>!</h2>'
)
if MONGODB_URI is not None
else ""
)
+ f"<br/>{center(pbar)}"
+ ("\n" + stars_suffix if index != -1 else "")
+ "\n"
+ center(payment_message_start)
+ f"\n\n---\n{user_settings_message}"
+ config_pr_message
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
def on_ticket(
title: str,
summary: str,
issue_number: int,
issue_url: str, # purely for logging purposes
username: str,
repo_full_name: str,
repo_description: str,
installation_id: int,
comment_id: int = None,
edited: bool = False,
tracking_id: str | None = None,
):
if not os.environ.get("CLI"):
assert validate_license(), "License key is invalid or expired. Please contact us at team@sweep.dev to upgrade to an enterprise license."
with logger.contextualize(
tracking_id=tracking_id,
):
if tracking_id is None:
tracking_id = get_hash()
on_ticket_start_time = time()
logger.info(f"Starting on_ticket with title {title} and summary {summary}")
(
title,
slow_mode,
do_map,
subissues_mode,
sandbox_mode,
fast_mode,
lint_mode,
) = strip_sweep(title)
# fetch images from body of issue
image_urls = get_image_urls_from_issue(issue_number, repo_full_name, installation_id)
image_contents = get_image_contents_from_urls(image_urls)
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n<summary>Checklist</summary>.*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- \[[ X]\].*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
ticket_progress = TicketProgress(
tracking_id=tracking_id,
username=username,
context=TicketContext(
title=title,
description=summary,
repo_full_name=repo_full_name,
issue_number=issue_number,
is_public=repo.private is False,
start_time=int(time()),
),
)
branch_match = re.search(
r"([B|b]ranch:) *(?P<branch_name>.+?)(\s|$)", summary
)
overrided_branch_name = None
if branch_match and "branch_name" in branch_match.groupdict():
overrided_branch_name = (
branch_match.groupdict()["branch_name"].strip().strip("`\"'")
)
# TODO: this code might be finicky, might have missed edge cases
if overrided_branch_name.startswith("https://github.com/"):
overrided_branch_name = overrided_branch_name.split("?")[0].split(
"tree/"
)[-1]
SweepConfig.get_branch(repo, overrided_branch_name)
chat_logger = (
ChatLogger(
{
"repo_name": repo_name,
"title": title,
"summary": summary,
"issue_number": issue_number,
"issue_url": issue_url,
"username": (
username if not username.startswith("sweep") else assignee
),
"repo_full_name": repo_full_name,
"repo_description": repo_description,
"installation_id": installation_id,
"type": "ticket",
"mode": ENV,
"comment_id": comment_id,
"edited": edited,
"tracking_id": tracking_id,
},
active=True,
)
if MONGODB_URI
else None
)
if chat_logger and not IS_SELF_HOSTED:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
if use_faster_model:
raise Exception(FASTER_MODEL_MESSAGE)
if fast_mode:
use_faster_model = True
if not comment_id and not edited and chat_logger and not sandbox_mode:
fire_and_forget_wrapper(chat_logger.add_successful_ticket)(
gpt3=use_faster_model
)
organization, repo_name = repo_full_name.split("/")
metadata = {
"issue_url": issue_url,
"repo_full_name": repo_full_name,
"organization": organization,
"repo_name": repo_name,
"repo_description": repo_description,
"username": username,
"comment_id": comment_id,
"title": title,
"installation_id": installation_id,
"function": "on_ticket",
"edited": edited,
"model": "gpt-3.5" if use_faster_model else "gpt-4",
"tier": "pro" if is_paying_user else "free",
"mode": ENV,
"slow_mode": slow_mode,
"do_map": do_map,
"subissues_mode": subissues_mode,
"sandbox_mode": sandbox_mode,
"fast_mode": fast_mode,
"is_self_hosted": IS_SELF_HOSTED,
"tracking_id": tracking_id,
}
fire_and_forget_wrapper(posthog.capture)(
username, "started", properties=metadata
)
try:
if current_issue.state == "closed":
fire_and_forget_wrapper(posthog.capture)(
username,
"issue_closed",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": False, "reason": "Issue is closed"}
fire_and_forget_wrapper(add_emoji)(current_issue, comment_id)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="rocket"
)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="confused"
)
fire_and_forget_wrapper(current_issue.edit)(body=summary)
replies_text = ""
summary = summary if summary else ""
fire_and_forget_wrapper(delete_old_prs)(repo, issue_number)
if not sandbox_mode:
progress_headers = [
None,
"Step 1: 🔎 Searching",
"Step 2: ⌨️ Coding",
"Step 3: 🔁 Code Review",
]
else:
progress_headers = [
None,
"📖 Reading File",
"🛠️ Executing Sandbox",
]
issue_comment = None
payment_message, payment_message_start = get_payment_messages(
chat_logger
)
ticket_progress.context.payment_context = PaymentContext(
use_faster_model=use_faster_model,
pro_user=is_paying_user,
daily_tickets_used=(
chat_logger.get_ticket_count(use_date=True)
if chat_logger
else 0
),
monthly_tickets_used=(
chat_logger.get_ticket_count() if chat_logger else 0
),
)
ticket_progress.save()
config_pr_url = None
user_settings = UserSettings.from_username(username=username)
user_settings_message = user_settings.get_message()
cloned_repo = ClonedRepo(
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=overrided_branch_name,
)
# check that repo's directory is non-empty
if os.listdir(cloned_repo.cached_dir) == []:
logger.info("Empty repo")
first_comment = (
"Sweep is currently not supported on empty repositories. Please add some"
f" code to your repository and try again.\n{sep}##"
f" {progress_headers[1]}\n{bot_suffix}{discord_suffix}"
)
if issue_comment is None:
issue_comment = current_issue.create_comment(
first_comment + BOT_SUFFIX
)
else:
issue_comment.edit(first_comment + BOT_SUFFIX)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {"success": False}
indexing_message = (
"I'm searching for relevant snippets in your repository. If this is your first"
" time using Sweep, I'm indexing your repository. You can monitor the progress using the progress dashboard"
)
first_comment = (
f"{get_comment_header(0, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message)}\n{sep}I am currently looking into this ticket! I"
" will update the progress of the ticket in this comment. I am currently"
f" searching through your code, looking for relevant snippets.\n{sep}##"
f" {progress_headers[1]}\n{indexing_message}{bot_suffix}{discord_suffix}"
)
# Find Sweep's previous comment
comments = []
for comment in current_issue.get_comments():
comments.append(comment)
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
break
if issue_comment is None:
issue_comment = current_issue.create_comment(first_comment)
else:
fire_and_forget_wrapper(issue_comment.edit)(first_comment)
old_edit = issue_comment.edit
issue_comment.edit = lambda msg: old_edit(msg + BOT_SUFFIX)
past_messages = {}
current_index = 0
table = None
initial_sandbox_response = -1
initial_sandbox_response_file = None
def refresh_token():
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
return user_token, g, repo
def edit_sweep_comment(
message: str,
index: int,
pr_message="",
done=False,
add_bonus_message=True,
):
nonlocal current_index, user_token, g, repo, issue_comment, initial_sandbox_response, initial_sandbox_response_file
message = sanitize_string_for_github(message)
if pr_message:
pr_message = sanitize_string_for_github(pr_message)
# -1 = error, -2 = retry
# Only update the progress bar if the issue generation errors.
errored = index == -1
if index >= 0:
past_messages[index] = message
current_index = index
agg_message = None
# Include progress history
# index = -2 is reserved for
for i in range(
current_index + 2
): # go to next header (for Working on it... text)
if i == 0 or i >= len(progress_headers):
continue # skip None header
header = progress_headers[i]
if header is not None:
header = "## " + header + "\n"
else:
header = "No header\n"
msg = header + (past_messages.get(i) or "Working on it...")
if agg_message is None:
agg_message = msg
else:
agg_message = agg_message + f"\n{sep}" + msg
suffix = bot_suffix + discord_suffix
if errored:
agg_message = (
"## ❌ Unable to Complete PR"
+ "\n"
+ message
+ (
"\n\nFor bonus GPT-4 tickets, please report this bug on"
f" **[Discord](https://discord.gg/invite/sweep)** (tracking ID: `{tracking_id}`)."
if add_bonus_message
else ""
)
)
if table is not None:
agg_message = (
agg_message
+ f"\n{sep}Please look at the generated plan. If something looks"
f" wrong, please add more details to your issue.\n\n{table}"
)
suffix = bot_suffix # don't include discord suffix for error messages
# Update the issue comment
msg = f"{get_comment_header(current_index, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message, errored=errored, pr_message=pr_message, done=done, initial_sandbox_response=initial_sandbox_response, initial_sandbox_response_file=initial_sandbox_response_file, config_pr_url=config_pr_url)}\n{sep}{agg_message}{suffix}"
try:
issue_comment.edit(msg)
except BadCredentialsException:
logger.error(
f"Bad credentials, refreshing token (tracking ID: `{tracking_id}`)"
)
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
issue_comment = None
for comment in comments:
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
current_issue = repo.get_issue(number=issue_number)
if issue_comment is None:
issue_comment = current_issue.create_comment(msg)
else:
issue_comment = [
comment
for comment in current_issue.get_comments()
if comment.user.login == CURRENT_USERNAME
][0]
issue_comment.edit(msg)
if use_faster_model:
edit_sweep_comment(
FASTER_MODEL_MESSAGE, -1, add_bonus_message=False
)
posthog.capture(
username,
"ran_out_of_tickets",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {
"success": False,
"error_message": "We deprecated supporting GPT 3.5.",
}
error_message = validate_issue(title + summary)
if error_message:
logger.warning(f"Validation error: {error_message}")
edit_sweep_comment(
(
f"The issue was rejected with the following response:\n\n{blockquote(error_message)}"
),
-1,
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
posthog.capture(
username,
"invalid_issue",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": True}
prs_extracted = PRReader.extract_prs(repo, summary)
message_summary = summary
if prs_extracted:
message_summary += "\n\n" + prs_extracted
edit_sweep_comment(
create_collapsible(
"I found that you mentioned the following Pull Requests that might be important:",
blockquote(
prs_extracted,
),
),
1,
)
try:
# search/context manager
logger.info("Searching for relevant snippets...")
if image_contents: # doing it here to avoid editing the original issue
message_summary += ImageDescriptionBot().describe_images(text=title + message_summary, images=image_contents)
snippets, tree, _, repo_context_manager = fetch_relevant_files(
cloned_repo,
title,
message_summary,
replies_text,
username,
metadata,
on_ticket_start_time,
tracking_id,
is_paying_user,
is_consumer_tier,
issue_url,
chat_logger,
ticket_progress,
images=image_contents
)
cloned_repo = repo_context_manager.cloned_repo
except Exception as e:
edit_sweep_comment(
(
"It looks like an issue has occurred around fetching the files."
f" The exception was {str(e)}. If this error persists"
f" contact team@sweep.dev.\n\n> @{username}, editing this issue description to include more details will automatically make me relaunch. Please join our Discord server for support (tracking_id={tracking_id})"
),
-1,
)
raise Exception("Failed to fetch files") from e
_user_token, g = get_github_client(installation_id)
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
repo = g.get_repo(repo_full_name)
ticket_progress.search_progress.indexing_progress = (
ticket_progress.search_progress.indexing_total
)
ticket_progress.status = TicketProgressStatus.PLANNING
ticket_progress.save()
# Fetch git commit history
if not repo_description:
repo_description = "No description provided."
message_summary += replies_text
get_documentation_dict(repo)
docs_results = ""
sweep_bot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title=title,
message_summary=message_summary,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
# Check repository for sweep.yml file.
sweep_yml_exists = False
sweep_yml_failed = False
for content_file in repo.get_contents(""):
if content_file.name == "sweep.yaml":
sweep_yml_exists = True
# Check if YAML is valid
yaml_content = content_file.decoded_content.decode("utf-8")
sweep_yaml_dict = {}
try:
sweep_yaml_dict = yaml.safe_load(yaml_content)
except Exception:
logger.error(f"Failed to load YAML file: {yaml_content}")
if len(sweep_yaml_dict) > 0:
break
linter_config = yamllint_config.YamlLintConfig(custom_config)
problems = list(linter.run(yaml_content, linter_config))
if problems:
errors = [
f"Line {problem.line}: {problem.desc} (rule: {problem.rule})"
for problem in problems
]
error_message = "\n".join(errors)
markdown_error_message = f"**There is something wrong with your [sweep.yaml](https://github.com/{repo_full_name}/blob/main/sweep.yaml):**\n```\n{error_message}\n```"
sweep_yml_failed = True
logger.error(markdown_error_message)
edit_sweep_comment(markdown_error_message, -1)
else:
logger.info("The YAML file is valid. No errors found.")
break
# If sweep.yaml does not exist, then create a new PR that simply creates the sweep.yaml file.
if not sweep_yml_exists:
try:
logger.info("Creating sweep.yaml file...")
config_pr = create_config_pr(sweep_bot, cloned_repo=cloned_repo)
config_pr_url = config_pr.html_url
edit_sweep_comment(message="", index=-2)
except Exception as e:
logger.error(
"Failed to create new branch for sweep.yaml file.\n",
e,
traceback.format_exc(),
)
else:
logger.info("sweep.yaml file already exists.")
try:
# ANALYZE SNIPPETS
newline = "\n"
edit_sweep_comment(
"I found the following snippets in your repository. I will now analyze"
" these snippets and come up with a plan."
+ "\n\n"
+ create_collapsible(
"Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.",
"\n".join(
[
f"https://github.com/{organization}/{repo_name}/blob/{repo.get_commits()[0].sha}/{snippet.file_path}#L{max(snippet.start, 1)}-L{min(snippet.end, snippet.content.count(newline) - 1)}\n"
for snippet in snippets
]
),
)
+ (
create_collapsible(
"I also found that you mentioned the following Pull Requests that may be helpful:",
blockquote(prs_extracted),
)
if prs_extracted
else ""
)
+ (f"\n\n{docs_results}\n\n" if docs_results else ""),
1,
)
logger.info("Fetching files to modify/create...")
file_change_requests, plan = get_files_to_change(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=f"{title}\n\n{message_summary}",
repo_name=repo_full_name,
cloned_repo=cloned_repo,
images=image_contents
)
validate_file_change_requests(file_change_requests, cloned_repo)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.status = TicketProgressStatus.CODING
ticket_progress.save()
if not file_change_requests:
if len(title + summary) < 60:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details? Please make sure that the title and"
" summary of the issue are at least 60 characters."
),
-1,
)
else:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details?"
),
-1,
)
raise Exception("No files to modify.")
file_change_requests: list[
FileChangeRequest
] = sweep_bot.validate_file_change_requests(
file_change_requests,
)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.save()
table = tabulate(
[
[
file_change_request.entity_display,
file_change_request.instructions_display.replace(
"\n", "<br/>"
).replace("```", "\\```"),
]
for file_change_request in file_change_requests
if file_change_request.change_type != "check"
],
headers=["File Path", "Proposed Changes"],
tablefmt="pipe",
)
logger.info("Generating PR...")
pull_request = PullRequest(
title="Sweep: " + title,
branch_name="sweep/" + to_branch_name(title),
content="",
)
logger.info("Making PR...")
ticket_progress.context.branch_name = pull_request.branch_name
ticket_progress.save()
files_progress: list[tuple[str, str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
"⏳ In Progress",
"",
)
for file_change_request in file_change_requests
]
checkboxes_progress: list[tuple[str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
" ",
)
for file_change_request in file_change_requests
if not file_change_request.change_type == "check"
]
checkboxes_contents = "\n".join(
[
create_checkbox(
f"`{filename}`", blockquote(instructions), check == "X"
)
for filename, instructions, check in checkboxes_progress
]
)
create_collapsible("Checklist", checkboxes_contents, opened=True)
file_change_requests[0].status = "running"
condensed_checkboxes_contents = "\n".join(
[
create_checkbox(f"`{filename}`", "", check == "X").strip()
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_collapsible = create_collapsible(
"Checklist", condensed_checkboxes_contents, opened=True
)
current_issue = repo.get_issue(number=issue_number)
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
delete_branch = False
generator = create_pr_changes(
file_change_requests,
pull_request,
sweep_bot,
username,
installation_id,
issue_number,
chat_logger=chat_logger,
base_branch=overrided_branch_name,
additional_messages=[],
)
edit_sweep_comment(checkboxes_contents, 2)
if not file_change_requests:
raise NoFilesException()
response = {
"error": Exception(
f"Sweep failed to generate any file change requests! This could mean that Sweep failed to find the correct lines of code to modify or that GPT-4 did not respond in our specified format. Sometimes, retrying will fix this error. Otherwise, reach out to our Discord server for support (tracking_id={tracking_id})."
)
}
changed_files = []
for item in generator:
if isinstance(item, dict):
response = item
break
(
new_file_contents,
_,
commit,
file_change_requests,
) = item
# append all files that have been changed
if new_file_contents:
for file_name, _ in new_file_contents.items():
changed_files.append(file_name)
commit_hash: str = (
commit
if isinstance(commit, str)
else (
commit.sha
if commit is not None
else repo.get_branch(
pull_request.branch_name
).commit.sha
)
)
commit_url = (
f"https://github.com/{repo_full_name}/commit/{commit_hash}"
)
commit_url_display = (
f"<a href='{commit_url}'><code>{commit_hash[:7]}</code></a>"
)
create_error_logs(
commit_url_display,
None,
status=(
"✓"
),
)
checkboxes_progress = [
(
file_change_request.display_summary
+ " "
+ file_change_request.status_display
+ " "
+ (file_change_request.commit_hash_url or "")
+ f" [Edit]({file_change_request.get_edit_url(repo.full_name, pull_request.branch_name)})",
file_change_request.instructions_ticket_display
+ f"\n\n{file_change_request.diff_display}",
(
"X"
if file_change_request.status
in ("succeeded", "failed")
else " "
),
)
for file_change_request in file_change_requests
]
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
collapsible_template.format(
summary="Checklist",
body=checkboxes_contents,
opened="open",
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
try:
current_issue = repo.get_issue(number=issue_number)
except BadCredentialsException:
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
logger.info(files_progress)
edit_sweep_comment(checkboxes_contents, 2)
if not response.get("success"):
raise Exception(f"Failed to create PR: {response.get('error')}")
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
for _ in range(3):
try:
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
break
except Exception:
from time import sleep
sleep(1)
edit_sweep_comment(checkboxes_contents, 2)
pr_changes = response["pull_request"]
# change the body here
diff_text = get_branch_diff_text(
repo=repo,
branch=pull_request.branch_name,
base_branch=overrided_branch_name,
)
new_description = PRDescriptionBot().describe_diffs(
diff_text,
pull_request.title,
)
# TODO: update the title as well
if new_description:
pr_changes.body = (
f"{new_description}\n\nFixes"
f" #{issue_number}.\n\n---\n\n{UPDATES_MESSAGE}\n\n---\n\n{INSTRUCTIONS_FOR_REVIEW}{BOT_SUFFIX}"
)
edit_sweep_comment(
"I have finished coding the issue. I am now reviewing it for completeness.",
3,
)
change_location = f" [`{pr_changes.pr_head}`](https://github.com/{repo_full_name}/commits/{pr_changes.pr_head}).\n\n"
review_message = (
"Here are my self-reviews of my changes at" + change_location
)
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
except Exception:
pass
changes_required, review_message = False, ""
if changes_required:
edit_sweep_comment(
review_message
+ "\n\nI finished incorporating these changes.",
3,
)
else:
edit_sweep_comment(
f"I have finished reviewing the code for completeness. I did not find errors for {change_location}",
3,
)
revert_buttons = []
for changed_file in set(changed_files):
revert_buttons.append(
Button(label=f"{RESET_FILE} {changed_file}")
)
revert_buttons_list = ButtonList(
buttons=revert_buttons, title=REVERT_CHANGED_FILES_TITLE
)
# delete failing sweep yaml if applicable
if sweep_yml_failed:
try:
repo.delete_file(
"sweep.yaml",
"Delete failing sweep.yaml",
branch=pr_changes.pr_head,
sha=repo.get_contents("sweep.yaml").sha,
)
except Exception:
pass
# create draft pr, then convert to regular pr later
pr: GithubPullRequest = repo.create_pull(
title=pr_changes.title,
body=pr_changes.body,
head=pr_changes.pr_head,
base=overrided_branch_name or SweepConfig.get_branch(repo),
# removed draft PR
draft=False,
)
try:
pr.add_to_assignees(username)
except Exception as e:
logger.error(
f"Failed to add assignee {username}: {e}, probably a bot."
)
ticket_progress.status = TicketProgressStatus.COMPLETE
ticket_progress.context.done_time = time()
ticket_progress.context.pr_id = pr.number
ticket_progress.save()
if revert_buttons:
pr.create_issue_comment(
revert_buttons_list.serialize() + BOT_SUFFIX
)
# add comments before labelling
pr.add_to_labels(GITHUB_LABEL_NAME)
current_issue.create_reaction("rocket")
heres_pr_message = f'<h1 align="center">🚀 Here\'s the PR! <a href="{pr.html_url}">#{pr.number}</a></h1>'
progress_message = f'<div align="center"><b>See Sweep\'s progress at <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">the progress dashboard</a>!</b></div>'
edit_sweep_comment(
review_message + "\n\nSuccess! 🚀",
4,
pr_message=(
f"{center(heres_pr_message)}\n{center(progress_message)}\n{center(payment_message_start)}"
),
done=True,
)
user_settings = UserSettings.from_username(username=username)
user = g.get_user(username)
full_name = user.name or user.login
name = full_name.split(" ")[0]
files_changed = []
for fcr in file_change_requests:
if fcr.change_type in ("create", "modify"):
diff = list(
difflib.unified_diff(
(fcr.old_content or "").splitlines() or [],
(fcr.new_content or "").splitlines() or [],
lineterm="",
)
)
added = sum(
1
for line in diff
if line.startswith("+") and not line.startswith("+++")
)
removed = sum(
1
for line in diff
if line.startswith("-") and not line.startswith("---")
)
files_changed.append(
f"<code>{fcr.filename}</code> (+{added}/-{removed})"
)
user_settings.send_email(
subject=f"Sweep Pull Request Complete for {repo_name}#{issue_number} {title}",
html=email_template.format(
name=name,
pr_url=pr.html_url,
issue_number=issue_number,
repo_full_name=repo_full_name,
pr_number=pr.number,
progress_url=f"{PROGRESS_BASE_URL}/issues/{tracking_id}",
summary=markdown.markdown(pr_changes.body),
files_changed="\n".join(
[f"<li>{item}</li>" for item in files_changed]
),
sweeping_gif=sweeping_gif,
),
)
# poll for github to check when gha are done
total_poll_attempts = 0
total_edit_attempts = 0
SLEEP_DURATION_SECONDS = 15
GITHUB_ACTIONS_ENABLED = get_gha_enabled(repo=repo) and DEPLOYMENT_GHA_ENABLED
GHA_MAX_EDIT_ATTEMPTS = 5 # max number of times to edit PR
current_commit = pr.head.sha
while True and GITHUB_ACTIONS_ENABLED:
logger.info(
f"Polling to see if Github Actions have finished... {total_poll_attempts}"
)
# we wait at most 60 minutes
if total_poll_attempts * SLEEP_DURATION_SECONDS // 60 >= 60:
break
else:
# wait one minute between check attempts
total_poll_attempts += 1
from time import sleep
sleep(SLEEP_DURATION_SECONDS)
runs = list(repo.get_workflow_runs(branch=pr.head.ref, head_sha=current_commit))
# if all runs have succeeded, break
if all([run.conclusion == "success" for run in runs]):
break
# if any of them have failed we retry
if any([run.conclusion == "failure" for run in runs]):
failed_runs = [
run for run in runs if run.conclusion == "failure"
]
failed_gha_logs: list[str] = get_failing_gha_logs(
failed_runs,
installation_id,
)
if failed_gha_logs:
# make edits to the PR
# TODO: look into rollbacks so we don't continue adding onto errors
cloned_repo = ClonedRepo( # reinitialize cloned_repo to avoid conflicts
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=pr.head.ref,
)
diffs = get_branch_diff_text(repo=repo, branch=pr.head.ref, base_branch=pr.base.ref)
problem_statement = f"{title}\n{message_summary}\n{replies_text}"
all_information_prompt = GHA_PROMPT.format(
problem_statement=problem_statement,
github_actions_logs=failed_gha_logs,
changes_made=diffs,
)
repo_context_manager = prep_snippets(cloned_repo=cloned_repo, query=(title + message_summary + replies_text).strip("\n"), ticket_progress=ticket_progress) # need to do this, can use the old query for speed
sweep_bot: SweepBot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title="Fix the following errors to complete the user request.",
message_summary=all_information_prompt,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
file_change_requests, plan = get_files_to_change_for_gha(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=all_information_prompt,
updated_files=new_file_contents,
cloned_repo=cloned_repo,
chat_logger=chat_logger,
)
validate_file_change_requests(file_change_requests, cloned_repo)
previous_modify_files_dict: dict[str, dict[str, str | list[str]]] | None = None
_, commit, _ = sweep_bot.handle_modify_file_main(
branch=pr.head.ref,
assistant_conversation=None,
additional_messages=[],
previous_modify_files_dict=previous_modify_files_dict,
file_change_requests=file_change_requests,
username=username
)
current_commit = commit.sha
pr = repo.get_pull(pr.number) # IMPORTANT: resync PR otherwise you'll fetch old GHA runs
total_edit_attempts += 1
if total_edit_attempts >= GHA_MAX_EDIT_ATTEMPTS:
logger.info(f"Tried to edit PR {GHA_MAX_EDIT_ATTEMPTS} times, giving up.")
break
# if none of the runs have completed we wait and poll github
logger.info(
f"No Github Actions have failed yet and not all have succeeded yet, waiting for {SLEEP_DURATION_SECONDS} seconds before polling again..."
)
# break from main for loop
convert_pr_draft_field(pr, is_draft=False, installation_id=installation_id)
except MaxTokensExceeded as e:
logger.info("Max tokens exceeded")
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Max tokens exceeded. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Max Tokens Exceeded",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
if chat_logger and chat_logger.is_paying_user():
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too long."
" We are currently working on improved file streaming to address"
" this issue.\n"
),
-1,
)
else:
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too"
" long.\n\nIf this file is incorrect, please describe the desired"
" file in the prompt. However, if you would like to edit longer"
" files, consider upgrading to [Sweep Pro](https://sweep.dev/) for"
" longer context lengths.\n"
),
-1,
)
delete_branch = True
raise e
except NoFilesException as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sweep could not find files to modify to address this issue. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.info("Sweep could not find files to modify")
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Sweep could not find files to modify",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
edit_sweep_comment(
(
"Sorry, Sweep could not find any appropriate files to edit to address"
" this issue. If this is a mistake, please provide more context and Sweep"
f" will retry!\n\n@{username}, please edit the issue description to"
" include more details. You can also ask for help on our community"
" forum: https://community.sweep.dev/"
),
-1,
)
delete_branch = True
raise e
except openai.BadRequestError as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sorry, it looks like there is an error with communicating with OpenAI. If this error persists, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
edit_sweep_comment(
(
"I'm sorry, but it looks our model has ran out of context length. We're"
" trying to make this happen less, but one way to mitigate this is to"
" code smaller files. If this error persists report it at"
" https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Context Length",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
posthog.capture(
username,
"failed",
properties={
"error": str(e),
"trace": traceback.format_exc(),
"reason": "Invalid request error / context length",
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
delete_branch = True
raise e
except AssistantRaisedException as e:
if ticket_progress is not None:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Sweep raised an error with the following message: {e.message}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.exception(e)
edit_sweep_comment(
f"Sweep raised an error with the following message:\n{blockquote(e.message)}",
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
except Exception as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Internal server error: {str(e)}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
# title and summary are defined elsewhere
if len(title + summary) < 60:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error occurred due to"
f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
" so Sweep can better address it. Alternatively, post on our community forum"
" for assistance: https://community.sweep.dev/"
),
-1,
)
else:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error has occurred due to"
+ f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
+ " so Sweep can better address it. Alternatively, reach out to Kevin or William for help at"
+ " https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
else:
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
fire_and_forget_wrapper(add_emoji)("rocket")
except SystemExit:
raise SystemExit
except Exception as e:
logger.error(e)
if delete_branch:
try:
if pull_request.branch_name.startswith("sweep"):
repo.get_git_ref(
f"heads/{pull_request.branch_name}"
).delete()
else:
raise Exception(
f"Branch name {pull_request.branch_name} does not start with sweep/"
)
except Exception as e:
logger.error(e)
logger.error(traceback.format_exc())
logger.info("Deleted branch", pull_request.branch_name)
except Exception as e:
posthog.capture(
username,
"failed",
properties={
**metadata,
"error": str(e),
"trace": traceback.format_exc(),
"duration": round(time() - on_ticket_start_time),
},
)
raise e
posthog.capture(
username,
"success",
properties={**metadata, "duration": round(time() - on_ticket_start_time)},
)
logger.info("on_ticket success in " + str(round(time() - on_ticket_start_time)))
return {"success": True}
def handle_sandbox_mode(
title, repo_full_name, repo, ticket_progress, edit_sweep_comment
):
logger.info("Running in sandbox mode")
sweep_bot = SweepBot(repo=repo, ticket_progress=ticket_progress)
logger.info("Getting file contents")
file_name = title.split(":")[1].strip()
file_contents = sweep_bot.get_contents(file_name).decoded_content.decode("utf-8")
try:
ext = file_name.split(".")[-1]
except Exception:
ext = ""
file_contents.replace("```", "\`\`\`")
sha = repo.get_branch(repo.default_branch).commit.sha
permalink = f"https://github.com/{repo_full_name}/blob/{sha}/{file_name}#L1-L{len(file_contents.splitlines())}"
logger.info("Running sandbox")
edit_sweep_comment(
f"Running sandbox for {file_name}. Current Code:\n\n{permalink}",
1,
)
updated_contents, sandbox_response = sweep_bot.check_sandbox(
file_name, file_contents
)
logger.info("Sandbox finished")
logs = (
(
"<br/>"
+ create_collapsible(
"Sandbox logs",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
updated_contents = updated_contents.replace("```", "\`\`\`")
diff = generate_diff(file_contents, updated_contents).replace("```", "\`\`\`")
diff_display = (
f"Updated Code:\n\n```{ext}\n{updated_contents}```\nDiff:\n```diff\n{diff}\n```"
if diff
else f"Sandbox made no changes to {file_name} (formatters were not configured or Sweep didn't make changes)."
)
edit_sweep_comment(
f"{logs}\n{diff_display}",
2,
)
edit_sweep_comment("N/A", 3)
logger.info("Sandbox comments updated")
def get_branch_diff_text(repo, branch, base_branch=None):
base_branch = base_branch or SweepConfig.get_branch(repo)
comparison = repo.compare(base_branch, branch)
file_diffs = comparison.files
pr_diffs = []
for file in file_diffs:
diff = file.patch
if (
file.status == "added"
or file.status == "modified"
or file.status == "removed"
):
pr_diffs.append((file.filename, diff))
else:
logger.info(
f"File status {file.status} not recognized"
) # TODO(sweep): We don't handle renamed files
return "\n".join([f"{filename}\n{diff}" for filename, diff in pr_diffs])
def get_payment_messages(chat_logger: ChatLogger):
if chat_logger:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
tracking_id = chat_logger.data["tracking_id"] if MONGODB_URI is not None else None
# Find the first comment made by the bot
tickets_allocated = 5
if is_consumer_tier:
tickets_allocated = 15
if is_paying_user:
tickets_allocated = 500
purchased_ticket_count = (
chat_logger.get_ticket_count(purchased=True) if chat_logger else 0
)
ticket_count = (
max(tickets_allocated - chat_logger.get_ticket_count(), 0)
+ purchased_ticket_count
if chat_logger
else 999
)
daily_ticket_count = (
(3 - chat_logger.get_ticket_count(use_date=True) if not use_faster_model else 0)
if chat_logger
else 999
)
model_name = "GPT-4"
single_payment_link = "https://buy.stripe.com/00g3fh7qF85q0AE14d"
pro_payment_link = "https://buy.stripe.com/00g5npeT71H2gzCfZ8"
daily_message = (
f" and {daily_ticket_count} for the day"
if not is_paying_user and not is_consumer_tier
else ""
)
user_type = "💎 <b>Sweep Pro</b>" if is_paying_user else "⚡ <b>Sweep Basic Tier</b>"
gpt_tickets_left_message = (
f"{ticket_count} GPT-4 tickets left for the month"
if not is_paying_user
else "unlimited GPT-4 tickets"
)
purchase_message = f"<br/><br/> For more GPT-4 tickets, visit <a href={single_payment_link}>our payment portal</a>. For a one week free trial, try <a href={pro_payment_link}>Sweep Pro</a> (unlimited GPT-4 tickets)."
payment_message = (
f"{user_type}: I used {model_name} to create this ticket. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)
payment_message_start = (
f"{user_type}: I'm using {model_name}. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)

import base64
import os
from dotenv import load_dotenv
from loguru import logger
logger.print = logger.info
load_dotenv(dotenv_path=".env", override=True, verbose=True)
os.environ["GITHUB_APP_PEM"] = os.environ.get("GITHUB_APP_PEM") or base64.b64decode(
os.environ.get("GITHUB_APP_PEM_BASE64", "")
).decode("utf-8")
if os.environ["GITHUB_APP_PEM"]:
os.environ["GITHUB_APP_ID"] = (
(os.environ.get("GITHUB_APP_ID") or os.environ.get("APP_ID"))
.replace("\\n", "\n")
.strip('"')
)
TEST_BOT_NAME = "sweep-nightly[bot]"
ENV = os.environ.get("ENV", "dev")
BOT_TOKEN_NAME = "bot-token"
# goes under Modal 'discord' secret name (optional, can leave env var blank)
DISCORD_WEBHOOK_URL = os.environ.get("DISCORD_WEBHOOK_URL")
DISCORD_MEDIUM_PRIORITY_URL = os.environ.get("DISCORD_MEDIUM_PRIORITY_URL")
DISCORD_LOW_PRIORITY_URL = os.environ.get("DISCORD_LOW_PRIORITY_URL")
DISCORD_FEEDBACK_WEBHOOK_URL = os.environ.get("DISCORD_FEEDBACK_WEBHOOK_URL")
SWEEP_HEALTH_URL = os.environ.get("SWEEP_HEALTH_URL")
DISCORD_STATUS_WEBHOOK_URL = os.environ.get("DISCORD_STATUS_WEBHOOK_URL")
# goes under Modal 'github' secret name
GITHUB_APP_ID = os.environ.get("GITHUB_APP_ID", os.environ.get("APP_ID"))
# deprecated: old logic transfer so upstream can use this
if GITHUB_APP_ID is None:
if ENV == "prod":
GITHUB_APP_ID = "307814"
elif ENV == "dev":
GITHUB_APP_ID = "324098"
elif ENV == "staging":
GITHUB_APP_ID = "327588"
GITHUB_BOT_USERNAME = os.environ.get("GITHUB_BOT_USERNAME")
# deprecated: left to support old logic
if not GITHUB_BOT_USERNAME:
if ENV == "prod":
GITHUB_BOT_USERNAME = "sweep-ai[bot]"
elif ENV == "dev":
GITHUB_BOT_USERNAME = "sweep-nightly[bot]"
elif ENV == "staging":
GITHUB_BOT_USERNAME = "sweep-canary[bot]"
elif not GITHUB_BOT_USERNAME.endswith("[bot]"):
GITHUB_BOT_USERNAME = GITHUB_BOT_USERNAME + "[bot]"
GITHUB_LABEL_NAME = os.environ.get("GITHUB_LABEL_NAME", "sweep")
GITHUB_LABEL_COLOR = os.environ.get("GITHUB_LABEL_COLOR", "9400D3")
GITHUB_LABEL_DESCRIPTION = os.environ.get(
"GITHUB_LABEL_DESCRIPTION", "Sweep your software chores"
)
GITHUB_APP_PEM = os.environ.get("GITHUB_APP_PEM")
GITHUB_APP_PEM = GITHUB_APP_PEM or os.environ.get("PRIVATE_KEY")
if GITHUB_APP_PEM is not None:
GITHUB_APP_PEM = GITHUB_APP_PEM.strip(' \n"') # Remove whitespace and quotes
GITHUB_APP_PEM = GITHUB_APP_PEM.replace("\\n", "\n")
GITHUB_CONFIG_BRANCH = os.environ.get("GITHUB_CONFIG_BRANCH", "sweep/add-sweep-config")
GITHUB_DEFAULT_CONFIG = os.environ.get(
"GITHUB_DEFAULT_CONFIG",
"""# Sweep AI turns bugs & feature requests into code changes (https://sweep.dev)
# For details on our config file, check out our docs at https://docs.sweep.dev/usage/config
# This setting contains a list of rules that Sweep will check for. If any of these rules are broken in a new commit, Sweep will create an pull request to fix the broken rule.
rules:
{additional_rules}
# This is the branch that Sweep will develop from and make pull requests to. Most people use 'main' or 'master' but some users also use 'dev' or 'staging'.
branch: 'main'
# By default Sweep will read the logs and outputs from your existing Github Actions. To disable this, set this to false.
gha_enabled: True
# This is the description of your project. It will be used by sweep when creating PRs. You can tell Sweep what's unique about your project, what frameworks you use, or anything else you want.
#
# Example:
#
# description: sweepai/sweep is a python project. The main api endpoints are in sweepai/api.py. Write code that adheres to PEP8.
description: ''
# This sets whether to create pull requests as drafts. If this is set to True, then all pull requests will be created as drafts and GitHub Actions will not be triggered.
draft: False
# This is a list of directories that Sweep will not be able to edit.
blocked_dirs: []
""",
)
MONGODB_URI = os.environ.get("MONGODB_URI", None)
IS_SELF_HOSTED = os.environ.get("IS_SELF_HOSTED", "true").lower() == "true"
REDIS_URL = os.environ.get("REDIS_URL")
if not REDIS_URL:
REDIS_URL = os.environ.get("redis_url", "redis://0.0.0.0:6379/0")
ORG_ID = os.environ.get("ORG_ID", None)
POSTHOG_API_KEY = os.environ.get(
"POSTHOG_API_KEY", "phc_CnzwIB0W548wN4wEGeRuxXqidOlEUH2AcyV2sKTku8n"
)
SUPPORT_COUNTRY = os.environ.get("GDRP_LIST", "").split(",")
WHITELISTED_REPOS = os.environ.get("WHITELISTED_REPOS", "").split(",")
BLACKLISTED_USERS = os.environ.get("BLACKLISTED_USERS", "").split(",")
# Default OpenAI
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", None) # this may be none, and it will use azure
OPENAI_API_TYPE = os.environ.get("OPENAI_API_TYPE", "anthropic")
assert OPENAI_API_TYPE in ["anthropic", "azure", "openai"], "Invalid OPENAI_API_TYPE"
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
AZURE_API_KEY = os.environ.get("AZURE_API_KEY", None)
OPENAI_API_BASE = os.environ.get("OPENAI_API_BASE", None)
OPENAI_API_VERSION = os.environ.get("OPENAI_API_VERSION", None)
AZURE_OPENAI_DEPLOYMENT = os.environ.get("AZURE_OPENAI_DEPLOYMENT", None)
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
OPENAI_EMBEDDINGS_AZURE_ENDPOINT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_ENDPOINT", None
)
OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT", None
)
OPENAI_EMBEDDINGS_AZURE_API_VERSION = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_API_VERSION", None
)
OPENAI_API_ENGINE_GPT35 = os.environ.get("OPENAI_API_ENGINE_GPT35", None)
OPENAI_API_ENGINE_GPT4 = os.environ.get("OPENAI_API_ENGINE_GPT4", None)
MULTI_REGION_CONFIG = os.environ.get("MULTI_REGION_CONFIG", None)
if isinstance(MULTI_REGION_CONFIG, str):
MULTI_REGION_CONFIG = MULTI_REGION_CONFIG.strip("'").replace("\\n", "\n")
MULTI_REGION_CONFIG = [item.split(",") for item in MULTI_REGION_CONFIG.split("\n")]
WHITELISTED_USERS = os.environ.get("WHITELISTED_USERS", None)
if WHITELISTED_USERS:
WHITELISTED_USERS = WHITELISTED_USERS.split(",")
WHITELISTED_USERS.append(GITHUB_BOT_USERNAME)
DEFAULT_GPT4_MODEL = os.environ.get("DEFAULT_GPT4_MODEL", "gpt-4-0125-preview")
RESEND_API_KEY = os.environ.get("RESEND_API_KEY", None)
LOKI_URL = None
DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
ENV = "prod" if GITHUB_BOT_USERNAME != TEST_BOT_NAME else "dev"
PROGRESS_BASE_URL = os.environ.get(
"PROGRESS_BASE_URL", "https://progress.sweep.dev"
).rstrip("/")
DISABLED_REPOS = os.environ.get("DISABLED_REPOS", "").split(",")
GHA_AUTOFIX_ENABLED: bool = os.environ.get("GHA_AUTOFIX_ENABLED", False)
MERGE_CONFLICT_ENABLED: bool = os.environ.get("MERGE_CONFLICT_ENABLED", False)
INSTALLATION_ID = os.environ.get("INSTALLATION_ID", None)
AWS_ACCESS_KEY=os.environ.get("AWS_ACCESS_KEY")
AWS_SECRET_KEY=os.environ.get("AWS_SECRET_KEY")
AWS_REGION=os.environ.get("AWS_REGION")
ANTHROPIC_AVAILABLE = AWS_ACCESS_KEY and AWS_SECRET_KEY and AWS_REGION
ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY", None)
COHERE_API_KEY = os.environ.get("COHERE_API_KEY", None)
VOYAGE_API_KEY = os.environ.get("VOYAGE_API_KEY", None)
VOYAGE_API_AWS_ACCESS_KEY=os.environ.get("VOYAGE_API_AWS_ACCESS_KEY_ID")
VOYAGE_API_AWS_SECRET_KEY=os.environ.get("VOYAGE_API_AWS_SECRET_KEY")
VOYAGE_API_AWS_REGION=os.environ.get("VOYAGE_API_AWS_REGION")
VOYAGE_API_AWS_ENDPOINT_NAME=os.environ.get("VOYAGE_API_AWS_ENDPOINT_NAME", "voyage-code-2")
VOYAGE_API_USE_AWS = VOYAGE_API_AWS_ACCESS_KEY and VOYAGE_API_AWS_SECRET_KEY and VOYAGE_API_AWS_REGION
PAREA_API_KEY = os.environ.get("PAREA_API_KEY", None)
# TODO: we need to make this dynamic + backoff
BATCH_SIZE = int(
os.environ.get("BATCH_SIZE", 64 if VOYAGE_API_KEY else 256) # Voyage only allows 128 items per batch and 120000 tokens per batch
)
DEPLOYMENT_GHA_ENABLED = os.environ.get("DEPLOYMENT_GHA_ENABLED", "true").lower() == "true"
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

import re
from sweepai.core.chat import ChatGPT
issue_validator_instructions_prompt = """# Instructions
A good issue for Sweep is actionable and it is clear how to resolve it. Here is what Sweep is currently capable of:
- Access to the entire codebase, with a high-quality search engine to find specific code snippets. Sweep is able to pinpoint the exact location of the code that needs to be changed based on vague descriptions.
- Making code changes to fix bugs or add features.
- Reading the GitHub Action logs to run tests and check the results.
- Ability to read images such as screenshots and charts.
Here are some examples of things Sweep does not currently support:
- Large-scale changes like migrations and large version upgrades.
- Tasks requiring accessing outside information like AWS consoles or retrieving API keys.
- Tasks requiring fixes outside of code changes
- Issues that have an existing fix or duplicate issues
Respond in the following format:
<thinking>
Provide an analysis of why it is a good or bad issue to pass on to Sweep. If it is a bad issue, suggest how the issue could be improved or clarified to make it more suitable for Sweep.
</thinking>
<pass>True or False</pass>
If False, respond to the user:
<response_to_user>
Response to user with justification on why the issue is unclear.
</response_to_user>"""
issue_validator_system_prompt = """You are an AI assistant tasked with determining whether an issue reported by customer support should be passed on to be resolved by Sweep, an AI-powered software engineer.
""" + issue_validator_instructions_prompt
issue_validator_user_prompt = """<issue>
{issue}
</issue>\n\n""" + issue_validator_instructions_prompt
def validate_issue(issue: str) -> str:
"""
Somehow haiku and GPT-4 can't do this consistently.
"""
chat_gpt = ChatGPT.from_system_message_string(
prompt_string=issue_validator_system_prompt,
)
response = chat_gpt.chat_anthropic(
issue_validator_user_prompt.format(
issue=issue
),
model="claude-3-opus-20240229",
temperature=0.0,
)
if "<pass>False</pass>" in response:
pattern = "<response_to_user>(.*)</response_to_user>"
return re.search(pattern, response, re.DOTALL).group(1).strip()
return ""
if __name__ == "__main__":


Step 2: ⌨️ Coding

  • Modify sweepai/config/server.py ! No changes made Edit
Modify sweepai/config/server.py with contents: Add a new environment variable for the user's Slack API key.

<original_code>
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

LICENSE_KEY = os.environ.get("LICENSE_KEY", None)
ALTERNATE_AWS = os.environ.get("ALTERNATE_AWS", "none").lower() == "true"
</original_code>

<new_code>
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

SLACK_API_KEY = os.environ.get("SLACK_API_KEY", None)

LICENSE_KEY = os.environ.get("LICENSE_KEY", None)
ALTERNATE_AWS = os.environ.get("ALTERNATE_AWS", "none").lower() == "true"
</new_code>

  • Modify sweepai/handlers/on_ticket.py ! No changes made Edit
Modify sweepai/handlers/on_ticket.py with contents: Import the necessary modules for making HTTP requests and parsing Slack message threads.

<original_code>
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
</original_code>

<new_code>
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter

from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError
</new_code>

  • Modify sweepai/handlers/on_ticket.py ! No changes made Edit
Modify sweepai/handlers/on_ticket.py with contents: In the `on_ticket` function, check if the issue description contains a Slack link. If found, authenticate to Slack and fetch the message thread.

<original_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
</original_code>

<new_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)

slack_link_match = re.search(r'(https://\w+\.slack\.com/archives/\w+/p\d+)', summary)
if slack_link_match:
    slack_link = slack_link_match.group(1)
    slack_client = WebClient(token=SLACK_API_KEY)
    
    try:
        slack_permalink_data = slack_client.chat_getPermalink(
            link=slack_link
        )
        slack_channel_id = slack_permalink_data['channel']
        slack_message_ts = slack_permalink_data['message_ts']

        slack_thread_replies = slack_client.conversations_replies(
            channel=slack_channel_id,
            ts=slack_message_ts
        )

        slack_thread_messages = [message['text'] for message in slack_thread_replies['messages']]
        slack_thread_text = '\n'.join(slack_thread_messages)

        summary += f"\n\nSlack Thread:\n{slack_thread_text}"

    except SlackApiError as e:
        logger.error(f"Error fetching Slack thread: {e}")

repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
    assignee = current_issue.user.login

</new_code>


Step 3: 🔁 Code Review

Working on it...


🎉 Latest improvements to Sweep:
  • New dashboard launched for real-time tracking of Sweep issues, covering all stages from search to coding.
  • Integration of OpenAI's latest Assistant API for more efficient and reliable code planning and editing, improving speed by 3x.
  • Use the GitHub issues extension for creating Sweep issues directly from your editor.

💡 To recreate the pull request edit the issue title or description.
Something wrong? Let us know.

This is an automated message generated by Sweep AI.

Copy link
Contributor

sweep-nightly bot commented May 3, 2024

Sweeping

✨ Track Sweep's progress on our progress dashboard!


50%

💎 Sweep Pro: I'm using GPT-4. You have unlimited GPT-4 tickets. (tracking ID: a261678dcc)

Tip

I can email you when I complete this pull request if you set up your email here!


Actions (click)

  • ↻ Restart Sweep

Step 1: 🔎 Searching

I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.

Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.

"""
on_ticket is the main function that is called when a new issue is created.
It is only called by the webhook handler in sweepai/api.py.
"""
import difflib
import io
import os
import re
import traceback
from typing import Any
import zipfile
from time import time
import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
from sweepai.core.sweep_bot import GHA_PROMPT
from sweepai.agents.pr_description_bot import PRDescriptionBot
from sweepai.agents.image_description_bot import ImageDescriptionBot
from sweepai.config.client import (
RESET_FILE,
RESTART_SWEEP_BUTTON,
REVERT_CHANGED_FILES_TITLE,
SweepConfig,
get_documentation_dict,
get_gha_enabled,
)
from sweepai.config.server import (
DEPLOYMENT_GHA_ENABLED,
ENV,
GITHUB_LABEL_NAME,
IS_SELF_HOSTED,
MONGODB_URI,
PROGRESS_BASE_URL,
)
from sweepai.core.entities import (
AssistantRaisedException,
FileChangeRequest,
MaxTokensExceeded,
NoFilesException,
PullRequest,
SandboxResponse,
)
from sweepai.core.entities import create_error_logs as entities_create_error_logs
from sweepai.core.pr_reader import PRReader
from sweepai.core.sweep_bot import SweepBot, get_files_to_change, get_files_to_change_for_gha, validate_file_change_requests
from sweepai.handlers.create_pr import (
create_config_pr,
create_pr_changes,
safe_delete_sweep_branch,
)
from sweepai.handlers.on_check_suite import clean_gh_logs
from sweepai.utils.image_utils import get_image_contents_from_urls, get_image_urls_from_issue
from sweepai.utils.issue_validator import validate_issue
from sweepai.utils.validate_license import validate_license
from sweepai.utils.buttons import Button, ButtonList, create_action_buttons
from sweepai.utils.chat_logger import ChatLogger
from sweepai.utils.diff import generate_diff
from sweepai.utils.event_logger import posthog
from sweepai.utils.github_utils import (
CURRENT_USERNAME,
ClonedRepo,
convert_pr_draft_field,
get_github_client,
get_token,
sanitize_string_for_github,
)
from sweepai.utils.progress import (
AssistantConversation,
PaymentContext,
TicketContext,
TicketProgress,
TicketProgressStatus,
)
from sweepai.utils.prompt_constructor import HumanMessagePrompt
from sweepai.utils.str_utils import (
BOT_SUFFIX,
FASTER_MODEL_MESSAGE,
UPDATES_MESSAGE,
blockquote,
bot_suffix,
checkbox_template,
clean_logs,
collapsible_template,
create_checkbox,
create_collapsible,
discord_suffix,
format_sandbox_success,
get_hash,
sep,
stars_suffix,
strip_sweep,
to_branch_name,
)
from sweepai.utils.ticket_utils import (
center,
fetch_relevant_files,
fire_and_forget_wrapper,
log_error,
prep_snippets,
)
from sweepai.utils.user_settings import UserSettings
# from sandbox.sandbox_utils import Sandbox
sweeping_gif = """<a href="https://github.com/sweepai/sweep"><img class="swing" src="https://raw.githubusercontent.com/sweepai/sweep/main/.assets/sweeping.gif" width="100" style="width:50px; margin-bottom:10px" alt="Sweeping"></a>"""
custom_config = """
extends: relaxed
rules:
line-length: disable
indentation: disable
"""
INSTRUCTIONS_FOR_REVIEW = """\
### 💡 To get Sweep to edit this pull request, you can:
* Comment below, and Sweep can edit the entire PR
* Comment on a file, Sweep will only modify the commented file
* Edit the original issue to get Sweep to recreate the PR from scratch"""
email_template = """Hey {name},
<br/><br/>
🚀 I just finished creating a pull request for your issue ({repo_full_name}#{issue_number}) at <a href="{pr_url}">{repo_full_name}#{pr_number}</a>!
<br/><br/>
You can view how I created this pull request <a href="{progress_url}">here</a>.
<h2>Summary</h2>
<blockquote>
{summary}
</blockquote>
<h2>Files Changed</h2>
<ul>
{files_changed}
</ul>
{sweeping_gif}
<br/>
Cheers,
<br/>
Sweep
<br/>"""
FAILING_GITHUB_ACTION_PROMPT = """\
The following Github Actions failed on a previous attempt at fixing this issue.
Propose a fix to the failing github actions. You must edit the source code, not the github action itself.
{github_action_log}
"""
# Add :eyes: emoji to ticket
def add_emoji(issue: Issue, comment_id: int = None, reaction_content="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
item_to_react_to.create_reaction(reaction_content)
# If SWEEP_BOT reacted to item_to_react_to with "rocket", then remove it.
def remove_emoji(issue: Issue, comment_id: int = None, content_to_delete="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
reactions = item_to_react_to.get_reactions()
for reaction in reactions:
if (
reaction.content == content_to_delete
and reaction.user.login == CURRENT_USERNAME
):
item_to_react_to.delete_reaction(reaction.id)
def create_error_logs(
commit_url_display: str,
sandbox_response: SandboxResponse,
status: str = "✓",
):
return (
(
"<br/>"
+ create_collapsible(
f"Sandbox logs for {commit_url_display} {status}",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
# takes in a list of workflow runs and returns a list of messages containing the logs of the failing runs
def get_failing_gha_logs(runs, installation_id) -> str:
token = get_token(installation_id)
all_logs = ""
for run in runs:
# jobs_url
jobs_url = run.jobs_url
jobs_response = requests.get(
jobs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
)
if jobs_response.status_code == 200:
failed_jobs = []
jobs = jobs_response.json()["jobs"]
for job in jobs:
if job["conclusion"] == "failure":
failed_jobs.append(job)
failed_jobs_name_list = []
for job in failed_jobs:
# add failed steps
for step in job["steps"]:
if step["conclusion"] == "failure":
failed_jobs_name_list.append(
f"{job['name']}/{step['number']}_{step['name']}"
)
else:
logger.error(
"Failed to get jobs for failing github actions, possible a credentials issue"
)
return all_logs
# make sure jobs in valid
if jobs_response.json()['total_count'] == 0:
logger.error(f"no jobs for this run: {run}, continuing...")
continue
# logs url
logs_url = run.logs_url
logs_response = requests.get(
logs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
allow_redirects=True,
)
# Check if the request was successful
if logs_response.status_code == 200:
zip_data = io.BytesIO(logs_response.content)
zip_file = zipfile.ZipFile(zip_data, "r")
zip_file_names = zip_file.namelist()
for file in failed_jobs_name_list:
if f"{file}.txt" in zip_file_names:
logs = zip_file.read(f"{file}.txt").decode("utf-8")
logs_prompt = clean_gh_logs(logs)
all_logs += logs_prompt + "\n"
else:
logger.error(
"Failed to get logs for failing github actions, likely a credentials issue"
)
return all_logs
def delete_old_prs(repo: Repository, issue_number: int):
logger.info("Deleting old PRs...")
prs = repo.get_pulls(
state="open",
sort="created",
direction="desc",
base=SweepConfig.get_branch(repo),
)
for pr in tqdm(prs.get_page(0)):
# # Check if this issue is mentioned in the PR, and pr is owned by bot
# # This is done in create_pr, (pr_description = ...)
if pr.user.login == CURRENT_USERNAME and f"Fixes #{issue_number}.\n" in pr.body:
safe_delete_sweep_branch(pr, repo)
break
def construct_sweep_bot(
repo: Repository,
repo_name: str,
issue_url: str,
repo_description: str,
title: str,
message_summary: str,
cloned_repo: ClonedRepo,
ticket_progress: TicketProgress,
chat_logger: ChatLogger,
snippets: Any = None,
tree: Any = None,
comments: Any = None,
) -> SweepBot:
human_message = HumanMessagePrompt(
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description.strip(),
title=title,
summary=message_summary,
snippets=snippets,
tree=tree,
)
sweep_bot = SweepBot.from_system_message_content(
human_message=human_message,
repo=repo,
is_reply=bool(comments),
chat_logger=chat_logger,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
)
return sweep_bot
def get_comment_header(
index: int,
g: Github,
repo_full_name: str,
user_settings: UserSettings,
progress_headers: list[None | str],
tracking_id: str | None,
payment_message_start: str,
user_settings_message: str,
errored: bool = False,
pr_message: str = "",
done: bool = False,
initial_sandbox_response: int | SandboxResponse = -1,
initial_sandbox_response_file=None,
config_pr_url: str | None = None,
):
config_pr_message = (
"\n"
+ f"<div align='center'>Install Sweep Configs: <a href='{config_pr_url}'>Pull Request</a></div>"
if config_pr_url is not None
else ""
)
actions_message = create_action_buttons(
[
RESTART_SWEEP_BUTTON,
]
)
sandbox_execution_message = "\n\n## GitHub Actions failed\n\nThe sandbox appears to be unavailable or down.\n\n"
if initial_sandbox_response == -1:
sandbox_execution_message = ""
elif initial_sandbox_response is not None:
repo = g.get_repo(repo_full_name)
commit_hash = repo.get_commits()[0].sha
success = initial_sandbox_response.outputs and initial_sandbox_response.success
status = "✓" if success else "X"
sandbox_execution_message = (
"\n\n## GitHub Actions"
+ status
+ "\n\nHere are the GitHub Actions logs prior to making any changes:\n\n"
)
sandbox_execution_message += entities_create_error_logs(
f'<a href="https://github.com/{repo_full_name}/commit/{commit_hash}"><code>{commit_hash[:7]}</code></a>',
initial_sandbox_response,
initial_sandbox_response_file,
)
if success:
sandbox_execution_message += f"\n\nSandbox passed on the latest `{repo.default_branch}`, so sandbox checks will be enabled for this issue."
else:
sandbox_execution_message += "\n\nSandbox failed, so all sandbox checks will be disabled for this issue."
if index < 0:
index = 0
if index == 4:
return (
pr_message
+ config_pr_message
+ f"\n\n---\n{user_settings.get_message(completed=True)}"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
total = len(progress_headers)
index += 1 if done else 0
index *= 100 / total
index = int(index)
index = min(100, index)
if errored:
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Errored&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}<br/>{center(pbar)}\n\n"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Progress&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}"
+ (
center(
f'\n\n<h2>✨ Track Sweep\'s progress on our <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">progress dashboard</a>!</h2>'
)
if MONGODB_URI is not None
else ""
)
+ f"<br/>{center(pbar)}"
+ ("\n" + stars_suffix if index != -1 else "")
+ "\n"
+ center(payment_message_start)
+ f"\n\n---\n{user_settings_message}"
+ config_pr_message
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
def on_ticket(
title: str,
summary: str,
issue_number: int,
issue_url: str, # purely for logging purposes
username: str,
repo_full_name: str,
repo_description: str,
installation_id: int,
comment_id: int = None,
edited: bool = False,
tracking_id: str | None = None,
):
if not os.environ.get("CLI"):
assert validate_license(), "License key is invalid or expired. Please contact us at team@sweep.dev to upgrade to an enterprise license."
with logger.contextualize(
tracking_id=tracking_id,
):
if tracking_id is None:
tracking_id = get_hash()
on_ticket_start_time = time()
logger.info(f"Starting on_ticket with title {title} and summary {summary}")
(
title,
slow_mode,
do_map,
subissues_mode,
sandbox_mode,
fast_mode,
lint_mode,
) = strip_sweep(title)
# fetch images from body of issue
image_urls = get_image_urls_from_issue(issue_number, repo_full_name, installation_id)
image_contents = get_image_contents_from_urls(image_urls)
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n<summary>Checklist</summary>.*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- \[[ X]\].*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
ticket_progress = TicketProgress(
tracking_id=tracking_id,
username=username,
context=TicketContext(
title=title,
description=summary,
repo_full_name=repo_full_name,
issue_number=issue_number,
is_public=repo.private is False,
start_time=int(time()),
),
)
branch_match = re.search(
r"([B|b]ranch:) *(?P<branch_name>.+?)(\s|$)", summary
)
overrided_branch_name = None
if branch_match and "branch_name" in branch_match.groupdict():
overrided_branch_name = (
branch_match.groupdict()["branch_name"].strip().strip("`\"'")
)
# TODO: this code might be finicky, might have missed edge cases
if overrided_branch_name.startswith("https://github.com/"):
overrided_branch_name = overrided_branch_name.split("?")[0].split(
"tree/"
)[-1]
SweepConfig.get_branch(repo, overrided_branch_name)
chat_logger = (
ChatLogger(
{
"repo_name": repo_name,
"title": title,
"summary": summary,
"issue_number": issue_number,
"issue_url": issue_url,
"username": (
username if not username.startswith("sweep") else assignee
),
"repo_full_name": repo_full_name,
"repo_description": repo_description,
"installation_id": installation_id,
"type": "ticket",
"mode": ENV,
"comment_id": comment_id,
"edited": edited,
"tracking_id": tracking_id,
},
active=True,
)
if MONGODB_URI
else None
)
if chat_logger and not IS_SELF_HOSTED:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
if use_faster_model:
raise Exception(FASTER_MODEL_MESSAGE)
if fast_mode:
use_faster_model = True
if not comment_id and not edited and chat_logger and not sandbox_mode:
fire_and_forget_wrapper(chat_logger.add_successful_ticket)(
gpt3=use_faster_model
)
organization, repo_name = repo_full_name.split("/")
metadata = {
"issue_url": issue_url,
"repo_full_name": repo_full_name,
"organization": organization,
"repo_name": repo_name,
"repo_description": repo_description,
"username": username,
"comment_id": comment_id,
"title": title,
"installation_id": installation_id,
"function": "on_ticket",
"edited": edited,
"model": "gpt-3.5" if use_faster_model else "gpt-4",
"tier": "pro" if is_paying_user else "free",
"mode": ENV,
"slow_mode": slow_mode,
"do_map": do_map,
"subissues_mode": subissues_mode,
"sandbox_mode": sandbox_mode,
"fast_mode": fast_mode,
"is_self_hosted": IS_SELF_HOSTED,
"tracking_id": tracking_id,
}
fire_and_forget_wrapper(posthog.capture)(
username, "started", properties=metadata
)
try:
if current_issue.state == "closed":
fire_and_forget_wrapper(posthog.capture)(
username,
"issue_closed",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": False, "reason": "Issue is closed"}
fire_and_forget_wrapper(add_emoji)(current_issue, comment_id)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="rocket"
)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="confused"
)
fire_and_forget_wrapper(current_issue.edit)(body=summary)
replies_text = ""
summary = summary if summary else ""
fire_and_forget_wrapper(delete_old_prs)(repo, issue_number)
if not sandbox_mode:
progress_headers = [
None,
"Step 1: 🔎 Searching",
"Step 2: ⌨️ Coding",
"Step 3: 🔁 Code Review",
]
else:
progress_headers = [
None,
"📖 Reading File",
"🛠️ Executing Sandbox",
]
issue_comment = None
payment_message, payment_message_start = get_payment_messages(
chat_logger
)
ticket_progress.context.payment_context = PaymentContext(
use_faster_model=use_faster_model,
pro_user=is_paying_user,
daily_tickets_used=(
chat_logger.get_ticket_count(use_date=True)
if chat_logger
else 0
),
monthly_tickets_used=(
chat_logger.get_ticket_count() if chat_logger else 0
),
)
ticket_progress.save()
config_pr_url = None
user_settings = UserSettings.from_username(username=username)
user_settings_message = user_settings.get_message()
cloned_repo = ClonedRepo(
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=overrided_branch_name,
)
# check that repo's directory is non-empty
if os.listdir(cloned_repo.cached_dir) == []:
logger.info("Empty repo")
first_comment = (
"Sweep is currently not supported on empty repositories. Please add some"
f" code to your repository and try again.\n{sep}##"
f" {progress_headers[1]}\n{bot_suffix}{discord_suffix}"
)
if issue_comment is None:
issue_comment = current_issue.create_comment(
first_comment + BOT_SUFFIX
)
else:
issue_comment.edit(first_comment + BOT_SUFFIX)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {"success": False}
indexing_message = (
"I'm searching for relevant snippets in your repository. If this is your first"
" time using Sweep, I'm indexing your repository. You can monitor the progress using the progress dashboard"
)
first_comment = (
f"{get_comment_header(0, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message)}\n{sep}I am currently looking into this ticket! I"
" will update the progress of the ticket in this comment. I am currently"
f" searching through your code, looking for relevant snippets.\n{sep}##"
f" {progress_headers[1]}\n{indexing_message}{bot_suffix}{discord_suffix}"
)
# Find Sweep's previous comment
comments = []
for comment in current_issue.get_comments():
comments.append(comment)
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
break
if issue_comment is None:
issue_comment = current_issue.create_comment(first_comment)
else:
fire_and_forget_wrapper(issue_comment.edit)(first_comment)
old_edit = issue_comment.edit
issue_comment.edit = lambda msg: old_edit(msg + BOT_SUFFIX)
past_messages = {}
current_index = 0
table = None
initial_sandbox_response = -1
initial_sandbox_response_file = None
def refresh_token():
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
return user_token, g, repo
def edit_sweep_comment(
message: str,
index: int,
pr_message="",
done=False,
add_bonus_message=True,
):
nonlocal current_index, user_token, g, repo, issue_comment, initial_sandbox_response, initial_sandbox_response_file
message = sanitize_string_for_github(message)
if pr_message:
pr_message = sanitize_string_for_github(pr_message)
# -1 = error, -2 = retry
# Only update the progress bar if the issue generation errors.
errored = index == -1
if index >= 0:
past_messages[index] = message
current_index = index
agg_message = None
# Include progress history
# index = -2 is reserved for
for i in range(
current_index + 2
): # go to next header (for Working on it... text)
if i == 0 or i >= len(progress_headers):
continue # skip None header
header = progress_headers[i]
if header is not None:
header = "## " + header + "\n"
else:
header = "No header\n"
msg = header + (past_messages.get(i) or "Working on it...")
if agg_message is None:
agg_message = msg
else:
agg_message = agg_message + f"\n{sep}" + msg
suffix = bot_suffix + discord_suffix
if errored:
agg_message = (
"## ❌ Unable to Complete PR"
+ "\n"
+ message
+ (
"\n\nFor bonus GPT-4 tickets, please report this bug on"
f" **[Discord](https://discord.gg/invite/sweep)** (tracking ID: `{tracking_id}`)."
if add_bonus_message
else ""
)
)
if table is not None:
agg_message = (
agg_message
+ f"\n{sep}Please look at the generated plan. If something looks"
f" wrong, please add more details to your issue.\n\n{table}"
)
suffix = bot_suffix # don't include discord suffix for error messages
# Update the issue comment
msg = f"{get_comment_header(current_index, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message, errored=errored, pr_message=pr_message, done=done, initial_sandbox_response=initial_sandbox_response, initial_sandbox_response_file=initial_sandbox_response_file, config_pr_url=config_pr_url)}\n{sep}{agg_message}{suffix}"
try:
issue_comment.edit(msg)
except BadCredentialsException:
logger.error(
f"Bad credentials, refreshing token (tracking ID: `{tracking_id}`)"
)
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
issue_comment = None
for comment in comments:
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
current_issue = repo.get_issue(number=issue_number)
if issue_comment is None:
issue_comment = current_issue.create_comment(msg)
else:
issue_comment = [
comment
for comment in current_issue.get_comments()
if comment.user.login == CURRENT_USERNAME
][0]
issue_comment.edit(msg)
if use_faster_model:
edit_sweep_comment(
FASTER_MODEL_MESSAGE, -1, add_bonus_message=False
)
posthog.capture(
username,
"ran_out_of_tickets",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {
"success": False,
"error_message": "We deprecated supporting GPT 3.5.",
}
error_message = validate_issue(title + summary)
if error_message:
logger.warning(f"Validation error: {error_message}")
edit_sweep_comment(
(
f"The issue was rejected with the following response:\n\n{blockquote(error_message)}"
),
-1,
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
posthog.capture(
username,
"invalid_issue",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": True}
prs_extracted = PRReader.extract_prs(repo, summary)
message_summary = summary
if prs_extracted:
message_summary += "\n\n" + prs_extracted
edit_sweep_comment(
create_collapsible(
"I found that you mentioned the following Pull Requests that might be important:",
blockquote(
prs_extracted,
),
),
1,
)
try:
# search/context manager
logger.info("Searching for relevant snippets...")
if image_contents: # doing it here to avoid editing the original issue
message_summary += ImageDescriptionBot().describe_images(text=title + message_summary, images=image_contents)
snippets, tree, _, repo_context_manager = fetch_relevant_files(
cloned_repo,
title,
message_summary,
replies_text,
username,
metadata,
on_ticket_start_time,
tracking_id,
is_paying_user,
is_consumer_tier,
issue_url,
chat_logger,
ticket_progress,
images=image_contents
)
cloned_repo = repo_context_manager.cloned_repo
except Exception as e:
edit_sweep_comment(
(
"It looks like an issue has occurred around fetching the files."
f" The exception was {str(e)}. If this error persists"
f" contact team@sweep.dev.\n\n> @{username}, editing this issue description to include more details will automatically make me relaunch. Please join our Discord server for support (tracking_id={tracking_id})"
),
-1,
)
raise Exception("Failed to fetch files") from e
_user_token, g = get_github_client(installation_id)
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
repo = g.get_repo(repo_full_name)
ticket_progress.search_progress.indexing_progress = (
ticket_progress.search_progress.indexing_total
)
ticket_progress.status = TicketProgressStatus.PLANNING
ticket_progress.save()
# Fetch git commit history
if not repo_description:
repo_description = "No description provided."
message_summary += replies_text
get_documentation_dict(repo)
docs_results = ""
sweep_bot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title=title,
message_summary=message_summary,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
# Check repository for sweep.yml file.
sweep_yml_exists = False
sweep_yml_failed = False
for content_file in repo.get_contents(""):
if content_file.name == "sweep.yaml":
sweep_yml_exists = True
# Check if YAML is valid
yaml_content = content_file.decoded_content.decode("utf-8")
sweep_yaml_dict = {}
try:
sweep_yaml_dict = yaml.safe_load(yaml_content)
except Exception:
logger.error(f"Failed to load YAML file: {yaml_content}")
if len(sweep_yaml_dict) > 0:
break
linter_config = yamllint_config.YamlLintConfig(custom_config)
problems = list(linter.run(yaml_content, linter_config))
if problems:
errors = [
f"Line {problem.line}: {problem.desc} (rule: {problem.rule})"
for problem in problems
]
error_message = "\n".join(errors)
markdown_error_message = f"**There is something wrong with your [sweep.yaml](https://github.com/{repo_full_name}/blob/main/sweep.yaml):**\n```\n{error_message}\n```"
sweep_yml_failed = True
logger.error(markdown_error_message)
edit_sweep_comment(markdown_error_message, -1)
else:
logger.info("The YAML file is valid. No errors found.")
break
# If sweep.yaml does not exist, then create a new PR that simply creates the sweep.yaml file.
if not sweep_yml_exists:
try:
logger.info("Creating sweep.yaml file...")
config_pr = create_config_pr(sweep_bot, cloned_repo=cloned_repo)
config_pr_url = config_pr.html_url
edit_sweep_comment(message="", index=-2)
except Exception as e:
logger.error(
"Failed to create new branch for sweep.yaml file.\n",
e,
traceback.format_exc(),
)
else:
logger.info("sweep.yaml file already exists.")
try:
# ANALYZE SNIPPETS
newline = "\n"
edit_sweep_comment(
"I found the following snippets in your repository. I will now analyze"
" these snippets and come up with a plan."
+ "\n\n"
+ create_collapsible(
"Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.",
"\n".join(
[
f"https://github.com/{organization}/{repo_name}/blob/{repo.get_commits()[0].sha}/{snippet.file_path}#L{max(snippet.start, 1)}-L{min(snippet.end, snippet.content.count(newline) - 1)}\n"
for snippet in snippets
]
),
)
+ (
create_collapsible(
"I also found that you mentioned the following Pull Requests that may be helpful:",
blockquote(prs_extracted),
)
if prs_extracted
else ""
)
+ (f"\n\n{docs_results}\n\n" if docs_results else ""),
1,
)
logger.info("Fetching files to modify/create...")
file_change_requests, plan = get_files_to_change(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=f"{title}\n\n{message_summary}",
repo_name=repo_full_name,
cloned_repo=cloned_repo,
images=image_contents
)
validate_file_change_requests(file_change_requests, cloned_repo)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.status = TicketProgressStatus.CODING
ticket_progress.save()
if not file_change_requests:
if len(title + summary) < 60:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details? Please make sure that the title and"
" summary of the issue are at least 60 characters."
),
-1,
)
else:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details?"
),
-1,
)
raise Exception("No files to modify.")
file_change_requests: list[
FileChangeRequest
] = sweep_bot.validate_file_change_requests(
file_change_requests,
)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.save()
table = tabulate(
[
[
file_change_request.entity_display,
file_change_request.instructions_display.replace(
"\n", "<br/>"
).replace("```", "\\```"),
]
for file_change_request in file_change_requests
if file_change_request.change_type != "check"
],
headers=["File Path", "Proposed Changes"],
tablefmt="pipe",
)
logger.info("Generating PR...")
pull_request = PullRequest(
title="Sweep: " + title,
branch_name="sweep/" + to_branch_name(title),
content="",
)
logger.info("Making PR...")
ticket_progress.context.branch_name = pull_request.branch_name
ticket_progress.save()
files_progress: list[tuple[str, str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
"⏳ In Progress",
"",
)
for file_change_request in file_change_requests
]
checkboxes_progress: list[tuple[str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
" ",
)
for file_change_request in file_change_requests
if not file_change_request.change_type == "check"
]
checkboxes_contents = "\n".join(
[
create_checkbox(
f"`{filename}`", blockquote(instructions), check == "X"
)
for filename, instructions, check in checkboxes_progress
]
)
create_collapsible("Checklist", checkboxes_contents, opened=True)
file_change_requests[0].status = "running"
condensed_checkboxes_contents = "\n".join(
[
create_checkbox(f"`{filename}`", "", check == "X").strip()
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_collapsible = create_collapsible(
"Checklist", condensed_checkboxes_contents, opened=True
)
current_issue = repo.get_issue(number=issue_number)
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
delete_branch = False
generator = create_pr_changes(
file_change_requests,
pull_request,
sweep_bot,
username,
installation_id,
issue_number,
chat_logger=chat_logger,
base_branch=overrided_branch_name,
additional_messages=[],
)
edit_sweep_comment(checkboxes_contents, 2)
if not file_change_requests:
raise NoFilesException()
response = {
"error": Exception(
f"Sweep failed to generate any file change requests! This could mean that Sweep failed to find the correct lines of code to modify or that GPT-4 did not respond in our specified format. Sometimes, retrying will fix this error. Otherwise, reach out to our Discord server for support (tracking_id={tracking_id})."
)
}
changed_files = []
for item in generator:
if isinstance(item, dict):
response = item
break
(
new_file_contents,
_,
commit,
file_change_requests,
) = item
# append all files that have been changed
if new_file_contents:
for file_name, _ in new_file_contents.items():
changed_files.append(file_name)
commit_hash: str = (
commit
if isinstance(commit, str)
else (
commit.sha
if commit is not None
else repo.get_branch(
pull_request.branch_name
).commit.sha
)
)
commit_url = (
f"https://github.com/{repo_full_name}/commit/{commit_hash}"
)
commit_url_display = (
f"<a href='{commit_url}'><code>{commit_hash[:7]}</code></a>"
)
create_error_logs(
commit_url_display,
None,
status=(
"✓"
),
)
checkboxes_progress = [
(
file_change_request.display_summary
+ " "
+ file_change_request.status_display
+ " "
+ (file_change_request.commit_hash_url or "")
+ f" [Edit]({file_change_request.get_edit_url(repo.full_name, pull_request.branch_name)})",
file_change_request.instructions_ticket_display
+ f"\n\n{file_change_request.diff_display}",
(
"X"
if file_change_request.status
in ("succeeded", "failed")
else " "
),
)
for file_change_request in file_change_requests
]
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
collapsible_template.format(
summary="Checklist",
body=checkboxes_contents,
opened="open",
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
try:
current_issue = repo.get_issue(number=issue_number)
except BadCredentialsException:
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
logger.info(files_progress)
edit_sweep_comment(checkboxes_contents, 2)
if not response.get("success"):
raise Exception(f"Failed to create PR: {response.get('error')}")
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
for _ in range(3):
try:
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
break
except Exception:
from time import sleep
sleep(1)
edit_sweep_comment(checkboxes_contents, 2)
pr_changes = response["pull_request"]
# change the body here
diff_text = get_branch_diff_text(
repo=repo,
branch=pull_request.branch_name,
base_branch=overrided_branch_name,
)
new_description = PRDescriptionBot().describe_diffs(
diff_text,
pull_request.title,
)
# TODO: update the title as well
if new_description:
pr_changes.body = (
f"{new_description}\n\nFixes"
f" #{issue_number}.\n\n---\n\n{UPDATES_MESSAGE}\n\n---\n\n{INSTRUCTIONS_FOR_REVIEW}{BOT_SUFFIX}"
)
edit_sweep_comment(
"I have finished coding the issue. I am now reviewing it for completeness.",
3,
)
change_location = f" [`{pr_changes.pr_head}`](https://github.com/{repo_full_name}/commits/{pr_changes.pr_head}).\n\n"
review_message = (
"Here are my self-reviews of my changes at" + change_location
)
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
except Exception:
pass
changes_required, review_message = False, ""
if changes_required:
edit_sweep_comment(
review_message
+ "\n\nI finished incorporating these changes.",
3,
)
else:
edit_sweep_comment(
f"I have finished reviewing the code for completeness. I did not find errors for {change_location}",
3,
)
revert_buttons = []
for changed_file in set(changed_files):
revert_buttons.append(
Button(label=f"{RESET_FILE} {changed_file}")
)
revert_buttons_list = ButtonList(
buttons=revert_buttons, title=REVERT_CHANGED_FILES_TITLE
)
# delete failing sweep yaml if applicable
if sweep_yml_failed:
try:
repo.delete_file(
"sweep.yaml",
"Delete failing sweep.yaml",
branch=pr_changes.pr_head,
sha=repo.get_contents("sweep.yaml").sha,
)
except Exception:
pass
# create draft pr, then convert to regular pr later
pr: GithubPullRequest = repo.create_pull(
title=pr_changes.title,
body=pr_changes.body,
head=pr_changes.pr_head,
base=overrided_branch_name or SweepConfig.get_branch(repo),
# removed draft PR
draft=False,
)
try:
pr.add_to_assignees(username)
except Exception as e:
logger.error(
f"Failed to add assignee {username}: {e}, probably a bot."
)
ticket_progress.status = TicketProgressStatus.COMPLETE
ticket_progress.context.done_time = time()
ticket_progress.context.pr_id = pr.number
ticket_progress.save()
if revert_buttons:
pr.create_issue_comment(
revert_buttons_list.serialize() + BOT_SUFFIX
)
# add comments before labelling
pr.add_to_labels(GITHUB_LABEL_NAME)
current_issue.create_reaction("rocket")
heres_pr_message = f'<h1 align="center">🚀 Here\'s the PR! <a href="{pr.html_url}">#{pr.number}</a></h1>'
progress_message = f'<div align="center"><b>See Sweep\'s progress at <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">the progress dashboard</a>!</b></div>'
edit_sweep_comment(
review_message + "\n\nSuccess! 🚀",
4,
pr_message=(
f"{center(heres_pr_message)}\n{center(progress_message)}\n{center(payment_message_start)}"
),
done=True,
)
user_settings = UserSettings.from_username(username=username)
user = g.get_user(username)
full_name = user.name or user.login
name = full_name.split(" ")[0]
files_changed = []
for fcr in file_change_requests:
if fcr.change_type in ("create", "modify"):
diff = list(
difflib.unified_diff(
(fcr.old_content or "").splitlines() or [],
(fcr.new_content or "").splitlines() or [],
lineterm="",
)
)
added = sum(
1
for line in diff
if line.startswith("+") and not line.startswith("+++")
)
removed = sum(
1
for line in diff
if line.startswith("-") and not line.startswith("---")
)
files_changed.append(
f"<code>{fcr.filename}</code> (+{added}/-{removed})"
)
user_settings.send_email(
subject=f"Sweep Pull Request Complete for {repo_name}#{issue_number} {title}",
html=email_template.format(
name=name,
pr_url=pr.html_url,
issue_number=issue_number,
repo_full_name=repo_full_name,
pr_number=pr.number,
progress_url=f"{PROGRESS_BASE_URL}/issues/{tracking_id}",
summary=markdown.markdown(pr_changes.body),
files_changed="\n".join(
[f"<li>{item}</li>" for item in files_changed]
),
sweeping_gif=sweeping_gif,
),
)
# poll for github to check when gha are done
total_poll_attempts = 0
total_edit_attempts = 0
SLEEP_DURATION_SECONDS = 15
GITHUB_ACTIONS_ENABLED = get_gha_enabled(repo=repo) and DEPLOYMENT_GHA_ENABLED
GHA_MAX_EDIT_ATTEMPTS = 5 # max number of times to edit PR
current_commit = pr.head.sha
while True and GITHUB_ACTIONS_ENABLED:
logger.info(
f"Polling to see if Github Actions have finished... {total_poll_attempts}"
)
# we wait at most 60 minutes
if total_poll_attempts * SLEEP_DURATION_SECONDS // 60 >= 60:
break
else:
# wait one minute between check attempts
total_poll_attempts += 1
from time import sleep
sleep(SLEEP_DURATION_SECONDS)
runs = list(repo.get_workflow_runs(branch=pr.head.ref, head_sha=current_commit))
# if all runs have succeeded, break
if all([run.conclusion == "success" for run in runs]):
break
# if any of them have failed we retry
if any([run.conclusion == "failure" for run in runs]):
failed_runs = [
run for run in runs if run.conclusion == "failure"
]
failed_gha_logs: list[str] = get_failing_gha_logs(
failed_runs,
installation_id,
)
if failed_gha_logs:
# make edits to the PR
# TODO: look into rollbacks so we don't continue adding onto errors
cloned_repo = ClonedRepo( # reinitialize cloned_repo to avoid conflicts
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=pr.head.ref,
)
diffs = get_branch_diff_text(repo=repo, branch=pr.head.ref, base_branch=pr.base.ref)
problem_statement = f"{title}\n{message_summary}\n{replies_text}"
all_information_prompt = GHA_PROMPT.format(
problem_statement=problem_statement,
github_actions_logs=failed_gha_logs,
changes_made=diffs,
)
repo_context_manager = prep_snippets(cloned_repo=cloned_repo, query=(title + message_summary + replies_text).strip("\n"), ticket_progress=ticket_progress) # need to do this, can use the old query for speed
sweep_bot: SweepBot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title="Fix the following errors to complete the user request.",
message_summary=all_information_prompt,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
file_change_requests, plan = get_files_to_change_for_gha(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=all_information_prompt,
updated_files=new_file_contents,
cloned_repo=cloned_repo,
chat_logger=chat_logger,
)
validate_file_change_requests(file_change_requests, cloned_repo)
previous_modify_files_dict: dict[str, dict[str, str | list[str]]] | None = None
_, commit, _ = sweep_bot.handle_modify_file_main(
branch=pr.head.ref,
assistant_conversation=None,
additional_messages=[],
previous_modify_files_dict=previous_modify_files_dict,
file_change_requests=file_change_requests,
username=username
)
current_commit = commit.sha
pr = repo.get_pull(pr.number) # IMPORTANT: resync PR otherwise you'll fetch old GHA runs
total_edit_attempts += 1
if total_edit_attempts >= GHA_MAX_EDIT_ATTEMPTS:
logger.info(f"Tried to edit PR {GHA_MAX_EDIT_ATTEMPTS} times, giving up.")
break
# if none of the runs have completed we wait and poll github
logger.info(
f"No Github Actions have failed yet and not all have succeeded yet, waiting for {SLEEP_DURATION_SECONDS} seconds before polling again..."
)
# break from main for loop
convert_pr_draft_field(pr, is_draft=False, installation_id=installation_id)
except MaxTokensExceeded as e:
logger.info("Max tokens exceeded")
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Max tokens exceeded. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Max Tokens Exceeded",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
if chat_logger and chat_logger.is_paying_user():
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too long."
" We are currently working on improved file streaming to address"
" this issue.\n"
),
-1,
)
else:
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too"
" long.\n\nIf this file is incorrect, please describe the desired"
" file in the prompt. However, if you would like to edit longer"
" files, consider upgrading to [Sweep Pro](https://sweep.dev/) for"
" longer context lengths.\n"
),
-1,
)
delete_branch = True
raise e
except NoFilesException as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sweep could not find files to modify to address this issue. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.info("Sweep could not find files to modify")
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Sweep could not find files to modify",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
edit_sweep_comment(
(
"Sorry, Sweep could not find any appropriate files to edit to address"
" this issue. If this is a mistake, please provide more context and Sweep"
f" will retry!\n\n@{username}, please edit the issue description to"
" include more details. You can also ask for help on our community"
" forum: https://community.sweep.dev/"
),
-1,
)
delete_branch = True
raise e
except openai.BadRequestError as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sorry, it looks like there is an error with communicating with OpenAI. If this error persists, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
edit_sweep_comment(
(
"I'm sorry, but it looks our model has ran out of context length. We're"
" trying to make this happen less, but one way to mitigate this is to"
" code smaller files. If this error persists report it at"
" https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Context Length",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
posthog.capture(
username,
"failed",
properties={
"error": str(e),
"trace": traceback.format_exc(),
"reason": "Invalid request error / context length",
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
delete_branch = True
raise e
except AssistantRaisedException as e:
if ticket_progress is not None:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Sweep raised an error with the following message: {e.message}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.exception(e)
edit_sweep_comment(
f"Sweep raised an error with the following message:\n{blockquote(e.message)}",
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
except Exception as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Internal server error: {str(e)}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
# title and summary are defined elsewhere
if len(title + summary) < 60:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error occurred due to"
f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
" so Sweep can better address it. Alternatively, post on our community forum"
" for assistance: https://community.sweep.dev/"
),
-1,
)
else:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error has occurred due to"
+ f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
+ " so Sweep can better address it. Alternatively, reach out to Kevin or William for help at"
+ " https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
else:
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
fire_and_forget_wrapper(add_emoji)("rocket")
except SystemExit:
raise SystemExit
except Exception as e:
logger.error(e)
if delete_branch:
try:
if pull_request.branch_name.startswith("sweep"):
repo.get_git_ref(
f"heads/{pull_request.branch_name}"
).delete()
else:
raise Exception(
f"Branch name {pull_request.branch_name} does not start with sweep/"
)
except Exception as e:
logger.error(e)
logger.error(traceback.format_exc())
logger.info("Deleted branch", pull_request.branch_name)
except Exception as e:
posthog.capture(
username,
"failed",
properties={
**metadata,
"error": str(e),
"trace": traceback.format_exc(),
"duration": round(time() - on_ticket_start_time),
},
)
raise e
posthog.capture(
username,
"success",
properties={**metadata, "duration": round(time() - on_ticket_start_time)},
)
logger.info("on_ticket success in " + str(round(time() - on_ticket_start_time)))
return {"success": True}
def handle_sandbox_mode(
title, repo_full_name, repo, ticket_progress, edit_sweep_comment
):
logger.info("Running in sandbox mode")
sweep_bot = SweepBot(repo=repo, ticket_progress=ticket_progress)
logger.info("Getting file contents")
file_name = title.split(":")[1].strip()
file_contents = sweep_bot.get_contents(file_name).decoded_content.decode("utf-8")
try:
ext = file_name.split(".")[-1]
except Exception:
ext = ""
file_contents.replace("```", "\`\`\`")
sha = repo.get_branch(repo.default_branch).commit.sha
permalink = f"https://github.com/{repo_full_name}/blob/{sha}/{file_name}#L1-L{len(file_contents.splitlines())}"
logger.info("Running sandbox")
edit_sweep_comment(
f"Running sandbox for {file_name}. Current Code:\n\n{permalink}",
1,
)
updated_contents, sandbox_response = sweep_bot.check_sandbox(
file_name, file_contents
)
logger.info("Sandbox finished")
logs = (
(
"<br/>"
+ create_collapsible(
"Sandbox logs",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
updated_contents = updated_contents.replace("```", "\`\`\`")
diff = generate_diff(file_contents, updated_contents).replace("```", "\`\`\`")
diff_display = (
f"Updated Code:\n\n```{ext}\n{updated_contents}```\nDiff:\n```diff\n{diff}\n```"
if diff
else f"Sandbox made no changes to {file_name} (formatters were not configured or Sweep didn't make changes)."
)
edit_sweep_comment(
f"{logs}\n{diff_display}",
2,
)
edit_sweep_comment("N/A", 3)
logger.info("Sandbox comments updated")
def get_branch_diff_text(repo, branch, base_branch=None):
base_branch = base_branch or SweepConfig.get_branch(repo)
comparison = repo.compare(base_branch, branch)
file_diffs = comparison.files
pr_diffs = []
for file in file_diffs:
diff = file.patch
if (
file.status == "added"
or file.status == "modified"
or file.status == "removed"
):
pr_diffs.append((file.filename, diff))
else:
logger.info(
f"File status {file.status} not recognized"
) # TODO(sweep): We don't handle renamed files
return "\n".join([f"{filename}\n{diff}" for filename, diff in pr_diffs])
def get_payment_messages(chat_logger: ChatLogger):
if chat_logger:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
tracking_id = chat_logger.data["tracking_id"] if MONGODB_URI is not None else None
# Find the first comment made by the bot
tickets_allocated = 5
if is_consumer_tier:
tickets_allocated = 15
if is_paying_user:
tickets_allocated = 500
purchased_ticket_count = (
chat_logger.get_ticket_count(purchased=True) if chat_logger else 0
)
ticket_count = (
max(tickets_allocated - chat_logger.get_ticket_count(), 0)
+ purchased_ticket_count
if chat_logger
else 999
)
daily_ticket_count = (
(3 - chat_logger.get_ticket_count(use_date=True) if not use_faster_model else 0)
if chat_logger
else 999
)
model_name = "GPT-4"
single_payment_link = "https://buy.stripe.com/00g3fh7qF85q0AE14d"
pro_payment_link = "https://buy.stripe.com/00g5npeT71H2gzCfZ8"
daily_message = (
f" and {daily_ticket_count} for the day"
if not is_paying_user and not is_consumer_tier
else ""
)
user_type = "💎 <b>Sweep Pro</b>" if is_paying_user else "⚡ <b>Sweep Basic Tier</b>"
gpt_tickets_left_message = (
f"{ticket_count} GPT-4 tickets left for the month"
if not is_paying_user
else "unlimited GPT-4 tickets"
)
purchase_message = f"<br/><br/> For more GPT-4 tickets, visit <a href={single_payment_link}>our payment portal</a>. For a one week free trial, try <a href={pro_payment_link}>Sweep Pro</a> (unlimited GPT-4 tickets)."
payment_message = (
f"{user_type}: I used {model_name} to create this ticket. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)
payment_message_start = (
f"{user_type}: I'm using {model_name}. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)

import base64
import os
from dotenv import load_dotenv
from loguru import logger
logger.print = logger.info
load_dotenv(dotenv_path=".env", override=True, verbose=True)
os.environ["GITHUB_APP_PEM"] = os.environ.get("GITHUB_APP_PEM") or base64.b64decode(
os.environ.get("GITHUB_APP_PEM_BASE64", "")
).decode("utf-8")
if os.environ["GITHUB_APP_PEM"]:
os.environ["GITHUB_APP_ID"] = (
(os.environ.get("GITHUB_APP_ID") or os.environ.get("APP_ID"))
.replace("\\n", "\n")
.strip('"')
)
TEST_BOT_NAME = "sweep-nightly[bot]"
ENV = os.environ.get("ENV", "dev")
BOT_TOKEN_NAME = "bot-token"
# goes under Modal 'discord' secret name (optional, can leave env var blank)
DISCORD_WEBHOOK_URL = os.environ.get("DISCORD_WEBHOOK_URL")
DISCORD_MEDIUM_PRIORITY_URL = os.environ.get("DISCORD_MEDIUM_PRIORITY_URL")
DISCORD_LOW_PRIORITY_URL = os.environ.get("DISCORD_LOW_PRIORITY_URL")
DISCORD_FEEDBACK_WEBHOOK_URL = os.environ.get("DISCORD_FEEDBACK_WEBHOOK_URL")
SWEEP_HEALTH_URL = os.environ.get("SWEEP_HEALTH_URL")
DISCORD_STATUS_WEBHOOK_URL = os.environ.get("DISCORD_STATUS_WEBHOOK_URL")
# goes under Modal 'github' secret name
GITHUB_APP_ID = os.environ.get("GITHUB_APP_ID", os.environ.get("APP_ID"))
# deprecated: old logic transfer so upstream can use this
if GITHUB_APP_ID is None:
if ENV == "prod":
GITHUB_APP_ID = "307814"
elif ENV == "dev":
GITHUB_APP_ID = "324098"
elif ENV == "staging":
GITHUB_APP_ID = "327588"
GITHUB_BOT_USERNAME = os.environ.get("GITHUB_BOT_USERNAME")
# deprecated: left to support old logic
if not GITHUB_BOT_USERNAME:
if ENV == "prod":
GITHUB_BOT_USERNAME = "sweep-ai[bot]"
elif ENV == "dev":
GITHUB_BOT_USERNAME = "sweep-nightly[bot]"
elif ENV == "staging":
GITHUB_BOT_USERNAME = "sweep-canary[bot]"
elif not GITHUB_BOT_USERNAME.endswith("[bot]"):
GITHUB_BOT_USERNAME = GITHUB_BOT_USERNAME + "[bot]"
GITHUB_LABEL_NAME = os.environ.get("GITHUB_LABEL_NAME", "sweep")
GITHUB_LABEL_COLOR = os.environ.get("GITHUB_LABEL_COLOR", "9400D3")
GITHUB_LABEL_DESCRIPTION = os.environ.get(
"GITHUB_LABEL_DESCRIPTION", "Sweep your software chores"
)
GITHUB_APP_PEM = os.environ.get("GITHUB_APP_PEM")
GITHUB_APP_PEM = GITHUB_APP_PEM or os.environ.get("PRIVATE_KEY")
if GITHUB_APP_PEM is not None:
GITHUB_APP_PEM = GITHUB_APP_PEM.strip(' \n"') # Remove whitespace and quotes
GITHUB_APP_PEM = GITHUB_APP_PEM.replace("\\n", "\n")
GITHUB_CONFIG_BRANCH = os.environ.get("GITHUB_CONFIG_BRANCH", "sweep/add-sweep-config")
GITHUB_DEFAULT_CONFIG = os.environ.get(
"GITHUB_DEFAULT_CONFIG",
"""# Sweep AI turns bugs & feature requests into code changes (https://sweep.dev)
# For details on our config file, check out our docs at https://docs.sweep.dev/usage/config
# This setting contains a list of rules that Sweep will check for. If any of these rules are broken in a new commit, Sweep will create an pull request to fix the broken rule.
rules:
{additional_rules}
# This is the branch that Sweep will develop from and make pull requests to. Most people use 'main' or 'master' but some users also use 'dev' or 'staging'.
branch: 'main'
# By default Sweep will read the logs and outputs from your existing Github Actions. To disable this, set this to false.
gha_enabled: True
# This is the description of your project. It will be used by sweep when creating PRs. You can tell Sweep what's unique about your project, what frameworks you use, or anything else you want.
#
# Example:
#
# description: sweepai/sweep is a python project. The main api endpoints are in sweepai/api.py. Write code that adheres to PEP8.
description: ''
# This sets whether to create pull requests as drafts. If this is set to True, then all pull requests will be created as drafts and GitHub Actions will not be triggered.
draft: False
# This is a list of directories that Sweep will not be able to edit.
blocked_dirs: []
""",
)
MONGODB_URI = os.environ.get("MONGODB_URI", None)
IS_SELF_HOSTED = os.environ.get("IS_SELF_HOSTED", "true").lower() == "true"
REDIS_URL = os.environ.get("REDIS_URL")
if not REDIS_URL:
REDIS_URL = os.environ.get("redis_url", "redis://0.0.0.0:6379/0")
ORG_ID = os.environ.get("ORG_ID", None)
POSTHOG_API_KEY = os.environ.get(
"POSTHOG_API_KEY", "phc_CnzwIB0W548wN4wEGeRuxXqidOlEUH2AcyV2sKTku8n"
)
SUPPORT_COUNTRY = os.environ.get("GDRP_LIST", "").split(",")
WHITELISTED_REPOS = os.environ.get("WHITELISTED_REPOS", "").split(",")
BLACKLISTED_USERS = os.environ.get("BLACKLISTED_USERS", "").split(",")
# Default OpenAI
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", None) # this may be none, and it will use azure
OPENAI_API_TYPE = os.environ.get("OPENAI_API_TYPE", "anthropic")
assert OPENAI_API_TYPE in ["anthropic", "azure", "openai"], "Invalid OPENAI_API_TYPE"
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
AZURE_API_KEY = os.environ.get("AZURE_API_KEY", None)
OPENAI_API_BASE = os.environ.get("OPENAI_API_BASE", None)
OPENAI_API_VERSION = os.environ.get("OPENAI_API_VERSION", None)
AZURE_OPENAI_DEPLOYMENT = os.environ.get("AZURE_OPENAI_DEPLOYMENT", None)
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
OPENAI_EMBEDDINGS_AZURE_ENDPOINT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_ENDPOINT", None
)
OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT", None
)
OPENAI_EMBEDDINGS_AZURE_API_VERSION = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_API_VERSION", None
)
OPENAI_API_ENGINE_GPT35 = os.environ.get("OPENAI_API_ENGINE_GPT35", None)
OPENAI_API_ENGINE_GPT4 = os.environ.get("OPENAI_API_ENGINE_GPT4", None)
MULTI_REGION_CONFIG = os.environ.get("MULTI_REGION_CONFIG", None)
if isinstance(MULTI_REGION_CONFIG, str):
MULTI_REGION_CONFIG = MULTI_REGION_CONFIG.strip("'").replace("\\n", "\n")
MULTI_REGION_CONFIG = [item.split(",") for item in MULTI_REGION_CONFIG.split("\n")]
WHITELISTED_USERS = os.environ.get("WHITELISTED_USERS", None)
if WHITELISTED_USERS:
WHITELISTED_USERS = WHITELISTED_USERS.split(",")
WHITELISTED_USERS.append(GITHUB_BOT_USERNAME)
DEFAULT_GPT4_MODEL = os.environ.get("DEFAULT_GPT4_MODEL", "gpt-4-0125-preview")
RESEND_API_KEY = os.environ.get("RESEND_API_KEY", None)
LOKI_URL = None
DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
ENV = "prod" if GITHUB_BOT_USERNAME != TEST_BOT_NAME else "dev"
PROGRESS_BASE_URL = os.environ.get(
"PROGRESS_BASE_URL", "https://progress.sweep.dev"
).rstrip("/")
DISABLED_REPOS = os.environ.get("DISABLED_REPOS", "").split(",")
GHA_AUTOFIX_ENABLED: bool = os.environ.get("GHA_AUTOFIX_ENABLED", False)
MERGE_CONFLICT_ENABLED: bool = os.environ.get("MERGE_CONFLICT_ENABLED", False)
INSTALLATION_ID = os.environ.get("INSTALLATION_ID", None)
AWS_ACCESS_KEY=os.environ.get("AWS_ACCESS_KEY")
AWS_SECRET_KEY=os.environ.get("AWS_SECRET_KEY")
AWS_REGION=os.environ.get("AWS_REGION")
ANTHROPIC_AVAILABLE = AWS_ACCESS_KEY and AWS_SECRET_KEY and AWS_REGION
ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY", None)
COHERE_API_KEY = os.environ.get("COHERE_API_KEY", None)
VOYAGE_API_KEY = os.environ.get("VOYAGE_API_KEY", None)
VOYAGE_API_AWS_ACCESS_KEY=os.environ.get("VOYAGE_API_AWS_ACCESS_KEY_ID")
VOYAGE_API_AWS_SECRET_KEY=os.environ.get("VOYAGE_API_AWS_SECRET_KEY")
VOYAGE_API_AWS_REGION=os.environ.get("VOYAGE_API_AWS_REGION")
VOYAGE_API_AWS_ENDPOINT_NAME=os.environ.get("VOYAGE_API_AWS_ENDPOINT_NAME", "voyage-code-2")
VOYAGE_API_USE_AWS = VOYAGE_API_AWS_ACCESS_KEY and VOYAGE_API_AWS_SECRET_KEY and VOYAGE_API_AWS_REGION
PAREA_API_KEY = os.environ.get("PAREA_API_KEY", None)
# TODO: we need to make this dynamic + backoff
BATCH_SIZE = int(
os.environ.get("BATCH_SIZE", 64 if VOYAGE_API_KEY else 256) # Voyage only allows 128 items per batch and 120000 tokens per batch
)
DEPLOYMENT_GHA_ENABLED = os.environ.get("DEPLOYMENT_GHA_ENABLED", "true").lower() == "true"
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

import re
from sweepai.core.chat import ChatGPT
issue_validator_instructions_prompt = """# Instructions
A good issue for Sweep is actionable and it is clear how to resolve it. Here is what Sweep is currently capable of:
- Access to the entire codebase, with a high-quality search engine to find specific code snippets. Sweep is able to pinpoint the exact location of the code that needs to be changed based on vague descriptions.
- Making code changes to fix bugs or add features.
- Reading the GitHub Action logs to run tests and check the results.
- Ability to read images such as screenshots and charts.
Here are some examples of things Sweep does not currently support:
- Large-scale changes like migrations and large version upgrades.
- Tasks requiring accessing outside information like AWS consoles or retrieving API keys.
- Tasks requiring fixes outside of code changes
- Issues that have an existing fix or duplicate issues
Respond in the following format:
<thinking>
Provide an analysis of why it is a good or bad issue to pass on to Sweep. If it is a bad issue, suggest how the issue could be improved or clarified to make it more suitable for Sweep.
</thinking>
<pass>True or False</pass>
If False, respond to the user:
<response_to_user>
Response to user with justification on why the issue is unclear.
</response_to_user>"""
issue_validator_system_prompt = """You are an AI assistant tasked with determining whether an issue reported by customer support should be passed on to be resolved by Sweep, an AI-powered software engineer.
""" + issue_validator_instructions_prompt
issue_validator_user_prompt = """<issue>
{issue}
</issue>\n\n""" + issue_validator_instructions_prompt
def validate_issue(issue: str) -> str:
"""
Somehow haiku and GPT-4 can't do this consistently.
"""
chat_gpt = ChatGPT.from_system_message_string(
prompt_string=issue_validator_system_prompt,
)
response = chat_gpt.chat_anthropic(
issue_validator_user_prompt.format(
issue=issue
),
model="claude-3-opus-20240229",
temperature=0.0,
)
if "<pass>False</pass>" in response:
pattern = "<response_to_user>(.*)</response_to_user>"
return re.search(pattern, response, re.DOTALL).group(1).strip()
return ""
if __name__ == "__main__":


Step 2: ⌨️ Coding

  • sweepai/config/server.py
Modify sweepai/config/server.py with contents: Add a new environment variable for the user's Slack API key.

<original_code>
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

LICENSE_KEY = os.environ.get("LICENSE_KEY", None)
ALTERNATE_AWS = os.environ.get("ALTERNATE_AWS", "none").lower() == "true"
</original_code>

<new_code>
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

SLACK_API_KEY = os.environ.get("SLACK_API_KEY", None)

LICENSE_KEY = os.environ.get("LICENSE_KEY", None)
ALTERNATE_AWS = os.environ.get("ALTERNATE_AWS", "none").lower() == "true"
</new_code>

  • sweepai/handlers/on_ticket.py
Modify sweepai/handlers/on_ticket.py with contents: Import the necessary modules for making HTTP requests and parsing Slack message threads.

<original_code>
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
</original_code>

<new_code>
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter

from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError
</new_code>

  • sweepai/handlers/on_ticket.py
Modify sweepai/handlers/on_ticket.py with contents: In the `on_ticket` function, check if the issue description contains a Slack link. If found, authenticate to Slack and fetch the message thread.

<original_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
</original_code>

<new_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)

slack_link_match = re.search(r'(https://\w+\.slack\.com/archives/\w+/p\d+)', summary)
if slack_link_match:
    slack_link = slack_link_match.group(1)
    slack_client = WebClient(token=SLACK_API_KEY)
    
    try:
        slack_permalink_data = slack_client.chat_getPermalink(
            link=slack_link
        )
        slack_channel_id = slack_permalink_data['channel']
        slack_message_ts = slack_permalink_data['message_ts']

        slack_thread_replies = slack_client.conversations_replies(
            channel=slack_channel_id,
            ts=slack_message_ts
        )

        slack_thread_messages = [message['text'] for message in slack_thread_replies['messages']]
        slack_thread_text = '\n'.join(slack_thread_messages)

        summary += f"\n\nSlack Thread:\n{slack_thread_text}"

    except SlackApiError as e:
        logger.error(f"Error fetching Slack thread: {e}")

repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
    assignee = current_issue.user.login

</new_code>


Step 3: 🔁 Code Review

Working on it...


🎉 Latest improvements to Sweep:
  • New dashboard launched for real-time tracking of Sweep issues, covering all stages from search to coding.
  • Integration of OpenAI's latest Assistant API for more efficient and reliable code planning and editing, improving speed by 3x.
  • Use the GitHub issues extension for creating Sweep issues directly from your editor.

💡 To recreate the pull request edit the issue title or description.
Something wrong? Let us know.

This is an automated message generated by Sweep AI.

Copy link
Contributor

sweep-nightly bot commented May 3, 2024

Sweeping

✨ Track Sweep's progress on our progress dashboard!


50%

💎 Sweep Pro: I'm using GPT-4. You have unlimited GPT-4 tickets. (tracking ID: 5eb258b35b)

Tip

I can email you when I complete this pull request if you set up your email here!


Actions (click)

  • ↻ Restart Sweep

Step 1: 🔎 Searching

I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.

Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.

"""
on_ticket is the main function that is called when a new issue is created.
It is only called by the webhook handler in sweepai/api.py.
"""
import difflib
import io
import os
import re
import traceback
from typing import Any
import zipfile
from time import time
import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
from sweepai.core.sweep_bot import GHA_PROMPT
from sweepai.agents.pr_description_bot import PRDescriptionBot
from sweepai.agents.image_description_bot import ImageDescriptionBot
from sweepai.config.client import (
RESET_FILE,
RESTART_SWEEP_BUTTON,
REVERT_CHANGED_FILES_TITLE,
SweepConfig,
get_documentation_dict,
get_gha_enabled,
)
from sweepai.config.server import (
DEPLOYMENT_GHA_ENABLED,
ENV,
GITHUB_LABEL_NAME,
IS_SELF_HOSTED,
MONGODB_URI,
PROGRESS_BASE_URL,
)
from sweepai.core.entities import (
AssistantRaisedException,
FileChangeRequest,
MaxTokensExceeded,
NoFilesException,
PullRequest,
SandboxResponse,
)
from sweepai.core.entities import create_error_logs as entities_create_error_logs
from sweepai.core.pr_reader import PRReader
from sweepai.core.sweep_bot import SweepBot, get_files_to_change, get_files_to_change_for_gha, validate_file_change_requests
from sweepai.handlers.create_pr import (
create_config_pr,
create_pr_changes,
safe_delete_sweep_branch,
)
from sweepai.handlers.on_check_suite import clean_gh_logs
from sweepai.utils.image_utils import get_image_contents_from_urls, get_image_urls_from_issue
from sweepai.utils.issue_validator import validate_issue
from sweepai.utils.validate_license import validate_license
from sweepai.utils.buttons import Button, ButtonList, create_action_buttons
from sweepai.utils.chat_logger import ChatLogger
from sweepai.utils.diff import generate_diff
from sweepai.utils.event_logger import posthog
from sweepai.utils.github_utils import (
CURRENT_USERNAME,
ClonedRepo,
convert_pr_draft_field,
get_github_client,
get_token,
sanitize_string_for_github,
)
from sweepai.utils.progress import (
AssistantConversation,
PaymentContext,
TicketContext,
TicketProgress,
TicketProgressStatus,
)
from sweepai.utils.prompt_constructor import HumanMessagePrompt
from sweepai.utils.str_utils import (
BOT_SUFFIX,
FASTER_MODEL_MESSAGE,
UPDATES_MESSAGE,
blockquote,
bot_suffix,
checkbox_template,
clean_logs,
collapsible_template,
create_checkbox,
create_collapsible,
discord_suffix,
format_sandbox_success,
get_hash,
sep,
stars_suffix,
strip_sweep,
to_branch_name,
)
from sweepai.utils.ticket_utils import (
center,
fetch_relevant_files,
fire_and_forget_wrapper,
log_error,
prep_snippets,
)
from sweepai.utils.user_settings import UserSettings
# from sandbox.sandbox_utils import Sandbox
sweeping_gif = """<a href="https://github.com/sweepai/sweep"><img class="swing" src="https://raw.githubusercontent.com/sweepai/sweep/main/.assets/sweeping.gif" width="100" style="width:50px; margin-bottom:10px" alt="Sweeping"></a>"""
custom_config = """
extends: relaxed
rules:
line-length: disable
indentation: disable
"""
INSTRUCTIONS_FOR_REVIEW = """\
### 💡 To get Sweep to edit this pull request, you can:
* Comment below, and Sweep can edit the entire PR
* Comment on a file, Sweep will only modify the commented file
* Edit the original issue to get Sweep to recreate the PR from scratch"""
email_template = """Hey {name},
<br/><br/>
🚀 I just finished creating a pull request for your issue ({repo_full_name}#{issue_number}) at <a href="{pr_url}">{repo_full_name}#{pr_number}</a>!
<br/><br/>
You can view how I created this pull request <a href="{progress_url}">here</a>.
<h2>Summary</h2>
<blockquote>
{summary}
</blockquote>
<h2>Files Changed</h2>
<ul>
{files_changed}
</ul>
{sweeping_gif}
<br/>
Cheers,
<br/>
Sweep
<br/>"""
FAILING_GITHUB_ACTION_PROMPT = """\
The following Github Actions failed on a previous attempt at fixing this issue.
Propose a fix to the failing github actions. You must edit the source code, not the github action itself.
{github_action_log}
"""
# Add :eyes: emoji to ticket
def add_emoji(issue: Issue, comment_id: int = None, reaction_content="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
item_to_react_to.create_reaction(reaction_content)
# If SWEEP_BOT reacted to item_to_react_to with "rocket", then remove it.
def remove_emoji(issue: Issue, comment_id: int = None, content_to_delete="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
reactions = item_to_react_to.get_reactions()
for reaction in reactions:
if (
reaction.content == content_to_delete
and reaction.user.login == CURRENT_USERNAME
):
item_to_react_to.delete_reaction(reaction.id)
def create_error_logs(
commit_url_display: str,
sandbox_response: SandboxResponse,
status: str = "✓",
):
return (
(
"<br/>"
+ create_collapsible(
f"Sandbox logs for {commit_url_display} {status}",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
# takes in a list of workflow runs and returns a list of messages containing the logs of the failing runs
def get_failing_gha_logs(runs, installation_id) -> str:
token = get_token(installation_id)
all_logs = ""
for run in runs:
# jobs_url
jobs_url = run.jobs_url
jobs_response = requests.get(
jobs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
)
if jobs_response.status_code == 200:
failed_jobs = []
jobs = jobs_response.json()["jobs"]
for job in jobs:
if job["conclusion"] == "failure":
failed_jobs.append(job)
failed_jobs_name_list = []
for job in failed_jobs:
# add failed steps
for step in job["steps"]:
if step["conclusion"] == "failure":
failed_jobs_name_list.append(
f"{job['name']}/{step['number']}_{step['name']}"
)
else:
logger.error(
"Failed to get jobs for failing github actions, possible a credentials issue"
)
return all_logs
# make sure jobs in valid
if jobs_response.json()['total_count'] == 0:
logger.error(f"no jobs for this run: {run}, continuing...")
continue
# logs url
logs_url = run.logs_url
logs_response = requests.get(
logs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
allow_redirects=True,
)
# Check if the request was successful
if logs_response.status_code == 200:
zip_data = io.BytesIO(logs_response.content)
zip_file = zipfile.ZipFile(zip_data, "r")
zip_file_names = zip_file.namelist()
for file in failed_jobs_name_list:
if f"{file}.txt" in zip_file_names:
logs = zip_file.read(f"{file}.txt").decode("utf-8")
logs_prompt = clean_gh_logs(logs)
all_logs += logs_prompt + "\n"
else:
logger.error(
"Failed to get logs for failing github actions, likely a credentials issue"
)
return all_logs
def delete_old_prs(repo: Repository, issue_number: int):
logger.info("Deleting old PRs...")
prs = repo.get_pulls(
state="open",
sort="created",
direction="desc",
base=SweepConfig.get_branch(repo),
)
for pr in tqdm(prs.get_page(0)):
# # Check if this issue is mentioned in the PR, and pr is owned by bot
# # This is done in create_pr, (pr_description = ...)
if pr.user.login == CURRENT_USERNAME and f"Fixes #{issue_number}.\n" in pr.body:
safe_delete_sweep_branch(pr, repo)
break
def construct_sweep_bot(
repo: Repository,
repo_name: str,
issue_url: str,
repo_description: str,
title: str,
message_summary: str,
cloned_repo: ClonedRepo,
ticket_progress: TicketProgress,
chat_logger: ChatLogger,
snippets: Any = None,
tree: Any = None,
comments: Any = None,
) -> SweepBot:
human_message = HumanMessagePrompt(
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description.strip(),
title=title,
summary=message_summary,
snippets=snippets,
tree=tree,
)
sweep_bot = SweepBot.from_system_message_content(
human_message=human_message,
repo=repo,
is_reply=bool(comments),
chat_logger=chat_logger,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
)
return sweep_bot
def get_comment_header(
index: int,
g: Github,
repo_full_name: str,
user_settings: UserSettings,
progress_headers: list[None | str],
tracking_id: str | None,
payment_message_start: str,
user_settings_message: str,
errored: bool = False,
pr_message: str = "",
done: bool = False,
initial_sandbox_response: int | SandboxResponse = -1,
initial_sandbox_response_file=None,
config_pr_url: str | None = None,
):
config_pr_message = (
"\n"
+ f"<div align='center'>Install Sweep Configs: <a href='{config_pr_url}'>Pull Request</a></div>"
if config_pr_url is not None
else ""
)
actions_message = create_action_buttons(
[
RESTART_SWEEP_BUTTON,
]
)
sandbox_execution_message = "\n\n## GitHub Actions failed\n\nThe sandbox appears to be unavailable or down.\n\n"
if initial_sandbox_response == -1:
sandbox_execution_message = ""
elif initial_sandbox_response is not None:
repo = g.get_repo(repo_full_name)
commit_hash = repo.get_commits()[0].sha
success = initial_sandbox_response.outputs and initial_sandbox_response.success
status = "✓" if success else "X"
sandbox_execution_message = (
"\n\n## GitHub Actions"
+ status
+ "\n\nHere are the GitHub Actions logs prior to making any changes:\n\n"
)
sandbox_execution_message += entities_create_error_logs(
f'<a href="https://github.com/{repo_full_name}/commit/{commit_hash}"><code>{commit_hash[:7]}</code></a>',
initial_sandbox_response,
initial_sandbox_response_file,
)
if success:
sandbox_execution_message += f"\n\nSandbox passed on the latest `{repo.default_branch}`, so sandbox checks will be enabled for this issue."
else:
sandbox_execution_message += "\n\nSandbox failed, so all sandbox checks will be disabled for this issue."
if index < 0:
index = 0
if index == 4:
return (
pr_message
+ config_pr_message
+ f"\n\n---\n{user_settings.get_message(completed=True)}"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
total = len(progress_headers)
index += 1 if done else 0
index *= 100 / total
index = int(index)
index = min(100, index)
if errored:
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Errored&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}<br/>{center(pbar)}\n\n"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Progress&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}"
+ (
center(
f'\n\n<h2>✨ Track Sweep\'s progress on our <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">progress dashboard</a>!</h2>'
)
if MONGODB_URI is not None
else ""
)
+ f"<br/>{center(pbar)}"
+ ("\n" + stars_suffix if index != -1 else "")
+ "\n"
+ center(payment_message_start)
+ f"\n\n---\n{user_settings_message}"
+ config_pr_message
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
def on_ticket(
title: str,
summary: str,
issue_number: int,
issue_url: str, # purely for logging purposes
username: str,
repo_full_name: str,
repo_description: str,
installation_id: int,
comment_id: int = None,
edited: bool = False,
tracking_id: str | None = None,
):
if not os.environ.get("CLI"):
assert validate_license(), "License key is invalid or expired. Please contact us at team@sweep.dev to upgrade to an enterprise license."
with logger.contextualize(
tracking_id=tracking_id,
):
if tracking_id is None:
tracking_id = get_hash()
on_ticket_start_time = time()
logger.info(f"Starting on_ticket with title {title} and summary {summary}")
(
title,
slow_mode,
do_map,
subissues_mode,
sandbox_mode,
fast_mode,
lint_mode,
) = strip_sweep(title)
# fetch images from body of issue
image_urls = get_image_urls_from_issue(issue_number, repo_full_name, installation_id)
image_contents = get_image_contents_from_urls(image_urls)
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n<summary>Checklist</summary>.*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- \[[ X]\].*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
ticket_progress = TicketProgress(
tracking_id=tracking_id,
username=username,
context=TicketContext(
title=title,
description=summary,
repo_full_name=repo_full_name,
issue_number=issue_number,
is_public=repo.private is False,
start_time=int(time()),
),
)
branch_match = re.search(
r"([B|b]ranch:) *(?P<branch_name>.+?)(\s|$)", summary
)
overrided_branch_name = None
if branch_match and "branch_name" in branch_match.groupdict():
overrided_branch_name = (
branch_match.groupdict()["branch_name"].strip().strip("`\"'")
)
# TODO: this code might be finicky, might have missed edge cases
if overrided_branch_name.startswith("https://github.com/"):
overrided_branch_name = overrided_branch_name.split("?")[0].split(
"tree/"
)[-1]
SweepConfig.get_branch(repo, overrided_branch_name)
chat_logger = (
ChatLogger(
{
"repo_name": repo_name,
"title": title,
"summary": summary,
"issue_number": issue_number,
"issue_url": issue_url,
"username": (
username if not username.startswith("sweep") else assignee
),
"repo_full_name": repo_full_name,
"repo_description": repo_description,
"installation_id": installation_id,
"type": "ticket",
"mode": ENV,
"comment_id": comment_id,
"edited": edited,
"tracking_id": tracking_id,
},
active=True,
)
if MONGODB_URI
else None
)
if chat_logger and not IS_SELF_HOSTED:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
if use_faster_model:
raise Exception(FASTER_MODEL_MESSAGE)
if fast_mode:
use_faster_model = True
if not comment_id and not edited and chat_logger and not sandbox_mode:
fire_and_forget_wrapper(chat_logger.add_successful_ticket)(
gpt3=use_faster_model
)
organization, repo_name = repo_full_name.split("/")
metadata = {
"issue_url": issue_url,
"repo_full_name": repo_full_name,
"organization": organization,
"repo_name": repo_name,
"repo_description": repo_description,
"username": username,
"comment_id": comment_id,
"title": title,
"installation_id": installation_id,
"function": "on_ticket",
"edited": edited,
"model": "gpt-3.5" if use_faster_model else "gpt-4",
"tier": "pro" if is_paying_user else "free",
"mode": ENV,
"slow_mode": slow_mode,
"do_map": do_map,
"subissues_mode": subissues_mode,
"sandbox_mode": sandbox_mode,
"fast_mode": fast_mode,
"is_self_hosted": IS_SELF_HOSTED,
"tracking_id": tracking_id,
}
fire_and_forget_wrapper(posthog.capture)(
username, "started", properties=metadata
)
try:
if current_issue.state == "closed":
fire_and_forget_wrapper(posthog.capture)(
username,
"issue_closed",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": False, "reason": "Issue is closed"}
fire_and_forget_wrapper(add_emoji)(current_issue, comment_id)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="rocket"
)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="confused"
)
fire_and_forget_wrapper(current_issue.edit)(body=summary)
replies_text = ""
summary = summary if summary else ""
fire_and_forget_wrapper(delete_old_prs)(repo, issue_number)
if not sandbox_mode:
progress_headers = [
None,
"Step 1: 🔎 Searching",
"Step 2: ⌨️ Coding",
"Step 3: 🔁 Code Review",
]
else:
progress_headers = [
None,
"📖 Reading File",
"🛠️ Executing Sandbox",
]
issue_comment = None
payment_message, payment_message_start = get_payment_messages(
chat_logger
)
ticket_progress.context.payment_context = PaymentContext(
use_faster_model=use_faster_model,
pro_user=is_paying_user,
daily_tickets_used=(
chat_logger.get_ticket_count(use_date=True)
if chat_logger
else 0
),
monthly_tickets_used=(
chat_logger.get_ticket_count() if chat_logger else 0
),
)
ticket_progress.save()
config_pr_url = None
user_settings = UserSettings.from_username(username=username)
user_settings_message = user_settings.get_message()
cloned_repo = ClonedRepo(
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=overrided_branch_name,
)
# check that repo's directory is non-empty
if os.listdir(cloned_repo.cached_dir) == []:
logger.info("Empty repo")
first_comment = (
"Sweep is currently not supported on empty repositories. Please add some"
f" code to your repository and try again.\n{sep}##"
f" {progress_headers[1]}\n{bot_suffix}{discord_suffix}"
)
if issue_comment is None:
issue_comment = current_issue.create_comment(
first_comment + BOT_SUFFIX
)
else:
issue_comment.edit(first_comment + BOT_SUFFIX)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {"success": False}
indexing_message = (
"I'm searching for relevant snippets in your repository. If this is your first"
" time using Sweep, I'm indexing your repository. You can monitor the progress using the progress dashboard"
)
first_comment = (
f"{get_comment_header(0, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message)}\n{sep}I am currently looking into this ticket! I"
" will update the progress of the ticket in this comment. I am currently"
f" searching through your code, looking for relevant snippets.\n{sep}##"
f" {progress_headers[1]}\n{indexing_message}{bot_suffix}{discord_suffix}"
)
# Find Sweep's previous comment
comments = []
for comment in current_issue.get_comments():
comments.append(comment)
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
break
if issue_comment is None:
issue_comment = current_issue.create_comment(first_comment)
else:
fire_and_forget_wrapper(issue_comment.edit)(first_comment)
old_edit = issue_comment.edit
issue_comment.edit = lambda msg: old_edit(msg + BOT_SUFFIX)
past_messages = {}
current_index = 0
table = None
initial_sandbox_response = -1
initial_sandbox_response_file = None
def refresh_token():
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
return user_token, g, repo
def edit_sweep_comment(
message: str,
index: int,
pr_message="",
done=False,
add_bonus_message=True,
):
nonlocal current_index, user_token, g, repo, issue_comment, initial_sandbox_response, initial_sandbox_response_file
message = sanitize_string_for_github(message)
if pr_message:
pr_message = sanitize_string_for_github(pr_message)
# -1 = error, -2 = retry
# Only update the progress bar if the issue generation errors.
errored = index == -1
if index >= 0:
past_messages[index] = message
current_index = index
agg_message = None
# Include progress history
# index = -2 is reserved for
for i in range(
current_index + 2
): # go to next header (for Working on it... text)
if i == 0 or i >= len(progress_headers):
continue # skip None header
header = progress_headers[i]
if header is not None:
header = "## " + header + "\n"
else:
header = "No header\n"
msg = header + (past_messages.get(i) or "Working on it...")
if agg_message is None:
agg_message = msg
else:
agg_message = agg_message + f"\n{sep}" + msg
suffix = bot_suffix + discord_suffix
if errored:
agg_message = (
"## ❌ Unable to Complete PR"
+ "\n"
+ message
+ (
"\n\nFor bonus GPT-4 tickets, please report this bug on"
f" **[Discord](https://discord.gg/invite/sweep)** (tracking ID: `{tracking_id}`)."
if add_bonus_message
else ""
)
)
if table is not None:
agg_message = (
agg_message
+ f"\n{sep}Please look at the generated plan. If something looks"
f" wrong, please add more details to your issue.\n\n{table}"
)
suffix = bot_suffix # don't include discord suffix for error messages
# Update the issue comment
msg = f"{get_comment_header(current_index, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message, errored=errored, pr_message=pr_message, done=done, initial_sandbox_response=initial_sandbox_response, initial_sandbox_response_file=initial_sandbox_response_file, config_pr_url=config_pr_url)}\n{sep}{agg_message}{suffix}"
try:
issue_comment.edit(msg)
except BadCredentialsException:
logger.error(
f"Bad credentials, refreshing token (tracking ID: `{tracking_id}`)"
)
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
issue_comment = None
for comment in comments:
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
current_issue = repo.get_issue(number=issue_number)
if issue_comment is None:
issue_comment = current_issue.create_comment(msg)
else:
issue_comment = [
comment
for comment in current_issue.get_comments()
if comment.user.login == CURRENT_USERNAME
][0]
issue_comment.edit(msg)
if use_faster_model:
edit_sweep_comment(
FASTER_MODEL_MESSAGE, -1, add_bonus_message=False
)
posthog.capture(
username,
"ran_out_of_tickets",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {
"success": False,
"error_message": "We deprecated supporting GPT 3.5.",
}
error_message = validate_issue(title + summary)
if error_message:
logger.warning(f"Validation error: {error_message}")
edit_sweep_comment(
(
f"The issue was rejected with the following response:\n\n{blockquote(error_message)}"
),
-1,
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
posthog.capture(
username,
"invalid_issue",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": True}
prs_extracted = PRReader.extract_prs(repo, summary)
message_summary = summary
if prs_extracted:
message_summary += "\n\n" + prs_extracted
edit_sweep_comment(
create_collapsible(
"I found that you mentioned the following Pull Requests that might be important:",
blockquote(
prs_extracted,
),
),
1,
)
try:
# search/context manager
logger.info("Searching for relevant snippets...")
if image_contents: # doing it here to avoid editing the original issue
message_summary += ImageDescriptionBot().describe_images(text=title + message_summary, images=image_contents)
snippets, tree, _, repo_context_manager = fetch_relevant_files(
cloned_repo,
title,
message_summary,
replies_text,
username,
metadata,
on_ticket_start_time,
tracking_id,
is_paying_user,
is_consumer_tier,
issue_url,
chat_logger,
ticket_progress,
images=image_contents
)
cloned_repo = repo_context_manager.cloned_repo
except Exception as e:
edit_sweep_comment(
(
"It looks like an issue has occurred around fetching the files."
f" The exception was {str(e)}. If this error persists"
f" contact team@sweep.dev.\n\n> @{username}, editing this issue description to include more details will automatically make me relaunch. Please join our Discord server for support (tracking_id={tracking_id})"
),
-1,
)
raise Exception("Failed to fetch files") from e
_user_token, g = get_github_client(installation_id)
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
repo = g.get_repo(repo_full_name)
ticket_progress.search_progress.indexing_progress = (
ticket_progress.search_progress.indexing_total
)
ticket_progress.status = TicketProgressStatus.PLANNING
ticket_progress.save()
# Fetch git commit history
if not repo_description:
repo_description = "No description provided."
message_summary += replies_text
get_documentation_dict(repo)
docs_results = ""
sweep_bot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title=title,
message_summary=message_summary,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
# Check repository for sweep.yml file.
sweep_yml_exists = False
sweep_yml_failed = False
for content_file in repo.get_contents(""):
if content_file.name == "sweep.yaml":
sweep_yml_exists = True
# Check if YAML is valid
yaml_content = content_file.decoded_content.decode("utf-8")
sweep_yaml_dict = {}
try:
sweep_yaml_dict = yaml.safe_load(yaml_content)
except Exception:
logger.error(f"Failed to load YAML file: {yaml_content}")
if len(sweep_yaml_dict) > 0:
break
linter_config = yamllint_config.YamlLintConfig(custom_config)
problems = list(linter.run(yaml_content, linter_config))
if problems:
errors = [
f"Line {problem.line}: {problem.desc} (rule: {problem.rule})"
for problem in problems
]
error_message = "\n".join(errors)
markdown_error_message = f"**There is something wrong with your [sweep.yaml](https://github.com/{repo_full_name}/blob/main/sweep.yaml):**\n```\n{error_message}\n```"
sweep_yml_failed = True
logger.error(markdown_error_message)
edit_sweep_comment(markdown_error_message, -1)
else:
logger.info("The YAML file is valid. No errors found.")
break
# If sweep.yaml does not exist, then create a new PR that simply creates the sweep.yaml file.
if not sweep_yml_exists:
try:
logger.info("Creating sweep.yaml file...")
config_pr = create_config_pr(sweep_bot, cloned_repo=cloned_repo)
config_pr_url = config_pr.html_url
edit_sweep_comment(message="", index=-2)
except Exception as e:
logger.error(
"Failed to create new branch for sweep.yaml file.\n",
e,
traceback.format_exc(),
)
else:
logger.info("sweep.yaml file already exists.")
try:
# ANALYZE SNIPPETS
newline = "\n"
edit_sweep_comment(
"I found the following snippets in your repository. I will now analyze"
" these snippets and come up with a plan."
+ "\n\n"
+ create_collapsible(
"Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.",
"\n".join(
[
f"https://github.com/{organization}/{repo_name}/blob/{repo.get_commits()[0].sha}/{snippet.file_path}#L{max(snippet.start, 1)}-L{min(snippet.end, snippet.content.count(newline) - 1)}\n"
for snippet in snippets
]
),
)
+ (
create_collapsible(
"I also found that you mentioned the following Pull Requests that may be helpful:",
blockquote(prs_extracted),
)
if prs_extracted
else ""
)
+ (f"\n\n{docs_results}\n\n" if docs_results else ""),
1,
)
logger.info("Fetching files to modify/create...")
file_change_requests, plan = get_files_to_change(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=f"{title}\n\n{message_summary}",
repo_name=repo_full_name,
cloned_repo=cloned_repo,
images=image_contents
)
validate_file_change_requests(file_change_requests, cloned_repo)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.status = TicketProgressStatus.CODING
ticket_progress.save()
if not file_change_requests:
if len(title + summary) < 60:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details? Please make sure that the title and"
" summary of the issue are at least 60 characters."
),
-1,
)
else:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details?"
),
-1,
)
raise Exception("No files to modify.")
file_change_requests: list[
FileChangeRequest
] = sweep_bot.validate_file_change_requests(
file_change_requests,
)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.save()
table = tabulate(
[
[
file_change_request.entity_display,
file_change_request.instructions_display.replace(
"\n", "<br/>"
).replace("```", "\\```"),
]
for file_change_request in file_change_requests
if file_change_request.change_type != "check"
],
headers=["File Path", "Proposed Changes"],
tablefmt="pipe",
)
logger.info("Generating PR...")
pull_request = PullRequest(
title="Sweep: " + title,
branch_name="sweep/" + to_branch_name(title),
content="",
)
logger.info("Making PR...")
ticket_progress.context.branch_name = pull_request.branch_name
ticket_progress.save()
files_progress: list[tuple[str, str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
"⏳ In Progress",
"",
)
for file_change_request in file_change_requests
]
checkboxes_progress: list[tuple[str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
" ",
)
for file_change_request in file_change_requests
if not file_change_request.change_type == "check"
]
checkboxes_contents = "\n".join(
[
create_checkbox(
f"`{filename}`", blockquote(instructions), check == "X"
)
for filename, instructions, check in checkboxes_progress
]
)
create_collapsible("Checklist", checkboxes_contents, opened=True)
file_change_requests[0].status = "running"
condensed_checkboxes_contents = "\n".join(
[
create_checkbox(f"`{filename}`", "", check == "X").strip()
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_collapsible = create_collapsible(
"Checklist", condensed_checkboxes_contents, opened=True
)
current_issue = repo.get_issue(number=issue_number)
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
delete_branch = False
generator = create_pr_changes(
file_change_requests,
pull_request,
sweep_bot,
username,
installation_id,
issue_number,
chat_logger=chat_logger,
base_branch=overrided_branch_name,
additional_messages=[],
)
edit_sweep_comment(checkboxes_contents, 2)
if not file_change_requests:
raise NoFilesException()
response = {
"error": Exception(
f"Sweep failed to generate any file change requests! This could mean that Sweep failed to find the correct lines of code to modify or that GPT-4 did not respond in our specified format. Sometimes, retrying will fix this error. Otherwise, reach out to our Discord server for support (tracking_id={tracking_id})."
)
}
changed_files = []
for item in generator:
if isinstance(item, dict):
response = item
break
(
new_file_contents,
_,
commit,
file_change_requests,
) = item
# append all files that have been changed
if new_file_contents:
for file_name, _ in new_file_contents.items():
changed_files.append(file_name)
commit_hash: str = (
commit
if isinstance(commit, str)
else (
commit.sha
if commit is not None
else repo.get_branch(
pull_request.branch_name
).commit.sha
)
)
commit_url = (
f"https://github.com/{repo_full_name}/commit/{commit_hash}"
)
commit_url_display = (
f"<a href='{commit_url}'><code>{commit_hash[:7]}</code></a>"
)
create_error_logs(
commit_url_display,
None,
status=(
"✓"
),
)
checkboxes_progress = [
(
file_change_request.display_summary
+ " "
+ file_change_request.status_display
+ " "
+ (file_change_request.commit_hash_url or "")
+ f" [Edit]({file_change_request.get_edit_url(repo.full_name, pull_request.branch_name)})",
file_change_request.instructions_ticket_display
+ f"\n\n{file_change_request.diff_display}",
(
"X"
if file_change_request.status
in ("succeeded", "failed")
else " "
),
)
for file_change_request in file_change_requests
]
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
collapsible_template.format(
summary="Checklist",
body=checkboxes_contents,
opened="open",
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
try:
current_issue = repo.get_issue(number=issue_number)
except BadCredentialsException:
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
logger.info(files_progress)
edit_sweep_comment(checkboxes_contents, 2)
if not response.get("success"):
raise Exception(f"Failed to create PR: {response.get('error')}")
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
for _ in range(3):
try:
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
break
except Exception:
from time import sleep
sleep(1)
edit_sweep_comment(checkboxes_contents, 2)
pr_changes = response["pull_request"]
# change the body here
diff_text = get_branch_diff_text(
repo=repo,
branch=pull_request.branch_name,
base_branch=overrided_branch_name,
)
new_description = PRDescriptionBot().describe_diffs(
diff_text,
pull_request.title,
)
# TODO: update the title as well
if new_description:
pr_changes.body = (
f"{new_description}\n\nFixes"
f" #{issue_number}.\n\n---\n\n{UPDATES_MESSAGE}\n\n---\n\n{INSTRUCTIONS_FOR_REVIEW}{BOT_SUFFIX}"
)
edit_sweep_comment(
"I have finished coding the issue. I am now reviewing it for completeness.",
3,
)
change_location = f" [`{pr_changes.pr_head}`](https://github.com/{repo_full_name}/commits/{pr_changes.pr_head}).\n\n"
review_message = (
"Here are my self-reviews of my changes at" + change_location
)
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
except Exception:
pass
changes_required, review_message = False, ""
if changes_required:
edit_sweep_comment(
review_message
+ "\n\nI finished incorporating these changes.",
3,
)
else:
edit_sweep_comment(
f"I have finished reviewing the code for completeness. I did not find errors for {change_location}",
3,
)
revert_buttons = []
for changed_file in set(changed_files):
revert_buttons.append(
Button(label=f"{RESET_FILE} {changed_file}")
)
revert_buttons_list = ButtonList(
buttons=revert_buttons, title=REVERT_CHANGED_FILES_TITLE
)
# delete failing sweep yaml if applicable
if sweep_yml_failed:
try:
repo.delete_file(
"sweep.yaml",
"Delete failing sweep.yaml",
branch=pr_changes.pr_head,
sha=repo.get_contents("sweep.yaml").sha,
)
except Exception:
pass
# create draft pr, then convert to regular pr later
pr: GithubPullRequest = repo.create_pull(
title=pr_changes.title,
body=pr_changes.body,
head=pr_changes.pr_head,
base=overrided_branch_name or SweepConfig.get_branch(repo),
# removed draft PR
draft=False,
)
try:
pr.add_to_assignees(username)
except Exception as e:
logger.error(
f"Failed to add assignee {username}: {e}, probably a bot."
)
ticket_progress.status = TicketProgressStatus.COMPLETE
ticket_progress.context.done_time = time()
ticket_progress.context.pr_id = pr.number
ticket_progress.save()
if revert_buttons:
pr.create_issue_comment(
revert_buttons_list.serialize() + BOT_SUFFIX
)
# add comments before labelling
pr.add_to_labels(GITHUB_LABEL_NAME)
current_issue.create_reaction("rocket")
heres_pr_message = f'<h1 align="center">🚀 Here\'s the PR! <a href="{pr.html_url}">#{pr.number}</a></h1>'
progress_message = f'<div align="center"><b>See Sweep\'s progress at <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">the progress dashboard</a>!</b></div>'
edit_sweep_comment(
review_message + "\n\nSuccess! 🚀",
4,
pr_message=(
f"{center(heres_pr_message)}\n{center(progress_message)}\n{center(payment_message_start)}"
),
done=True,
)
user_settings = UserSettings.from_username(username=username)
user = g.get_user(username)
full_name = user.name or user.login
name = full_name.split(" ")[0]
files_changed = []
for fcr in file_change_requests:
if fcr.change_type in ("create", "modify"):
diff = list(
difflib.unified_diff(
(fcr.old_content or "").splitlines() or [],
(fcr.new_content or "").splitlines() or [],
lineterm="",
)
)
added = sum(
1
for line in diff
if line.startswith("+") and not line.startswith("+++")
)
removed = sum(
1
for line in diff
if line.startswith("-") and not line.startswith("---")
)
files_changed.append(
f"<code>{fcr.filename}</code> (+{added}/-{removed})"
)
user_settings.send_email(
subject=f"Sweep Pull Request Complete for {repo_name}#{issue_number} {title}",
html=email_template.format(
name=name,
pr_url=pr.html_url,
issue_number=issue_number,
repo_full_name=repo_full_name,
pr_number=pr.number,
progress_url=f"{PROGRESS_BASE_URL}/issues/{tracking_id}",
summary=markdown.markdown(pr_changes.body),
files_changed="\n".join(
[f"<li>{item}</li>" for item in files_changed]
),
sweeping_gif=sweeping_gif,
),
)
# poll for github to check when gha are done
total_poll_attempts = 0
total_edit_attempts = 0
SLEEP_DURATION_SECONDS = 15
GITHUB_ACTIONS_ENABLED = get_gha_enabled(repo=repo) and DEPLOYMENT_GHA_ENABLED
GHA_MAX_EDIT_ATTEMPTS = 5 # max number of times to edit PR
current_commit = pr.head.sha
while True and GITHUB_ACTIONS_ENABLED:
logger.info(
f"Polling to see if Github Actions have finished... {total_poll_attempts}"
)
# we wait at most 60 minutes
if total_poll_attempts * SLEEP_DURATION_SECONDS // 60 >= 60:
break
else:
# wait one minute between check attempts
total_poll_attempts += 1
from time import sleep
sleep(SLEEP_DURATION_SECONDS)
runs = list(repo.get_workflow_runs(branch=pr.head.ref, head_sha=current_commit))
# if all runs have succeeded, break
if all([run.conclusion == "success" for run in runs]):
break
# if any of them have failed we retry
if any([run.conclusion == "failure" for run in runs]):
failed_runs = [
run for run in runs if run.conclusion == "failure"
]
failed_gha_logs: list[str] = get_failing_gha_logs(
failed_runs,
installation_id,
)
if failed_gha_logs:
# make edits to the PR
# TODO: look into rollbacks so we don't continue adding onto errors
cloned_repo = ClonedRepo( # reinitialize cloned_repo to avoid conflicts
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=pr.head.ref,
)
diffs = get_branch_diff_text(repo=repo, branch=pr.head.ref, base_branch=pr.base.ref)
problem_statement = f"{title}\n{message_summary}\n{replies_text}"
all_information_prompt = GHA_PROMPT.format(
problem_statement=problem_statement,
github_actions_logs=failed_gha_logs,
changes_made=diffs,
)
repo_context_manager = prep_snippets(cloned_repo=cloned_repo, query=(title + message_summary + replies_text).strip("\n"), ticket_progress=ticket_progress) # need to do this, can use the old query for speed
sweep_bot: SweepBot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title="Fix the following errors to complete the user request.",
message_summary=all_information_prompt,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
file_change_requests, plan = get_files_to_change_for_gha(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=all_information_prompt,
updated_files=new_file_contents,
cloned_repo=cloned_repo,
chat_logger=chat_logger,
)
validate_file_change_requests(file_change_requests, cloned_repo)
previous_modify_files_dict: dict[str, dict[str, str | list[str]]] | None = None
_, commit, _ = sweep_bot.handle_modify_file_main(
branch=pr.head.ref,
assistant_conversation=None,
additional_messages=[],
previous_modify_files_dict=previous_modify_files_dict,
file_change_requests=file_change_requests,
username=username
)
current_commit = commit.sha
pr = repo.get_pull(pr.number) # IMPORTANT: resync PR otherwise you'll fetch old GHA runs
total_edit_attempts += 1
if total_edit_attempts >= GHA_MAX_EDIT_ATTEMPTS:
logger.info(f"Tried to edit PR {GHA_MAX_EDIT_ATTEMPTS} times, giving up.")
break
# if none of the runs have completed we wait and poll github
logger.info(
f"No Github Actions have failed yet and not all have succeeded yet, waiting for {SLEEP_DURATION_SECONDS} seconds before polling again..."
)
# break from main for loop
convert_pr_draft_field(pr, is_draft=False, installation_id=installation_id)
except MaxTokensExceeded as e:
logger.info("Max tokens exceeded")
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Max tokens exceeded. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Max Tokens Exceeded",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
if chat_logger and chat_logger.is_paying_user():
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too long."
" We are currently working on improved file streaming to address"
" this issue.\n"
),
-1,
)
else:
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too"
" long.\n\nIf this file is incorrect, please describe the desired"
" file in the prompt. However, if you would like to edit longer"
" files, consider upgrading to [Sweep Pro](https://sweep.dev/) for"
" longer context lengths.\n"
),
-1,
)
delete_branch = True
raise e
except NoFilesException as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sweep could not find files to modify to address this issue. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.info("Sweep could not find files to modify")
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Sweep could not find files to modify",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
edit_sweep_comment(
(
"Sorry, Sweep could not find any appropriate files to edit to address"
" this issue. If this is a mistake, please provide more context and Sweep"
f" will retry!\n\n@{username}, please edit the issue description to"
" include more details. You can also ask for help on our community"
" forum: https://community.sweep.dev/"
),
-1,
)
delete_branch = True
raise e
except openai.BadRequestError as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sorry, it looks like there is an error with communicating with OpenAI. If this error persists, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
edit_sweep_comment(
(
"I'm sorry, but it looks our model has ran out of context length. We're"
" trying to make this happen less, but one way to mitigate this is to"
" code smaller files. If this error persists report it at"
" https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Context Length",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
posthog.capture(
username,
"failed",
properties={
"error": str(e),
"trace": traceback.format_exc(),
"reason": "Invalid request error / context length",
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
delete_branch = True
raise e
except AssistantRaisedException as e:
if ticket_progress is not None:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Sweep raised an error with the following message: {e.message}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.exception(e)
edit_sweep_comment(
f"Sweep raised an error with the following message:\n{blockquote(e.message)}",
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
except Exception as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Internal server error: {str(e)}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
# title and summary are defined elsewhere
if len(title + summary) < 60:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error occurred due to"
f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
" so Sweep can better address it. Alternatively, post on our community forum"
" for assistance: https://community.sweep.dev/"
),
-1,
)
else:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error has occurred due to"
+ f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
+ " so Sweep can better address it. Alternatively, reach out to Kevin or William for help at"
+ " https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
else:
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
fire_and_forget_wrapper(add_emoji)("rocket")
except SystemExit:
raise SystemExit
except Exception as e:
logger.error(e)
if delete_branch:
try:
if pull_request.branch_name.startswith("sweep"):
repo.get_git_ref(
f"heads/{pull_request.branch_name}"
).delete()
else:
raise Exception(
f"Branch name {pull_request.branch_name} does not start with sweep/"
)
except Exception as e:
logger.error(e)
logger.error(traceback.format_exc())
logger.info("Deleted branch", pull_request.branch_name)
except Exception as e:
posthog.capture(
username,
"failed",
properties={
**metadata,
"error": str(e),
"trace": traceback.format_exc(),
"duration": round(time() - on_ticket_start_time),
},
)
raise e
posthog.capture(
username,
"success",
properties={**metadata, "duration": round(time() - on_ticket_start_time)},
)
logger.info("on_ticket success in " + str(round(time() - on_ticket_start_time)))
return {"success": True}
def handle_sandbox_mode(
title, repo_full_name, repo, ticket_progress, edit_sweep_comment
):
logger.info("Running in sandbox mode")
sweep_bot = SweepBot(repo=repo, ticket_progress=ticket_progress)
logger.info("Getting file contents")
file_name = title.split(":")[1].strip()
file_contents = sweep_bot.get_contents(file_name).decoded_content.decode("utf-8")
try:
ext = file_name.split(".")[-1]
except Exception:
ext = ""
file_contents.replace("```", "\`\`\`")
sha = repo.get_branch(repo.default_branch).commit.sha
permalink = f"https://github.com/{repo_full_name}/blob/{sha}/{file_name}#L1-L{len(file_contents.splitlines())}"
logger.info("Running sandbox")
edit_sweep_comment(
f"Running sandbox for {file_name}. Current Code:\n\n{permalink}",
1,
)
updated_contents, sandbox_response = sweep_bot.check_sandbox(
file_name, file_contents
)
logger.info("Sandbox finished")
logs = (
(
"<br/>"
+ create_collapsible(
"Sandbox logs",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
updated_contents = updated_contents.replace("```", "\`\`\`")
diff = generate_diff(file_contents, updated_contents).replace("```", "\`\`\`")
diff_display = (
f"Updated Code:\n\n```{ext}\n{updated_contents}```\nDiff:\n```diff\n{diff}\n```"
if diff
else f"Sandbox made no changes to {file_name} (formatters were not configured or Sweep didn't make changes)."
)
edit_sweep_comment(
f"{logs}\n{diff_display}",
2,
)
edit_sweep_comment("N/A", 3)
logger.info("Sandbox comments updated")
def get_branch_diff_text(repo, branch, base_branch=None):
base_branch = base_branch or SweepConfig.get_branch(repo)
comparison = repo.compare(base_branch, branch)
file_diffs = comparison.files
pr_diffs = []
for file in file_diffs:
diff = file.patch
if (
file.status == "added"
or file.status == "modified"
or file.status == "removed"
):
pr_diffs.append((file.filename, diff))
else:
logger.info(
f"File status {file.status} not recognized"
) # TODO(sweep): We don't handle renamed files
return "\n".join([f"{filename}\n{diff}" for filename, diff in pr_diffs])
def get_payment_messages(chat_logger: ChatLogger):
if chat_logger:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
tracking_id = chat_logger.data["tracking_id"] if MONGODB_URI is not None else None
# Find the first comment made by the bot
tickets_allocated = 5
if is_consumer_tier:
tickets_allocated = 15
if is_paying_user:
tickets_allocated = 500
purchased_ticket_count = (
chat_logger.get_ticket_count(purchased=True) if chat_logger else 0
)
ticket_count = (
max(tickets_allocated - chat_logger.get_ticket_count(), 0)
+ purchased_ticket_count
if chat_logger
else 999
)
daily_ticket_count = (
(3 - chat_logger.get_ticket_count(use_date=True) if not use_faster_model else 0)
if chat_logger
else 999
)
model_name = "GPT-4"
single_payment_link = "https://buy.stripe.com/00g3fh7qF85q0AE14d"
pro_payment_link = "https://buy.stripe.com/00g5npeT71H2gzCfZ8"
daily_message = (
f" and {daily_ticket_count} for the day"
if not is_paying_user and not is_consumer_tier
else ""
)
user_type = "💎 <b>Sweep Pro</b>" if is_paying_user else "⚡ <b>Sweep Basic Tier</b>"
gpt_tickets_left_message = (
f"{ticket_count} GPT-4 tickets left for the month"
if not is_paying_user
else "unlimited GPT-4 tickets"
)
purchase_message = f"<br/><br/> For more GPT-4 tickets, visit <a href={single_payment_link}>our payment portal</a>. For a one week free trial, try <a href={pro_payment_link}>Sweep Pro</a> (unlimited GPT-4 tickets)."
payment_message = (
f"{user_type}: I used {model_name} to create this ticket. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)
payment_message_start = (
f"{user_type}: I'm using {model_name}. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)

import base64
import os
from dotenv import load_dotenv
from loguru import logger
logger.print = logger.info
load_dotenv(dotenv_path=".env", override=True, verbose=True)
os.environ["GITHUB_APP_PEM"] = os.environ.get("GITHUB_APP_PEM") or base64.b64decode(
os.environ.get("GITHUB_APP_PEM_BASE64", "")
).decode("utf-8")
if os.environ["GITHUB_APP_PEM"]:
os.environ["GITHUB_APP_ID"] = (
(os.environ.get("GITHUB_APP_ID") or os.environ.get("APP_ID"))
.replace("\\n", "\n")
.strip('"')
)
TEST_BOT_NAME = "sweep-nightly[bot]"
ENV = os.environ.get("ENV", "dev")
BOT_TOKEN_NAME = "bot-token"
# goes under Modal 'discord' secret name (optional, can leave env var blank)
DISCORD_WEBHOOK_URL = os.environ.get("DISCORD_WEBHOOK_URL")
DISCORD_MEDIUM_PRIORITY_URL = os.environ.get("DISCORD_MEDIUM_PRIORITY_URL")
DISCORD_LOW_PRIORITY_URL = os.environ.get("DISCORD_LOW_PRIORITY_URL")
DISCORD_FEEDBACK_WEBHOOK_URL = os.environ.get("DISCORD_FEEDBACK_WEBHOOK_URL")
SWEEP_HEALTH_URL = os.environ.get("SWEEP_HEALTH_URL")
DISCORD_STATUS_WEBHOOK_URL = os.environ.get("DISCORD_STATUS_WEBHOOK_URL")
# goes under Modal 'github' secret name
GITHUB_APP_ID = os.environ.get("GITHUB_APP_ID", os.environ.get("APP_ID"))
# deprecated: old logic transfer so upstream can use this
if GITHUB_APP_ID is None:
if ENV == "prod":
GITHUB_APP_ID = "307814"
elif ENV == "dev":
GITHUB_APP_ID = "324098"
elif ENV == "staging":
GITHUB_APP_ID = "327588"
GITHUB_BOT_USERNAME = os.environ.get("GITHUB_BOT_USERNAME")
# deprecated: left to support old logic
if not GITHUB_BOT_USERNAME:
if ENV == "prod":
GITHUB_BOT_USERNAME = "sweep-ai[bot]"
elif ENV == "dev":
GITHUB_BOT_USERNAME = "sweep-nightly[bot]"
elif ENV == "staging":
GITHUB_BOT_USERNAME = "sweep-canary[bot]"
elif not GITHUB_BOT_USERNAME.endswith("[bot]"):
GITHUB_BOT_USERNAME = GITHUB_BOT_USERNAME + "[bot]"
GITHUB_LABEL_NAME = os.environ.get("GITHUB_LABEL_NAME", "sweep")
GITHUB_LABEL_COLOR = os.environ.get("GITHUB_LABEL_COLOR", "9400D3")
GITHUB_LABEL_DESCRIPTION = os.environ.get(
"GITHUB_LABEL_DESCRIPTION", "Sweep your software chores"
)
GITHUB_APP_PEM = os.environ.get("GITHUB_APP_PEM")
GITHUB_APP_PEM = GITHUB_APP_PEM or os.environ.get("PRIVATE_KEY")
if GITHUB_APP_PEM is not None:
GITHUB_APP_PEM = GITHUB_APP_PEM.strip(' \n"') # Remove whitespace and quotes
GITHUB_APP_PEM = GITHUB_APP_PEM.replace("\\n", "\n")
GITHUB_CONFIG_BRANCH = os.environ.get("GITHUB_CONFIG_BRANCH", "sweep/add-sweep-config")
GITHUB_DEFAULT_CONFIG = os.environ.get(
"GITHUB_DEFAULT_CONFIG",
"""# Sweep AI turns bugs & feature requests into code changes (https://sweep.dev)
# For details on our config file, check out our docs at https://docs.sweep.dev/usage/config
# This setting contains a list of rules that Sweep will check for. If any of these rules are broken in a new commit, Sweep will create an pull request to fix the broken rule.
rules:
{additional_rules}
# This is the branch that Sweep will develop from and make pull requests to. Most people use 'main' or 'master' but some users also use 'dev' or 'staging'.
branch: 'main'
# By default Sweep will read the logs and outputs from your existing Github Actions. To disable this, set this to false.
gha_enabled: True
# This is the description of your project. It will be used by sweep when creating PRs. You can tell Sweep what's unique about your project, what frameworks you use, or anything else you want.
#
# Example:
#
# description: sweepai/sweep is a python project. The main api endpoints are in sweepai/api.py. Write code that adheres to PEP8.
description: ''
# This sets whether to create pull requests as drafts. If this is set to True, then all pull requests will be created as drafts and GitHub Actions will not be triggered.
draft: False
# This is a list of directories that Sweep will not be able to edit.
blocked_dirs: []
""",
)
MONGODB_URI = os.environ.get("MONGODB_URI", None)
IS_SELF_HOSTED = os.environ.get("IS_SELF_HOSTED", "true").lower() == "true"
REDIS_URL = os.environ.get("REDIS_URL")
if not REDIS_URL:
REDIS_URL = os.environ.get("redis_url", "redis://0.0.0.0:6379/0")
ORG_ID = os.environ.get("ORG_ID", None)
POSTHOG_API_KEY = os.environ.get(
"POSTHOG_API_KEY", "phc_CnzwIB0W548wN4wEGeRuxXqidOlEUH2AcyV2sKTku8n"
)
SUPPORT_COUNTRY = os.environ.get("GDRP_LIST", "").split(",")
WHITELISTED_REPOS = os.environ.get("WHITELISTED_REPOS", "").split(",")
BLACKLISTED_USERS = os.environ.get("BLACKLISTED_USERS", "").split(",")
# Default OpenAI
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", None) # this may be none, and it will use azure
OPENAI_API_TYPE = os.environ.get("OPENAI_API_TYPE", "anthropic")
assert OPENAI_API_TYPE in ["anthropic", "azure", "openai"], "Invalid OPENAI_API_TYPE"
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
AZURE_API_KEY = os.environ.get("AZURE_API_KEY", None)
OPENAI_API_BASE = os.environ.get("OPENAI_API_BASE", None)
OPENAI_API_VERSION = os.environ.get("OPENAI_API_VERSION", None)
AZURE_OPENAI_DEPLOYMENT = os.environ.get("AZURE_OPENAI_DEPLOYMENT", None)
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
OPENAI_EMBEDDINGS_AZURE_ENDPOINT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_ENDPOINT", None
)
OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT", None
)
OPENAI_EMBEDDINGS_AZURE_API_VERSION = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_API_VERSION", None
)
OPENAI_API_ENGINE_GPT35 = os.environ.get("OPENAI_API_ENGINE_GPT35", None)
OPENAI_API_ENGINE_GPT4 = os.environ.get("OPENAI_API_ENGINE_GPT4", None)
MULTI_REGION_CONFIG = os.environ.get("MULTI_REGION_CONFIG", None)
if isinstance(MULTI_REGION_CONFIG, str):
MULTI_REGION_CONFIG = MULTI_REGION_CONFIG.strip("'").replace("\\n", "\n")
MULTI_REGION_CONFIG = [item.split(",") for item in MULTI_REGION_CONFIG.split("\n")]
WHITELISTED_USERS = os.environ.get("WHITELISTED_USERS", None)
if WHITELISTED_USERS:
WHITELISTED_USERS = WHITELISTED_USERS.split(",")
WHITELISTED_USERS.append(GITHUB_BOT_USERNAME)
DEFAULT_GPT4_MODEL = os.environ.get("DEFAULT_GPT4_MODEL", "gpt-4-0125-preview")
RESEND_API_KEY = os.environ.get("RESEND_API_KEY", None)
LOKI_URL = None
DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
ENV = "prod" if GITHUB_BOT_USERNAME != TEST_BOT_NAME else "dev"
PROGRESS_BASE_URL = os.environ.get(
"PROGRESS_BASE_URL", "https://progress.sweep.dev"
).rstrip("/")
DISABLED_REPOS = os.environ.get("DISABLED_REPOS", "").split(",")
GHA_AUTOFIX_ENABLED: bool = os.environ.get("GHA_AUTOFIX_ENABLED", False)
MERGE_CONFLICT_ENABLED: bool = os.environ.get("MERGE_CONFLICT_ENABLED", False)
INSTALLATION_ID = os.environ.get("INSTALLATION_ID", None)
AWS_ACCESS_KEY=os.environ.get("AWS_ACCESS_KEY")
AWS_SECRET_KEY=os.environ.get("AWS_SECRET_KEY")
AWS_REGION=os.environ.get("AWS_REGION")
ANTHROPIC_AVAILABLE = AWS_ACCESS_KEY and AWS_SECRET_KEY and AWS_REGION
ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY", None)
COHERE_API_KEY = os.environ.get("COHERE_API_KEY", None)
VOYAGE_API_KEY = os.environ.get("VOYAGE_API_KEY", None)
VOYAGE_API_AWS_ACCESS_KEY=os.environ.get("VOYAGE_API_AWS_ACCESS_KEY_ID")
VOYAGE_API_AWS_SECRET_KEY=os.environ.get("VOYAGE_API_AWS_SECRET_KEY")
VOYAGE_API_AWS_REGION=os.environ.get("VOYAGE_API_AWS_REGION")
VOYAGE_API_AWS_ENDPOINT_NAME=os.environ.get("VOYAGE_API_AWS_ENDPOINT_NAME", "voyage-code-2")
VOYAGE_API_USE_AWS = VOYAGE_API_AWS_ACCESS_KEY and VOYAGE_API_AWS_SECRET_KEY and VOYAGE_API_AWS_REGION
PAREA_API_KEY = os.environ.get("PAREA_API_KEY", None)
# TODO: we need to make this dynamic + backoff
BATCH_SIZE = int(
os.environ.get("BATCH_SIZE", 64 if VOYAGE_API_KEY else 256) # Voyage only allows 128 items per batch and 120000 tokens per batch
)
DEPLOYMENT_GHA_ENABLED = os.environ.get("DEPLOYMENT_GHA_ENABLED", "true").lower() == "true"
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

import re
from sweepai.core.chat import ChatGPT
issue_validator_instructions_prompt = """# Instructions
A good issue for Sweep is actionable and it is clear how to resolve it. Here is what Sweep is currently capable of:
- Access to the entire codebase, with a high-quality search engine to find specific code snippets. Sweep is able to pinpoint the exact location of the code that needs to be changed based on vague descriptions.
- Making code changes to fix bugs or add features.
- Reading the GitHub Action logs to run tests and check the results.
- Ability to read images such as screenshots and charts.
Here are some examples of things Sweep does not currently support:
- Large-scale changes like migrations and large version upgrades.
- Tasks requiring accessing outside information like AWS consoles or retrieving API keys.
- Tasks requiring fixes outside of code changes
- Issues that have an existing fix or duplicate issues
Respond in the following format:
<thinking>
Provide an analysis of why it is a good or bad issue to pass on to Sweep. If it is a bad issue, suggest how the issue could be improved or clarified to make it more suitable for Sweep.
</thinking>
<pass>True or False</pass>
If False, respond to the user:
<response_to_user>
Response to user with justification on why the issue is unclear.
</response_to_user>"""
issue_validator_system_prompt = """You are an AI assistant tasked with determining whether an issue reported by customer support should be passed on to be resolved by Sweep, an AI-powered software engineer.
""" + issue_validator_instructions_prompt
issue_validator_user_prompt = """<issue>
{issue}
</issue>\n\n""" + issue_validator_instructions_prompt
def validate_issue(issue: str) -> str:
"""
Somehow haiku and GPT-4 can't do this consistently.
"""
chat_gpt = ChatGPT.from_system_message_string(
prompt_string=issue_validator_system_prompt,
)
response = chat_gpt.chat_anthropic(
issue_validator_user_prompt.format(
issue=issue
),
model="claude-3-opus-20240229",
temperature=0.0,
)
if "<pass>False</pass>" in response:
pattern = "<response_to_user>(.*)</response_to_user>"
return re.search(pattern, response, re.DOTALL).group(1).strip()
return ""
if __name__ == "__main__":


Step 2: ⌨️ Coding

  • sweepai/config/server.py
Modify sweepai/config/server.py with contents: Add a new environment variable for the user's Slack API key.

<original_code>
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

LICENSE_KEY = os.environ.get("LICENSE_KEY", None)
ALTERNATE_AWS = os.environ.get("ALTERNATE_AWS", "none").lower() == "true"
</original_code>

<new_code>
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

SLACK_API_KEY = os.environ.get("SLACK_API_KEY", None)

LICENSE_KEY = os.environ.get("LICENSE_KEY", None)
ALTERNATE_AWS = os.environ.get("ALTERNATE_AWS", "none").lower() == "true"
</new_code>

  • sweepai/handlers/on_ticket.py
Modify sweepai/handlers/on_ticket.py with contents: Import the necessary modules for making HTTP requests and parsing Slack message threads.

<original_code>
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
</original_code>

<new_code>
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter

from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError
</new_code>

  • sweepai/handlers/on_ticket.py
Modify sweepai/handlers/on_ticket.py with contents: In the `on_ticket` function, check if the issue description contains a Slack link. If found, authenticate to Slack and fetch the message thread.

<original_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
</original_code>

<new_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)

slack_link_match = re.search(r'(https://\w+\.slack\.com/archives/\w+/p\d+)', summary)
if slack_link_match:
    slack_link = slack_link_match.group(1)
    slack_client = WebClient(token=SLACK_API_KEY)
    
    try:
        slack_permalink_data = slack_client.chat_getPermalink(
            link=slack_link
        )
        slack_channel_id = slack_permalink_data['channel']
        slack_message_ts = slack_permalink_data['message_ts']

        slack_thread_replies = slack_client.conversations_replies(
            channel=slack_channel_id,
            ts=slack_message_ts
        )

        slack_thread_messages = [message['text'] for message in slack_thread_replies['messages']]
        slack_thread_text = '\n'.join(slack_thread_messages)

        summary += f"\n\nSlack Thread:\n{slack_thread_text}"

    except SlackApiError as e:
        logger.error(f"Error fetching Slack thread: {e}")

repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
    assignee = current_issue.user.login

</new_code>


Step 3: 🔁 Code Review

Working on it...


🎉 Latest improvements to Sweep:
  • New dashboard launched for real-time tracking of Sweep issues, covering all stages from search to coding.
  • Integration of OpenAI's latest Assistant API for more efficient and reliable code planning and editing, improving speed by 3x.
  • Use the GitHub issues extension for creating Sweep issues directly from your editor.

💡 To recreate the pull request edit the issue title or description.
Something wrong? Let us know.

This is an automated message generated by Sweep AI.

Copy link
Contributor

sweep-nightly bot commented May 3, 2024


Tip

I can email you next time I complete a pull request if you set up your email here!


Actions (click)

  • ↻ Restart Sweep

❌ Unable to Complete PR

I'm sorry, but it looks like an error has occurred due to a planning failure. The error message is 'NoneType' object has no attribute 'sha'. Feel free to add more details to the issue description so Sweep can better address it. Alternatively, reach out to Kevin or William for help at https://discord.gg/sweep.

For bonus GPT-4 tickets, please report this bug on Discord (tracking ID: 8f2764305e).


Please look at the generated plan. If something looks wrong, please add more details to your issue.

File Path Proposed Changes
sweepai/config/server.py Modify sweepai/config/server.py with contents:
Add a new environment variable for the user's Slack API key.

<original_code>
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

LICENSE_KEY = os.environ.get("LICENSE_KEY", None)
ALTERNATE_AWS = os.environ.get("ALTERNATE_AWS", "none").lower() == "true"
</original_code>

<new_code>
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

SLACK_API_KEY = os.environ.get("SLACK_API_KEY", None)

LICENSE_KEY = os.environ.get("LICENSE_KEY", None)
ALTERNATE_AWS = os.environ.get("ALTERNATE_AWS", "none").lower() == "true"
</new_code>
sweepai/handlers/on_ticket.py Modify sweepai/handlers/on_ticket.py with contents:
Import the necessary modules for making HTTP requests and parsing Slack message threads.

<original_code>
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
</original_code>

<new_code>
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter

from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError
</new_code>
sweepai/handlers/on_ticket.py Modify sweepai/handlers/on_ticket.py with contents:
In the on_ticket function, check if the issue description contains a Slack link. If found, authenticate to Slack and fetch the message thread.

<original_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\nChecklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
</original_code>

<new_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\nChecklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)

slack_link_match = re.search(r'(https://\w+.slack.com/archives/\w+/p\d+)', summary)
if slack_link_match:
slack_link = slack_link_match.group(1)
slack_client = WebClient(token=SLACK_API_KEY)

try:
slack_permalink_data = slack_client.chat_getPermalink(
link=slack_link
)
slack_channel_id = slack_permalink_data['channel']
slack_message_ts = slack_permalink_data['message_ts']

slack_thread_replies = slack_client.conversations_replies(
channel=slack_channel_id,
ts=slack_message_ts
)

slack_thread_messages = [message['text'] for message in slack_thread_replies['messages']]
slack_thread_text = '\n'.join(slack_thread_messages)

summary += f"\n\nSlack Thread:\n{slack_thread_text}"

except SlackApiError as e:
logger.error(f"Error fetching Slack thread: {e}")

repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
</new_code>

🎉 Latest improvements to Sweep:
  • New dashboard launched for real-time tracking of Sweep issues, covering all stages from search to coding.
  • Integration of OpenAI's latest Assistant API for more efficient and reliable code planning and editing, improving speed by 3x.
  • Use the GitHub issues extension for creating Sweep issues directly from your editor.

💡 To recreate the pull request edit the issue title or description.

This is an automated message generated by Sweep AI.

Copy link
Contributor

sweep-nightly bot commented May 3, 2024

🚀 Here's the PR! #3668

See Sweep's progress at the progress dashboard!
💎 Sweep Pro: I'm using GPT-4. You have unlimited GPT-4 tickets. (tracking ID: d584755c9d)

Tip

I can email you next time I complete a pull request if you set up your email here!


Actions (click)

  • ↻ Restart Sweep

Step 1: 🔎 Searching

I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.

Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.

"""
on_ticket is the main function that is called when a new issue is created.
It is only called by the webhook handler in sweepai/api.py.
"""
import difflib
import io
import os
import re
import traceback
from typing import Any
import zipfile
from time import time
import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
from sweepai.core.sweep_bot import GHA_PROMPT
from sweepai.agents.pr_description_bot import PRDescriptionBot
from sweepai.agents.image_description_bot import ImageDescriptionBot
from sweepai.config.client import (
RESET_FILE,
RESTART_SWEEP_BUTTON,
REVERT_CHANGED_FILES_TITLE,
SweepConfig,
get_documentation_dict,
get_gha_enabled,
)
from sweepai.config.server import (
DEPLOYMENT_GHA_ENABLED,
ENV,
GITHUB_LABEL_NAME,
IS_SELF_HOSTED,
MONGODB_URI,
PROGRESS_BASE_URL,
)
from sweepai.core.entities import (
AssistantRaisedException,
FileChangeRequest,
MaxTokensExceeded,
NoFilesException,
PullRequest,
SandboxResponse,
)
from sweepai.core.entities import create_error_logs as entities_create_error_logs
from sweepai.core.pr_reader import PRReader
from sweepai.core.sweep_bot import SweepBot, get_files_to_change, get_files_to_change_for_gha, validate_file_change_requests
from sweepai.handlers.create_pr import (
create_config_pr,
create_pr_changes,
safe_delete_sweep_branch,
)
from sweepai.handlers.on_check_suite import clean_gh_logs
from sweepai.utils.image_utils import get_image_contents_from_urls, get_image_urls_from_issue
from sweepai.utils.issue_validator import validate_issue
from sweepai.utils.validate_license import validate_license
from sweepai.utils.buttons import Button, ButtonList, create_action_buttons
from sweepai.utils.chat_logger import ChatLogger
from sweepai.utils.diff import generate_diff
from sweepai.utils.event_logger import posthog
from sweepai.utils.github_utils import (
CURRENT_USERNAME,
ClonedRepo,
convert_pr_draft_field,
get_github_client,
get_token,
sanitize_string_for_github,
)
from sweepai.utils.progress import (
AssistantConversation,
PaymentContext,
TicketContext,
TicketProgress,
TicketProgressStatus,
)
from sweepai.utils.prompt_constructor import HumanMessagePrompt
from sweepai.utils.str_utils import (
BOT_SUFFIX,
FASTER_MODEL_MESSAGE,
UPDATES_MESSAGE,
blockquote,
bot_suffix,
checkbox_template,
clean_logs,
collapsible_template,
create_checkbox,
create_collapsible,
discord_suffix,
format_sandbox_success,
get_hash,
sep,
stars_suffix,
strip_sweep,
to_branch_name,
)
from sweepai.utils.ticket_utils import (
center,
fetch_relevant_files,
fire_and_forget_wrapper,
log_error,
prep_snippets,
)
from sweepai.utils.user_settings import UserSettings
# from sandbox.sandbox_utils import Sandbox
sweeping_gif = """<a href="https://github.com/sweepai/sweep"><img class="swing" src="https://raw.githubusercontent.com/sweepai/sweep/main/.assets/sweeping.gif" width="100" style="width:50px; margin-bottom:10px" alt="Sweeping"></a>"""
custom_config = """
extends: relaxed
rules:
line-length: disable
indentation: disable
"""
INSTRUCTIONS_FOR_REVIEW = """\
### 💡 To get Sweep to edit this pull request, you can:
* Comment below, and Sweep can edit the entire PR
* Comment on a file, Sweep will only modify the commented file
* Edit the original issue to get Sweep to recreate the PR from scratch"""
email_template = """Hey {name},
<br/><br/>
🚀 I just finished creating a pull request for your issue ({repo_full_name}#{issue_number}) at <a href="{pr_url}">{repo_full_name}#{pr_number}</a>!
<br/><br/>
You can view how I created this pull request <a href="{progress_url}">here</a>.
<h2>Summary</h2>
<blockquote>
{summary}
</blockquote>
<h2>Files Changed</h2>
<ul>
{files_changed}
</ul>
{sweeping_gif}
<br/>
Cheers,
<br/>
Sweep
<br/>"""
FAILING_GITHUB_ACTION_PROMPT = """\
The following Github Actions failed on a previous attempt at fixing this issue.
Propose a fix to the failing github actions. You must edit the source code, not the github action itself.
{github_action_log}
"""
# Add :eyes: emoji to ticket
def add_emoji(issue: Issue, comment_id: int = None, reaction_content="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
item_to_react_to.create_reaction(reaction_content)
# If SWEEP_BOT reacted to item_to_react_to with "rocket", then remove it.
def remove_emoji(issue: Issue, comment_id: int = None, content_to_delete="eyes"):
item_to_react_to = issue.get_comment(comment_id) if comment_id else issue
reactions = item_to_react_to.get_reactions()
for reaction in reactions:
if (
reaction.content == content_to_delete
and reaction.user.login == CURRENT_USERNAME
):
item_to_react_to.delete_reaction(reaction.id)
def create_error_logs(
commit_url_display: str,
sandbox_response: SandboxResponse,
status: str = "✓",
):
return (
(
"<br/>"
+ create_collapsible(
f"Sandbox logs for {commit_url_display} {status}",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
# takes in a list of workflow runs and returns a list of messages containing the logs of the failing runs
def get_failing_gha_logs(runs, installation_id) -> str:
token = get_token(installation_id)
all_logs = ""
for run in runs:
# jobs_url
jobs_url = run.jobs_url
jobs_response = requests.get(
jobs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
)
if jobs_response.status_code == 200:
failed_jobs = []
jobs = jobs_response.json()["jobs"]
for job in jobs:
if job["conclusion"] == "failure":
failed_jobs.append(job)
failed_jobs_name_list = []
for job in failed_jobs:
# add failed steps
for step in job["steps"]:
if step["conclusion"] == "failure":
failed_jobs_name_list.append(
f"{job['name']}/{step['number']}_{step['name']}"
)
else:
logger.error(
"Failed to get jobs for failing github actions, possible a credentials issue"
)
return all_logs
# make sure jobs in valid
if jobs_response.json()['total_count'] == 0:
logger.error(f"no jobs for this run: {run}, continuing...")
continue
# logs url
logs_url = run.logs_url
logs_response = requests.get(
logs_url,
headers={
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {token}",
"X-GitHub-Api-Version": "2022-11-28",
},
allow_redirects=True,
)
# Check if the request was successful
if logs_response.status_code == 200:
zip_data = io.BytesIO(logs_response.content)
zip_file = zipfile.ZipFile(zip_data, "r")
zip_file_names = zip_file.namelist()
for file in failed_jobs_name_list:
if f"{file}.txt" in zip_file_names:
logs = zip_file.read(f"{file}.txt").decode("utf-8")
logs_prompt = clean_gh_logs(logs)
all_logs += logs_prompt + "\n"
else:
logger.error(
"Failed to get logs for failing github actions, likely a credentials issue"
)
return all_logs
def delete_old_prs(repo: Repository, issue_number: int):
logger.info("Deleting old PRs...")
prs = repo.get_pulls(
state="open",
sort="created",
direction="desc",
base=SweepConfig.get_branch(repo),
)
for pr in tqdm(prs.get_page(0)):
# # Check if this issue is mentioned in the PR, and pr is owned by bot
# # This is done in create_pr, (pr_description = ...)
if pr.user.login == CURRENT_USERNAME and f"Fixes #{issue_number}.\n" in pr.body:
safe_delete_sweep_branch(pr, repo)
break
def construct_sweep_bot(
repo: Repository,
repo_name: str,
issue_url: str,
repo_description: str,
title: str,
message_summary: str,
cloned_repo: ClonedRepo,
ticket_progress: TicketProgress,
chat_logger: ChatLogger,
snippets: Any = None,
tree: Any = None,
comments: Any = None,
) -> SweepBot:
human_message = HumanMessagePrompt(
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description.strip(),
title=title,
summary=message_summary,
snippets=snippets,
tree=tree,
)
sweep_bot = SweepBot.from_system_message_content(
human_message=human_message,
repo=repo,
is_reply=bool(comments),
chat_logger=chat_logger,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
)
return sweep_bot
def get_comment_header(
index: int,
g: Github,
repo_full_name: str,
user_settings: UserSettings,
progress_headers: list[None | str],
tracking_id: str | None,
payment_message_start: str,
user_settings_message: str,
errored: bool = False,
pr_message: str = "",
done: bool = False,
initial_sandbox_response: int | SandboxResponse = -1,
initial_sandbox_response_file=None,
config_pr_url: str | None = None,
):
config_pr_message = (
"\n"
+ f"<div align='center'>Install Sweep Configs: <a href='{config_pr_url}'>Pull Request</a></div>"
if config_pr_url is not None
else ""
)
actions_message = create_action_buttons(
[
RESTART_SWEEP_BUTTON,
]
)
sandbox_execution_message = "\n\n## GitHub Actions failed\n\nThe sandbox appears to be unavailable or down.\n\n"
if initial_sandbox_response == -1:
sandbox_execution_message = ""
elif initial_sandbox_response is not None:
repo = g.get_repo(repo_full_name)
commit_hash = repo.get_commits()[0].sha
success = initial_sandbox_response.outputs and initial_sandbox_response.success
status = "✓" if success else "X"
sandbox_execution_message = (
"\n\n## GitHub Actions"
+ status
+ "\n\nHere are the GitHub Actions logs prior to making any changes:\n\n"
)
sandbox_execution_message += entities_create_error_logs(
f'<a href="https://github.com/{repo_full_name}/commit/{commit_hash}"><code>{commit_hash[:7]}</code></a>',
initial_sandbox_response,
initial_sandbox_response_file,
)
if success:
sandbox_execution_message += f"\n\nSandbox passed on the latest `{repo.default_branch}`, so sandbox checks will be enabled for this issue."
else:
sandbox_execution_message += "\n\nSandbox failed, so all sandbox checks will be disabled for this issue."
if index < 0:
index = 0
if index == 4:
return (
pr_message
+ config_pr_message
+ f"\n\n---\n{user_settings.get_message(completed=True)}"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
total = len(progress_headers)
index += 1 if done else 0
index *= 100 / total
index = int(index)
index = min(100, index)
if errored:
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Errored&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}<br/>{center(pbar)}\n\n"
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
pbar = f"\n\n<img src='https://progress-bar.dev/{index}/?&title=Progress&width=600' alt='{index}%' />"
return (
f"{center(sweeping_gif)}"
+ (
center(
f'\n\n<h2>✨ Track Sweep\'s progress on our <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">progress dashboard</a>!</h2>'
)
if MONGODB_URI is not None
else ""
)
+ f"<br/>{center(pbar)}"
+ ("\n" + stars_suffix if index != -1 else "")
+ "\n"
+ center(payment_message_start)
+ f"\n\n---\n{user_settings_message}"
+ config_pr_message
+ f"\n\n---\n{actions_message}"
+ sandbox_execution_message
)
def on_ticket(
title: str,
summary: str,
issue_number: int,
issue_url: str, # purely for logging purposes
username: str,
repo_full_name: str,
repo_description: str,
installation_id: int,
comment_id: int = None,
edited: bool = False,
tracking_id: str | None = None,
):
if not os.environ.get("CLI"):
assert validate_license(), "License key is invalid or expired. Please contact us at team@sweep.dev to upgrade to an enterprise license."
with logger.contextualize(
tracking_id=tracking_id,
):
if tracking_id is None:
tracking_id = get_hash()
on_ticket_start_time = time()
logger.info(f"Starting on_ticket with title {title} and summary {summary}")
(
title,
slow_mode,
do_map,
subissues_mode,
sandbox_mode,
fast_mode,
lint_mode,
) = strip_sweep(title)
# fetch images from body of issue
image_urls = get_image_urls_from_issue(issue_number, repo_full_name, installation_id)
image_contents = get_image_contents_from_urls(image_urls)
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n<summary>Checklist</summary>.*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- \[[ X]\].*",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
ticket_progress = TicketProgress(
tracking_id=tracking_id,
username=username,
context=TicketContext(
title=title,
description=summary,
repo_full_name=repo_full_name,
issue_number=issue_number,
is_public=repo.private is False,
start_time=int(time()),
),
)
branch_match = re.search(
r"([B|b]ranch:) *(?P<branch_name>.+?)(\s|$)", summary
)
overrided_branch_name = None
if branch_match and "branch_name" in branch_match.groupdict():
overrided_branch_name = (
branch_match.groupdict()["branch_name"].strip().strip("`\"'")
)
# TODO: this code might be finicky, might have missed edge cases
if overrided_branch_name.startswith("https://github.com/"):
overrided_branch_name = overrided_branch_name.split("?")[0].split(
"tree/"
)[-1]
SweepConfig.get_branch(repo, overrided_branch_name)
chat_logger = (
ChatLogger(
{
"repo_name": repo_name,
"title": title,
"summary": summary,
"issue_number": issue_number,
"issue_url": issue_url,
"username": (
username if not username.startswith("sweep") else assignee
),
"repo_full_name": repo_full_name,
"repo_description": repo_description,
"installation_id": installation_id,
"type": "ticket",
"mode": ENV,
"comment_id": comment_id,
"edited": edited,
"tracking_id": tracking_id,
},
active=True,
)
if MONGODB_URI
else None
)
if chat_logger and not IS_SELF_HOSTED:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
if use_faster_model:
raise Exception(FASTER_MODEL_MESSAGE)
if fast_mode:
use_faster_model = True
if not comment_id and not edited and chat_logger and not sandbox_mode:
fire_and_forget_wrapper(chat_logger.add_successful_ticket)(
gpt3=use_faster_model
)
organization, repo_name = repo_full_name.split("/")
metadata = {
"issue_url": issue_url,
"repo_full_name": repo_full_name,
"organization": organization,
"repo_name": repo_name,
"repo_description": repo_description,
"username": username,
"comment_id": comment_id,
"title": title,
"installation_id": installation_id,
"function": "on_ticket",
"edited": edited,
"model": "gpt-3.5" if use_faster_model else "gpt-4",
"tier": "pro" if is_paying_user else "free",
"mode": ENV,
"slow_mode": slow_mode,
"do_map": do_map,
"subissues_mode": subissues_mode,
"sandbox_mode": sandbox_mode,
"fast_mode": fast_mode,
"is_self_hosted": IS_SELF_HOSTED,
"tracking_id": tracking_id,
}
fire_and_forget_wrapper(posthog.capture)(
username, "started", properties=metadata
)
try:
if current_issue.state == "closed":
fire_and_forget_wrapper(posthog.capture)(
username,
"issue_closed",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": False, "reason": "Issue is closed"}
fire_and_forget_wrapper(add_emoji)(current_issue, comment_id)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="rocket"
)
fire_and_forget_wrapper(remove_emoji)(
current_issue, comment_id, content_to_delete="confused"
)
fire_and_forget_wrapper(current_issue.edit)(body=summary)
replies_text = ""
summary = summary if summary else ""
fire_and_forget_wrapper(delete_old_prs)(repo, issue_number)
if not sandbox_mode:
progress_headers = [
None,
"Step 1: 🔎 Searching",
"Step 2: ⌨️ Coding",
"Step 3: 🔁 Code Review",
]
else:
progress_headers = [
None,
"📖 Reading File",
"🛠️ Executing Sandbox",
]
issue_comment = None
payment_message, payment_message_start = get_payment_messages(
chat_logger
)
ticket_progress.context.payment_context = PaymentContext(
use_faster_model=use_faster_model,
pro_user=is_paying_user,
daily_tickets_used=(
chat_logger.get_ticket_count(use_date=True)
if chat_logger
else 0
),
monthly_tickets_used=(
chat_logger.get_ticket_count() if chat_logger else 0
),
)
ticket_progress.save()
config_pr_url = None
user_settings = UserSettings.from_username(username=username)
user_settings_message = user_settings.get_message()
cloned_repo = ClonedRepo(
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=overrided_branch_name,
)
# check that repo's directory is non-empty
if os.listdir(cloned_repo.cached_dir) == []:
logger.info("Empty repo")
first_comment = (
"Sweep is currently not supported on empty repositories. Please add some"
f" code to your repository and try again.\n{sep}##"
f" {progress_headers[1]}\n{bot_suffix}{discord_suffix}"
)
if issue_comment is None:
issue_comment = current_issue.create_comment(
first_comment + BOT_SUFFIX
)
else:
issue_comment.edit(first_comment + BOT_SUFFIX)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {"success": False}
indexing_message = (
"I'm searching for relevant snippets in your repository. If this is your first"
" time using Sweep, I'm indexing your repository. You can monitor the progress using the progress dashboard"
)
first_comment = (
f"{get_comment_header(0, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message)}\n{sep}I am currently looking into this ticket! I"
" will update the progress of the ticket in this comment. I am currently"
f" searching through your code, looking for relevant snippets.\n{sep}##"
f" {progress_headers[1]}\n{indexing_message}{bot_suffix}{discord_suffix}"
)
# Find Sweep's previous comment
comments = []
for comment in current_issue.get_comments():
comments.append(comment)
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
break
if issue_comment is None:
issue_comment = current_issue.create_comment(first_comment)
else:
fire_and_forget_wrapper(issue_comment.edit)(first_comment)
old_edit = issue_comment.edit
issue_comment.edit = lambda msg: old_edit(msg + BOT_SUFFIX)
past_messages = {}
current_index = 0
table = None
initial_sandbox_response = -1
initial_sandbox_response_file = None
def refresh_token():
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
return user_token, g, repo
def edit_sweep_comment(
message: str,
index: int,
pr_message="",
done=False,
add_bonus_message=True,
):
nonlocal current_index, user_token, g, repo, issue_comment, initial_sandbox_response, initial_sandbox_response_file
message = sanitize_string_for_github(message)
if pr_message:
pr_message = sanitize_string_for_github(pr_message)
# -1 = error, -2 = retry
# Only update the progress bar if the issue generation errors.
errored = index == -1
if index >= 0:
past_messages[index] = message
current_index = index
agg_message = None
# Include progress history
# index = -2 is reserved for
for i in range(
current_index + 2
): # go to next header (for Working on it... text)
if i == 0 or i >= len(progress_headers):
continue # skip None header
header = progress_headers[i]
if header is not None:
header = "## " + header + "\n"
else:
header = "No header\n"
msg = header + (past_messages.get(i) or "Working on it...")
if agg_message is None:
agg_message = msg
else:
agg_message = agg_message + f"\n{sep}" + msg
suffix = bot_suffix + discord_suffix
if errored:
agg_message = (
"## ❌ Unable to Complete PR"
+ "\n"
+ message
+ (
"\n\nFor bonus GPT-4 tickets, please report this bug on"
f" **[Discord](https://discord.gg/invite/sweep)** (tracking ID: `{tracking_id}`)."
if add_bonus_message
else ""
)
)
if table is not None:
agg_message = (
agg_message
+ f"\n{sep}Please look at the generated plan. If something looks"
f" wrong, please add more details to your issue.\n\n{table}"
)
suffix = bot_suffix # don't include discord suffix for error messages
# Update the issue comment
msg = f"{get_comment_header(current_index, g, repo_full_name, user_settings, progress_headers, tracking_id, payment_message_start, user_settings_message, errored=errored, pr_message=pr_message, done=done, initial_sandbox_response=initial_sandbox_response, initial_sandbox_response_file=initial_sandbox_response_file, config_pr_url=config_pr_url)}\n{sep}{agg_message}{suffix}"
try:
issue_comment.edit(msg)
except BadCredentialsException:
logger.error(
f"Bad credentials, refreshing token (tracking ID: `{tracking_id}`)"
)
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
issue_comment = None
for comment in comments:
if comment.user.login == CURRENT_USERNAME:
issue_comment = comment
current_issue = repo.get_issue(number=issue_number)
if issue_comment is None:
issue_comment = current_issue.create_comment(msg)
else:
issue_comment = [
comment
for comment in current_issue.get_comments()
if comment.user.login == CURRENT_USERNAME
][0]
issue_comment.edit(msg)
if use_faster_model:
edit_sweep_comment(
FASTER_MODEL_MESSAGE, -1, add_bonus_message=False
)
posthog.capture(
username,
"ran_out_of_tickets",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
return {
"success": False,
"error_message": "We deprecated supporting GPT 3.5.",
}
error_message = validate_issue(title + summary)
if error_message:
logger.warning(f"Validation error: {error_message}")
edit_sweep_comment(
(
f"The issue was rejected with the following response:\n\n{blockquote(error_message)}"
),
-1,
)
fire_and_forget_wrapper(add_emoji)(
current_issue, comment_id, reaction_content="confused"
)
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
posthog.capture(
username,
"invalid_issue",
properties={
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
return {"success": True}
prs_extracted = PRReader.extract_prs(repo, summary)
message_summary = summary
if prs_extracted:
message_summary += "\n\n" + prs_extracted
edit_sweep_comment(
create_collapsible(
"I found that you mentioned the following Pull Requests that might be important:",
blockquote(
prs_extracted,
),
),
1,
)
try:
# search/context manager
logger.info("Searching for relevant snippets...")
if image_contents: # doing it here to avoid editing the original issue
message_summary += ImageDescriptionBot().describe_images(text=title + message_summary, images=image_contents)
snippets, tree, _, repo_context_manager = fetch_relevant_files(
cloned_repo,
title,
message_summary,
replies_text,
username,
metadata,
on_ticket_start_time,
tracking_id,
is_paying_user,
is_consumer_tier,
issue_url,
chat_logger,
ticket_progress,
images=image_contents
)
cloned_repo = repo_context_manager.cloned_repo
except Exception as e:
edit_sweep_comment(
(
"It looks like an issue has occurred around fetching the files."
f" The exception was {str(e)}. If this error persists"
f" contact team@sweep.dev.\n\n> @{username}, editing this issue description to include more details will automatically make me relaunch. Please join our Discord server for support (tracking_id={tracking_id})"
),
-1,
)
raise Exception("Failed to fetch files") from e
_user_token, g = get_github_client(installation_id)
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
repo = g.get_repo(repo_full_name)
ticket_progress.search_progress.indexing_progress = (
ticket_progress.search_progress.indexing_total
)
ticket_progress.status = TicketProgressStatus.PLANNING
ticket_progress.save()
# Fetch git commit history
if not repo_description:
repo_description = "No description provided."
message_summary += replies_text
get_documentation_dict(repo)
docs_results = ""
sweep_bot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title=title,
message_summary=message_summary,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
# Check repository for sweep.yml file.
sweep_yml_exists = False
sweep_yml_failed = False
for content_file in repo.get_contents(""):
if content_file.name == "sweep.yaml":
sweep_yml_exists = True
# Check if YAML is valid
yaml_content = content_file.decoded_content.decode("utf-8")
sweep_yaml_dict = {}
try:
sweep_yaml_dict = yaml.safe_load(yaml_content)
except Exception:
logger.error(f"Failed to load YAML file: {yaml_content}")
if len(sweep_yaml_dict) > 0:
break
linter_config = yamllint_config.YamlLintConfig(custom_config)
problems = list(linter.run(yaml_content, linter_config))
if problems:
errors = [
f"Line {problem.line}: {problem.desc} (rule: {problem.rule})"
for problem in problems
]
error_message = "\n".join(errors)
markdown_error_message = f"**There is something wrong with your [sweep.yaml](https://github.com/{repo_full_name}/blob/main/sweep.yaml):**\n```\n{error_message}\n```"
sweep_yml_failed = True
logger.error(markdown_error_message)
edit_sweep_comment(markdown_error_message, -1)
else:
logger.info("The YAML file is valid. No errors found.")
break
# If sweep.yaml does not exist, then create a new PR that simply creates the sweep.yaml file.
if not sweep_yml_exists:
try:
logger.info("Creating sweep.yaml file...")
config_pr = create_config_pr(sweep_bot, cloned_repo=cloned_repo)
config_pr_url = config_pr.html_url
edit_sweep_comment(message="", index=-2)
except Exception as e:
logger.error(
"Failed to create new branch for sweep.yaml file.\n",
e,
traceback.format_exc(),
)
else:
logger.info("sweep.yaml file already exists.")
try:
# ANALYZE SNIPPETS
newline = "\n"
edit_sweep_comment(
"I found the following snippets in your repository. I will now analyze"
" these snippets and come up with a plan."
+ "\n\n"
+ create_collapsible(
"Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.",
"\n".join(
[
f"https://github.com/{organization}/{repo_name}/blob/{repo.get_commits()[0].sha}/{snippet.file_path}#L{max(snippet.start, 1)}-L{min(snippet.end, snippet.content.count(newline) - 1)}\n"
for snippet in snippets
]
),
)
+ (
create_collapsible(
"I also found that you mentioned the following Pull Requests that may be helpful:",
blockquote(prs_extracted),
)
if prs_extracted
else ""
)
+ (f"\n\n{docs_results}\n\n" if docs_results else ""),
1,
)
logger.info("Fetching files to modify/create...")
file_change_requests, plan = get_files_to_change(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=f"{title}\n\n{message_summary}",
repo_name=repo_full_name,
cloned_repo=cloned_repo,
images=image_contents
)
validate_file_change_requests(file_change_requests, cloned_repo)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.status = TicketProgressStatus.CODING
ticket_progress.save()
if not file_change_requests:
if len(title + summary) < 60:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details? Please make sure that the title and"
" summary of the issue are at least 60 characters."
),
-1,
)
else:
edit_sweep_comment(
(
"Sorry, I could not find any files to modify, can you please"
" provide more details?"
),
-1,
)
raise Exception("No files to modify.")
file_change_requests: list[
FileChangeRequest
] = sweep_bot.validate_file_change_requests(
file_change_requests,
)
ticket_progress.planning_progress.file_change_requests = (
file_change_requests
)
ticket_progress.coding_progress.assistant_conversations = [
AssistantConversation() for fcr in file_change_requests
]
ticket_progress.save()
table = tabulate(
[
[
file_change_request.entity_display,
file_change_request.instructions_display.replace(
"\n", "<br/>"
).replace("```", "\\```"),
]
for file_change_request in file_change_requests
if file_change_request.change_type != "check"
],
headers=["File Path", "Proposed Changes"],
tablefmt="pipe",
)
logger.info("Generating PR...")
pull_request = PullRequest(
title="Sweep: " + title,
branch_name="sweep/" + to_branch_name(title),
content="",
)
logger.info("Making PR...")
ticket_progress.context.branch_name = pull_request.branch_name
ticket_progress.save()
files_progress: list[tuple[str, str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
"⏳ In Progress",
"",
)
for file_change_request in file_change_requests
]
checkboxes_progress: list[tuple[str, str, str]] = [
(
file_change_request.entity_display,
file_change_request.instructions_display,
" ",
)
for file_change_request in file_change_requests
if not file_change_request.change_type == "check"
]
checkboxes_contents = "\n".join(
[
create_checkbox(
f"`{filename}`", blockquote(instructions), check == "X"
)
for filename, instructions, check in checkboxes_progress
]
)
create_collapsible("Checklist", checkboxes_contents, opened=True)
file_change_requests[0].status = "running"
condensed_checkboxes_contents = "\n".join(
[
create_checkbox(f"`{filename}`", "", check == "X").strip()
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_collapsible = create_collapsible(
"Checklist", condensed_checkboxes_contents, opened=True
)
current_issue = repo.get_issue(number=issue_number)
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
delete_branch = False
generator = create_pr_changes(
file_change_requests,
pull_request,
sweep_bot,
username,
installation_id,
issue_number,
chat_logger=chat_logger,
base_branch=overrided_branch_name,
additional_messages=[],
)
edit_sweep_comment(checkboxes_contents, 2)
if not file_change_requests:
raise NoFilesException()
response = {
"error": Exception(
f"Sweep failed to generate any file change requests! This could mean that Sweep failed to find the correct lines of code to modify or that GPT-4 did not respond in our specified format. Sometimes, retrying will fix this error. Otherwise, reach out to our Discord server for support (tracking_id={tracking_id})."
)
}
changed_files = []
for item in generator:
if isinstance(item, dict):
response = item
break
(
new_file_contents,
_,
commit,
file_change_requests,
) = item
# append all files that have been changed
if new_file_contents:
for file_name, _ in new_file_contents.items():
changed_files.append(file_name)
commit_hash: str = (
commit
if isinstance(commit, str)
else (
commit.sha
if commit is not None
else repo.get_branch(
pull_request.branch_name
).commit.sha
)
)
commit_url = (
f"https://github.com/{repo_full_name}/commit/{commit_hash}"
)
commit_url_display = (
f"<a href='{commit_url}'><code>{commit_hash[:7]}</code></a>"
)
create_error_logs(
commit_url_display,
None,
status=(
"✓"
),
)
checkboxes_progress = [
(
file_change_request.display_summary
+ " "
+ file_change_request.status_display
+ " "
+ (file_change_request.commit_hash_url or "")
+ f" [Edit]({file_change_request.get_edit_url(repo.full_name, pull_request.branch_name)})",
file_change_request.instructions_ticket_display
+ f"\n\n{file_change_request.diff_display}",
(
"X"
if file_change_request.status
in ("succeeded", "failed")
else " "
),
)
for file_change_request in file_change_requests
]
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
collapsible_template.format(
summary="Checklist",
body=checkboxes_contents,
opened="open",
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
try:
current_issue = repo.get_issue(number=issue_number)
except BadCredentialsException:
user_token, g, repo = refresh_token()
cloned_repo.token = user_token
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
logger.info(files_progress)
edit_sweep_comment(checkboxes_contents, 2)
if not response.get("success"):
raise Exception(f"Failed to create PR: {response.get('error')}")
checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions=blockquote(instructions),
)
for filename, instructions, check in checkboxes_progress
]
)
condensed_checkboxes_contents = "\n".join(
[
checkbox_template.format(
check=check,
filename=filename,
instructions="",
).strip()
for filename, instructions, check in checkboxes_progress
if not instructions.lower().startswith("run")
]
)
condensed_checkboxes_collapsible = collapsible_template.format(
summary="Checklist",
body=condensed_checkboxes_contents,
opened="open",
)
for _ in range(3):
try:
current_issue.edit(
body=summary + "\n\n" + condensed_checkboxes_collapsible
)
break
except Exception:
from time import sleep
sleep(1)
edit_sweep_comment(checkboxes_contents, 2)
pr_changes = response["pull_request"]
# change the body here
diff_text = get_branch_diff_text(
repo=repo,
branch=pull_request.branch_name,
base_branch=overrided_branch_name,
)
new_description = PRDescriptionBot().describe_diffs(
diff_text,
pull_request.title,
)
# TODO: update the title as well
if new_description:
pr_changes.body = (
f"{new_description}\n\nFixes"
f" #{issue_number}.\n\n---\n\n{UPDATES_MESSAGE}\n\n---\n\n{INSTRUCTIONS_FOR_REVIEW}{BOT_SUFFIX}"
)
edit_sweep_comment(
"I have finished coding the issue. I am now reviewing it for completeness.",
3,
)
change_location = f" [`{pr_changes.pr_head}`](https://github.com/{repo_full_name}/commits/{pr_changes.pr_head}).\n\n"
review_message = (
"Here are my self-reviews of my changes at" + change_location
)
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
except Exception:
pass
changes_required, review_message = False, ""
if changes_required:
edit_sweep_comment(
review_message
+ "\n\nI finished incorporating these changes.",
3,
)
else:
edit_sweep_comment(
f"I have finished reviewing the code for completeness. I did not find errors for {change_location}",
3,
)
revert_buttons = []
for changed_file in set(changed_files):
revert_buttons.append(
Button(label=f"{RESET_FILE} {changed_file}")
)
revert_buttons_list = ButtonList(
buttons=revert_buttons, title=REVERT_CHANGED_FILES_TITLE
)
# delete failing sweep yaml if applicable
if sweep_yml_failed:
try:
repo.delete_file(
"sweep.yaml",
"Delete failing sweep.yaml",
branch=pr_changes.pr_head,
sha=repo.get_contents("sweep.yaml").sha,
)
except Exception:
pass
# create draft pr, then convert to regular pr later
pr: GithubPullRequest = repo.create_pull(
title=pr_changes.title,
body=pr_changes.body,
head=pr_changes.pr_head,
base=overrided_branch_name or SweepConfig.get_branch(repo),
# removed draft PR
draft=False,
)
try:
pr.add_to_assignees(username)
except Exception as e:
logger.error(
f"Failed to add assignee {username}: {e}, probably a bot."
)
ticket_progress.status = TicketProgressStatus.COMPLETE
ticket_progress.context.done_time = time()
ticket_progress.context.pr_id = pr.number
ticket_progress.save()
if revert_buttons:
pr.create_issue_comment(
revert_buttons_list.serialize() + BOT_SUFFIX
)
# add comments before labelling
pr.add_to_labels(GITHUB_LABEL_NAME)
current_issue.create_reaction("rocket")
heres_pr_message = f'<h1 align="center">🚀 Here\'s the PR! <a href="{pr.html_url}">#{pr.number}</a></h1>'
progress_message = f'<div align="center"><b>See Sweep\'s progress at <a href="{PROGRESS_BASE_URL}/issues/{tracking_id}">the progress dashboard</a>!</b></div>'
edit_sweep_comment(
review_message + "\n\nSuccess! 🚀",
4,
pr_message=(
f"{center(heres_pr_message)}\n{center(progress_message)}\n{center(payment_message_start)}"
),
done=True,
)
user_settings = UserSettings.from_username(username=username)
user = g.get_user(username)
full_name = user.name or user.login
name = full_name.split(" ")[0]
files_changed = []
for fcr in file_change_requests:
if fcr.change_type in ("create", "modify"):
diff = list(
difflib.unified_diff(
(fcr.old_content or "").splitlines() or [],
(fcr.new_content or "").splitlines() or [],
lineterm="",
)
)
added = sum(
1
for line in diff
if line.startswith("+") and not line.startswith("+++")
)
removed = sum(
1
for line in diff
if line.startswith("-") and not line.startswith("---")
)
files_changed.append(
f"<code>{fcr.filename}</code> (+{added}/-{removed})"
)
user_settings.send_email(
subject=f"Sweep Pull Request Complete for {repo_name}#{issue_number} {title}",
html=email_template.format(
name=name,
pr_url=pr.html_url,
issue_number=issue_number,
repo_full_name=repo_full_name,
pr_number=pr.number,
progress_url=f"{PROGRESS_BASE_URL}/issues/{tracking_id}",
summary=markdown.markdown(pr_changes.body),
files_changed="\n".join(
[f"<li>{item}</li>" for item in files_changed]
),
sweeping_gif=sweeping_gif,
),
)
# poll for github to check when gha are done
total_poll_attempts = 0
total_edit_attempts = 0
SLEEP_DURATION_SECONDS = 15
GITHUB_ACTIONS_ENABLED = get_gha_enabled(repo=repo) and DEPLOYMENT_GHA_ENABLED
GHA_MAX_EDIT_ATTEMPTS = 5 # max number of times to edit PR
current_commit = pr.head.sha
while True and GITHUB_ACTIONS_ENABLED:
logger.info(
f"Polling to see if Github Actions have finished... {total_poll_attempts}"
)
# we wait at most 60 minutes
if total_poll_attempts * SLEEP_DURATION_SECONDS // 60 >= 60:
break
else:
# wait one minute between check attempts
total_poll_attempts += 1
from time import sleep
sleep(SLEEP_DURATION_SECONDS)
runs = list(repo.get_workflow_runs(branch=pr.head.ref, head_sha=current_commit))
# if all runs have succeeded, break
if all([run.conclusion == "success" for run in runs]):
break
# if any of them have failed we retry
if any([run.conclusion == "failure" for run in runs]):
failed_runs = [
run for run in runs if run.conclusion == "failure"
]
failed_gha_logs: list[str] = get_failing_gha_logs(
failed_runs,
installation_id,
)
if failed_gha_logs:
# make edits to the PR
# TODO: look into rollbacks so we don't continue adding onto errors
cloned_repo = ClonedRepo( # reinitialize cloned_repo to avoid conflicts
repo_full_name,
installation_id=installation_id,
token=user_token,
repo=repo,
branch=pr.head.ref,
)
diffs = get_branch_diff_text(repo=repo, branch=pr.head.ref, base_branch=pr.base.ref)
problem_statement = f"{title}\n{message_summary}\n{replies_text}"
all_information_prompt = GHA_PROMPT.format(
problem_statement=problem_statement,
github_actions_logs=failed_gha_logs,
changes_made=diffs,
)
repo_context_manager = prep_snippets(cloned_repo=cloned_repo, query=(title + message_summary + replies_text).strip("\n"), ticket_progress=ticket_progress) # need to do this, can use the old query for speed
sweep_bot: SweepBot = construct_sweep_bot(
repo=repo,
repo_name=repo_name,
issue_url=issue_url,
repo_description=repo_description,
title="Fix the following errors to complete the user request.",
message_summary=all_information_prompt,
cloned_repo=cloned_repo,
ticket_progress=ticket_progress,
chat_logger=chat_logger,
snippets=snippets,
tree=tree,
comments=comments,
)
file_change_requests, plan = get_files_to_change_for_gha(
relevant_snippets=repo_context_manager.current_top_snippets,
read_only_snippets=repo_context_manager.read_only_snippets,
problem_statement=all_information_prompt,
updated_files=new_file_contents,
cloned_repo=cloned_repo,
chat_logger=chat_logger,
)
validate_file_change_requests(file_change_requests, cloned_repo)
previous_modify_files_dict: dict[str, dict[str, str | list[str]]] | None = None
_, commit, _ = sweep_bot.handle_modify_file_main(
branch=pr.head.ref,
assistant_conversation=None,
additional_messages=[],
previous_modify_files_dict=previous_modify_files_dict,
file_change_requests=file_change_requests,
username=username
)
current_commit = commit.sha
pr = repo.get_pull(pr.number) # IMPORTANT: resync PR otherwise you'll fetch old GHA runs
total_edit_attempts += 1
if total_edit_attempts >= GHA_MAX_EDIT_ATTEMPTS:
logger.info(f"Tried to edit PR {GHA_MAX_EDIT_ATTEMPTS} times, giving up.")
break
# if none of the runs have completed we wait and poll github
logger.info(
f"No Github Actions have failed yet and not all have succeeded yet, waiting for {SLEEP_DURATION_SECONDS} seconds before polling again..."
)
# break from main for loop
convert_pr_draft_field(pr, is_draft=False, installation_id=installation_id)
except MaxTokensExceeded as e:
logger.info("Max tokens exceeded")
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Max tokens exceeded. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Max Tokens Exceeded",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
if chat_logger and chat_logger.is_paying_user():
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too long."
" We are currently working on improved file streaming to address"
" this issue.\n"
),
-1,
)
else:
edit_sweep_comment(
(
f"Sorry, I could not edit `{e.filename}` as this file is too"
" long.\n\nIf this file is incorrect, please describe the desired"
" file in the prompt. However, if you would like to edit longer"
" files, consider upgrading to [Sweep Pro](https://sweep.dev/) for"
" longer context lengths.\n"
),
-1,
)
delete_branch = True
raise e
except NoFilesException as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sweep could not find files to modify to address this issue. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.info("Sweep could not find files to modify")
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Sweep could not find files to modify",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
edit_sweep_comment(
(
"Sorry, Sweep could not find any appropriate files to edit to address"
" this issue. If this is a mistake, please provide more context and Sweep"
f" will retry!\n\n@{username}, please edit the issue description to"
" include more details. You can also ask for help on our community"
" forum: https://community.sweep.dev/"
),
-1,
)
delete_branch = True
raise e
except openai.BadRequestError as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = "Sorry, it looks like there is an error with communicating with OpenAI. If this error persists, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
edit_sweep_comment(
(
"I'm sorry, but it looks our model has ran out of context length. We're"
" trying to make this happen less, but one way to mitigate this is to"
" code smaller files. If this error persists report it at"
" https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Context Length",
str(e) + "\n" + traceback.format_exc(),
priority=2,
)
posthog.capture(
username,
"failed",
properties={
"error": str(e),
"trace": traceback.format_exc(),
"reason": "Invalid request error / context length",
**metadata,
"duration": round(time() - on_ticket_start_time),
},
)
delete_branch = True
raise e
except AssistantRaisedException as e:
if ticket_progress is not None:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Sweep raised an error with the following message: {e.message}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.exception(e)
edit_sweep_comment(
f"Sweep raised an error with the following message:\n{blockquote(e.message)}",
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
except Exception as e:
ticket_progress.status = TicketProgressStatus.ERROR
ticket_progress.error_message = f"Internal server error: {str(e)}. Feel free to add more details to the issue descript for Sweep to better address it, or alternatively, reach out to Kevin or William for help at https://discord.gg/sweep."
ticket_progress.save()
logger.error(traceback.format_exc())
logger.error(e)
# title and summary are defined elsewhere
if len(title + summary) < 60:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error occurred due to"
f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
" so Sweep can better address it. Alternatively, post on our community forum"
" for assistance: https://community.sweep.dev/"
),
-1,
)
else:
edit_sweep_comment(
(
"I'm sorry, but it looks like an error has occurred due to"
+ f" a planning failure. The error message is {str(e)}. Feel free to add more details to the issue description"
+ " so Sweep can better address it. Alternatively, reach out to Kevin or William for help at"
+ " https://discord.gg/sweep."
),
-1,
)
log_error(
is_paying_user,
is_consumer_tier,
username,
issue_url,
"Workflow",
str(e) + "\n" + traceback.format_exc(),
priority=1,
)
raise e
else:
try:
fire_and_forget_wrapper(remove_emoji)(content_to_delete="eyes")
fire_and_forget_wrapper(add_emoji)("rocket")
except SystemExit:
raise SystemExit
except Exception as e:
logger.error(e)
if delete_branch:
try:
if pull_request.branch_name.startswith("sweep"):
repo.get_git_ref(
f"heads/{pull_request.branch_name}"
).delete()
else:
raise Exception(
f"Branch name {pull_request.branch_name} does not start with sweep/"
)
except Exception as e:
logger.error(e)
logger.error(traceback.format_exc())
logger.info("Deleted branch", pull_request.branch_name)
except Exception as e:
posthog.capture(
username,
"failed",
properties={
**metadata,
"error": str(e),
"trace": traceback.format_exc(),
"duration": round(time() - on_ticket_start_time),
},
)
raise e
posthog.capture(
username,
"success",
properties={**metadata, "duration": round(time() - on_ticket_start_time)},
)
logger.info("on_ticket success in " + str(round(time() - on_ticket_start_time)))
return {"success": True}
def handle_sandbox_mode(
title, repo_full_name, repo, ticket_progress, edit_sweep_comment
):
logger.info("Running in sandbox mode")
sweep_bot = SweepBot(repo=repo, ticket_progress=ticket_progress)
logger.info("Getting file contents")
file_name = title.split(":")[1].strip()
file_contents = sweep_bot.get_contents(file_name).decoded_content.decode("utf-8")
try:
ext = file_name.split(".")[-1]
except Exception:
ext = ""
file_contents.replace("```", "\`\`\`")
sha = repo.get_branch(repo.default_branch).commit.sha
permalink = f"https://github.com/{repo_full_name}/blob/{sha}/{file_name}#L1-L{len(file_contents.splitlines())}"
logger.info("Running sandbox")
edit_sweep_comment(
f"Running sandbox for {file_name}. Current Code:\n\n{permalink}",
1,
)
updated_contents, sandbox_response = sweep_bot.check_sandbox(
file_name, file_contents
)
logger.info("Sandbox finished")
logs = (
(
"<br/>"
+ create_collapsible(
"Sandbox logs",
blockquote(
"\n\n".join(
[
create_collapsible(
f"<code>{output}</code> {i + 1}/{len(sandbox_response.outputs)} {format_sandbox_success(sandbox_response.success)}",
f"<pre>{clean_logs(output)}</pre>",
i == len(sandbox_response.outputs) - 1,
)
for i, output in enumerate(sandbox_response.outputs)
if len(sandbox_response.outputs) > 0
]
)
),
opened=True,
)
)
if sandbox_response
else ""
)
updated_contents = updated_contents.replace("```", "\`\`\`")
diff = generate_diff(file_contents, updated_contents).replace("```", "\`\`\`")
diff_display = (
f"Updated Code:\n\n```{ext}\n{updated_contents}```\nDiff:\n```diff\n{diff}\n```"
if diff
else f"Sandbox made no changes to {file_name} (formatters were not configured or Sweep didn't make changes)."
)
edit_sweep_comment(
f"{logs}\n{diff_display}",
2,
)
edit_sweep_comment("N/A", 3)
logger.info("Sandbox comments updated")
def get_branch_diff_text(repo, branch, base_branch=None):
base_branch = base_branch or SweepConfig.get_branch(repo)
comparison = repo.compare(base_branch, branch)
file_diffs = comparison.files
pr_diffs = []
for file in file_diffs:
diff = file.patch
if (
file.status == "added"
or file.status == "modified"
or file.status == "removed"
):
pr_diffs.append((file.filename, diff))
else:
logger.info(
f"File status {file.status} not recognized"
) # TODO(sweep): We don't handle renamed files
return "\n".join([f"{filename}\n{diff}" for filename, diff in pr_diffs])
def get_payment_messages(chat_logger: ChatLogger):
if chat_logger:
is_paying_user = chat_logger.is_paying_user()
is_consumer_tier = chat_logger.is_consumer_tier()
use_faster_model = chat_logger.use_faster_model()
else:
is_paying_user = True
is_consumer_tier = False
use_faster_model = False
tracking_id = chat_logger.data["tracking_id"] if MONGODB_URI is not None else None
# Find the first comment made by the bot
tickets_allocated = 5
if is_consumer_tier:
tickets_allocated = 15
if is_paying_user:
tickets_allocated = 500
purchased_ticket_count = (
chat_logger.get_ticket_count(purchased=True) if chat_logger else 0
)
ticket_count = (
max(tickets_allocated - chat_logger.get_ticket_count(), 0)
+ purchased_ticket_count
if chat_logger
else 999
)
daily_ticket_count = (
(3 - chat_logger.get_ticket_count(use_date=True) if not use_faster_model else 0)
if chat_logger
else 999
)
model_name = "GPT-4"
single_payment_link = "https://buy.stripe.com/00g3fh7qF85q0AE14d"
pro_payment_link = "https://buy.stripe.com/00g5npeT71H2gzCfZ8"
daily_message = (
f" and {daily_ticket_count} for the day"
if not is_paying_user and not is_consumer_tier
else ""
)
user_type = "💎 <b>Sweep Pro</b>" if is_paying_user else "⚡ <b>Sweep Basic Tier</b>"
gpt_tickets_left_message = (
f"{ticket_count} GPT-4 tickets left for the month"
if not is_paying_user
else "unlimited GPT-4 tickets"
)
purchase_message = f"<br/><br/> For more GPT-4 tickets, visit <a href={single_payment_link}>our payment portal</a>. For a one week free trial, try <a href={pro_payment_link}>Sweep Pro</a> (unlimited GPT-4 tickets)."
payment_message = (
f"{user_type}: I used {model_name} to create this ticket. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)
payment_message_start = (
f"{user_type}: I'm using {model_name}. You have {gpt_tickets_left_message}{daily_message}. (tracking ID: <code>{tracking_id}</code>)"
+ (purchase_message if not is_paying_user else "")
)

import base64
import os
from dotenv import load_dotenv
from loguru import logger
logger.print = logger.info
load_dotenv(dotenv_path=".env", override=True, verbose=True)
os.environ["GITHUB_APP_PEM"] = os.environ.get("GITHUB_APP_PEM") or base64.b64decode(
os.environ.get("GITHUB_APP_PEM_BASE64", "")
).decode("utf-8")
if os.environ["GITHUB_APP_PEM"]:
os.environ["GITHUB_APP_ID"] = (
(os.environ.get("GITHUB_APP_ID") or os.environ.get("APP_ID"))
.replace("\\n", "\n")
.strip('"')
)
TEST_BOT_NAME = "sweep-nightly[bot]"
ENV = os.environ.get("ENV", "dev")
BOT_TOKEN_NAME = "bot-token"
# goes under Modal 'discord' secret name (optional, can leave env var blank)
DISCORD_WEBHOOK_URL = os.environ.get("DISCORD_WEBHOOK_URL")
DISCORD_MEDIUM_PRIORITY_URL = os.environ.get("DISCORD_MEDIUM_PRIORITY_URL")
DISCORD_LOW_PRIORITY_URL = os.environ.get("DISCORD_LOW_PRIORITY_URL")
DISCORD_FEEDBACK_WEBHOOK_URL = os.environ.get("DISCORD_FEEDBACK_WEBHOOK_URL")
SWEEP_HEALTH_URL = os.environ.get("SWEEP_HEALTH_URL")
DISCORD_STATUS_WEBHOOK_URL = os.environ.get("DISCORD_STATUS_WEBHOOK_URL")
# goes under Modal 'github' secret name
GITHUB_APP_ID = os.environ.get("GITHUB_APP_ID", os.environ.get("APP_ID"))
# deprecated: old logic transfer so upstream can use this
if GITHUB_APP_ID is None:
if ENV == "prod":
GITHUB_APP_ID = "307814"
elif ENV == "dev":
GITHUB_APP_ID = "324098"
elif ENV == "staging":
GITHUB_APP_ID = "327588"
GITHUB_BOT_USERNAME = os.environ.get("GITHUB_BOT_USERNAME")
# deprecated: left to support old logic
if not GITHUB_BOT_USERNAME:
if ENV == "prod":
GITHUB_BOT_USERNAME = "sweep-ai[bot]"
elif ENV == "dev":
GITHUB_BOT_USERNAME = "sweep-nightly[bot]"
elif ENV == "staging":
GITHUB_BOT_USERNAME = "sweep-canary[bot]"
elif not GITHUB_BOT_USERNAME.endswith("[bot]"):
GITHUB_BOT_USERNAME = GITHUB_BOT_USERNAME + "[bot]"
GITHUB_LABEL_NAME = os.environ.get("GITHUB_LABEL_NAME", "sweep")
GITHUB_LABEL_COLOR = os.environ.get("GITHUB_LABEL_COLOR", "9400D3")
GITHUB_LABEL_DESCRIPTION = os.environ.get(
"GITHUB_LABEL_DESCRIPTION", "Sweep your software chores"
)
GITHUB_APP_PEM = os.environ.get("GITHUB_APP_PEM")
GITHUB_APP_PEM = GITHUB_APP_PEM or os.environ.get("PRIVATE_KEY")
if GITHUB_APP_PEM is not None:
GITHUB_APP_PEM = GITHUB_APP_PEM.strip(' \n"') # Remove whitespace and quotes
GITHUB_APP_PEM = GITHUB_APP_PEM.replace("\\n", "\n")
GITHUB_CONFIG_BRANCH = os.environ.get("GITHUB_CONFIG_BRANCH", "sweep/add-sweep-config")
GITHUB_DEFAULT_CONFIG = os.environ.get(
"GITHUB_DEFAULT_CONFIG",
"""# Sweep AI turns bugs & feature requests into code changes (https://sweep.dev)
# For details on our config file, check out our docs at https://docs.sweep.dev/usage/config
# This setting contains a list of rules that Sweep will check for. If any of these rules are broken in a new commit, Sweep will create an pull request to fix the broken rule.
rules:
{additional_rules}
# This is the branch that Sweep will develop from and make pull requests to. Most people use 'main' or 'master' but some users also use 'dev' or 'staging'.
branch: 'main'
# By default Sweep will read the logs and outputs from your existing Github Actions. To disable this, set this to false.
gha_enabled: True
# This is the description of your project. It will be used by sweep when creating PRs. You can tell Sweep what's unique about your project, what frameworks you use, or anything else you want.
#
# Example:
#
# description: sweepai/sweep is a python project. The main api endpoints are in sweepai/api.py. Write code that adheres to PEP8.
description: ''
# This sets whether to create pull requests as drafts. If this is set to True, then all pull requests will be created as drafts and GitHub Actions will not be triggered.
draft: False
# This is a list of directories that Sweep will not be able to edit.
blocked_dirs: []
""",
)
MONGODB_URI = os.environ.get("MONGODB_URI", None)
IS_SELF_HOSTED = os.environ.get("IS_SELF_HOSTED", "true").lower() == "true"
REDIS_URL = os.environ.get("REDIS_URL")
if not REDIS_URL:
REDIS_URL = os.environ.get("redis_url", "redis://0.0.0.0:6379/0")
ORG_ID = os.environ.get("ORG_ID", None)
POSTHOG_API_KEY = os.environ.get(
"POSTHOG_API_KEY", "phc_CnzwIB0W548wN4wEGeRuxXqidOlEUH2AcyV2sKTku8n"
)
SUPPORT_COUNTRY = os.environ.get("GDRP_LIST", "").split(",")
WHITELISTED_REPOS = os.environ.get("WHITELISTED_REPOS", "").split(",")
BLACKLISTED_USERS = os.environ.get("BLACKLISTED_USERS", "").split(",")
# Default OpenAI
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", None) # this may be none, and it will use azure
OPENAI_API_TYPE = os.environ.get("OPENAI_API_TYPE", "anthropic")
assert OPENAI_API_TYPE in ["anthropic", "azure", "openai"], "Invalid OPENAI_API_TYPE"
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
AZURE_API_KEY = os.environ.get("AZURE_API_KEY", None)
OPENAI_API_BASE = os.environ.get("OPENAI_API_BASE", None)
OPENAI_API_VERSION = os.environ.get("OPENAI_API_VERSION", None)
AZURE_OPENAI_DEPLOYMENT = os.environ.get("AZURE_OPENAI_DEPLOYMENT", None)
OPENAI_EMBEDDINGS_API_TYPE = os.environ.get("OPENAI_EMBEDDINGS_API_TYPE", "openai")
OPENAI_EMBEDDINGS_AZURE_ENDPOINT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_ENDPOINT", None
)
OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_DEPLOYMENT", None
)
OPENAI_EMBEDDINGS_AZURE_API_VERSION = os.environ.get(
"OPENAI_EMBEDDINGS_AZURE_API_VERSION", None
)
OPENAI_API_ENGINE_GPT35 = os.environ.get("OPENAI_API_ENGINE_GPT35", None)
OPENAI_API_ENGINE_GPT4 = os.environ.get("OPENAI_API_ENGINE_GPT4", None)
MULTI_REGION_CONFIG = os.environ.get("MULTI_REGION_CONFIG", None)
if isinstance(MULTI_REGION_CONFIG, str):
MULTI_REGION_CONFIG = MULTI_REGION_CONFIG.strip("'").replace("\\n", "\n")
MULTI_REGION_CONFIG = [item.split(",") for item in MULTI_REGION_CONFIG.split("\n")]
WHITELISTED_USERS = os.environ.get("WHITELISTED_USERS", None)
if WHITELISTED_USERS:
WHITELISTED_USERS = WHITELISTED_USERS.split(",")
WHITELISTED_USERS.append(GITHUB_BOT_USERNAME)
DEFAULT_GPT4_MODEL = os.environ.get("DEFAULT_GPT4_MODEL", "gpt-4-0125-preview")
RESEND_API_KEY = os.environ.get("RESEND_API_KEY", None)
LOKI_URL = None
DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
ENV = "prod" if GITHUB_BOT_USERNAME != TEST_BOT_NAME else "dev"
PROGRESS_BASE_URL = os.environ.get(
"PROGRESS_BASE_URL", "https://progress.sweep.dev"
).rstrip("/")
DISABLED_REPOS = os.environ.get("DISABLED_REPOS", "").split(",")
GHA_AUTOFIX_ENABLED: bool = os.environ.get("GHA_AUTOFIX_ENABLED", False)
MERGE_CONFLICT_ENABLED: bool = os.environ.get("MERGE_CONFLICT_ENABLED", False)
INSTALLATION_ID = os.environ.get("INSTALLATION_ID", None)
AWS_ACCESS_KEY=os.environ.get("AWS_ACCESS_KEY")
AWS_SECRET_KEY=os.environ.get("AWS_SECRET_KEY")
AWS_REGION=os.environ.get("AWS_REGION")
ANTHROPIC_AVAILABLE = AWS_ACCESS_KEY and AWS_SECRET_KEY and AWS_REGION
ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY", None)
COHERE_API_KEY = os.environ.get("COHERE_API_KEY", None)
VOYAGE_API_KEY = os.environ.get("VOYAGE_API_KEY", None)
VOYAGE_API_AWS_ACCESS_KEY=os.environ.get("VOYAGE_API_AWS_ACCESS_KEY_ID")
VOYAGE_API_AWS_SECRET_KEY=os.environ.get("VOYAGE_API_AWS_SECRET_KEY")
VOYAGE_API_AWS_REGION=os.environ.get("VOYAGE_API_AWS_REGION")
VOYAGE_API_AWS_ENDPOINT_NAME=os.environ.get("VOYAGE_API_AWS_ENDPOINT_NAME", "voyage-code-2")
VOYAGE_API_USE_AWS = VOYAGE_API_AWS_ACCESS_KEY and VOYAGE_API_AWS_SECRET_KEY and VOYAGE_API_AWS_REGION
PAREA_API_KEY = os.environ.get("PAREA_API_KEY", None)
# TODO: we need to make this dynamic + backoff
BATCH_SIZE = int(
os.environ.get("BATCH_SIZE", 64 if VOYAGE_API_KEY else 256) # Voyage only allows 128 items per batch and 120000 tokens per batch
)
DEPLOYMENT_GHA_ENABLED = os.environ.get("DEPLOYMENT_GHA_ENABLED", "true").lower() == "true"
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

import re
from sweepai.core.chat import ChatGPT
issue_validator_instructions_prompt = """# Instructions
A good issue for Sweep is actionable and it is clear how to resolve it. Here is what Sweep is currently capable of:
- Access to the entire codebase, with a high-quality search engine to find specific code snippets. Sweep is able to pinpoint the exact location of the code that needs to be changed based on vague descriptions.
- Making code changes to fix bugs or add features.
- Reading the GitHub Action logs to run tests and check the results.
- Ability to read images such as screenshots and charts.
Here are some examples of things Sweep does not currently support:
- Large-scale changes like migrations and large version upgrades.
- Tasks requiring accessing outside information like AWS consoles or retrieving API keys.
- Tasks requiring fixes outside of code changes
- Issues that have an existing fix or duplicate issues
Respond in the following format:
<thinking>
Provide an analysis of why it is a good or bad issue to pass on to Sweep. If it is a bad issue, suggest how the issue could be improved or clarified to make it more suitable for Sweep.
</thinking>
<pass>True or False</pass>
If False, respond to the user:
<response_to_user>
Response to user with justification on why the issue is unclear.
</response_to_user>"""
issue_validator_system_prompt = """You are an AI assistant tasked with determining whether an issue reported by customer support should be passed on to be resolved by Sweep, an AI-powered software engineer.
""" + issue_validator_instructions_prompt
issue_validator_user_prompt = """<issue>
{issue}
</issue>\n\n""" + issue_validator_instructions_prompt
def validate_issue(issue: str) -> str:
"""
Somehow haiku and GPT-4 can't do this consistently.
"""
chat_gpt = ChatGPT.from_system_message_string(
prompt_string=issue_validator_system_prompt,
)
response = chat_gpt.chat_anthropic(
issue_validator_user_prompt.format(
issue=issue
),
model="claude-3-opus-20240229",
temperature=0.0,
)
if "<pass>False</pass>" in response:
pattern = "<response_to_user>(.*)</response_to_user>"
return re.search(pattern, response, re.DOTALL).group(1).strip()
return ""
if __name__ == "__main__":


Step 2: ⌨️ Coding

Modify sweepai/config/server.py with contents: Add a new environment variable for the user's Slack API key.

<original_code>
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

LICENSE_KEY = os.environ.get("LICENSE_KEY", None)
ALTERNATE_AWS = os.environ.get("ALTERNATE_AWS", "none").lower() == "true"
</original_code>

<new_code>
JIRA_USER_NAME = os.environ.get("JIRA_USER_NAME", None)
JIRA_API_TOKEN = os.environ.get("JIRA_API_TOKEN", None)
JIRA_URL = os.environ.get("JIRA_URL", None)

SLACK_API_KEY = os.environ.get("SLACK_API_KEY", None)

LICENSE_KEY = os.environ.get("LICENSE_KEY", None)
ALTERNATE_AWS = os.environ.get("ALTERNATE_AWS", "none").lower() == "true"
</new_code>

  • Modify sweepai/handlers/on_ticket.py3af4b58 Edit
Modify sweepai/handlers/on_ticket.py with contents: Import the necessary modules for making HTTP requests and parsing Slack message threads.

<original_code>
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter
</original_code>

<new_code>
import re
import traceback
from typing import Any
import zipfile
from time import time

import markdown
import openai
import requests
import yaml
import yamllint.config as yamllint_config
from github import BadCredentialsException, Github, Repository
from github.Issue import Issue
from github.PullRequest import PullRequest as GithubPullRequest
from loguru import logger
from tabulate import tabulate
from tqdm import tqdm
from yamllint import linter

from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError
</new_code>

  • Modify sweepai/handlers/on_ticket.py3af4b58 Edit
Modify sweepai/handlers/on_ticket.py with contents: In the `on_ticket` function, check if the issue description contains a Slack link. If found, authenticate to Slack and fetch the message thread.

<original_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)
repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
assignee = current_issue.user.login
</original_code>

<new_code>
summary = summary or ""
summary = re.sub(
"<details (open)?>(\r)?\n

Checklist.",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"---\s+Checklist:(\r)?\n(\r)?\n- [[ X]].
",
"",
summary,
flags=re.DOTALL,
).strip()
summary = re.sub(
"### Details\n\n_No response_", "", summary, flags=re.DOTALL
)
summary = re.sub("\n\n", "\n", summary, flags=re.DOTALL)

slack_link_match = re.search(r'(https://\w+\.slack\.com/archives/\w+/p\d+)', summary)
if slack_link_match:
    slack_link = slack_link_match.group(1)
    slack_client = WebClient(token=SLACK_API_KEY)
    
    try:
        slack_permalink_data = slack_client.chat_getPermalink(
            link=slack_link
        )
        slack_channel_id = slack_permalink_data['channel']
        slack_message_ts = slack_permalink_data['message_ts']

        slack_thread_replies = slack_client.conversations_replies(
            channel=slack_channel_id,
            ts=slack_message_ts
        )

        slack_thread_messages = [message['text'] for message in slack_thread_replies['messages']]
        slack_thread_text = '\n'.join(slack_thread_messages)

        summary += f"\n\nSlack Thread:\n{slack_thread_text}"

    except SlackApiError as e:
        logger.error(f"Error fetching Slack thread: {e}")

repo_name = repo_full_name
user_token, g = get_github_client(installation_id)
repo = g.get_repo(repo_full_name)
current_issue: Issue = repo.get_issue(number=issue_number)
assignee = current_issue.assignee.login if current_issue.assignee else None
if assignee is None:
    assignee = current_issue.user.login

</new_code>


Step 3: 🔁 Code Review

I have finished reviewing the code for completeness. I did not find errors for sweep/augment_on_ticket_so_that_when_a_user_ad_3c33d.


🎉 Latest improvements to Sweep:
  • New dashboard launched for real-time tracking of Sweep issues, covering all stages from search to coding.
  • Integration of OpenAI's latest Assistant API for more efficient and reliable code planning and editing, improving speed by 3x.
  • Use the GitHub issues extension for creating Sweep issues directly from your editor.

💡 To recreate the pull request edit the issue title or description.
Something wrong? Let us know.

This is an automated message generated by Sweep AI.

wwzeng1 added a commit that referenced this issue May 3, 2024
…tomatically unroll the thread and extract the information (#3668)

# Description
This pull request introduces enhancements to the `on_ticket` handler
within the SweepAI application. It augments the existing functionality
by automatically unrolling Slack threads when a Slack link is included
in a ticket summary. The extracted conversation from the Slack thread is
then appended to the ticket summary, providing additional context and
information directly within the ticket.

# Summary
- Added `SLACK_API_KEY` environment variable to
`sweepai/config/server.py` for Slack integration.
- Imported `WebClient` and `SlackApiError` from `slack_sdk` in
`sweepai/handlers/on_ticket.py` to enable communication with the Slack
API.
- Implemented a new feature in `on_ticket` that:
  - Detects a Slack link in the ticket summary.
- Uses the Slack API to fetch the permalink data and retrieve the thread
replies.
  - Appends the text of the Slack thread messages to the ticket summary.
- Handles potential `SlackApiError` exceptions and logs errors
accordingly.
- The changes ensure that relevant Slack conversations are automatically
included in the ticket details, improving the ticket resolution process.

Fixes #3656.

---

<details>
<summary><b>🎉 Latest improvements to Sweep:</b></summary>
<ul>
<li>New <a href="https://progress.sweep.dev">dashboard</a> launched for
real-time tracking of Sweep issues, covering all stages from search to
coding.</li>
<li>Integration of OpenAI's latest Assistant API for more efficient and
reliable code planning and editing, improving speed by 3x.</li>
<li>Use the <a
href="https://marketplace.visualstudio.com/items?itemName=GitHub.vscode-pull-request-github">GitHub
issues extension</a> for creating Sweep issues directly from your
editor.</li>
</ul>
</details>


---

### 💡 To get Sweep to edit this pull request, you can:
* Comment below, and Sweep can edit the entire PR
* Comment on a file, Sweep will only modify the commented file
* Edit the original issue to get Sweep to recreate the PR from scratch

*This is an automated message generated by [Sweep
AI](https://sweep.dev).*

---------

Co-authored-by: sweep-nightly[bot] <131841235+sweep-nightly[bot]@users.noreply.github.com>
Co-authored-by: wwzeng1 <william@sweep.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment