Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Admin Tools Enhancement and Cost Optimization #68

Open
10 tasks
gchhablani opened this issue Feb 9, 2024 · 5 comments
Open
10 tasks

Admin Tools Enhancement and Cost Optimization #68

gchhablani opened this issue Feb 9, 2024 · 5 comments

Comments

@gchhablani
Copy link
Collaborator

gchhablani commented Feb 9, 2024

Project Title: Admin Tools Enhancement and Cost Optimization

Description: The goal of this project is to focus on improving admin experience on EvalAI, as well as target efficient cost-reduction for maintaining EvalAI.

One of the primary focuses will be on enhancing the existing automation of the cancellation of submissions which have expired messages on the SQS queues. The second focus will be identifying underutilized/overutilized ECS instances using AWS health metrics to automatically determine the required compute. Other improvements will include admin actions on Django administration for starting/stopping/restarting the EC2 instance workers, and providing automated deletion of code-upload infrastructure on challenge un-approval.

These features, along with others mentioned in the deliverables, will make EvalAI administrative experience seamless and will also save costs in the longer run.

Deliverable:

  • Admin Enhancements:
    • Create admin actions for EC2 worker start/stop/create/restart.
    • Implement an approval button directly on Slack request notifications for challenge approval.
    • Add a feature to automatically delete code-upload infrastructure on unapproving challenges via Django administration.
  • Cost Optimization Measures:
    • Use the custom SQS queue retention time to automatically cancel submissions and save costs.
    • Enhance the auto-cancel script to reflect changes in Prometheus metrics for accuracy of metrics.
    • Add improvements for retention of Prometheus metrics on container restart.
    • Identify and stop excessive instances and EC2 clones running on AWS for cost-saving.
    • Identify and remove old ECR repositories (and other avenues) for cost reduction on AWS.
  • Infrastructure Monitoring and Automation:
    • Automate ECS monitoring to detect and adjust CPU and memory consumption based on challenge requirements.
    • Address challenges requiring frequent restarts on both EC2 and ECS instances. Migrate problematic workers from ECS to EC2 and improve efficiency of the instances.

Mentor: @gchhablani, Rahul Singh, @gautamjajoo, @RishabhJain2018

Skills: Python, Django, AngularJS, AWS

Skill Level: Medium

Get started: Try to fix some issues in EvalAI (note that there are some issues labeled with GSoC-2024).

Important Links:

@mridul45
Copy link

Hi @gchhablani , I am new to GSOC and I was wondering how to get started. So I have a question , do I have to present a soluttion to the above project title or I have to download the code , study it and then prepare a detailed proposal about my own solution?

@saif-hacker
Copy link

Hi @gchhablani, @gautamjajoo, @RishabhJain2018 I am new to GSOC and i am an absolute beginner in open source contribution i need some assistance to get started so that i could effectively contribute to this issue

@Antoniocolapso
Copy link

Interested for the project.

A little bit about me :

I'm Omm Prakash Sahoo from IIT Bhilai, 3rd year b.tech. i recently become ML guy after doing CP and dev in my first 2 years.Was ICPC regionalist expert at CodeForces done decent amount of DEVoPs, Developed both Web and android apps, topped in SystemDesign and ML course at our college, built LIP-Reader (predicting sentence only by lip movement), was lead of INTER-IIT Tech meet team for Adobe behaviour simulation challenge in which we developed 2 LLMs to ease the process of posting new content for marketing for companies in every segment and a lots of interesting works.

After achieving my personal goals in CP in first 2 years now i want to finally contribute to real-world problems.Coz have been member of GDSC, OpenLake(Club for Open Source enthusiasts at IIT Bhilai) and Co-ordinator of Ingenuity (CP club).

Here is my resume link : https://drive.google.com/file/d/1LbGBW9veH75x7IqABSk9ui03JIzZv7cV/view?usp=drive_link

I will be there to solve it soon.

@abdulhameed04
Copy link

Hello. I'm Abdul Hameed, pursuing my Statistics Honours from Delhi University. I'm new to this GSoC platform, and also I'm a vivid learner adopting to different work environment. Loved to help you in this project.

@Aisikhue
Copy link

Aisikhue commented Apr 2, 2024

Good day,

My name is Ayemhenre Isikhuemhen, I am a student at the University of North Carolina at Charlotte, studying Computer Science with a concentration in Artificial Intelligence. I am looking for hands-on experience with Machine Learning and would love to have the opportunity to contribute to this project.

You can access my resume using the following link:
https://drive.google.com/file/d/1UcbCRY0mC9AqjcN49DwYwSBeKdZoMavT/view?usp=sharing

Kindly,
Ayemehenre Isikhuemhen
aisikhue@uncc.edu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants