Skip to content

Chaos-Camp/ai-test-challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

OpenAI API Testing Challenge

Setup:

You have been provided an endpoint to the OpenAI API. Your goal is to ensure this AI system is robust, timely, and produces quality results across diverse inputs.

To begin:

  1. Sign up for an OpenAI API account here.
  2. Once you've signed up and have your API key, set it up as an environment variable on your machine. This helps ensure the security of your API key by not hardcoding it in your scripts.
    export OPENAI_API_KEY='YOUR_API_KEY'

Replace 'YOUR_API_KEY' with the actual key provided to you by OpenAI.

Challenges:

  1. Task-Specific Accuracy:

    • Write a set of tests that send specific prompts to the API and compare the results to a predefined answer set.
  2. Diverse Input Scenarios:

    • Develop tests that generate a wide variety of prompts to examine how well the API handles diverse input scenarios.
  3. Long Conversations:

    • Design a tests that simulates a prolonged interaction with the AI, ensuring the AI maintains context over a series of related prompts.
  4. Response Time:

    • Measure the response time for various queries. Provide a report on average, min, and max response times.
  5. Ethical and Safety Checks:

    • Analyze the sentiment of AI's outputs on potentially sensitive prompts. Alert if the AI provides inappropriate responses.
  6. Multilingual Capability:

    • Send prompts in various languages and validate the correctness of translations or responses.
  7. Generalization Ability:

    • Create tests to evaluate the AI's ability to generalize from the data it was trained on.
  8. Memory and Context Evaluation:

    • Send a sequence of prompts/questions to the AI to see if it maintains context over the course of the conversation.
  9. Feedback Loops:

    • Design a feedback interface and store feedback. Ensure that the AI's outputs can be incorporated for feedback.
  10. Input Validation and Sanitization:

    • Validate and sanitize inputs to the API. Use tools to test how well the system rejects or handles malicious inputs.

Submission:

Provide the following:

  • Code for your tests and scripts.
  • A detailed report of your findings, including any vulnerabilities or issues discovered.
  • Recommendations for improvements or fixes.
  • To submit your solutions, fork this repo and create a pull request with the naming convention: feature/YOUR_GITHUB_USERNAME. We'll review your submission and provide feedback!