Skip to content

Refactoring

Kuznetsov Denis edited this page Dec 27, 2021 · 5 revisions

General conceptions

  • drop useless directories/files/SLOC
  • keep standard file structure (Service directory structure)
  • keep clean SLOC (a tutor with best practices) and use the auto formatter
  • document SLOC if they are not obvious
  • document for what purpose SLOC are kept if they are not used in service runtime
  • keep the license of the service (your need to create NOTICE.md if the service has a license )
  • freeze versions in Dockerfile and requirements.txt

Service directory structure

data/ (OPTIONAL)
FOLDER_FOR_LOCAL_PYTHON_MODULES/ (OPTIONAL)
tests/
NOTICE.md (OPTIONAL)
Dockerfile 
README.md 
requirements.txt
server.py
test_server.py
test.sh

data/

It's an optional directory, be used for small text files

FOLDER_FOR_LOCAL_PYTHON_MODULES/

It's an optional directory, be used for local python modules e.g. utils.py, model.py, etc. The name of the folder has to be changed to the appropriate variant.

tests/

It's a directory, be used for testing files ( *_in.json and *_out.json).

NOTICE.md

It's an optional file, be used for additional license documentation NOTICE.md. Create it if the service has its license.

Dockerfile

Use .dockerignore if it needs. Use ADD instead of wget/curl except in some cases it is preferred to use the RUN instruction over ADD to download a package using curl or wget, extract it, and then remove the original file in a single step, reducing the number of layers. Try to optimize your Dockerfile to speed up the build. Reorder dockerfile directives to reduce the number of unhashed changes for rebuilds

README.md

Describe purposes of service and resource consumptions (RAM for CPU and GPU) and average executed time for both.

requirements.txt

minimum required set of required dependencies with versions

server.py

#!/usr/bin/env python
# basic import
import logging
import time
import os
import random
...

# external import
....

# local modules import
...

import test_server

ignore_logger("root")

sentry_sdk.init(os.getenv("SENTRY_DSN"))
SERVICE_NAME = os.getenv("SERVICE_NAME")
SERVICE_PORT = int(os.getenv("SERVICE_PORT"))
RANDOM_SEED = int(os.getenv("RANDOM_SEED", 2718))

logging.basicConfig(format="%(asctime)s - %(pathname)s - %(lineno)d - %(levelname)s - %(message)s", level=logging.INFO)
logger = logging.getLogger(__name__)
logging.getLogger("werkzeug").setLevel("WARNING")
# init code
...


def handler(requested_data, random_seed=None):
    st_time = time.time()
    random_seed = requested_data.get("random_seed", random_seed)  # for tests
    # get data from requested_data
    ...
    responses = []
    for (...) in zip(...): # iteration per dialog if batch is not needed
        try:
            # for tests
            if random_seed:
                random.seed(int(random_seed))
            # code
            ....
            responses.append(...)
        except Exception as exc:
            sentry_sdk.capture_exception(exc)
            logger.exception(exc)
            responses.append(...)

    total_time = time.time() - st_time
    logger.info(f"{SERVICE_NAME} exec time = {total_time:.3f}s")
    return responses


try:
    test_server.run_test(handler)
    logger.info("test query processed")
except Exception as exc:
    sentry_sdk.capture_exception(exc)
    logger.exception(exc)
    raise exc

logger.info(f"{SERVICE_NAME} is loaded and ready")


@app.route("/respond", methods=["POST"])
def respond():
    # import common.test_utils as t_utils; t_utils.save_to_test(request.json,"tests/lets_talk_in.json",indent=4)  # TEST
    # responses = handler(request.json, RANDOM_SEED)  # TEST
    # import common.test_utils as t_utils; t_utils.save_to_test(responses,"tests/lets_talk_out.json",indent=4)  # TEST
    responses = handler(request.json)
    return jsonify(responses)


if __name__ == "__main__":
    app.run(debug=False, host="0.0.0.0", port=SERVICE_PORT)

Below you can see a part of the code. It is used for test creation. You have to uncomment them and comment responses = handler(request.json) for the creation of new test pair (*_in.json and *_out.json files).

    # import common.test_utils as t_utils; t_utils.save_to_test(request.json,"tests/lets_talk_in.json",indent=4)  # TEST
    # responses = handler(request.json, RANDOM_SEED)  # TEST
    # import common.test_utils as t_utils; t_utils.save_to_test(responses,"tests/lets_talk_out.json",indent=4)  # TEST

You can find a full example in the file skills/dff_movie_skill/server.py

test_server.py

it has to be the same as the example below without changes. If you need a change your file asks us about that.

import requests
import os

import common.test_utils as test_utils


SERVICE_PORT = int(os.getenv("SERVICE_PORT"))
RANDOM_SEED = int(os.getenv("RANDOM_SEED", 2718)) # for reproducible testing
URL = f"http://0.0.0.0:{SERVICE_PORT}/respond"


def handler(requested_data, random_seed):
    hypothesis = requests.post(URL, json={**requested_data, "random_seed": random_seed}).json()
    return hypothesis


def run_test(handler):
    in_data, out_data = test_utils.get_dataset()
    for test_name in in_data:
        hypothesis = handler(in_data[test_name], RANDOM_SEED)
        print(f"test name: {test_name}")
        is_equal_flag, msg = test_utils.compare_structs(out_data[test_name], hypothesis)
        if msg and len(msg.split("`")) == 5:
            _, ground_truth_text, _, hypothesis_text, _ = msg.split("`")
            is_equal_flag, ratio = test_utils.compare_text(ground_truth_text, hypothesis_text, 0.80)
            if not is_equal_flag:
                msg = f"{msg} ratio = {ratio}"
        assert is_equal_flag, msg
        print("Success")


if __name__ == "__main__":
    run_test(handler)

test.sh

#!/bin/bash

python test_server.py

Example

this service has good code but it is not perfect skills/dff_coronovirus_skill.

Code style

Just use black --line-length=120 . and your code will be nice. black

Links