TFServing-setup-review

Basic tutorial for Tensorflow Serving API (checkout tfclient branch)

TFClinet

Basic tutorial for Tensorflow Serving

Install Docker

Window/MacOS: install Docker from DockerHub. (need to register new account if you are newbie)
linux: install Docker

Tutorial for starting

$ git clone https://github.com/Alwaysproblem/TFServing-setup-review.git
$ cd TFServing-setup-review

Easy TFServer

try simple example from tensorflow document.

# Download the TensorFlow Serving Docker image and repo
$ docker pull tensorflow/serving

$ git clone https://github.com/tensorflow/serving
# Location of demo models
TESTDATA="$(pwd)/serving/tensorflow_serving/servables/tensorflow/testdata"

# Start TensorFlow Serving container and open the REST API port
$ docker run -it --rm -p 8501:8501 \
    -v "$TESTDATA/saved_model_half_plus_two_cpu:/models/half_plus_two" \
    -e MODEL_NAME=half_plus_two \
    tensorflow/serving &

# Query the model using the predict API
# need to create a new terminal.
$ curl -d '{"instances": [1.0, 2.0, 5.0]}' \
    -X POST http://localhost:8501/v1/models/half_plus_two:predict

# Returns => { "predictions": [2.5, 3.0, 4.5] }

Docker common command.

#kill all the alive image.
$ docker kill $(docker ps -q)

#stop all the alinve image
$ docker stop $(docker ps -q)

# remove all non-running image
$ docker rm $$(docker ps -aq)

# check all images
$ dokcker ps -a

#check the all alive image.
$ docker ps

#run a serving image as a daemon with a readable name.
$ docker run -d --name serving_base tensorflow/serving

#execute a command in the docker, you should substitute $(docker image name) for you own image name.
$ docker exec -it ${docker image name} sh -c "cd /tmp"

# enter docker ubuntu bash
$ docker exec -it ${docker image name} bash -l

Run Server with your own saved pretrain models

make sure your model directory like this:

---save
    |
    ---Model Name
          |
          ---1
              |
              ---asset
              |
              ---variables
              |
              ---model.pb

substitute user_define_model_name for you own model name and path_to_your_own_models for directory path of your own model

# run the server.
$ docker run -it --rm -p 8501:8501 -v "$(pwd)/${path_to_your_own_models}/1:/models/${user_define_model_name}" -e MODEL_NAME=${user_define_model_name} tensorflow/serving &

#run the client.
$ curl -d '{"instances": [[1.0, 2.0]]}' -X POST http://localhost:8501/v1/models/${user_define_model_name}:predict

you also can use tensorflow_model_server command after entering docker bash

$ docker exec -it ${docker image name} bash -l

$ tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME}

example

Save the model after running LinearKeras.py

$ docker run -it --rm -p 8501:8501 -v "$(pwd)/save/Toy:/models/Toy" -e MODEL_NAME=Toy tensorflow/serving &

$ curl -d '{"instances": [[1.0, 2.0]]}' -X POST http://localhost:8501/v1/models/Toy:predict

# {
#     "predictions": [[0.999035]
#     ]

bind your own model to the server

bind bash path to the model.

$ docker run -p 8501:8501 --mount type=bind,source=/path/to/my_model/,target=/models/my_model -e MODEL_NAME=my_model -it tensorflow/serving

example

$ docker run -p 8501:8501 --mount type=bind,source=$(pwd)/save/Toy,target=/models/Toy -e MODEL_NAME=Toy -it tensorflow/serving

$ curl -d '{"instances": [[1.0, 2.0]]}' -X POST http://localhost:8501/v1/models/Toy:predict

# {
#     "predictions": [[0.999035]
#     ]

RESTful API

data is like

a b c d e f

390 25 1 1 1 2

345 34 45 2 34 3456

instances means a row of data

{"instances": [
    {
      "a": [390],
      "b": [25],
      "c": [1],
      "d": [1],
      "e": [1],
      "f": [2]
    },
    {
      "a": [345],
      "b": [34],
      "c": [45],
      "d": [2],
      "e": [34],
      "f": [3456]
    }
  ]
}

inputs means a column of data

  {"inputs":
    {
      "a": [[390], [345]],
      "b": [[25], [34]],
      "c": [[1], [45]],
      "d": [[1], [2]],
      "e": [[1], [34]],
      "f": [[2], [3456]]
    },
  }

REST API

Run multiple model in TFServer

set up the configuration file named Toy.config

model_config_list: {
  config: {
    name: "Toy",
    base_path: "/models/save/Toy/",
    model_platform: "tensorflow"
  },
  config: {
    name: "Toy_double",
    base_path: "/models/save/Toy_double/",
    model_platform: "tensorflow"
  }
}

substitute Config Path for you own configeratin file.

docker run -it --rm -p 8501:8501 -v "$(pwd):/models/" tensorflow/serving --model_config_file=/models/${Config Path} --model_config_file_poll_wait_seconds=60

example

$ docker run -it --rm -p 8501:8501 -v "$(pwd):/models/" tensorflow/serving --model_config_file=/models/config/Toy.config

$ curl -d '{"instances": [[1.0, 2.0]]}' -X POST http://localhost:8501/v1/models/Toy_double:predict
# {
#     "predictions": [[6.80301666]
#     ]
# }

$ curl -d '{"instances": [[1.0, 2.0]]}' -X POST http://localhost:8501/v1/models/Toy:predict
# {
#     "predictions": [[0.999035]
#     ]`
# }

bind your own path to TFserver. The model target path is related to the configuration file.

$ docker run --rm -p 8500:8500 -p 8501:8501 \
  --mount type=bind,source=${/path/to/my_model/},target=/models/${my_model} \
  --mount type=bind,source=${/path/to/my/models.config},target=/models/${models.config} -it tensorflow/serving --model_config_file=/models/{models.config}

example

$ docker run --rm -p 8500:8500 -p 8501:8501 --mount type=bind,source=$(pwd)/save/,target=/models/save --mount type=bind,source=$(pwd)/config/Toy.config,target=/models/Toy.config -it tensorflow/serving --model_config_file=/models/Toy.config

$ curl -d '{"instances": [[1.0, 2.0]]}' -X POST http://localhost:8501/v1/models/Toy_double:predict
# {
#     "predictions": [[6.80301666]
#     ]
# }

$ curl -d '{"instances": [[1.0, 2.0]]}' -X POST http://localhost:8501/v1/models/Toy:predict
# {
#     "predictions": [[0.999035]
#     ]
# }

Version control for TFServer

set up single version control configuration file.

model_config_list: {
  config: {
    name: "Toy",
    base_path: "/models/save/Toy/",
    model_platform: "tensorflow",
    model_version_policy: {
        specific {
            versions: 1
        }
    }
  },
  config: {
    name: "Toy_double",
    base_path: "/models/save/Toy_double/",
    model_platform: "tensorflow"
  }
}

set up multiple version control configuration file.

model_config_list: {
  config: {
    name: "Toy",
    base_path: "/models/save/Toy/",
    model_platform: "tensorflow",
    model_version_policy: {
        specific {
            versions: 1,
            versions: 2
        }
    }
  },
  config: {
    name: "Toy_double",
    base_path: "/models/save/Toy_double/",
    model_platform: "tensorflow"
  }
}

example

$ docker run --rm -p 8500:8500 -p 8501:8501 --mount type=bind,source=$(pwd)/save/,target=/models/save --mount type=bind,source=$(pwd)/config/versionctrl.config,target=/models/versionctrl.config -it tensorflow/serving --model_config_file=/models/versionctrl.config --model_config_file_poll_wait_seconds=60

for POST

$ curl -d '{"instances": [[1.0, 2.0]]}' -X POST http://localhost:8501/v1/models/Toy/versions/1:predict
# {
#     "predictions": [[10.8054295]
#     ]
# }

$ curl -d '{"instances": [[1.0, 2.0]]}' -X POST http://localhost:8501/v1/models/Toy/versions/2:predict
# {
#     "predictions": [[0.999035]
#     ]
# }

for gRPC

  $ python3 grpcRequest.py -m Toy -v 1
  # outputs {
  #   key: "output_1"
  #   value {
  #     dtype: DT_FLOAT
  #     tensor_shape {
  #       dim {
  #         size: 2
  #       }
  #       dim {
  #         size: 1
  #       }
  #     }
  #     float_val: 10.805429458618164
  #     float_val: 14.010123252868652
  #   }
  # }
  # model_spec {
  #   name: "Toy"
  #   version {
  #     value: 1
  #   }
  #   signature_name: "serving_default"
  # }
  $ python3 grpcRequest.py -m Toy -v 2
  # outputs {
  #   key: "output_1"
  #   value {
  #     dtype: DT_FLOAT
  #     tensor_shape {
  #       dim {
  #         size: 2
  #       }
  #       dim {
  #         size: 1
  #       }
  #     }
  #     float_val: 0.9990350008010864
  #     float_val: 0.9997349381446838
  #   }
  # }
  # model_spec {
  #   name: "Toy"
  #   version {
  #     value: 2
  #   }
  #   signature_name: "serving_default"
  # }

set an alias label for each version. Only avaliable for gRPC.

model_config_list: {
  config: {
    name: "Toy",
    base_path: "/models/save/Toy/",
    model_platform: "tensorflow",
    model_version_policy: {
        specific {
            versions: 1,
            versions: 2
        }
    },
    version_labels {
      key: 'stable',
      value: 1
    },
    version_labels {
      key: 'canary',
      value: 2
    }
  },
  config: {
    name: "Toy_double",
    base_path: "/models/save/Toy_double/",
    model_platform: "tensorflow"
  }
}

refer to https://www.tensorflow.org/tfx/serving/serving_config

Please note that labels can only be assigned to model versions that are loaded and available for serving. Once a model version is available, one may reload the model config on the fly, to assign a label to it (can be achieved using HandleReloadConfigRequest RPC endpoint).

Maybe you should delete the label related part first, then start the tensorflow serving, and finally add the label related part to the config file on the fly.

set flag --allow_version_labels_for_unavailable_models true will be able to add version lables at the first runing.

$ docker run --rm -p 8500:8500 -p 8501:8501 --mount type=bind,source=$(pwd)/save/,target=/models/save --mount type=bind,source=$(pwd)/config/versionlabels.config,target=/models/versionctrl.config -it tensorflow/serving --model_config_file=/models/versionctrl.config --model_config_file_poll_wait_seconds=60 --allow_version_labels_for_unavailable_models

$ python3 grpcRequest.py -m Toy -l stable
# outputs {
#   key: "output_1"
#   value {
#     dtype: DT_FLOAT
#     tensor_shape {
#       dim {
#         size: 2
#       }
#       dim {
#         size: 1
#       }
#     }
#     float_val: 10.805429458618164
#     float_val: 14.010123252868652
#   }
# }
# model_spec {
#   name: "Toy"
#   version {
#     value: 1
#   }
#   signature_name: "serving_default"
# }
$ python3 grpcRequest.py -m Toy -l canary
# outputs {
#   key: "output_1"
#   value {
#     dtype: DT_FLOAT
#     tensor_shape {
#       dim {
#         size: 2
#       }
#       dim {
#         size: 1
#       }
#     }
#     float_val: 0.9990350008010864
#     float_val: 0.9997349381446838
#   }
# }
# model_spec {
#   name: "Toy"
#   version {
#     value: 2
#   }
#   signature_name: "serving_default"
# }

Other Configuration parameter

Configuration
Batch Configuration: need to set --enable_batching=true and pass the config to --batching_parameters_file, more
- CPU-only: One Approach
  
  If your system is CPU-only (no GPU), then consider starting with the following values: num_batch_threads equal to the number of CPU cores; max_batch_size to infinity; batch_timeout_micros to 0. Then experiment with batch_timeout_micros values in the 1-10 millisecond (1000-10000 microsecond) range, while keeping in mind that 0 may be the optimal value.
- GPU: One Approach
  
  If your model uses a GPU device for part or all of your its inference work, consider the following approach:
  
  Set num_batch_threads to the number of CPU cores.
  
  Temporarily set batch_timeout_micros to infinity while you tune max_batch_size to achieve the desired balance between throughput and average latency. Consider values in the hundreds or thousands.
  
  For online serving, tune batch_timeout_micros to rein in tail latency. The idea is that batches normally get filled to max_batch_size, but occasionally when there is a lapse in incoming requests, to avoid introducing a latency spike it makes sense to process whatever's in the queue even if it represents an underfull batch. The best value for batch_timeout_micros is typically a few milliseconds, and depends on your context and goals. Zero is a value to consider; it works well for some workloads. (For bulk processing jobs, choose a large value, perhaps a few seconds, to ensure good throughput but not wait too long for the final (and likely underfull) batch.)
  
  batch.config
```
max_batch_size { value: 1 }
batch_timeout_micros { value: 0 }
max_enqueued_batches { value: 1000000 }
num_batch_threads { value: 8 }
```
- example
  - server
```
docker run --rm -p 8500:8500 -p 8501:8501 --mount type=bind,source=$(pwd),target=/models --mount type=bind,source=$(pwd)/config/versionctrl.config,target=/models/versionctrl.config -it tensorflow/serving --model_config_file=/models/versionctrl.config --model_config_file_poll_wait_seconds=60 --enable_batching=true --batching_parameters_file=/models/batch/batchpara.config
```
  - client
    - return error "Task size 2 is larger than maximum batch size 1"
      $ python3 grpcRequest.py -m Toy -v 1 # Traceback (most recent call last): # File "grpcRequest.py", line 58, in <module> # resp = stub.Predict(request, timeout_req) # File "/Users/yongxiyang/opt/anaconda3/envs/tf2cpu/lib/python3.7/site-packages/grpc/_channel.py", line 824, in __call__ # return _end_unary_response_blocking(state, call, False, None) # File "/Users/yongxiyang/opt/anaconda3/envs/tf2cpu/lib/python3.7/site-packages/grpc/_channel.py", line 726, in _end_unary_response_blocking # raise _InactiveRpcError(state) # grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with: # status = StatusCode.INVALID_ARGUMENT # details = "Task size 2 is larger than maximum batch size 1" # debug_error_string = "{"created":"@1591246233.042335000","description":"Error received from peer ipv4:0.0.0.0:8500","file":"src# /core/lib/surface/call.cc","file_line":1056,"grpc_message":"Task size 2 is larger than maximum batch size 1","grpc_status":3}"

monitor: pass file path to --monitoring_config_file

monitor.config

prometheus_config {
    enable: true,
    path: "/models/metrics"
}

request through RESTful API

example

server

$ docker run --rm -p 8500:8500 -p 8501:8501 -v "$(pwd):/models" -it tensorflow/serving --model_config_file=/models/config/versionlabels.config --model_config_file_poll_wait_seconds=60 --allow_version_labels_for_unavailable_models --monitoring_config_file=/models/monitor/monitor.config

client

$ curl -X GET http://localhost:8501/monitoring/prometheus/metrics
# # TYPE :tensorflow:api:op:using_fake_quantization gauge
# # TYPE :tensorflow:cc:saved_model:load_attempt_count counter
# :tensorflow:cc:saved_model:load_attempt_count{model_path="/models/save/Toy/1",status="success"} 1
# :tensorflow:cc:saved_model:load_attempt_count{model_path="/models/save/Toy/2",status="success"} 1
# ...
# # TYPE :tensorflow:cc:saved_model:load_latency counter
# :tensorflow:cc:saved_model:load_latency{model_path="/models/save/Toy/1"} 54436
# :tensorflow:cc:saved_model:load_latency{model_path="/models/save/Toy/2"} 45230
# ...
# # TYPE :tensorflow:mlir:import_failure_count counter
# # TYPE :tensorflow:serving:model_warmup_latency histogram
# # TYPE :tensorflow:serving:request_example_count_total counter
# # TYPE :tensorflow:serving:request_example_counts histogram
# # TYPE :tensorflow:serving:request_log_count counter

show monitor data in the prometheus docker

modified your own prometheus configuration file

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  - job_name: 'tensorflow'
    scrape_interval: 5s
    metrics_path: '/monitoring/prometheus/metrics'
    static_configs:
- targets: ['docker.for.mac.localhost:8501'] # for `Mac users`
# - targets: ['127.0.0.1:8501']

start prometheus docker server

$ docker run --rm -ti --name prometheus -p 127.0.0.1:9090:9090 -v "$(pwd)/monitor:/tmp" prom/prometheus --config.file=/tmp/prometheus/prome.yaml

access prometheus on the webUI
- check target and status
- webUI on localhost:9090

Obtain the information

get the information data structure.

curl -d '{"instances": [[1.0, 2.0]]}' -X GET http://localhost:8501/v1/models/Toy/metadata

get the information data structure with gRPC

$ python grpcMetadata.py -m Toy -v 2
# model_spec {
#   name: "Toy"
#   version {
#     value: 2
#   }
# }
# metadata {
#   key: "signature_def"
#   value {
#     type_url: "type.googleapis.com/tensorflow.serving.SignatureDefMap"
#     value: "\n\253\001\n\017serving_default\022\227\001\n;\n\007input_1\0220\n\031serving_default_input_1:0\020\001\032\021\022\013\010\377\377\377\377\377\377\377\377\377\001\022\002\010\002\022<\n\010output_1\0220\n\031StatefulPartitionedCall:0\020\001\032\021\022\013\010\377\377\377\377\377\377\377\377\377\001\022\002\010\001\032\032tensorflow/serving/predict\n>\n\025__saved_model_init_op\022%\022#\n\025__saved_model_init_op\022\n\n\004NoOp\032\002\030\001"
#   }
# }

Accerleration by GPU

pull tensorflow server GPU version from DockerHub.
```
docker pull tensorflow/serving:latest-gpu
```

clone the server.git if you haven't done it.

git clone https://github.com/tensorflow/serving

set --runtime==nvidia and use the tensorflow/serving:latest-gpu

docker run --runtime=nvidia -p 8501:8501 -v "$(pwd)/${path_to_your_own_models}/1:/models/${user_define_model_name}" -e MODEL_NAME=${user_define_model_name} tensorflow/serving &

example

docker run --runtime=nvidia -p 8501:8501 -v "$(pwd)/save/Toy:/models/Toy" -e MODEL_NAME=Toy tensorflow/serving:latest-gpu &
or
nvidia-docker run -p 8501:8501 -v "$(pwd)/save/Toy:/models/Toy" -e MODEL_NAME=Toy tensorflow/serving:latest-gpu &
or
docker run --gpu ${all/1} -p 8501:8501 -v "$(pwd)/save/Toy:/models/Toy" -e MODEL_NAME=Toy tensorflow/serving:latest-gpu &

Setup gRPC and POST request using Python files

setup environment

pip install numpy tensorflow tensorflow-serving-api grpcio

grpc API for python

grpc API

predict.proto

syntax = "proto3";

package tensorflow.serving;
option cc_enable_arenas = true;

import "tensorflow/core/framework/tensor.proto";
import "tensorflow_serving/apis/model.proto";

// PredictRequest specifies which TensorFlow model to run, as well as
// how inputs are mapped to tensors and how outputs are filtered before
// returning to user.
message PredictRequest {
  // Model Specification. If version is not specified, will use the latest
  // (numerical) version.
  ModelSpec model_spec = 1;  # for python `request.model_spec`

  // Input tensors.
  // Names of input tensor are alias names. The mapping from aliases to real
  // input tensor names is stored in the SavedModel export as a prediction
  // SignatureDef under the 'inputs' field.
  map<string, TensorProto> inputs = 2; # for python `request.input` dictionary

  // Output filter.
  // Names specified are alias names. The mapping from aliases to real output
  // tensor names is stored in the SavedModel export as a prediction
  // SignatureDef under the 'outputs' field.
  // Only tensors specified here will be run/fetched and returned, with the
  // exception that when none is specified, all tensors specified in the
  // named signature will be run/fetched and returned.
  repeated string output_filter = 3; # for python `request.output_filter` list need to append values.
}

// Response for PredictRequest on successful run.
message PredictResponse {
  // Effective Model Specification used to process PredictRequest.
  ModelSpec model_spec = 2;

  // Output tensors.
  map<string, TensorProto> outputs = 1;
}

model.proto

syntax = "proto3";

package tensorflow.serving;
option cc_enable_arenas = true;

import "google/protobuf/wrappers.proto";

// Metadata for an inference request such as the model name and version.
message ModelSpec {
  // Required servable name.
  string name = 1;

  // Optional choice of which version of the model to use.
  //
  // Recommended to be left unset in the common case. Should be specified only
  // when there is a strong version consistency requirement.
  //
  // When left unspecified, the system will serve the best available version.
  // This is typically the latest version, though during version transitions,
  // notably when serving on a fleet of instances, may be either the previous or
  // new version.

  # for this `request.model_spec.version.value`
  # or `request.model_spec.version_label`
  oneof version_choice {
    // Use this specific version number.
    google.protobuf.Int64Value version = 2;

    // Use the version associated with the given label.
    string version_label = 4;
  }

  // A named signature to evaluate. If unspecified, the default signature will
  // be used.
  # for python `request.model_spec.signature_name`
  string signature_name = 3;
}

run grpcRequest.py

python3 grpcRequest.py
# outputs {
#   key: "dense"
#   value {
#     dtype: DT_FLOAT
#     tensor_shape {
#       dim {
#         size: 2
#       }
#       dim {
#         size: 1
#       }
#     }
#     float_val: 0.9901617765426636
#     float_val: 0.9934704303741455
#   }
# }
# model_spec {
#   name: "Toy"
#   version {
#     value: 3
#   }
#   signature_name: "serving_default"
# }

get model status and reload model using grpc

run server

$ docker run --rm -p 8500:8500 -p 8501:8501 --mount type=bind,source=$(pwd)/save/,target=/models/save --mount type=bind,srce=$(pwd)/config/versionlabels.config,target=/models/versionctrl.config -it tensorflow/serving --model_config_file=/models/versionctrl.config --model_config_file_poll_wait_seconds=60 --allow_version_labels_for_unavailable_models

obtain model status info

$ python grpcModelStatus.py -m Toy -v 1
# model_version_status {
#   version: 1
#   state: AVAILABLE
#   status {
#   }
# }

reload config file

$ python grpcReloadModel.py -m Toy
# model Toy reloaded sucessfully

from Server

# 2020-05-28 10:38:51.057588: I tensorflow_serving/model_servers/model_service_impl.cc:47]
# Config entry
#   index : 0
#   path : /models/save/Toy/
#   name : Toy
#   platform : tensorflow
# 2020-05-28 10:38:51.057775: I tensorflow_serving/model_servers/server_core.cc:462] Adding/updating models.
# 2020-05-28 10:38:51.057843: I tensorflow_serving/model_servers/server_core.cc:573]  (Re-)adding model: Toy
# 2020-05-28 10:38:51.156301: I tensorflow_serving/core/loader_harness.cc:138] Quiescing servable version {name: Toy_double version: 1}
# 2020-05-28 10:38:51.156406: I tensorflow_serving/core/loader_harness.cc:145] Done quiescing servable version {name: Toy_double version: 1}
# 2020-05-28 10:38:51.156442: I tensorflow_serving/core/loader_harness.cc:120] Unloading servable version {name: Toy_double version: 1}
# ...
# 2020-05-28 10:38:52.083807: I tensorflow_serving/core/loader_harness.cc:128] Done unloading servable version {name: Toy version: 2}

RESTful API

run POSTreq.py

python3 POSTreq.py
# this request is based on isntances
# True
# {
#     "predictions": [[0.990161777], [0.99347043]
#     ]
# }
# time consumption: 47.346710999999985ms
# this request is based on inputs
# True
# {
#     "outputs": [
#         [
#             0.985201657
#         ],
#         [
#             0.99923408
#         ]
#     ]
# }
# time consumption: 6.932738000000049ms

For production

SavedModel Warmup
please see grpcRequestLog.py
--enable_model_warmup: Enables model warmup using user-provided PredictionLogs in assets.extra/ directory

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.vscode		.vscode
batch		batch
config		config
image4md		image4md
monitor		monitor
save		save
serving		serving
useless		useless
LICENSE		LICENSE
LinearKeras.py		LinearKeras.py
LinearKerasV2.py		LinearKerasV2.py
LogisticKeras.py		LogisticKeras.py
ModelBuilt.py		ModelBuilt.py
POSTreq.py		POSTreq.py
README.md		README.md
grpcMetadata.py		grpcMetadata.py
grpcModelStatus.py		grpcModelStatus.py
grpcReloadModel.py		grpcReloadModel.py
grpcRequest.py		grpcRequest.py
grpcRequestLog.py		grpcRequestLog.py
monitor_model.py		monitor_model.py
subclass-template.py		subclass-template.py

a	b	c	d	e	f
390	25	1	1	1	2
345	34	45	2	34	3456

License

Alwaysproblem/TFServing-setup-tutorial

Folders and files

Latest commit

History

Repository files navigation

TFServing-setup-review

Basic tutorial for Tensorflow Serving API (checkout tfclient branch)

Basic tutorial for Tensorflow Serving

Install Docker

Tutorial for starting

Easy TFServer

Run Server with your own saved pretrain models

RESTful API

Run multiple model in TFServer

Version control for TFServer

Other Configuration parameter

Obtain the information

Accerleration by GPU

Setup gRPC and POST request using Python files

For production

About

Topics

Resources

License

Stars

Watchers

Forks

Languages