Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add possibility to export environment variables with .env file #1147

Closed
RobertLucian opened this issue Jun 20, 2020 · 14 comments · Fixed by #1154
Closed

Add possibility to export environment variables with .env file #1147

RobertLucian opened this issue Jun 20, 2020 · 14 comments · Fixed by #1154
Assignees
Labels
enhancement New feature or request
Projects
Milestone

Comments

@RobertLucian
Copy link
Member

RobertLucian commented Jun 20, 2020

Description

Add support for exporting environment variables from an .env file placed in the root directory of a Cortex project.

Motivation

In case the user doesn't want to export environment variables using the predictor:env field in cortex.yaml. A reason for that could be to keep the cortex.yaml deployment clean.

@RobertLucian RobertLucian added the enhancement New feature or request label Jun 20, 2020
@RobertLucian RobertLucian self-assigned this Jun 20, 2020
@RobertLucian RobertLucian added this to Prioritize in Cortex Jun 20, 2020
@RobertLucian RobertLucian moved this from Prioritize to Done in Cortex Jun 23, 2020
@RobertLucian RobertLucian moved this from Done to In progress in Cortex Jun 23, 2020
Cortex automation moved this from In progress to Done Jun 23, 2020
@deliahu deliahu added this to the v0.18 milestone Nov 26, 2020
@lminer
Copy link

lminer commented Jan 5, 2021

Does this expose the environment variables to the tensorflow-serving image as well? I'm trying to read directly from s3 from tensorflow-serving and it isn't seeing my credentials.

@RobertLucian
Copy link
Member Author

RobertLucian commented Jan 6, 2021

@lminer only the predictor.image container gets the exported variables from .env. So the answer is no, they don't get exported for the predictor.tensorflow_serving_image container.

predictor.tensorflow_serving_image at least in our view is only for TFS - anything else is done in the API container (predictor.image). Is it possible for you to start a service or anything that you want in dependencies.sh instead? dependencies.sh includes the exported environment variables and it runs before the API, inside the API container. Here's more on dependencies.sh.

Also, your API will inherit the cluster's AWS credentials (provided it's an AWS cluster, read this for more) and so you don't have to export them again - unless these credentials are for S3 buckets that fall outside the API credentials' scope. Can you tell us more about this use case?

I see that you want to read from an S3 bucket. If it's a model you want to download, Cortex already handles that awesomely. More on that here. Is this what you want? If not, can you tell us about your case?

@lminer
Copy link

lminer commented Jan 6, 2021

@RobertLucian because of the GRPC limit on the amount of data that I can send to tensorflow-serving, I thought that instead I would simply send a link to an object in an s3 bucket and then read and write to s3 directly within the model itself. However, in order to do this, I need some way to get the AWS credentials to the model and it appears as if this can only be done via environment variables.

@RobertLucian
Copy link
Member Author

@lminer I see. That would not be an optimal architecture.

We have increased the priority on #1740 and it's the next thing for me to work on. Since this seems to be urgent to you, we will probably make a patch release for you or just create a customised image that will work for you. Will this be acceptable to you?

@lminer
Copy link

lminer commented Jan 6, 2021

That would be great. Thank you!

@deliahu
Copy link
Member

deliahu commented Jan 7, 2021

@lminer I'd just like to add that the AWS credential environment variables are populated in the TF serving container. In addition, you can add custom env vars in both the predictor container and the TF serving container by specifying env in your API configuration, e.g.

- name: iris-classifier
  kind: RealtimeAPI
  predictor:
    type: tensorflow
    path: predictor.py
    models:
      path: s3://cortex-examples/tensorflow/iris-classifier/nn/
    env:
      MY_VAR: my-value

That said, I agree with @RobertLucian that the best approach will be to address #1740.

@lminer
Copy link

lminer commented Jan 7, 2021

Ah. Does this hold if you use a .env file too? I tried it with that and tensorflow wasn't seeing my aws creds.

@RobertLucian
Copy link
Member Author

@lminer it doesn't hold for the .env file though. It's only for the env vars specified in the API spec.

@lminer
Copy link

lminer commented Jan 7, 2021

@deliahu Are you sure that the env vars from the yaml file are propagated to the tensorflow-serving image? I'm still getting the error of aws creds not being properly set. Moreover, when I attach to the tensorflow-serving docker container and run printenv I don't see the environment variables.

Here's my cortex.yaml

- name: pasta
  kind: RealtimeAPI
  predictor:
    type: tensorflow
    path: serving/cortex_server.py
    models:
      path: ../pasta_bucket/
      signature_key: serving_default
    image: quay.io/cortexlabs/tensorflow-predictor:0.25.0
    tensorflow_serving_image: quay.io/robertlucian/cortex-tensorflow-serving-gpu-tf2.4:0.25.0
    env:
      AWS_ACCESS_KEY_ID: foo
      AWS_SECRET_ACCESS_KEY: bar 

Here's the error:

2021-01-07 17:22:25.306428: W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1763] OP_REQUIRES failed at whole_file_read_ops.cc:116 : Failed precondition: AWS Credentials have not been set properly. Unable to access the specified S3 location
<_InactiveRpcError of RPC that terminated with:
        status = StatusCode.FAILED_PRECONDITION
        details = "2 root error(s) found.
  (0) Failed precondition: AWS Credentials have not been set properly. Unable to access the specified S3 location
         [[{{node ReadFile}}]]
         [[StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/while/body/_1860/while/StatefulPartitionedCall/StatefulPartitionedCall/up_sample_layer_2/batch_normalization_177/FusedBatchNormV3/truediv/_3347]]
  (1) Failed precondition: AWS Credentials have not been set properly. Unable to access the specified S3 location
         [[{{node ReadFile}}]]
0 successful operations.
0 derived errors ignored."
        debug_error_string = "{"created":"@1610040145.309085820","description":"Error received from peer ipv4:172.17.0.2:9000","file":"src/core/lib/surface/call.cc","file_line":1061,"grpc_message":"2 root error(s) found.\n  (0) Failed precondition: AWS Credentials have not been set properly. Unable to access the specified S3 location\n\t [[{{node ReadFile}}]]\n\t [[StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/while/body/_1860/while/StatefulPartitionedCall/StatefulPartitionedCall/up_sample_layer_2/batch_normalization_177/FusedBatchNormV3/truediv/_3347]]\n  (1) Failed precondition: AWS Credentials have not been set properly. Unable to access the specified S3 location\n\t [[{{node ReadFile}}]]\n0 successful operations.\n0 derived errors ignored.","grpc_status":9}"

@deliahu
Copy link
Member

deliahu commented Jan 8, 2021

@lminer that is weird, it seemed to work for me when I tried last night. What command are you using to connect to the container?

I did something like:

kubectl exec api-iris-classifier-6c5556595c-mkv4l -c serve -it -- /bin/bash

And then when I ran echo $AWS_ACCESS_KEY_ID it showed up

@lminer
Copy link

lminer commented Jan 8, 2021

I used something like the following.

docker exec -it elegant_hertz /bin/bash

The image I'm using right now is non-standard so maybe that's it: quay.io/robertlucian/cortex-tensorflow-serving-gpu-tf2.4:0.25.0

@deliahu
Copy link
Member

deliahu commented Jan 8, 2021

Oh I see, I think it's because I checked the container when it was running in the cluster, whereas you are just checking locally. If you deploy your API to the cluster, the env vars should be there (they get populated at runtime and are not baked into the image).

@lminer
Copy link

lminer commented Jan 10, 2021

@deliahu It would be nice, if possible, to have the env vars show up locally as well. Makes testing a lot easier. I have found that the env vars show up locally in the api image, just not the tensorflow-serving image.

@deliahu
Copy link
Member

deliahu commented Jan 10, 2021

@lminer yes that makes sense. However, we have actually removed support for running Cortex locally in our latest release (v0.26). We've found that the best way to develop/test a predictor implementation locally is to import it in a separate Python file and call the __init__() and predict() functions directly. The best way to test an API is to deploy it to an actual dev/test cluster, since this will most closely resemble the production environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants