Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to import numpy in AWS Lambda function #13465

Closed
ljames1 opened this issue May 4, 2019 · 18 comments
Closed

Unable to import numpy in AWS Lambda function #13465

ljames1 opened this issue May 4, 2019 · 18 comments

Comments

@ljames1
Copy link

ljames1 commented May 4, 2019

  • how you installed Python
    deleted python3 on my mac and then brew install python to install python 3.7.3
  • how you installed numpy
    Since I am installing to run on AWS Lambda, pip install numpy --target .
  • your operating system
    macOS Sierra 10.12.6
  • whether or not you have multiple versions of Python installed
    the native python 2.7.10 is still installed on my mac
  • if you built from source, your compiler versions and ideally a build log
    N/A

So I am able to run my code locally and I have no issues. I am pandas, matplotlib, boto3, and mpld3 to organize and display data in an AWS DDB table with matplotlib graphs that mpld3 turns into html. numpy appears to be needed for pandas, and whenever I try to install these libraries to a target directory so they can run in a lambda function, according to https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html, I get the following error:


Importing the multiarray numpy extension module failed.  Most
likely you are trying to import a failed build of numpy.
Here is how to proceed:
- If you're working with a numpy git repository, try `git clean -xdf`
  (removes all files not under version control) and rebuild numpy.
- If you are simply trying to use the numpy version that you have installed:
  your installation is broken - please reinstall numpy.
- If you have already reinstalled and that did not fix the problem, then:
  1. Check that you are using the Python you expect (you're using /var/lang/bin/python3.6),
     and that you have no directories in your PATH or PYTHONPATH that can
     interfere with the Python and numpy versions you're trying to use.
  2. If (1) looks fine, you can open a new issue at
     https://github.com/numpy/numpy/issues.  Please include details on:
     - how you installed Python
     - how you installed numpy
     - your operating system
     - whether or not you have multiple versions of Python installed
     - if you built from source, your compiler versions and ideally a build log

     Note: this error has many possible causes, so please don't comment on
     an existing issue about this - open a new one instead.

Original error was: No module named 'numpy.core._multiarray_umath'


END RequestId: 2fc65f50-420d-441b-930c-665b1c8ab3ea
REPORT RequestId: 2fc65f50-420d-441b-930c-665b1c8ab3ea	Duration: 0.85 ms	Billed Duration: 100 ms 	Memory Size: 128 MB	Max Memory Used: 40 MB	```
  
@ljames1
Copy link
Author

ljames1 commented May 4, 2019

To isolate the issue, I retried the above with:

def main(event, context):

    a = 100
    print(a)
    return

and

import numpy as np

def main(event, context):

    a = np.array(100)
    print(a)
    return

Both files were zipped a package directory containing a targeted installation of numpy according to https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html. After zipping, the files were ~16MB, so they had the numpy installation. The first example that did not import numpy runs fine when testing the Lambda function, but the second errors with the original error message

@ljames1 ljames1 changed the title Numpy breaking when running pandas in lambda. Original error was: No module named 'numpy.core._multiarray_umath' Unable to import numpy in AWS Lambda function May 4, 2019
@rgommers
Copy link
Member

rgommers commented May 4, 2019

That AWS guide isn't telling you the whole story. The Amazon Linux distribution isn't compatible with a regular NumPy install like from PyPi or conda-forge.

I suggest you follow one of these guides or use a zip file from one of these repos:
https://medium.com/@samme/setting-up-python-3-6-aws-lambda-deployment-package-with-numpy-scipy-pillow-and-scikit-image-de488b2afca6
https://medium.com/@korniichuk/lambda-with-pandas-fd81aa2ff25e
https://blog.orikami.nl/building-scipy-pandas-and-numpy-for-aws-lambda-python-3-6-cba9355b44e9
https://github.com/pbegle/aws-lambda-py3.6-pandas-numpy
https://github.com/vitolimandibhrata/aws-lambda-numpy

This is not a NumPy bug, so I'll close the issue.

@rgommers rgommers closed this as completed May 4, 2019
@ljames1
Copy link
Author

ljames1 commented May 6, 2019 via email

@iceback
Copy link

iceback commented Oct 22, 2019

@rgommers Do you know of any python3.7 related tacks on this. I've tried a couple from your list of 3.6 and always end up with the same gripe about the _multiarray lib. I need to make an AWS "layer" since the combination I need (numpy,scipy,pandas) exceeds size allowance of a single function.

@leehagoodjames
Copy link

@iceback if your error was similar to what I had, which was No module named 'numpy.core._multiarray_umath', this is caused by not installing numpy with the correct OS.

AWS lambda is going to run AWS linux at runtime, so any target-installed packages should be installed with the proper OS. If you target install numpy to a package directory via pip install --target ./package numpy, and you are running another OS (such as macOS), then this will fail because the target installation assumes that the OS will not change from your native OS.

To get around this, you have two options:

  1. Perform the target installation from a machine running AWS Linux, such as an EC2 instance. This worked great for me.
  2. Figure out how to specify the runtime OS in your target install command. If you find a good way to do this, please share 😄

@iceback
Copy link

iceback commented Oct 22, 2019

I'll have to revisit all the things I tried last week which included the @korniichuk and github/pbegle and lord knows what else but always ended up in same spot. Building numpy on an EC2 is not going well. Perhaps wrong Cython version (only 0.27 available on aws linux). Can you elaborate on your #1. How do you get from an installed numpy on one instance to a layer on Lambda?

@leehagoodjames
Copy link

@iceback Ya this was frustrating when I first did it, but it is definitely possible. I currently run Lambda functions with Python 3.7 with the 3 packages you mentioned.

How I performed step 1. above involved following the steps outlined in Updating a Function with Additional Dependencies, described here. For simplicity sake, I am going to assume that your local python file is named pony.py and the lambda_function is invoked with the ride_pony function.

  1. Launch and connect to an EC2 instance, such as an EC2 micro instance. Find instructions here. Ensure that the instance runs AWS Linux.
  2. Check that python3 is installed. If it is not installed, install it with sudo yum install python3 -y
  3. Make a directory for your local packages. mkdir package
  4. Install the libraries you want to use in Lambda. FYI I always call pip as a module to avoid confusion between Python2 and Python3 pips. Also, some installation-order may be preferred in these installations and that may be worth researching.
  • python3 -m pip install --target ./package numpy
  • python3 -m pip install --target ./package pandas
  • python3 -m pip install --target ./package scipy
  1. Move into installation directory with cd package
  2. Zip the installation directory with zip -r9 ../pony.zip .
  3. Now, from your local machine, scp pony.zip from your EC2 instance to your local with the directions here.
  4. From your local, zip pony.py into pony.zip. zip -g pony.zip pony.py. Your zip file now contains your lambda code along with the python modules needed, which were installed to target with an AWS Linux OS.
  5. Upload the zip file to S3 (which allows for larger lambda functions, up to 50 MB). If your code is small, you can skip to step 10.
  6. Go into your lambda function (or update from the CLI) to use the code in pony.zip for that lambda function

**Ensure that your lambda function points to & executes your filename and main function, pony.ride_pony, if it is not the default lambda_function.lambda_handler

@iceback
Copy link

iceback commented Oct 22, 2019

Thank you very much. I believe I have a shot! (My understanding is that the top of the zip has to be "python" for a Lambda layer so I'll use that instead of "package")

@iceback
Copy link

iceback commented Oct 22, 2019

Shout that how-to out loud and proud! Not sure where I went afoul of the other suggestion (though they were for python3.6) but I'm now back to working on my function code. Thanks a ton.

@rgommers
Copy link
Member

AWS now also published a layer that includes NumPy and SciPy. From https://aws.amazon.com/blogs/aws/new-for-aws-lambda-use-any-programming-language-and-share-common-components/

Based on our customer feedback, and to provide an example of how to use Lambda Layers, we are publishing a public layer which includes NumPy and SciPy, two popular scientific libraries for Python. This prebuilt and optimized layer can help you start very quickly with data processing and machine learning applications.

@iceback
Copy link

iceback commented Oct 23, 2019 via email

@ajbenz18
Copy link

What worked for me was using a Linux version of the numpy library (I use macOS). I went to https://pypi.org/project/numpy/#files and downloaded the .whl file in the version i was looking for (for me, it was numpy-1.19.0-cp37-cp37m-manylinux1_x86_64.whl). Next go to the terminal and unzip it by doing 'unzip numpy-1.19.0-cp37-cp37m-manylinux1_x86_64.whl'. This should give you the numpy version that will work on Lambda. Then, zip everything up as you were doing before and upload it. In the end an incredibly frustrating problem was solved pretty simply.

@kp101994
Copy link

I had same problem, the solution that worked for me is that i uninstalled numpy from my pc,(windows 7).
than i added layer in aws lambda function something called AWSlambda scipy and thats it.

@tuomastik
Copy link

tuomastik commented Nov 4, 2020

I ran into the same issue with Windows but was able to solve it by either of the following approaches:

  • Build the Lambda with AWS SAM (Serverless Application Model) running on Ubuntu that runs on WSL (Windows Subsystem for Linux)
  • Build the Lambda with AWS SAM using the --use-container flag, which has the following explanation in the docs:

    If your functions depend on packages that have natively compiled dependencies, use this flag to build your function inside an AWS Lambda-like Docker container.

@DylanAlbertazzi
Copy link

AWS provides a layer with Numpy and Scipy. Inside of the lambda console just click Add Layer then AWS Layers and it will come up at the top.

@ahulist
Copy link

ahulist commented Jun 21, 2021

I'm having the same issue with AWS Data Wrangler + Numpy/Scipy combination. I need Data Wrangler to read Excel files from S3. The problem here is probably the fact that Data Wrangler incorporates Pandas which in turn incorporates its own Numpy... which is incompatible with Numpy from AWS Numpy/Scipy official Layer.
How do I get around that?

@rkochar
Copy link

rkochar commented Dec 3, 2023

Use manylinux.

Example: pip install --platform=manylinux_2_17_x86_64 --only-binary=:all: -r requirements.txt -t .

@Nephthys76
Copy link

Use manylinux.

Example: pip install --platform=manylinux_2_17_x86_64 --only-binary=:all: -r requirements.txt -t .

This solved the error for me. Thank you for replying to such an old thread with your solution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests