Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model Loading #246

Open
ArijitSinghEDA opened this issue Feb 6, 2024 · 0 comments
Open

Model Loading #246

ArijitSinghEDA opened this issue Feb 6, 2024 · 0 comments

Comments

@ArijitSinghEDA
Copy link

I am loading model like this

import sparknlp
import nlu

spark = sparknlp.start()
df = spark.read.csv("nlp_data.csv")
res = nlu.load("pos").predict(df[["text"]].rdd.flatMap(lambda x: x).collect())
print(res)
spark.stop()

Each time I get the following messages in my console:

com.johnsnowlabs.nlp#spark-nlp_2.12 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-3f17e4b8-0bdf-40c5-9879-d62f9c2dc974;1.0
        confs: [default]
        found com.johnsnowlabs.nlp#spark-nlp_2.12;5.2.3 in central
        found com.typesafe#config;1.4.2 in central
        found org.rocksdb#rocksdbjni;6.29.5 in central
        found com.amazonaws#aws-java-sdk-s3;1.12.500 in central
        found com.amazonaws#aws-java-sdk-kms;1.12.500 in central
        found com.amazonaws#aws-java-sdk-core;1.12.500 in central
        found commons-logging#commons-logging;1.1.3 in central
        found commons-codec#commons-codec;1.15 in central
        found org.apache.httpcomponents#httpclient;4.5.13 in central
        found org.apache.httpcomponents#httpcore;4.4.13 in central
        found software.amazon.ion#ion-java;1.0.2 in central
        found com.fasterxml.jackson.dataformat#jackson-dataformat-cbor;2.12.6 in central
        found joda-time#joda-time;2.8.1 in central
        found com.amazonaws#jmespath-java;1.12.500 in central
        found com.github.universal-automata#liblevenshtein;3.0.0 in central
        found com.google.protobuf#protobuf-java-util;3.0.0-beta-3 in central
        found com.google.protobuf#protobuf-java;3.0.0-beta-3 in central
        found com.google.code.gson#gson;2.3 in central
        found it.unimi.dsi#fastutil;7.0.12 in central
        found org.projectlombok#lombok;1.16.8 in central
        found com.google.cloud#google-cloud-storage;2.20.1 in central
        found com.google.guava#guava;31.1-jre in central
        found com.google.guava#failureaccess;1.0.1 in central
        found com.google.guava#listenablefuture;9999.0-empty-to-avoid-conflict-with-guava in central
        found com.google.errorprone#error_prone_annotations;2.18.0 in central
        found com.google.j2objc#j2objc-annotations;1.3 in central
        found com.google.http-client#google-http-client;1.43.0 in central
        found io.opencensus#opencensus-contrib-http-util;0.31.1 in central
        found com.google.http-client#google-http-client-jackson2;1.43.0 in central
        found com.google.http-client#google-http-client-gson;1.43.0 in central
        found com.google.api-client#google-api-client;2.2.0 in central
        found com.google.oauth-client#google-oauth-client;1.34.1 in central
        found com.google.http-client#google-http-client-apache-v2;1.43.0 in central
        found com.google.apis#google-api-services-storage;v1-rev20220705-2.0.0 in central
        found com.google.code.gson#gson;2.10.1 in central
        found com.google.cloud#google-cloud-core;2.12.0 in central
        found io.grpc#grpc-context;1.53.0 in central
        found com.google.auto.value#auto-value-annotations;1.10.1 in central
        found com.google.auto.value#auto-value;1.10.1 in central
        found javax.annotation#javax.annotation-api;1.3.2 in central
        found com.google.cloud#google-cloud-core-http;2.12.0 in central
        found com.google.http-client#google-http-client-appengine;1.43.0 in central
        found com.google.api#gax-httpjson;0.108.2 in central
        found com.google.cloud#google-cloud-core-grpc;2.12.0 in central
        found io.grpc#grpc-alts;1.53.0 in central
        found io.grpc#grpc-grpclb;1.53.0 in central
        found org.conscrypt#conscrypt-openjdk-uber;2.5.2 in central
        found io.grpc#grpc-auth;1.53.0 in central
        found io.grpc#grpc-protobuf;1.53.0 in central
        found io.grpc#grpc-protobuf-lite;1.53.0 in central
        found io.grpc#grpc-core;1.53.0 in central
        found com.google.api#gax;2.23.2 in central
        found com.google.api#gax-grpc;2.23.2 in central
        found com.google.auth#google-auth-library-credentials;1.16.0 in central
        found com.google.auth#google-auth-library-oauth2-http;1.16.0 in central
        found com.google.api#api-common;2.6.2 in central
        found io.opencensus#opencensus-api;0.31.1 in central
        found com.google.api.grpc#proto-google-iam-v1;1.9.2 in central
        found com.google.protobuf#protobuf-java;3.21.12 in central
        found com.google.protobuf#protobuf-java-util;3.21.12 in central
        found com.google.api.grpc#proto-google-common-protos;2.14.2 in central
        found org.threeten#threetenbp;1.6.5 in central
        found com.google.api.grpc#proto-google-cloud-storage-v2;2.20.1-alpha in central
        found com.google.api.grpc#grpc-google-cloud-storage-v2;2.20.1-alpha in central
        found com.google.api.grpc#gapic-google-cloud-storage-v2;2.20.1-alpha in central
        found com.fasterxml.jackson.core#jackson-core;2.14.2 in central
        found com.google.code.findbugs#jsr305;3.0.2 in central
        found io.grpc#grpc-api;1.53.0 in central
        found io.grpc#grpc-stub;1.53.0 in central
        found org.checkerframework#checker-qual;3.31.0 in central
        found io.perfmark#perfmark-api;0.26.0 in central
        found com.google.android#annotations;4.1.1.4 in central
        found org.codehaus.mojo#animal-sniffer-annotations;1.22 in central
        found io.opencensus#opencensus-proto;0.2.0 in central
        found io.grpc#grpc-services;1.53.0 in central
        found com.google.re2j#re2j;1.6 in central
        found io.grpc#grpc-netty-shaded;1.53.0 in central
        found io.grpc#grpc-googleapis;1.53.0 in central
        found io.grpc#grpc-xds;1.53.0 in central
        found com.navigamez#greex;1.0 in central
        found dk.brics.automaton#automaton;1.11-8 in central
        found com.johnsnowlabs.nlp#tensorflow-cpu_2.12;0.4.4 in central
        found com.microsoft.onnxruntime#onnxruntime;1.16.3 in central
:: resolution report :: resolve 1966ms :: artifacts dl 54ms
        :: modules in use:
        com.amazonaws#aws-java-sdk-core;1.12.500 from central in [default]
        com.amazonaws#aws-java-sdk-kms;1.12.500 from central in [default]
        com.amazonaws#aws-java-sdk-s3;1.12.500 from central in [default]
        com.amazonaws#jmespath-java;1.12.500 from central in [default]
        com.fasterxml.jackson.core#jackson-core;2.14.2 from central in [default]
        com.fasterxml.jackson.dataformat#jackson-dataformat-cbor;2.12.6 from central in [default]
        com.github.universal-automata#liblevenshtein;3.0.0 from central in [default]
        com.google.android#annotations;4.1.1.4 from central in [default]
        com.google.api#api-common;2.6.2 from central in [default]
        com.google.api#gax;2.23.2 from central in [default]
        com.google.api#gax-grpc;2.23.2 from central in [default]
        com.google.api#gax-httpjson;0.108.2 from central in [default]
        com.google.api-client#google-api-client;2.2.0 from central in [default]
        com.google.api.grpc#gapic-google-cloud-storage-v2;2.20.1-alpha from central in [default]
        com.google.api.grpc#grpc-google-cloud-storage-v2;2.20.1-alpha from central in [default]
        com.google.api.grpc#proto-google-cloud-storage-v2;2.20.1-alpha from central in [default]
        com.google.api.grpc#proto-google-common-protos;2.14.2 from central in [default]
        com.google.api.grpc#proto-google-iam-v1;1.9.2 from central in [default]
        com.google.apis#google-api-services-storage;v1-rev20220705-2.0.0 from central in [default]
        com.google.auth#google-auth-library-credentials;1.16.0 from central in [default]
        com.google.auth#google-auth-library-oauth2-http;1.16.0 from central in [default]
        com.google.auto.value#auto-value;1.10.1 from central in [default]
        com.google.auto.value#auto-value-annotations;1.10.1 from central in [default]
        com.google.cloud#google-cloud-core;2.12.0 from central in [default]
        com.google.cloud#google-cloud-core-grpc;2.12.0 from central in [default]
        com.google.cloud#google-cloud-core-http;2.12.0 from central in [default]
        com.google.cloud#google-cloud-storage;2.20.1 from central in [default]
        com.google.code.findbugs#jsr305;3.0.2 from central in [default]
        com.google.code.gson#gson;2.10.1 from central in [default]
        com.google.errorprone#error_prone_annotations;2.18.0 from central in [default]
        com.google.guava#failureaccess;1.0.1 from central in [default]
        com.google.guava#guava;31.1-jre from central in [default]
        com.google.guava#listenablefuture;9999.0-empty-to-avoid-conflict-with-guava from central in [default]
        com.google.http-client#google-http-client;1.43.0 from central in [default]
        com.google.http-client#google-http-client-apache-v2;1.43.0 from central in [default]
        com.google.http-client#google-http-client-appengine;1.43.0 from central in [default]
        com.google.http-client#google-http-client-gson;1.43.0 from central in [default]
        com.google.http-client#google-http-client-jackson2;1.43.0 from central in [default]
        com.google.j2objc#j2objc-annotations;1.3 from central in [default]
        com.google.oauth-client#google-oauth-client;1.34.1 from central in [default]
        com.google.protobuf#protobuf-java;3.21.12 from central in [default]
        com.google.protobuf#protobuf-java-util;3.21.12 from central in [default]
        com.google.re2j#re2j;1.6 from central in [default]
        com.johnsnowlabs.nlp#spark-nlp_2.12;5.2.3 from central in [default]
        com.johnsnowlabs.nlp#tensorflow-cpu_2.12;0.4.4 from central in [default]
        com.microsoft.onnxruntime#onnxruntime;1.16.3 from central in [default]
        com.navigamez#greex;1.0 from central in [default]
        com.typesafe#config;1.4.2 from central in [default]
        commons-codec#commons-codec;1.15 from central in [default]
        commons-logging#commons-logging;1.1.3 from central in [default]
        dk.brics.automaton#automaton;1.11-8 from central in [default]
        io.grpc#grpc-alts;1.53.0 from central in [default]
        io.grpc#grpc-api;1.53.0 from central in [default]
        io.grpc#grpc-auth;1.53.0 from central in [default]
        io.grpc#grpc-context;1.53.0 from central in [default]
        io.grpc#grpc-core;1.53.0 from central in [default]
        io.grpc#grpc-googleapis;1.53.0 from central in [default]
        io.grpc#grpc-grpclb;1.53.0 from central in [default]
        io.grpc#grpc-netty-shaded;1.53.0 from central in [default]
        io.grpc#grpc-protobuf;1.53.0 from central in [default]
        io.grpc#grpc-protobuf-lite;1.53.0 from central in [default]
        io.grpc#grpc-services;1.53.0 from central in [default]
        io.grpc#grpc-stub;1.53.0 from central in [default]
        io.grpc#grpc-xds;1.53.0 from central in [default]
        io.opencensus#opencensus-api;0.31.1 from central in [default]
        io.opencensus#opencensus-contrib-http-util;0.31.1 from central in [default]
        io.opencensus#opencensus-proto;0.2.0 from central in [default]
        io.perfmark#perfmark-api;0.26.0 from central in [default]
        it.unimi.dsi#fastutil;7.0.12 from central in [default]
        javax.annotation#javax.annotation-api;1.3.2 from central in [default]
        joda-time#joda-time;2.8.1 from central in [default]
        org.apache.httpcomponents#httpclient;4.5.13 from central in [default]
        org.apache.httpcomponents#httpcore;4.4.13 from central in [default]
        org.checkerframework#checker-qual;3.31.0 from central in [default]
        org.codehaus.mojo#animal-sniffer-annotations;1.22 from central in [default]
        org.conscrypt#conscrypt-openjdk-uber;2.5.2 from central in [default]
        org.projectlombok#lombok;1.16.8 from central in [default]
        org.rocksdb#rocksdbjni;6.29.5 from central in [default]
        org.threeten#threetenbp;1.6.5 from central in [default]
        software.amazon.ion#ion-java;1.0.2 from central in [default]
        :: evicted modules:
        commons-logging#commons-logging;1.2 by [commons-logging#commons-logging;1.1.3] in [default]
        commons-codec#commons-codec;1.11 by [commons-codec#commons-codec;1.15] in [default]
        com.google.protobuf#protobuf-java-util;3.0.0-beta-3 by [com.google.protobuf#protobuf-java-util;3.21.12] in [default]
        com.google.protobuf#protobuf-java;3.0.0-beta-3 by [com.google.protobuf#protobuf-java;3.21.12] in [default]
        com.google.code.gson#gson;2.3 by [com.google.code.gson#gson;2.10.1] in [default]
        ---------------------------------------------------------------------
        |                  |            modules            ||   artifacts   |
        |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
        ---------------------------------------------------------------------
        |      default     |   85  |   0   |   0   |   5   ||   80  |   0   |
        ---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent-3f17e4b8-0bdf-40c5-9879-d62f9c2dc974
        confs: [default]
        0 artifacts copied, 80 already retrieved (0kB/27ms)


pos_anc download started this may take some time.
Approximate size to download 3.9 MB
[ / ]pos_anc download started this may take some time.
Approximate size to download 3.9 MB
[ — ]Download done! Loading the resource.
[OK!]
sentence_detector_dl download started this may take some time.
Approximate size to download 354.6 KB
[ | ]sentence_detector_dl download started this may take some time.
Approximate size to download 354.6 KB
[ / ]Download done! Loading the resource.
[ — ]2024-02-06 14:43:45.340048: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[OK!]

Is it indicating that I am downloading the model(s) from the internet agin and again, or am I downloading it from the jar files?
I assume that the jar files are now on my local system since it took some time when I first installed spark-nlp, and now it just prints the jars information almost immediately when I run the code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant