Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TRACE log level causes an application to fail #3524

Open
progxaker opened this issue Jan 26, 2024 · 11 comments
Open

TRACE log level causes an application to fail #3524

progxaker opened this issue Jan 26, 2024 · 11 comments

Comments

@progxaker
Copy link
Contributor

Expected behavior

Don't fail a main application. Probably use WARN log level.

Actual behavior

The enabled JVM agent with TRACE log level causes a Java application to fail.

To Reproduce

  1. Connect a JVM agent (-javagent:/path/to/application-insights-3.4.17.jar).
  2. Provide a connection string.
  3. Set an environment variable APPLICATIONINSIGHTS_SELF_DIAGNOSTICS_LEVEL to TRACE.
  4. Run a Java application.

System information

Please provide the following information:

  • SDK Version: 3.4.17
  • OS type and version: RHEL 8.9
  • Application Server type and version (if applicable): Apache Spark 3.4.
  • Using spring-boot? No.
  • Additional relevant libraries (with version, if applicable): N/A.

Logs

...
2024-01-26 17:09:03.3392 ERROR c.m.applicationinsights.agent - Error related to the execution of [keytool, -list, -rfc, -keystore, /opt/java/jdk-17.0.9+9-jre/lib/security/cacerts]
java.io.IOException: Cannot run program "keytool": error=2, No such file or directory
        at java.base/java.lang.ProcessBuilder.start(Unknown Source)
        at java.base/java.lang.ProcessBuilder.start(Unknown Source)
        at com.microsoft.applicationinsights.agent.internal.init.AppInsightsCertificate.executeWithoutException(AppInsightsCertificate.java:96)
        at com.microsoft.applicationinsights.agent.internal.init.AppInsightsCertificate.loadCertificates(AppInsightsCertificate.java:83)
        at com.microsoft.applicationinsights.agent.internal.init.AppInsightsCertificate.isInJavaKeystore(AppInsightsCertificate.java:74)
        at com.microsoft.applicationinsights.agent.internal.init.FirstEntryPoint.init(FirstEntryPoint.java:131)
        at io.opentelemetry.javaagent.tooling.AgentStarterImpl.start(AgentStarterImpl.java:95)
        at io.opentelemetry.javaagent.bootstrap.AgentInitializer$1.run(Agent Initializer.java:53)
        at io.opentelemetry.javaagent.bootstrap.AgentInitializer$1.run(AgentInitializer.java:47)
        at io.opentelemetry.javaagent.bootstrap.AgentInitializer.execute(AgentInitializer.java:64)
        at io.opentelemetry.javaagent.bootstrap.AgentInitializer.initialize(AgentInitializer.java:46)
        at io.opentelemetry.javaagent.OpenTelemetryAgent.startAgent(OpenTelemetryAgent.java:57)
        at io.opentelemetry.javaagent.OpenTelemetryAgent.premain(OpenTelemetryAgent.java:45)
        at com.microsoft.application insights.agent.Agent.premain(Agent.java:38)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.base/java.lang.reflect.Method.invoke(Unknown Source)
        at java.instrument/sun.instrument.InstrumentationImpl.loadClassAndStartAgent(Unknown Source)
        at java.instrument/sun. instrument. InstrumentationImpl.loadClassAndCallPremain(Unknown Source)
Caused by: java.io.IOException: error=2, No such file or directory
        at java.base/java.lang.ProcessImpl.forkAndExec(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(Unknown Source)
        at java.base/java.lang. ProcessImpl.start(Unknown Source)
        ... 20 common frames omitted
2024-01-26 17:09:03.340Z TRACE c.m.application insights.agent - Application Insights root certificate in the Java keystore: false
...
@trask
Copy link
Member

trask commented Jan 29, 2024

hi @progxaker! the error you mention is only logged and shouldn't cause any other issues. can you provide the full applicationinsights.log, and also include what exception your application startup is failing with? thx

@progxaker
Copy link
Contributor Author

progxaker commented Jan 30, 2024

Hi @trask. My bad, the error confused me and it's not related to the main problem.

shouldn't cause any other issues

That's what I thought too, but the main code doesn't execute.

can you provide the full applicationinsights.log

applicationinsights.log

what exception your application startup is failing with?

No exceptions at all :/ Latest logs:

2024-01-30 12:33:23.463Z INFO  c.m.applicationinsights.agent - Java version: 17.0.10, vendor: Red Hat, Inc., home: /usr/lib/jvm/java-17-openjdk-17.0.10.0.7-2.el8.x86_64
/usr/lib/jvm/java-17/bin/java -cp /tmp/spark/conf/:/tmp/spark/jars/* -Xmx1g -XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED -Djdk.reflect.useDirectMethodHandle=false org.apache.spark.deploy.SparkSubmit --master local[4] --conf spark.jars.ivy=/tmp/.ivy --class SimpleApp target/simple-project-1.0.jar
2024-01-30 12:33:23.513Z DEBUG i.o.s.m.export.PeriodicMetricReader - No metric data to export - skipping export.

I've prepared a Dockerfile to reproduce the problem more easily:

  • docker build -t reproduce-trace-issues:v1 .
  • Run as expected: docker run --env APPLICATIONINSIGHTS_CONNECTION_STRING="<real-connection-string>" --env APPLICATIONINSIGHTS_SELF_DIAGNOSTICS_LEVEL="DEBUG" reproduce-trace-issues:v1 (there will be a Lines: line).
  • docker run --env APPLICATIONINSIGHTS_CONNECTION_STRING="<real-connection-string>" --env APPLICATIONINSIGHTS_SELF_DIAGNOSTICS_LEVEL="TRACE" reproduce-trace-issues:v1 (no logs after No metric data to export).
SimpleApp.java
import org.apache.spark.sql.SparkSession;
import org.apache.spark.sql.Dataset;

public class SimpleApp {
  public static void main(String[] args) {
    String logFile = "/tmp/spark/README.md"; // Should be some file on your system
    SparkSession spark = SparkSession.builder().appName("Simple Application").getOrCreate();
    Dataset<String> logData = spark.read().textFile(logFile).cache();

    long length = logData.count();

    System.out.println("Lines: " + length);

    spark.stop();
  }
}
pom.xml
<project>
  <groupId>edu.berkeley</groupId>
  <artifactId>simple-project</artifactId>
  <modelVersion>4.0.0</modelVersion>
  <name>Simple Project</name>
  <packaging>jar</packaging>
  <version>1.0</version>
  <properties>
     <maven.compiler.source>17</maven.compiler.source>
     <maven.compiler.target>17</maven.compiler.target>
  </properties>
  <dependencies>
    <dependency> <!-- Spark dependency -->
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql_2.12</artifactId>
      <version>3.5.0</version>
      <scope>provided</scope>
    </dependency>
  </dependencies>
</project>
Dockerfile
FROM registry.access.redhat.com/ubi8/openjdk-17:1.18-2.1705573234

USER 0

WORKDIR /tmp/

RUN microdnf -y install gzip procps
RUN curl -fsSLo /tmp/spark.tgz https://dlcdn.apache.org/spark/spark-3.5.0/spark-3.5.0-bin-hadoop3.tgz
RUN tar -C /tmp/ -xzf spark.tgz
RUN mv /tmp/spark-3.5.0-bin-hadoop3 /tmp/spark/
RUN curl -fsSLo /tmp/spark/jars/applicationinsights-agent.jar https://repo1.maven.org/maven2/com/microsoft/azure/applicationinsights-agent/3.4.19/applicationinsights-agent-3.4.19.jar

WORKDIR /tmp/project/

COPY SimpleApp.java ./src/main/java/SimpleApp.java
COPY pom.xml ./pom.xml

RUN mvn package

ENV APPLICATIONINSIGHTS_SELF_DIAGNOSTICS_LEVEL="TRACE"
ENV JAVA_TOOL_OPTIONS="-javaagent:/tmp/spark/jars/applicationinsights-agent.jar"
ENV CLASSPATH="/tmp/spark/jars/"
ENV APPLICATIONINSIGHTS_CONNECTION_STRING="InstrumentationKey=00000000-0000-0000-0000-0FEEDDADBEEF;IngestionEndpoint=http://host.testcontainers.internal:6060/;LiveEndpoint=http://host.testcontainers.internal:6060/"

CMD ["/tmp/spark/bin/spark-submit", "--class", "SimpleApp", "--master", "local[4]", "--conf", "spark.jars.ivy=/tmp/.ivy", "target/simple-project-1.0.jar"]

@progxaker
Copy link
Contributor Author

progxaker commented Feb 28, 2024

While I'm trying to solve a higher priority problem via the CET, I tried to print the OpenTelemetry logs bypassing your configuration:

} else if (name.startsWith("io.opentelemetry")) {
// OpenTelemetry instrumentation debug log has lots of things that look like errors
// which has been confusing customers, so only enable it when user configures "trace" level
loggerLevel = getDefaultLibraryLevel(level);
} else if (name.equals("com.azure.core.implementation.jackson.MemberNameConverterImpl")) {

and found that if the loggerLevel value is changed to Level.TRACE the problem is reproduced, if I set Level.INFO the problem is solved. I hope this may be helpful.

- loggerLevel = getAtLeastInfoLevel(level);
+ loggerLevel = Level.TRACE;
- loggerLevel = getAtLeastInfoLevel(level);
+ loggerLevel = Level.INFO;

UPD: If add -Dotel.sdk.disabled=true, the problem also went away.

@progxaker
Copy link
Contributor Author

Keep digging deeper into it.

TL;DR The problem occurs when the logger name starts with io.opentelemetry.sdk.metrics but no deeper then io.opentelemetry.sdk.metrics.internal.

I get a list of configured loggers

Code block
// Originally published at https://mailman.qos.ch/pipermail/logback-user/2008-November/000751.html
// and modified by me to return a string.

import org.slf4j.LoggerFactory;

String findNamesOfConfiguredAppenders() {
  LoggerContext lc = (LoggerContext) LoggerFactory.getILoggerFactory();
  String strList = "";
  for (Logger log : lc.getLoggerList()) {
    strList = strList + ";" + log.getName();
  }
  return strList;
}

added the function before

+  System.out.println("Loggers" + findNamesOfConfiguredAppenders());
   loggerLevel = getDefaultLibraryLevel(level); 

and get all the loggers from a log file.

# grep 'Loggers: ' /tmp/temp.log | uniq | sed 's/;/\n/g'
Loggers:
ROOT
com
com.azure
com.azure.core
com.azure.core.util
com.azure.core.util.CoreUtils
com.azure.monitor
com.azure.monitor.opentelemetry
com.azure.monitor.opentelemetry.exporter
com.azure.monitor.opentelemetry.exporter.implementation
com.azure.monitor.opentelemetry.exporter.implementation.utils
com.azure.monitor.opentelemetry.exporter.implementation.utils.HostName com.azure.monitor.opentelemetry.exporter.implementation.utils.PropertyHelper
io
io.grpc
io.grpc.Context
io.netty
io.netty.handler
io.netty.handler.ssl
io.opentelemetry
io.opentelemetry.exporter
io.opentelemetry.exporter.logging
io.opentelemetry.javaagent
io.opentelemetry.javaagent.tooling
io.opentelemetry.javaagent.tooling.VersionLogger
io.opentelemetry.sdk
io.opentelemetry.sdk.autoconfigure
io.opentelemetry.sdk.metrics
io.opentelemetry.sdk.metrics.internal
io.opentelemetry.sdk.metrics.internal.state
io.opentelemetry.sdk.metrics.internal.state.DeltaMetricStorage
io.opentelemetry.sdk.resources
oshi
reactor
reactor.netty
reactor.util

and started experimenting with io.opentelemetry.sdk.autoconfigure, io.opentelemetry.sdk.metrics and io.opentelemetry.sdk.resources. As a result, I can set Level.All for the autoconfigure and resources loggers, but if I set it for the metrics one, the problem is reproduced.

LoggingLevelConfigurator.java
@@ -33,6 +36,13 @@ public class LoggingLevelConfigurator {
     updateLoggerLevel(loggerContext.getLogger("io.opentelemetry.exporter.logging"));
     updateLoggerLevel(
         loggerContext.getLogger("io.opentelemetry.sdk.metrics.internal.state.DeltaMetricStorage"));
+    updateLoggerLevel(loggerContext.getLogger("io.opentelemetry.sdk.autoconfigure"));
+    updateLoggerLevel(loggerContext.getLogger("io.opentelemetry.sdk.resources"));
+    updateLoggerLevel(loggerContext.getLogger("io.opentelemetry.sdk.extension"));
+    updateLoggerLevel(loggerContext.getLogger("io.opentelemetry.sdk.metrics"));
+    updateLoggerLevel(loggerContext.getLogger("io.opentelemetry.sdk.metrics.internal"));
+    //updateLoggerLevel(loggerContext.getLogger("io.opentelemetry.sdk.metrics.internal.state"));
+    updateLoggerLevel(loggerContext.getLogger("io.opentelemetry.sdk"));
     updateLoggerLevel(loggerContext.getLogger("io.opentelemetry"));
     updateLoggerLevel(loggerContext.getLogger("muzzleMatcher"));
     updateLoggerLevel(
@@ -69,10 +80,26 @@ public class LoggingLevelConfigurator {
     loggerLevel = level;
     else if (name.equals("io.opentelemetry.sdk.metrics.internal.state.DeltaMetricStorage")) {
       loggerLevel = getDeltaMetricStorageLoggerLevel(level);
+    } else if (name.startsWith("io.opentelemetry.sdk.autoconfigure")) {
+      loggerLevel = Level.ALL;
+    } else if (name.startsWith("io.opentelemetry.sdk.resources")) {
+      loggerLevel = Level.ALL;
+    } else if (name.startsWith("io.opentelemetry.sdk.extension")) {
+      loggerLevel = Level.ALL;
+    //} else if (name.startsWith("io.opentelemetry.sdk.metrics.internal.state")) {
+      //loggerLevel = Level. INFO;
+    } else if (name.startsWith("io.opentelemetry.sdk.metrics.internal")) {
+      loggerLevel = Level. ERROR:
+    } else if (name.startsWith("io.opentelemetry.sdk.metrics")) {
+      loggerLevel = Level.ALL;
+    } else if (name.startsWith("io.opentelemetry.sdk")) {
+      loggerLevel = Level.ALL;
+    } else if (name.startsWith("io.opentelemetry")) {
       // OpenTelemetry instrumentation debug log has lots of things that look like errors
       // which has been confusing customers, so only enable it when user configures "trace"
-      level loggerLevel = getDefaultLibraryLevel(level);
+      //loggerLevel = getDefaultLibraryLevel(level);
+      loggerLevel = Level.ALL;
     } else if (name.equals("com.azure.core.implementation.jackson.MemberNameConverterImpl")) {
       // never want to log at trace or debug, as it logs confusing stack trace that
       // looks like error but isn't

@trask
Copy link
Member

trask commented Feb 29, 2024

hi @progxaker, can you provide a runnable repro? (as a github repository would be easiest for us to consume and try it out)

@progxaker
Copy link
Contributor Author

Hi @trask. I've consolidated the shared files (LoggingLevelConfigurator.java and the application's ones) in the progxaker/reproduce-trace-issues repository.

P.S. I tested the workaround for v3.4.19, but it doesn't work for v3.5.0.

@progxaker
Copy link
Contributor Author

As for v3.5.0, the problem occurs not only with the TRACE log level, but also with DEBUG.

@trask
Copy link
Member

trask commented Mar 14, 2024

hi @progxaker, I tried to run your repro locally but got error b/c I'm running on Windows. is there any chance you could simplify the repro, maybe along the lines of https://github.com/trask/apache-spark-extension-test? that would also make it easier for us to run in Intellij under a debugger (instead of having to remote debug inside of a docker image). thanks!

@progxaker
Copy link
Contributor Author

progxaker commented Apr 1, 2024

Hi @trask. I hope you're doing well.

Sorry for the long delay, been busy implementing OpenTelemetry (Python) into a project.

TL;DR:
The problem is reproduced when setting the JAVA_TOOL_OPTIONS variable with the value -javaagent:<absolute-path-to-the-agent>.


  1. I tried to reproduce the problem on Windows as you asked.
    As a result, the agent and application worked correctly.
    But in my attempts to run Spark, I discarded the JAVA_TOOL_OPTIONS variable
    and used the spark.driver.extraJavaOptions parameter instead.
    Actually both of them gave the following error, but I got the final
    when using spark.driver.extraJavaOptions (I removed the tmpdir parameter
    and ran from the folder with the JVM agent).
InvalidPathException exception
# spark-submit --conf "spark.driver.extraJavaOptions='-Djava.io.tmpdir=C:\Users\progxaker\Downloads\temp\ -javaagent:C:\Users\progxaker\Downloads\applicationinsights-agent-3.5.1.jar'" --class SimpleApp --master local[4] C:\Users\progxaker\Downloads\simple-project-1.0.jar
Exception in thread "main" java.nio.file.InvalidPathException: Illegal char <:> at index 31: C:UsersprogxakerDownloadstemp -javaagent:C:UsersDownloadsapplicationinsights-agent-3.5.1.jar\spark-fb7f738b-cffc-44a3-9e04-2b76598eaf30
        at java.base/sun.nio.fs.WindowsPathParser.normalize(WindowsPathParser.java:182)
        at java.base/sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:153)
        at java.base/sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:77)
        at java.base/sun.nio.fs.WindowsPath.parse(WindowsPath.java:92)
        at java.base/sun.nio.fs.WindowsFileSystem.getPath(Windows FileSystem.java:229)
        at java.base/java.io.File.toPath(File.java:2329)
        at org.apache.spark.network.util.JavaUtils.createDirectory (JavaUtils.java:396)
        at org.apache.spark.util.SparkFileUtils.createDirectory (SparkFileUtils.scala:95)
        at org.apache.spark.util.SparkFileUtils.createDirectory$ (SparkFileUtils.scala:94)
        at org.apache.spark.util.Utils$.createDirectory (Utils.scala:94)
        at org.apache.spark.util.Utils$.createTempDir(Utils.scala:241)
        at org.apache.spark.util.SparkFileUtils.createTempDir(SparkFileUtils.scala:103)
        at org.apache.spark.util.SparkFileUtils.createTempDir$ (SparkFileUtils.scala:102)
        at org.apache.spark.util.Utils$.createTempDir(Utils.scala:94)
        at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment (SparkSubmit.scala:372)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:964)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit (SparkSubmit.scala:1120)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
  1. Tried using the Java 11 Docker image. Application failed.
  2. Tried using the original OpenJDK Docker image - application failed.
  3. Tried using the Ubuntu Docker image and installing OpenJDK - application failed.
  4. Tried using the Ubuntu Docker image and installing Temurin JDK, since I noticed it was installed on Windows - application failed.
  5. I tried to reproduce the problem in a Linux VM. Worked as expected)
  6. And at this point I noticed that I hadn't set JAVA_TOOL_OPTIONS on the Linux VM...
    I did - the problem was reproduced.
  7. Tried on my Docker image - if I unset the variable, everything works, but if I return it, it doesn't.
  8. Tried it on Windows - with the Spark parameter works, without it too,
    but if I set JAVA_TOOL_OPTIONS, for some reason (and only on Windows) I get the following error:
# SET JAVA_TOOL_OPTIONS=-javaagent:applicationinsights-agent-3.5.1.jar
# spark-submit --class SimpleApp --master local[4] C:\Users\progxaker\Downloads\simple-project-1.0.jar
Picked up JAVA_TOOL_OPTIONS: "-javaagent:applicationinsights-agent-3.5.1.jar"
'2024-04-01' is not recognized as an internal or external command,
operable program or batch file.

As a result, the problem is reproducible on

  • OpenJDK 11 and 17
    • Docker: RHEL, Debian and Ubuntu.
  • Temurin JDK 11
    • Docker: Ubuntu.
    • OS: Windows and Linux.

I'll prepare a Windows version by the end of the day (~5 p.m. UTC).

@progxaker
Copy link
Contributor Author

@progxaker
Copy link
Contributor Author

Hello @trask. Have you had time to look at this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants