hugegraph服务挂了 #2484

2867021781 · 2024-03-18T08:08:46Z

Bug Type (问题类型)

None

Before submit

我已经确认现有的 Issues 与 FAQ 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents)

Environment (环境信息)

Server Version: 0.12.0 单机
Backend: RocksDB
OS: centos 3.10.0-957.el7.x86_64 Mem:32G used:10g
Data Size: 67张图，300g数据

Expected & Actual behavior (期望与实际表现)

单机服务，用的0.12.0版本，常会发生 hugegraph服务挂了，但没有任何错误信息，，，该怎么去排查呢

Vertex/Edge example (问题点 / 边数据举例)

No response

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

No response

dosubot · 2024-03-18T08:08:59Z

嗨，@2867021781！很高兴再次看到你在这里提问。希望一切都好！

根据您提供的信息，我有以下几个建议来帮助您排查HugeGraph服务频繁崩溃的问题：

检查Java版本：请确保Java版本至少为最低要求的版本。HugeGraph 0.12.0版本需要的最低Java版本是Java 8。您可以通过在终端运行java -version来手动检查Java版本。
查看日志文件：HugeGraph服务的日志文件通常会写入到$LOGS目录，主要的日志文件是hugegraph-server.log。请查看此日志文件中是否有任何错误信息或堆栈跟踪，这可能会指示出导致服务崩溃的原因。
内存配置：如果您的服务器内存不足，可能会导致服务崩溃。检查内存分配设置，如果您的服务器有更多可用内存，可以考虑增加这些设置。服务可以使用的最大和最小堆内存由MAX_MEM和MIN_MEM变量设置。
垃圾收集器（GC）选项：如果您已经指定了垃圾收集器选项（G1GC或ZGC），请确保它适合您的Java版本和服务器配置。垃圾收集器的错误配置或使用不适合的垃圾收集器可能会导致性能问题或崩溃。
检查OpenTelemetry工具：如果您已经启用了OpenTelemetry工具（OPEN_TELEMETRY="true"），请确保OpenTelemetry Java代理正确下载，并且其MD5校验和与预期值匹配。工具设置错误可能会干扰应用程序的正常运行。
安全管理器：如果启用了安全管理器（OPEN_SECURITY_CHECK="true"），请确保您的安全策略不过于严格，可能会阻止服务正常运行。
查看自定义用户选项：如果您提供了任何自定义用户选项（USER_OPTION），请查看它们是否导致服务崩溃。错误的JVM选项可能会导致不稳定。
外部依赖：确保所有外部依赖，如数据库后端和HugeGraph所需的任何服务，都在运行并可访问。与外部服务的连接问题或配置错误可能导致应用程序崩溃。

通过遵循这些排查步骤，您应该能够找出并解决导致HugeGraph服务频繁崩溃的问题。

参考资料：

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

imbajin · 2024-03-18T09:48:23Z

@dosu-bot 整体回复得不错, 不过注意一下 0.12 是之前的旧版本, 并不是 master 分支 (pom 文件里的版本)

其中有些配置项是只有最新代码里才有的, 以后可以确认改进一下, 另外之后回复类似的问题, 也可以参考下面的回复 (提醒用户升级版本, 去系统日志里查询 kill 原因等)

单机服务，用的0.12.0版本，常会发生 hugegraph服务挂了，但没有任何错误信息，，，该怎么去排查呢

@2867021781 单机 rocksdb 建议升级一下到社区的最新版本 (至少是 1.2.0), 非 Apache 的早期版本已经不再维护更新了, tools 一键导出然后重新导入一下速度也还行 (要注意是否是 Rocksdb 使用内存多了 OOM-kill 了导致没有异常信息, 需要去系统日志里查看确认)

PS: Better use English to describe your problems in Apache Community instead (Thanks~)

dosubot · 2024-03-18T09:50:41Z

@2867021781 你好！

感谢您的反馈和耐心等待。根据 imbajin 的建议，我想提醒您升级HugeGraph到最新版本（至少是 1.2.0），因为0.12.0是之前的旧版本，并且非Apache的早期版本已经不再维护更新了。升级到最新版本可能会解决您遇到的问题。

此外，如果您的单机使用RocksDB，建议您检查是否发生了OOM（Out of Memory）错误，可能是因为RocksDB使用了过多内存而导致系统进行了OOM-kill，这可能会导致没有异常信息。您可以去系统日志里查看确认是否发生了这样的情况。

希望这些建议能帮助您解决HugeGraph服务频繁崩溃的问题。如果您有任何其他疑问或需要进一步帮助，请随时告诉我。

祝您好运！

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

2867021781 · 2024-03-20T05:52:11Z

@2867021781 你好！

感谢您的反馈和耐心等待。根据 imbajin 的建议，我想提醒您升级HugeGraph到最新版本（至少是 1.2.0），因为0.12.0是之前的旧版本，并且非Apache的早期版本已经不再维护更新了。升级到最新版本可能会解决您遇到的问题。

此外，如果您的单机使用RocksDB，建议您检查是否发生了OOM（Out of Memory）错误，可能是因为RocksDB使用了过多内存而导致系统进行了OOM-kill，这可能会导致没有异常信息。您可以去系统日志里查看确认是否发生了这样的情况。

希望这些建议能帮助您解决HugeGraph服务频繁崩溃的问题。如果您有任何其他疑问或需要进一步帮助，请随时告诉我。

祝您好运！

About Dosu

q请问是不是跟这个Xmx6254m的配置有关系，在不同机器这个参数不一样，可以动态配置吗？？？

2867021781 · 2024-03-20T05:54:22Z

嗨，@2867021781！很高兴再次看到你在这里提问。希望一切都好！

根据您提供的信息，我有以下几个建议来帮助您排查HugeGraph服务频繁崩溃的问题：

检查Java版本：请确保Java版本至少为最低要求的版本。HugeGraph 0.12.0版本需要的最低Java版本是Java 8。您可以通过在终端运行java -version来手动检查Java版本。

查看日志文件：HugeGraph服务的日志文件通常会写入到$LOGS目录，主要的日志文件是hugegraph-server.log。请查看此日志文件中是否有任何错误信息或堆栈跟踪，这可能会指示出导致服务崩溃的原因。

内存配置：如果您的服务器内存不足，可能会导致服务崩溃。检查内存分配设置，如果您的服务器有更多可用内存，可以考虑增加这些设置。服务可以使用的最大和最小堆内存由MAX_MEM和MIN_MEM变量设置。

垃圾收集器（GC）选项：如果您已经指定了垃圾收集器选项（G1GC或ZGC），请确保它适合您的Java版本和服务器配置。垃圾收集器的错误配置或使用不适合的垃圾收集器可能会导致性能问题或崩溃。

检查OpenTelemetry工具：如果您已经启用了OpenTelemetry工具（OPEN_TELEMETRY="true"），请确保OpenTelemetry Java代理正确下载，并且其MD5校验和与预期值匹配。工具设置错误可能会干扰应用程序的正常运行。

安全管理器：如果启用了安全管理器（OPEN_SECURITY_CHECK="true"），请确保您的安全策略不过于严格，可能会阻止服务正常运行。

查看自定义用户选项：如果您提供了任何自定义用户选项（USER_OPTION），请查看它们是否导致服务崩溃。错误的JVM选项可能会导致不稳定。

外部依赖：确保所有外部依赖，如数据库后端和HugeGraph所需的任何服务，都在运行并可访问。与外部服务的连接问题或配置错误可能导致应用程序崩溃。

通过遵循这些排查步骤，您应该能够找出并解决导致HugeGraph服务频繁崩溃的问题。

参考资料：

hugegraph-server/hugegraph-dist/src/assembly/static/bin/hugegraph-server.sh

pom.xml

hugegraph-server/hugegraph-api/src/main/java/org/apache/hugegraph/api/metrics/MetricsAPI.java

Sources

hugegraph-server/hugegraph-dist/src/assembly/static/bin/hugegraph-server.sh

pom.xml

hugegraph-server/hugegraph-api/src/main/java/org/apache/hugegraph/api/metrics/MetricsAPI.java

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

非常感谢，我排查下看看

dosubot · 2024-03-20T06:01:16Z

Based on the information you provided, it seems like the HugeGraph service might be crashing due to insufficient memory resources. The Xmx6254m setting you mentioned is the maximum heap size for the Java Virtual Machine (JVM) running HugeGraph. If the heap size is not sufficient for the data and operations, it could lead to the service crashing.

To dynamically configure the Xmx6254m setting in HugeGraph version 0.12.0, you can modify the hugegraph-server.sh script to accept an environment variable for setting the maximum heap size (Xmx). This approach allows you to change the heap size without modifying the script each time. Here's how you can modify the script:

# Before setting JAVA_OPTIONS
if [ -n "$HUGEGRAPH_MAX_MEM" ]; then
    XMX=$HUGEGRAPH_MAX_MEM
else
    XMX=$(calc_xmx $MIN_MEM $MAX_MEM)
    if [ $? -ne 0 ]; then
        echo "Failed to start HugeGraphServer, requires at least ${MIN_MEM}MB free memory" >> "${OUTPUT}"
        exit 1
    fi
fi

JAVA_OPTIONS="-Xms${MIN_MEM}m -Xmx${XMX}m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=${LOGS} ${USER_OPTION}"

With this modification, you can dynamically set the Xmx value by exporting the HUGEGRAPH_MAX_MEM environment variable before starting the HugeGraph server. For example, to set the maximum heap size to 6254 MB, you can do the following:

export HUGEGRAPH_MAX_MEM=6254
./hugegraph-server.sh <GREMLIN_SERVER_CONF> <REST_SERVER_CONF> <OPEN_SECURITY_CHECK>

This approach allows you to adjust the maximum heap size based on your requirements without modifying the script each time. You can find the script here.

As for the recommended memory settings for HugeGraph version 0.12.0 when running on a machine with 32GB of memory, the script sets the maximum heap memory (-Xmx) dynamically based on the available system memory but is capped at a maximum (MAX_MEM) of 32GB (32768MB). The minimum heap memory (-Xms) is set to 512MB.

If the service is still crashing without any error messages, it could be due to a variety of reasons such as insufficient system resources, configuration issues, port conflicts, file permissions, daemon mode failures, external dependencies, script errors, or logging configuration. You can review the startup script and configuration files, check system resources and file permissions, and ensure the log file is accessible and monitored to diagnose and fix such issues.

Unfortunately, I couldn't find specific system requirements for running HugeGraph version 0.12.0 with RocksDB as the backend on a CentOS 3.10.0-957.el7.x86_64 server in the repository. However, ensuring that your system has sufficient resources (CPU, memory, disk space) and that HugeGraph and RocksDB are correctly configured should help in running the service smoothly.

I hope this helps! If you have any more questions or need further clarification, feel free to ask.

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

2867021781 added the bug Something isn't working label Mar 18, 2024

dosubot bot added question Further information is requested rocksdb RocksDB backend labels Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hugegraph服务挂了 #2484

hugegraph服务挂了 #2484

2867021781 commented Mar 18, 2024

dosubot bot commented Mar 18, 2024 •

edited

About Dosu

imbajin commented Mar 18, 2024 •

edited

dosubot bot commented Mar 18, 2024

About Dosu

2867021781 commented Mar 20, 2024

About Dosu

2867021781 commented Mar 20, 2024

Sources

About Dosu

dosubot bot commented Mar 20, 2024

About Dosu

hugegraph服务挂了 #2484

hugegraph服务挂了 #2484

Comments

2867021781 commented Mar 18, 2024

Bug Type (问题类型)

Before submit

Environment (环境信息)

Expected & Actual behavior (期望与实际表现)

Vertex/Edge example (问题点 / 边数据举例)

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

dosubot bot commented Mar 18, 2024 • edited

Sources

About Dosu

imbajin commented Mar 18, 2024 • edited

dosubot bot commented Mar 18, 2024

About Dosu

2867021781 commented Mar 20, 2024

About Dosu

2867021781 commented Mar 20, 2024

Sources

About Dosu

dosubot bot commented Mar 20, 2024

Sources

About Dosu

dosubot bot commented Mar 18, 2024 •

edited

imbajin commented Mar 18, 2024 •

edited