Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

快速安装完整版本,安装失败 #235

Open
lkeai2007 opened this issue Aug 24, 2023 · 16 comments
Open

快速安装完整版本,安装失败 #235

lkeai2007 opened this issue Aug 24, 2023 · 16 comments

Comments

@lkeai2007
Copy link

我执行的该命令,我的版本是centos8.6
helm install sreworks ./ --create-namespace --namespace sreworks --set global.accessMode="nodePort" --set appmanager.home.url="http://127.0.0.1:30767" --set appmanager.server.jwtSecretKey="123321"
image

@lkeai2007
Copy link
Author

minikube start --image-mirror-country=cn --cpus=4 --memory=15gb ,并且该命令已经启动成功
image

@lkeai2007
Copy link
Author

错误日志:kubectl logs sreworks-appmanager-cluster-initjob-xf829 -n sreworks

  • python /app/sbin/cluster_init.py
    Traceback (most recent call last):
    File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 203, in _new_conn
    sock = connection.create_connection(
    File "/usr/local/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection
    raise err
    File "/usr/local/lib/python3.9/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
    ConnectionRefusedError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 790, in urlopen
response = self._make_request(
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 496, in _make_request
conn.request(
File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 395, in request
self.endheaders()
File "/usr/local/lib/python3.9/http/client.py", line 1280, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/local/lib/python3.9/http/client.py", line 1040, in _send_output
self.send(msg)
File "/usr/local/lib/python3.9/http/client.py", line 980, in send
self.connect()
File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 243, in connect
self.sock = self._new_conn()
File "/usr/local/lib/python3.9/site-packages/urllib3/connection.py", line 218, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f803e342730>: Failed to establish a new connection: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 844, in urlopen
retries = retries.increment(
File "/usr/local/lib/python3.9/site-packages/urllib3/util/retry.py", line 515, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='sreworks-appmanager', port=80): Max retries exceeded with url: /oauth/token (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f803e342730>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/app/sbin/cluster_init.py", line 165, in
init_cluster(AppManagerClient(ENDPOINT, CLIENT_ID, CLIENT_SECRET, USERNAME, PASSWORD).client)
File "/app/sbin/cluster_init.py", line 75, in init
self._token = self._fetch_token()
File "/app/sbin/cluster_init.py", line 86, in _fetch_token
return oauth.fetch_token(
File "/usr/local/lib/python3.9/site-packages/requests_oauthlib/oauth2_session.py", line 341, in fetch_token
r = self.request(
File "/usr/local/lib/python3.9/site-packages/requests_oauthlib/oauth2_session.py", line 521, in request
return super(OAuth2Session, self).request(
File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.9/site-packages/requests/adapters.py", line 519, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='sreworks-appmanager', port=80): Max retries exceeded with url: /oauth/token (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f803e342730>: Failed to establish a new connection: [Errno 111] Connection refused'))

@Twwy
Copy link
Collaborator

Twwy commented Aug 25, 2023

关键点在 sreworks-mysql-0 和 sreworks-redis-master-0 没有正常启动

@Twwy
Copy link
Collaborator

Twwy commented Aug 25, 2023

导致其他pod没法正常运行。

@lkeai2007
Copy link
Author

其他导致pod无法正常运行。

还是不行呀,
image
image

@Twwy
Copy link
Collaborator

Twwy commented Sep 1, 2023

从当前截图看已经mysql已经重启两次了,是否当前资源(比如内存),不是很充分,导致OOM反复重启?继而依赖mysql的服务都没法正常工作。

@lkeai2007
Copy link
Author

从当前截图看已经mysql已经重启两次了,是否当前资源(比如内存),不是很充分,导致OOM反复重启?继而依赖mysql的服务都没法正常工作。

虚拟机我给的26g内存

@lkeai2007
Copy link
Author

从当前截图看已经mysql已经重启两次了,是否当前资源(比如内存),不是很充分,导致OOM反复重启?继而依赖mysql的服务都没法正常工作。

是否要搭建私有docker仓库才能起来呢

@Twwy
Copy link
Collaborator

Twwy commented Sep 1, 2023

私有docker仓库不是必要条件。快速安装流程中,已经把所有镜像预置到公网可访问的docker镜像路径中。

@Twwy
Copy link
Collaborator

Twwy commented Sep 1, 2023

从当前截图看已经mysql已经重启两次了,是否当前资源(比如内存),不是很充分,导致OOM反复重启?继而依赖mysql的服务都没法正常工作。

虚拟机我给的26g内存

用kubectl describe 看一下mysql的最近几次重启的原因。

@Twwy
Copy link
Collaborator

Twwy commented Sep 1, 2023

https://www.yuque.com/sreworks-doc/docs/rr5g10
单机完整(数智版)部署: 建议至少 8核/32G内存/300G硬盘
可能内存确实差了一些。

@lkeai2007
Copy link
Author

https://www.yuque.com/sreworks-doc/docs/rr5g10 单机完整(数智版)部署: 建议至少 8核/32G内存/300G硬盘 可能内存确实差了一些。

[root@smu1 sreworks-chart]# kubectl describe pod sreworks-mysql-0 -n sreworks
Name: sreworks-mysql-0
Namespace: sreworks
Priority: 0
Service Account: sreworks-mysql
Node: minikube/192.168.58.2
Start Time: Thu, 31 Aug 2023 16:18:00 +0800
Labels: app.kubernetes.io/component=primary
app.kubernetes.io/instance=sreworks
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=mysql
controller-revision-hash=sreworks-mysql-7cc549768b
helm.sh/chart=mysql-8.2.3
statefulset.kubernetes.io/pod-name=sreworks-mysql-0
Annotations: checksum/configuration: 7c4d261d8711bdefbd47de7a4939ed26a18abae82038d582783afe9c2b6cb39d
Status: Running
IP: 10.244.1.176
IPs:
IP: 10.244.1.176
Controlled By: StatefulSet/sreworks-mysql
Containers:
mysql:
Container ID: docker://94f5868b20d3a4ab5b13e9356733b1c2ff5187247318e6881198a1946e918a89
Image: sreworks-registry.cn-beijing.cr.aliyuncs.com/hub/mysql:v1.0
Image ID: docker-pullable://sreworks-registry.cn-beijing.cr.aliyuncs.com/hub/mysql@sha256:3ffd066da0331310857607aef3f025f688648fc36c3a0b46df0bda3081666dc8
Port: 3306/TCP
Host Port: 0/TCP
State: Running
Started: Fri, 01 Sep 2023 15:19:55 +0800
Last State: Terminated
Reason: Error
Exit Code: 255
Started: Thu, 31 Aug 2023 16:25:00 +0800
Finished: Fri, 01 Sep 2023 15:18:08 +0800
Ready: True
Restart Count: 3
Liveness: exec [/bin/bash -ec password_aux="${MYSQL_ROOT_PASSWORD:-}"
if [[ -f "${MYSQL_ROOT_PASSWORD_FILE:-}" ]]; then
password_aux=$(cat "$MYSQL_ROOT_PASSWORD_FILE")
fi
mysqladmin status -uroot -p"${password_aux}"
] delay=120s timeout=1s period=10s #success=1 #failure=3
Readiness: exec [/bin/bash -ec password_aux="${MYSQL_ROOT_PASSWORD:-}"
if [[ -f "${MYSQL_ROOT_PASSWORD_FILE:-}" ]]; then
password_aux=$(cat "$MYSQL_ROOT_PASSWORD_FILE")
fi
mysqladmin status -uroot -p"${password_aux}"
] delay=30s timeout=1s period=10s #success=1 #failure=3
Environment:
BITNAMI_DEBUG: false
MYSQL_ROOT_PASSWORD: <set to the key 'mysql-root-password' in secret 'sreworks-mysql'> Optional: false
MYSQL_DATABASE: my_database
MYSQL_EXTRA_FLAGS: --max-connect-errors=1000 --max_connections=10000
Mounts:
/bitnami/mysql from data (rw)
/opt/bitnami/mysql/conf/my.cnf from config (rw,path="my.cnf")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-splgn (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-sreworks-mysql-0
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: sreworks-mysql
Optional: false
kube-api-access-splgn:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message


Normal Scheduled 23h default-scheduler Successfully assigned sreworks/sreworks-mysql-0 to minikube
Normal Created 23h kubelet Created container mysql
Normal Started 23h kubelet Started container mysql
Warning Unhealthy 23h kubelet Readiness probe failed: mysqladmin: [Warning] Using a password on the command line interface can be insecure.
mysqladmin: connect to server at 'localhost' failed
error: 'Access denied for user 'root'@'localhost' (using password: YES)'
Normal Killing 23h kubelet Container mysql failed liveness probe, will be restarted
Normal Pulled 23h (x2 over 23h) kubelet Container image "sreworks-registry.cn-beijing.cr.aliyuncs.com/hub/mysql:v1.0" already present on machine
Warning Unhealthy 22h (x40 over 23h) kubelet Liveness probe failed: command "/bin/bash -ec password_aux="${MYSQL_ROOT_PASSWORD:-}"\nif [[ -f "${MYSQL_ROOT_PASSWORD_FILE:-}" ]]; then\n password_aux=$(cat "$MYSQL_ROOT_PASSWORD_FILE")\nfi\nmysqladmin status -uroot -p"${password_aux}"\n" timed out
Warning Unhealthy 22h (x68 over 23h) kubelet Readiness probe failed: command "/bin/bash -ec password_aux="${MYSQL_ROOT_PASSWORD:-}"\nif [[ -f "${MYSQL_ROOT_PASSWORD_FILE:-}" ]]; then\n password_aux=$(cat "$MYSQL_ROOT_PASSWORD_FILE")\nfi\nmysqladmin status -uroot -p"${password_aux}"\n" timed out
Normal SandboxChanged 23m kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulled 22m kubelet Container image "sreworks-registry.cn-beijing.cr.aliyuncs.com/hub/mysql:v1.0" already present on machine
Normal Created 22m kubelet Created container mysql
Normal Started 22m kubelet Started container mysql
Warning Unhealthy 21m (x3 over 22m) kubelet Readiness probe failed: mysqladmin: [Warning] Using a password on the command line interface can be insecure.
mysqladmin: connect to server at 'localhost' failed
error: 'Can't connect to local MySQL server through socket '/opt/bitnami/mysql/tmp/mysql.sock' (2)'
Check that mysqld is running and that the socket: '/opt/bitnami/mysql/tmp/mysql.sock' exists!
Warning Unhealthy 12m (x10 over 20m) kubelet Liveness probe failed: command "/bin/bash -ec password_aux="${MYSQL_ROOT_PASSWORD:-}"\nif [[ -f "${MYSQL_ROOT_PASSWORD_FILE:-}" ]]; then\n password_aux=$(cat "$MYSQL_ROOT_PASSWORD_FILE")\nfi\nmysqladmin status -uroot -p"${password_aux}"\n" timed out
Warning Unhealthy 3m11s (x17 over 22m) kubelet Readiness probe failed: command "/bin/bash -ec password_aux="${MYSQL_ROOT_PASSWORD:-}"\nif [[ -f "${MYSQL_ROOT_PASSWORD_FILE:-}" ]]; then\n password_aux=$(cat "$MYSQL_ROOT_PASSWORD_FILE")\nfi\nmysqladmin status -uroot -p"${password_aux}"\n" timed out

@stdnt-xiao
Copy link
Contributor

你的集群挂载的存储卷可能出现了兼容问题。可以尝试取消mysql的data磁盘挂载测试。
但这不是最终的解决办法,理论上程序应考虑到兼容不同的存储系统。

@sixinshuier
Copy link

sixinshuier commented Dec 17, 2023

image
image
同样的问题

@Twwy
Copy link
Collaborator

Twwy commented Dec 17, 2023

image image 同样的问题

您这个问题,不太一样,你这边的mysql已经正常运行没有重启。您看一下 sreworks-appmanager-postrun-8nnnv 这个pod的日志,看看postrun是什么没有运行成功。

@sixinshuier
Copy link

sixinshuier commented Dec 18, 2023

image
一样的,还是连接sreworks-appmanager 超时. sreworks-appmanager-server 有问题,数据库初始化有问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants