wrong form. #679

mikeatform · 2021-04-20T16:32:16Z

All the DNS names are in host files. I can ssh between the nodes.
Yesterday, I just reset a the offline nodes and things got back online.
Today, just the same errors.

[2021-04-20 16:03:47.685059] E [name.c:266:af_inet_client_get_remote_sockaddr] 0-glusterfs: DNS resolution failed on host srv-3:srv-4 [2021-04-20 16:03:50.685442] E [name.c:266:af_inet_client_get_remote_sockaddr] 0-glusterfs: DNS resolution failed on host srv-3:srv-4 [2021-04-20 16:03:50.725539] W [fuse-bridge.c:1276:fuse_attr_cbk] 0-glusterfs-fuse: 6885431: LOOKUP() / => -1 (Transport endpoint is not connected) [2021-04-20 16:03:50.753676] I [fuse-bridge.c:6083:fuse_thread_proc] 0-fuse: initiating unmount of /shared The message "E [MSGID: 101075] [common-utils.c:505:gf_resolve_ip6] 0-resolver: getaddrinfo failed (family:2) (Name or service not known)" repeated 17 times between [2021-04-20 16:02:59.633395] and [2021-04-20 16:03:50.685439] [2021-04-20 16:03:50.753827] W [glusterfsd.c:1596:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7e65) [0x7f14965bae65] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x5626eff99625] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x5626eff9948b] ) 0-: received signum (15), shutting down [2021-04-20 16:03:50.753846] I [fuse-bridge.c:6871:fini] 0-fuse: Unmounting '/shared'. [2021-04-20 16:03:50.753852] I [fuse-bridge.c:6876:fini] 0-fuse: Closing fuse connection to '/shared'.

`# gluster volume status
Status of volume: hpc-admin
Gluster process TCP Port RDMA Port Online Pid

Brick serv-2:/DATA/hpc-admin/brick1 49152 0 Y 9181
Brick serv-3:/DATA/hpc-admin/brick1 49152 0 Y 10828
Brick serv-4:/DATA/hpc-admin/brick1 49152 0 Y 9264
Self-heal Daemon on localhost N/A N/A Y 15218
Self-heal Daemon on serv-2 N/A N/A Y 18495
Self-heal Daemon on serv-3 N/A N/A Y 48312

Task Status of Volume hpc-admin

There are no active volume tasks

Status of volume: shared
Gluster process TCP Port RDMA Port Online Pid

Brick serv-2:/DATA/shared/brick1 N/A N/A N N/A
Brick serv-3:/DATA/shared/brick1 49153 0 Y 36391
Brick serv-4:/DATA/shared/brick1 N/A N/A N N/A
Self-heal Daemon on localhost N/A N/A Y 15218
Self-heal Daemon on serv-3 N/A N/A Y 48312
Self-heal Daemon on serv-2 N/A N/A Y 18495

Task Status of Volume shared

There are no active volume tasks
`

The text was updated successfully, but these errors were encountered:

mikeatform · 2021-04-20T17:52:32Z

Two nodes not playing along have this:
DATA-shared-brick1[16996]: [2021-04-20 17:48:28.186345] C [MSGID: 113081] [posix-common.c:639:posix_init] 0-shared-posix: Extended attribute not supported, exiting.

mikeatform changed the title ~~Working cluster with 2 volumes. spontaneously one volume can't resolve it's peer servers.~~ wrong form. Apr 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wrong form. #679

wrong form. #679

mikeatform commented Apr 20, 2021 •

edited

mikeatform commented Apr 20, 2021

wrong form. #679

wrong form. #679

Comments

mikeatform commented Apr 20, 2021 • edited

`# gluster volume status Status of volume: hpc-admin Gluster process TCP Port RDMA Port Online Pid

Task Status of Volume hpc-admin

Status of volume: shared Gluster process TCP Port RDMA Port Online Pid

Task Status of Volume shared

mikeatform commented Apr 20, 2021

mikeatform commented Apr 20, 2021 •

edited

`# gluster volume status
Status of volume: hpc-admin
Gluster process TCP Port RDMA Port Online Pid

Status of volume: shared
Gluster process TCP Port RDMA Port Online Pid