Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] mfsbdev and map + unmap + map on /dev/ndb0 = input/output error #551

Open
asyslinux opened this issue Sep 17, 2023 · 1 comment
Open

Comments

@asyslinux
Copy link

asyslinux commented Sep 17, 2023

Have you read through available documentation, open Github issues and Github Q&A Discussions?

Yes

System information

Debian 11.7 (amd64) + Proxmox 7.4 / MooseFS Pro 3.0.117

root@srv:~# uname -a
Linux srv 5.15.116-1-pve #1 SMP PVE 5.15.116-1 (2023-08-29T13:46Z) x86_64 GNU/Linux

Hardware / network configuration, and underlying filesystems on master, chunkservers, and clients.

Underlying fs: ext4

2 x master server
2 x chunkservers
2 x clients

How much data is tracked by moosefs master (order of magnitude)?

Empty, new MooseFS.

Describe the problem you observed.

I try to use mfsbdev for virtual machine images. But before i start use this for vm, i have a problem:
When i start mfsbdev and 1. map file 2. unmap file 3. map file again , then ndb device is not working properly anymore.

If i stop and start mfsbdev, map is start normally and working ok. But if i have several vm`s, then i can not stop mfsbdev service in production ready system, because i need to stop all virtual machines, in this case.

  1. VM is stopped (vm is not important now).

  2. map + unmap + map uses only /dev/nbd0

  3. Additional settings:

root@srv:~# cat /etc/modprobe.d/nbd.conf
options nbd nbds_max=64

Can you reproduce it? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

Yes:

  1. map + unmap + map
root@srv:~# modprobe nbd

root@srv:~# /usr/sbin/mfsbdev start -H mfsmaster.lan
mfsmaster accepted connection with parameters: read-write,restricted_ip,admin ; root mapped to root:root

root@srv:~# /usr/sbin/mfsbdev map -f vz/nbd/100/vm-100.bin -s 32GB -n vm-wire-lv
started block device: (/dev/mfs/vm-wire-lv->/dev/nbd0 : MFS:/vz/nbd/100/vm-100.bin : 29.802GiB)

root@srv:~# fdisk /dev/mfs/vm-wire-lv

Welcome to fdisk (util-linux 2.36.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Command (m for help): p
Disk /dev/mfs/vm-wire-lv: 29.8 GiB, 32000000000 bytes, 7812500 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x33c0c8a3

Device               Boot Start      End  Sectors   Size Id Type
/dev/mfs/vm-wire-lv1 *     2048 62498815 62496768 238.4G 83 Linux

Command (m for help): q

root@srv:~# mfsbdev unmap -f vz/nbd/100/vm-100.bin
stop block device: (/dev/mfs/vm-wire-lv->/dev/nbd0 : MFS:/vz/nbd/100/vm-100.bin : 29.802GiB)

root@srv:~# /usr/sbin/mfsbdev map -f vz/nbd/100/vm-100.bin -s 32GB -n vm-wire-lv
started block device: (/dev/mfs/vm-wire-lv->/dev/nbd0 : MFS:/vz/nbd/100/vm-100.bin : 29.802GiB)

root@srv:~# fdisk /dev/mfs/vm-wire-lv

Welcome to fdisk (util-linux 2.36.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

fdisk: cannot open /dev/mfs/vm-wire-lv: Input/output error
root@srv:~# cfdisk /dev/mfs/vm-wire-lv
cfdisk: cannot open /dev/mfs/vm-wire-lv: Input/output error

Also strange size of partition = 238.4G in 32GB disk
/dev/mfs/vm-wire-lv1 * 2048 62498815 62496768 238.4G 83 Linux

  1. map + unmap + map + unmap + map (Without restarting mfsbdev service)
    After map + unmap + map and get input/output error, then if try to repeat additional unmap + map, then working ok, without restart mfsbdev service.
root@srv:~# mfsbdev unmap -f vz/nbd/100/vm-100.bin
stop block device: (/dev/mfs/vm-wire-lv->/dev/nbd1 : MFS:/vz/nbd/100/vm-100.bin : 29.802GiB)

root@srv:~# /usr/sbin/mfsbdev map -f vz/nbd/100/vm-100.bin -s 32GB -n vm-wire-lv
started block device: (/dev/mfs/vm-wire-lv->/dev/nbd1 : MFS:/vz/nbd/100/vm-100.bin : 29.802GiB)

root@srv:~# fdisk /dev/mfs/vm-wire-lv

Welcome to fdisk (util-linux 2.36.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Command (m for help):

And if try to repeat one more unmap + map, then get again input/output error.

I mean:

map = ok
map + unmap + map = input/output error
map + unmap + map + unmap + map = ok
map + unmap + map + unmap + map + unmap + map = input/output error
map + unmap + map + unmap + map + unmap + map + unmap + map = ok

Additionally:

Also i try to use -i for ignore locking, but this is not working too. I have an idea - map single image on 2 different servers with -i key for ready hot standby scenario for vm`s, but if i stop vm on 1-st server and unmap file of vm on 1-st server, then on 2-nd server i also see input/output error when i try to use fdisk/cfdisk.

I have two servers in MooseFS Pro cluster with mfsbdev and want to have hot standby scenario for virtual machines with -i key (ideal solution with pre-mapped image files on 2-nd server) or without -i key, if this is impossible to have more than 1 ndb mapped devices with single image on 2 different servers in cluster - in this case i can manually map ndb devices before start vm`s on 2-nd server if 1-st server is going down.

But now, simple map + unmap + map = broken ndb device on single server too.

After stopping mfsbdev and removing nbd module from kernel, i also see stuck connections:

root@srv:~# ps aux | grep "nbd\|bdev" | grep -v grep ; lsmod | grep nbd
root         883  0.0  0.0      0     0 ?        I<   Sep10   1:36 [kworker/u196:0-nbd0-recv]
root     1474426  0.0  0.0      0     0 ?        I<   07:25   0:00 [kworker/u196:1-nbd1-recv]
@asyslinux asyslinux changed the title [BUG] mfsdbev and map + unmap + map on /dev/ndb0 = input/output error [BUG] mfsbdev and map + unmap + map on /dev/ndb0 = input/output error Sep 17, 2023
@chogata
Copy link
Member

chogata commented Oct 10, 2023

Thank you for the report. We've found the issue and we will post a fix soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants