Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvme create defaults to port_id=0, doesn't link in new subsystems. #12

Open
Smithx10 opened this issue Jun 30, 2022 · 3 comments
Open

Comments

@Smithx10
Copy link

Currently all nvme create's will not populate past the first.

The reason being is that the resource-agent responsible for creating the port and linking the subsystems to that port never will reach its code due to "nvmet_port_monitor()".
https://github.com/ClusterLabs/resource-agents/blob/main/heartbeat/nvmet-port#L137

The health check only checks the existence of the directory and if so, doesn't iterate over the nqns:
https://github.com/ClusterLabs/resource-agents/blob/main/heartbeat/nvmet-port#L148

nvmet_port_start() {
	nvmet_port_monitor
	if [ $? =  $OCF_SUCCESS ]; then
		return $OCF_SUCCESS
	fi

	mkdir ${portdir}
	echo ${OCF_RESKEY_addr} > ${portdir}/addr_traddr
	echo ${OCF_RESKEY_type} > ${portdir}/addr_trtype
	echo ${OCF_RESKEY_svcid} > ${portdir}/addr_trsvcid
	echo ${OCF_RESKEY_addr_fam} > ${portdir}/addr_adrfam

	for subsystem in ${OCF_RESKEY_nqns}; do
		ln -s /sys/kernel/config/nvmet/subsystems/${subsystem} \
		   ${portdir}/subsystems/${subsystem}
	done

	nvmet_port_monitor
}
nvmet_port_monitor() {
	[ -d ${portdir} ] || return $OCF_NOT_RUNNING
	return $OCF_SUCCESS
}

I noticed that we don't populate port_id in "/etc/drbd-reactor.d/linstor-gateway-nvmeof-$name.toml"

We only populate:

"ocf:heartbeat:nvmet-port port addr=10.91.230.214 nqns=linbit:nvme:zoo type=tcp"

Desired behavior?
If the user provides a different service address it probably should just automatically take the next available port. This probably should be fixed in linstor-gateway.

If the user provides the same service address it should link it in. This probably should be fixed in resource-agents.

Potentially port_id could be exposed to a user, but probably not necessary.

@chrboe
Copy link
Collaborator

chrboe commented Jul 1, 2022

Thanks for the report.

If the user provides a different service address it probably should just automatically take the next available port

Hm, yes. We would have to read back all the already created targets and check for the highest port_id... Probably not impossible, but I don't think we have precedent for that kind of logic yet. I will look at it.

If the user provides the same service address it should link it in. This probably should be fixed in resource-agents.

I don't think I fully understand this point. Right now I guess it would create a new portdir and symlink the subsystem in there. Does the backend not accept this? How would we fix this in the resource agents?

I guess if anything linstor-gateway should look up whether or not there is already a target with the same addr and assign the same port_id if there is...

@Smithx10
Copy link
Author

Smithx10 commented Jul 1, 2022

Sorry if I wasn't clear.

I guess if anything linstor-gateway should look up whether or not there is already a target with the same addr and assign the same port_id if there is...

Even if we use the same port_id for the same service_address with the current nvmet-port heartbeat code we will never symlink in the subsystem.

nvmet_port_start() runs nvmet_port_monitor which only checks if the $portdir exists, which it will since we created a port prior and will return 0 and never hit the following:

	for subsystem in ${OCF_RESKEY_nqns}; do
		ln -s /sys/kernel/config/nvmet/subsystems/${subsystem} \
		   ${portdir}/subsystems/${subsystem}
	done

the healthcheck

nvmet_port_monitor() {
	[ -d ${portdir} ] || return $OCF_NOT_RUNNING
	return $OCF_SUCCESS
}

Perhaps we should run loop where we link even if the portdir exists.

@Smithx10
Copy link
Author

Smithx10 commented Jul 5, 2022

After going through a PoC implementation of this behavior, I discovered that when you have 1 VIP with 4 subsystems, it's possible for reactor to promote the VIP on separate Primaries.

For example:
nvme create -r nvme_group linbit:nvme:demo0 10.91.230.214/32 10G
nvme create -r nvme_group linbit:nvme:demo1 10.91.230.214/32 10G
nvme create -r nvme_group linbit:nvme:demo2 10.91.230.214/32 10G

Can result with demo0 and demo1 on NodeA, and demo2 on NodeB both with the VIP 10.91.230.214.

Is there a way to make sure that Reactor can co-locate things like this?

Perhaps preferred-nodes? https://github.com/LINBIT/drbd-reactor/blob/master/doc/promoter.md#preferred-nodes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants