Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libxl blktap support #394

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

jandryuk
Copy link
Contributor

These two patches are the minimal needed for integration with libxl.

The InitWait handling needs a little modification to set it earlier. libxl needs to see InitWait to run hotplug scripts which will write hotplug-status = "connected". Setting InitWait earlier won't allow transitioning to Connected any earlier, so I think it should be okay with XAPI. Howver, it has not been tested under XAPI.

The deleting of the xenstore backend is no longer performed. libxl expects to find the backend to run the hotplug scripts to perform cleanup. Later on, libxl will delete the entire domain sub-tree, so xenstore will get cleaned up that way. Again, untested under XAPI. Maybe XAPI does similar sub-tree cleanup so this isn't necessary?

If this isn't acceptable as-is, would a command line option & global flag for a "libxl mode" be acceptable to change behavior?

This is needed for libxl integration.

tapdisk waits to transition to InitWait until after "hotplug-status" is
connected.  However libxl doesn't run its hotplug scripts, which write
"hotplug-status", until the backend transitions to InitWait.  With both
sides waiting, progress is not made and connecting the blktap device
times out and fails.

Make tapback transition to InitWait earlier to resolve this issue under
libxl.  Place the transition to InitWait in
tapback_backend_create_device() after the xenstore feature nodes have
been written.  InitWait is a signal to the frontend that such nodes have
been written.  This matches blkback's behaviour.  It should also be fine
since tapback still won't advance to Connected without the other setup
like physical-device-path and hotplug-status.

VBDs can be reconnected.  When pv-grub is used, it connects the VBD,
loads the kernel, disconnects the VBD.  It then re-sets the frontend
state to XenbusStateInitialising so that the new kernel will find and
connect the VBD.

tapback and blkback handle this case differently.  When blkback observes
the frontend transition to XenbusStateInitialising, and the backend is
XenbusStateClosed, the backend transitions to XenbusStateInitWait.

When tapback observes the frontend transition to
XenbusStateInitialising, the backend checks for hotplug_status_connected
to be true before switching XenbusStateInitWait.  For tapback, this
serves a second purpose for setting XenbusStateInitWait initially as
well.

With tapback setting XenbusStateInitWait earlier, follow the blkback
model to transition to XenbusStateInitWait when the backend was
previously in XenbusStateClosed.  The end result shouldn't change, but
it will remove an extra write and watch trigger for XenbusStateInitWait.

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
libxl doesn't clean up tapdisks because it doesn't call the hotplug
cleanup scripts:
libxl: debug: libxl_event.c:1043:devstate_callback: backend /local/domain/0/backend/vbd3/5/2048/state wanted state 6 but it was removed
libxl: debug: libxl_event.c:849:libxl__ev_xswatch_deregister: watch w=0xf82ba0 wpath=/local/domain/0/backend/vbd3/5/2048/state token=1/2: deregister slotnum=1
libxl: debug: libxl_device.c:1156:device_backend_callback: Domain 5:calling device_backend_cleanup
libxl: debug: libxl_event.c:863:libxl__ev_xswatch_deregister: watch w=0xf82ba0: deregister unregistered
libxl: error: libxl_device.c:1169:device_backend_callback: Domain 5:unable to remove device with path /local/domain/0/backend/vbd3/5/2048 - rc -6

The backend state cannot be found because tapback deleted the entire
backend subtree.

tapback shouldn't remove the backend nodes when the frontend is removed,
because the nodes contain the information on how to clean up.  Leave the
nodes and allowed them to be removed by the toolstack.

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
@MarkSymsCtx
Copy link
Contributor

I'll run this through our test systems and check nothing goes bang,

@MarkSymsCtx
Copy link
Contributor

OK, as this stands right now it causes Xenopsd to fail with timeouts so it can't be immediately merged in this form. I'll get somebody to look at the xenopsd failures and see if that can be made to work compatibly.

@jandryuk
Copy link
Contributor Author

@MarkSymsCtx Thanks so much for testing it - sorry it didn't work...

@MarkSymsCtx
Copy link
Contributor

@jandryuk no problems, we'll look into whether we can get xenopsd to function compatibly with this. It looked to be timing out waiting on a state change, which possibly means that it's missed that the transition has already occurred.

@jandryuk
Copy link
Contributor Author

This branch puts the changes from this PR behind a new -l/--libxl flag jandryuk@63f3883

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants