-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a small_cell
indexer to dials.index
#2477
base: main
Are you sure you want to change the base?
Conversation
…l, and convert orientation to Crystal
cctbx.xfel.small_cell_process working with this branch, not the change in 4499187
is achieved by passing the params as None, as then the Strategy __init__ will parse the phil params. Weird setup, but that's the way it is.
Ok, looking at this today. I found that at the top of the log for the original version it says this:
But for the dials.index version it says this:
This means that fewer spots are landing at the right d-spacings. This step happens way before any graph/pair matching and after spotfinding. To me implies there is either something wrong with how the geometry is being passed in or something wrong with the spotfinding. Looking at these options next. |
Ok, I've found the source of the 10 vs. 12 spots on rings, the wavelength. Use
The next problem seems to be dials.index mapping the input unit cell to a different setting. If I print out
While this does generate the same list of d-spacings, it does not generate the same list of operators. C222 generates a set of operators where h, k, and l are all in the same order, which this code relies on. However the operators generated by "C m m m (x-y,x+y,z)" include operators with transpositions of h, k, and l, which the above code doesn't consider. This diff re-writes that section to support more complex operators:
With that, I get the same graph/connections as I do on the cctbx.xfel.small_cell_process side. My next bug was due to resetting the imageset_id incorrectly, fixed with this diff in small_cell.py:
With these changes we get a crystal model successfully, but dials.index fails in the refinement step. So the next thing to think about is that dials.index will invoke the indexer or the stills indexer, but neither are appropriate for small cell indexing. They both do complicated refinements. My first thought is to bypass the refinement steps in the indexer or the stills indexer, but I haven't got that working yet. |
Got it! indexing.refinement_protocol.mode=None was part of the answer, but the full command that worked is
This requires cctbx/cctbx_project#914 which has the above code edits. Hm, some of these extra parameters are the defaults for cctbx.xfel.small_cell_process/dials.stills_process: |
Ah, nice one! For reference, this worked for me, with DIALS on the
End result looks like a sensible solution:
|
I'm thinking about the parameter defaults. I have an ulterior motive here, which is to try |
Oh that's a good point. In fact, dials.stills_process fixes the detector even for protein still images. I'd fix the wavelength, cell, and detector, and only refine the orientation (3 parameters total). For the above image, that would be 3 reflections per parameter. That said, for X-ray small molecule, I still think |
I was wondering about how the One possibility though is that we instead just include a step of orientation-only refinement as part of the Another issue I'm thinking about is that the use case of indexing a single still pattern while refining detector parameters is indeed not very sensible, but it could be reasonable to refine detector parameters against a set of indexed cells. I suppose that is more a job for |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #2477 +/- ##
==========================================
- Coverage 78.52% 78.43% -0.09%
==========================================
Files 609 610 +1
Lines 75096 75145 +49
Branches 10723 10277 -446
==========================================
- Hits 58971 58943 -28
- Misses 13953 14024 +71
- Partials 2172 2178 +6 |
Agreed
That sounds doable. One trick is that the reason I had to set refinement parameters (
Agreed |
Ah yes, thanks! I was looking for that, but couldn't find it. It looks like this refinement should have a fallback mode when there are very few reflections, like in this case. At the moment it just gives up with the solution. The simplest approach may be to have another try here: dials/src/dials/algorithms/indexing/model_evaluation.py Lines 339 to 340 in 93c1364
before finally giving up only in the case that small_cell -appropriate refinement does not succeed. I'll take a look at this now.
|
Just a note about adding a step of orientation-only refinement at the end of dials/src/dials/algorithms/indexing/lattice_search/low_res_spot_match.py Lines 533 to 542 in 93c1364
If I can remember the details I'll try to add this too. That way you get a better |
Hi, I think |
Yes, probably similar. I'm trying to work out why I'm the test example the small_cell algorithm ends with 6 indexed spots, but when the crystal model gets passed to AssignIndices only 3 actually get indexed |
Co-authored-by: Aaron Brewster <asbrewster@lbl.gov>
Ok, I think I've worked out the weirdness. If you add these lines to the end of the
then we are doing the index assignment like dials.index will do, but we do it twice. The results are a bit surprising:
The first time round, only 3 reflections are indexed, which is exactly what I've been seeing in The problem is that
But If instead of the above edit I reset the
then run
|
I haven't quite decided what to do about this yet. We could copy the reflection list before calling |
Hi, just chiming in, it looks like you are chasing the same leads we are. From @dwpaley: we should be measuring the tolerance in reciprocal angstroms, but we use fractional coordinates. For small cells, this actually puts us closer to the volumes checked for protein data. For this image there could be multiple crystals in the shot, so it's impossible to know which tolerance is better. A tighter tolerance might be more realistic. |
Hi Aaron, away from the computer now, but just wanted to add that I'm not worried about the tolerance but the non-idempotent behaviour of the index assignment. Looks like there are bugs to me |
Indeed if you keep calling the index assignment it ends up toggling between two outputs! |
Ha, wow. Definitely a bug. Digging deeper with @dwpaley. |
orientation by superposing ideal relps over the observed relp positions. This is seen to improve the model going into the refinement done by ModelEvaluation.
@phyy-nx @dwpaley I've put in a workaround for #2485, but it may be something we can remove if that issue is fixed within Also, I wanted to bring 3761b4a to your attention. To test this I had a look at the output of refinement done by
With the additional polishing step I saw this:
The models end up at basically the same place, but including the I have not tested this any more extensively than the single image example in #2477 (comment), but if it is something that proves generally useful then perhaps it should move from DIALS indexing's |
Hi, I think what you are showing is that there is a improvement when using |
@dagewa There's something weird going on here or thereabouts:
Do you understand this code? Currently this is selecting the unindexed reflections and attempting to index them. If we don't index any of those, or if there were no unindexed reflections in the first place, then |
Ok, @phyy-nx and I figured this out. The small_cell lattice search was breaking convention by marking reflections as indexed. If we reset |
As a test I did that (inefficiently) with this change: diff --git a/xfel/small_cell/solve_orientation.py b/xfel/small_cell/solve_orientation.py
index eb87513672..2b677a837d 100644
--- a/xfel/small_cell/solve_orientation.py
+++ b/xfel/small_cell/solve_orientation.py
@@ -49,6 +49,21 @@ class small_cell_orientation:
print("Summed squared residuals of x,y,z for %d spots in 1/angstroms: %.7f, %.7f, %.7f"%(N,self.residuals[0],self.residuals[1],self.residuals[2]))
Amatrix = sqr(solution.flatten()).transpose()
+ # Recalculate using superpose
+ from scitbx import matrix
+ from scitbx.math import superpose
+ Bmat = matrix.sqr(self.sym.unit_cell().fractionalization_matrix()).transpose()
+ observed_relps = [(0.0, 0.0, 0.0)]
+ predicted_relps = [(0.0, 0.0, 0.0)]
+ for hkl, xyz in zip(self.miller_indices, self.u_vectors):
+ predicted_relps.append(Bmat * hkl)
+ observed_relps.append(xyz)
+ fit = superpose.least_squares_fit(
+ flex.vec3_double(observed_relps), flex.vec3_double(predicted_relps)
+ )
+ Umat = fit.r
+ Amatrix = Umat * Bmat
+
from cctbx import crystal_orientation
ori = crystal_orientation.crystal_orientation(Amatrix, crystal_orientation.basis_type.reciprocal)
return ori It's not quite the same though. Without looking at it more closely, when it is called here there are 8 reflections in the fit, whereas when it was called at the end of Here is the output of
Now the start point looks a bit worse in If I do the
Good start point in These results might be all much of a muchness. It doesn't actually affect the end result - we get the same number of indexed reflections and basically the same crystal model any which way. |
The reason the results are different between the approaches is interesting though. As I see it, the By contrast the My intuition is that the |
Yes, thanks, I should have kept up with this thread better :) From your 9 indexed reflections and your large-ish Δψ values I think you're looking at the first shot in the ten_cbfs dataset. I would caution that 1-2 of those 9 appear to be bogus. My final orientations appeared better with outlier rejection enabled for candidate refinement:
Then RMSD_DeltaPsi and RMSD_Y are better by ~half (for the refined subset), but more importantly the integration shoeboxes are in the right places. We have a lot of progress and I'm putting together a summary, but for now just wanted to mention the outlier issue for this specific image. We're really excited about this here! We are tentatively planning a retirement party for |
Thanks @dwpaley! Yes, I really need to look at the wider picture with more images. I've been stuck in the weeds with the details so far... |
@dagewa Are you ok with 578a4ce? If so, then with the following phil:
I can process all of the ten_cbfs like this:
|
Hi @dwpaley yes, that looks fine to me at this stage (though in general I think we'd prefer to float the import). Longer term I think we might want a wider conversation about convergence of workflows and the data model. The fact that we have things like different classes for various mosaicity models is, IMHO, an indication of a problematic design. Honestly I think the base Another target on my long range radar are things like Ultimately I'd like the software just to do the Right Thing in each situation, governed by short and directly meaningful user parameters. It's an idealistic goal, but I think getting the feeling that is going on is one reason why people love programs like XDS. Anyway, that's probably a much longer reply than you expected 😉 (you can tell I'm getting ready to put down tools for my summer break). I put it here because I think the |
@dagewa Agree with everything here and thanks for the collaboration on this. One of my last points would be to eliminate the phil-juggling here: dials/src/dials/algorithms/indexing/lattice_search/small_cell.py Lines 54 to 59 in 578a4ce
because it fails to pass along any other small_cell params that may have been provided. I tentatively think this means deprecating I know your month is ending in a couple hours; while I haven't discussed it with @phyy-nx, I don't necessarily think the above needs to prevent this PR from getting merged. |
Hi @dwpaley agreed sounds sensible. I think this PR shouldn't go in until the |
Just wanted to mention that for whatever reason (not yet investigated) this branch provided 4x indexing rate compared to |
@dagewa do you know what the state of this is? |
WIP.
Requires cctbx/cctbx_project#912.
Currently, it runs, but indexing fails to assign indices for the
cctbx.xfel.small_cell_process
test cases. I'm not sure what the problem is.Test procedure
First run the
cctbx.xfel.small_cell_process
testNow split the output to get reference geometry for each of the individual images
Then import the first image like this, ensuring the detector geometry and wavelength come from the reference
Find spots using the same parameters as in
params_1.phil
Now try to index
We get some output from
small_cell
, showing that a cell model has been found, but unfortunately indices can't be assigned :-(I'm wondering if @phyy-nx or @dwpaley might know what is wrong here?