-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DIALS σ(I) estimates are considerably smaller than those from XDS #2625
Comments
This is very interesting! I looked at the thaumatin_i04 data with a slightly different approach.
high res cutoff much more generous than you used as I/σ(I) was ~20 at 2 Å for this dataset.
Some observations:
The plot of σ(I) has a slope of 1.182 which exactly matches the estimated I/σ(I) asymptotic limits reported by Most interesting are the error model parameters...
I think you need to take square root of both
so:
|
So of course I did:
This gives:
The estimated I/σ(I) asymptotic limit is now identical to that reported by XSCALE and the graphs are as you'd expect from this So I would concentrate the investigation on the differences in error model refinement between XDS and dials.scale. |
Some thoughts:
I will dig out my scripts for the second of these: when I discussed this with Randy at a GRC many years ago he did suggest that it wasn't really horrible - I think I called it "semi-synthetic" or something at the time |
I remember Andrew Leslie (by Garib's suggestion?) performed the third analysis on MOSFLM by doubling the unit cell lengths. |
Yes, you can do that with DIALS as well with |
The reflections from dials with I=0 appear to be a xia2 thing. I cannot reproduce with 'normal' dials workflow. They have I=0 σ(I)=0:
processed using @dagewa suggest you filter these out in your script. Culprit seems to be
|
Thanks @huwjenkins, I will update the script. @graeme-winter investigating this with a "semi-synthetic" dataset (or a few) sounds like an excellent idea. As for the systematic absences, I still need to really dig into it for the data set mentioned at the top of the thread. I suspect the absence-breaking to be "true", as this is ED on an inorganic crystal. Olex2 produces a plot of the intensity distribution of absence-breaking reflections, which looks like this: |
For the small molecule ED dataset what does the table
in the |
Also how are you going from XDS_ASCII -> SHELX HKLF format - it looks from the plot from Olex2 that the XDS intensities are much larger than the ones from DIALS? |
Thanks Huw, yes I reduced
|
Scaling and conversion for XDS: cat <<EOF > XSCALE.INP
OUTPUT_FILE=temp.ahkl
INPUT_FILE=../Data1/XDS_ASCII.HKL
INPUT_FILE=../Data3/XDS_ASCII.HKL
INPUT_FILE=../Data4/XDS_ASCII.HKL
SPACE_GROUP_NUMBER= 22
UNIT_CELL_CONSTANTS= 18.374 18.669 6.767 90 90 90
RESOLUTION_SHELLS=4 3 2 1 0.61
EOF
xscale
cat <<EOF > XDSCONV.INP
INPUT_FILE=temp.ahkl
OUTPUT_FILE=xds.hkl SHELX
FRIEDEL'S_LAW=FALSE
EOF
xdsconv NB the space group is |
Maybe just an effect of the overall scale factor being arbitrary and XDS adjusting to best use a fixed-width format? |
In my experience XSCALE always results in very large intensities - there'll be a line in
but I'm intrigued that these are getting through XDSCONV. This must have changed because I'm sure it used to scale so intensities fitted into the 8.2F format but now maybe it doesn't and takes advantage that SHELX programs should be able to read 9999999. from this format. What is maximum and minimum in |
For this case,
The five-number summary of the I column in I guess it is scaled so that the maximum intensity is 999999.00? |
Probably I should change |
possibly the file contains The shape of those I(XDS) vs I(dials) plots are quite surprising though. For those are you using the |
@dagewa I think your script should really filter the reflection table by
|
Yes, that seems sensible. Ok, I updated the script. I think maybe the robust fit line should only be calculated on the stronger intensities too. Plus sometimes calculating that fails when there are too many points. After I improve the script a bit more I may re-calculate plots for the examples above. |
XSCALE places the intensities on an absolute scale assuming 50% solvent Output from CORRECT is on image scale |
@dagewa - I looked a bit more at this with our biotin ED datasets on zenodo. I think there are 2 different issues here. The MX cases where you are comparing one dataset processed with XDS and DIALS is different to the ED case with 3 datasets. When you process the ED data are you scaling all three datasets together in DIALS and refining 1 set of error model parameters for all 3 datasets ( Does that explain the plots? Could you try adding |
Here's my analysis. The data are Process with this script. Colour series is orange, blue, green, pink, yellow for datasets 0-4 (xtal1 to xtal5)
|
Thanks @huwjenkins. I got caught up with some deadlines so haven't been able to look at this for a while, but I will pick it up again eventually. |
I had a case where I could not solve a small molecule 3D ED structure with shelxt 2018/2, but it solved fine with data from XDS. The cause appears to be because in that version of sheltx:
The "absences" were slightly present for this data set, and due to rather low error estimates from DIALS, the correct space group was rejected.
This ties in with anecdotal evidence that DIALS error estimates generally seem too small, or equivalently, reported I/σ(I) seems too high.
All this sent me on an investigation into error estimation in DIALS compared to XDS, using the following script:
xia2 pipeline=dials
and then withxia2 pipeline=3dii
compare_errors.py
scriptWhat follows are results from some of the data sets available in DIALS data. In summary, the XDS error estimates are always higher than the DIALS ones, but this varies between 1.2 times for
thaumatin_i04
and 4.2 times forx4wide
. No claim is made as to which is more "correct", but the discrepancies and some of the features in the plots seem surprising.I would appreciate if anyone could double-check my analysis and comment on the observations.
fumarase
insulin
mpro_x0692
spring8_ccp4_2018
small_molecule_example
NB
xia2 pipeline=3dii
failed atpointless
, but we had got far enough anywaythaumatin_i04
x4wide
The text was updated successfully, but these errors were encountered: