Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download and reformat benchmark data set from NASA/WMAP page #93

Open
bnord opened this issue Jul 5, 2023 · 34 comments
Open

Download and reformat benchmark data set from NASA/WMAP page #93

bnord opened this issue Jul 5, 2023 · 34 comments
Assignees
Labels
enhancement New feature or request

Comments

@bnord
Copy link
Contributor

bnord commented Jul 5, 2023

https://lambda.gsfc.nasa.gov/product/foreground/fg_sz_cluster.html

  1. Download the simulated SZ halo catalogs
  2. Do they connect to submaps that are also at that link?
@bnord bnord added the enhancement New feature or request label Jul 5, 2023
@kbanker
Copy link
Contributor

kbanker commented Jul 17, 2023

I've been working on this today, downloaded and have been playing around with the simulated SZ halo catalogs. I think the catalogs connect to submaps that are at the same link, but I haven't confirmed it independently yet, as the submap is for an octant of the sky, but my code keeps treating it as a full sky map (so I need to convert the long/lat positions)

@bnord
Copy link
Contributor Author

bnord commented Jul 17, 2023

Could you post some stats of the catalog? E.g., distributions/histograms of masses, redshifts, and SZ data

@bnord
Copy link
Contributor Author

bnord commented Jul 17, 2023

Does it look like the octant of the sky will need to be sliced around individual objects?

@bnord
Copy link
Contributor Author

bnord commented Jul 17, 2023

We did a paper on classification using this data in 2021: https://arxiv.org/abs/2102.13123

There may be some good info on data processing there.

@bnord
Copy link
Contributor Author

bnord commented Jul 17, 2023

Could we plan to put this into an h5 data format for posterity?

@kbanker
Copy link
Contributor

kbanker commented Jul 18, 2023

Screen Shot 2023-07-18 at 6 48 56 PM Screen Shot 2023-07-18 at 6 49 03 PM Screen Shot 2023-07-18 at 6 49 11 PM Screen Shot 2023-07-18 at 6 49 20 PM These are a few histograms showing parts of the dataset. Notice the log scale for all of them other than the redshift.

@kbanker
Copy link
Contributor

kbanker commented Jul 18, 2023

Does it look like the octant of the sky will need to be sliced around individual objects?

What exactly do you mean by this? I don't believe the octant needs to be sliced at all. In fact, there might be no need to slice the full-sky either as doing some more reading has led me to believe that the full-sky map is generated from the single octant via reflections, so we can probably just work with that one octant. I am working on trying to verify this with the data though

@bnord
Copy link
Contributor Author

bnord commented Jul 18, 2023

We'll need a little cut-out image of each individual halo/cluster to use as input to the inference process. Is that available already, or will we need to make cut-outs from already-existing larger maps?

@kbanker
Copy link
Contributor

kbanker commented Jul 19, 2023

We would need to do cut-outs from the larger octant maps, since the maps are all full-size.

@kbanker
Copy link
Contributor

kbanker commented Jul 27, 2023

I managed to figure out how to do cut-outs and put the maps in a comparable state to our sims. Here's an example:
Screen Shot 2023-07-27 at 4 41 03 PM

A couple of things to note: Since these are simulated maps, this specific cutout is from the map that just describes the SZ effect, so doesn't have any noise, either from galactic dust, or even from the CMB. However, it does include the kSZ effect, that we do not have included. We may want to use an added map of the tSZ effect + CMB + noise to have a better comparison to our simulations

@bnord
Copy link
Contributor Author

bnord commented Jul 27, 2023

I can imagine their being KSZ-related physics in the simulation, but shouldn't the KSZ signal itself be a different signal and thus not in the map? Maybe I'm forgetting some of my SZ physics.

@kbanker
Copy link
Contributor

kbanker commented Jul 27, 2023

I can imagine their being KSZ-related physics in the simulation, but shouldn't the KSZ signal itself be a different signal and thus not in the map? Maybe I'm forgetting some of my SZ physics.

It is actually a different signal, but the map that I had downloaded/worked with was actually the tSZ map + the kSZ map + relativistic corrections, and that's why the map shown has KSZ and TSZ signals. The map that I'm switching to now would be just the tSZ map.

@bnord
Copy link
Contributor Author

bnord commented Jul 27, 2023

ah cool cool. Yeah, let's go with the pure tSZ map for now.

Later, we can add in kSZ when we want to get spicy and upgrade.

@kbanker
Copy link
Contributor

kbanker commented Jul 28, 2023

I downloaded the new pure tSZ map, and here's a comparison of a similar sized cluster from this data :
Screen Shot 2023-07-28 at 4 48 10 PM
vs from our simulation
Screen Shot 2023-07-28 at 4 49 08 PM

@bnord
Copy link
Contributor Author

bnord commented Jul 31, 2023

very cool.

I'm guessing ours with noise will look similar.

@evavagiakis What do you think of the following for a diagnostic plot (for benchmark comparisons): choose haloes in our simulation and the NASA sim that are comparable in some variables (e.g., mass and redshift), and then subtract the two images and divide by one of the images (pixel by pixel)? This would give a residual map for each our objects. We could potentially further summarize by taking the average over the pixels in the plot, or looking at the distribution over the pixels in the residual image.

@evavagiakis
Copy link
Collaborator

Sounds good to me. If we can do a component by component residual plot (just y, or just cmb, or just noise for example) that might help in diagnostics as well. An average over the pixels sounds like the aperture photometry filter to me which we can also use for comparison

@kbanker
Copy link
Contributor

kbanker commented Aug 1, 2023

Here's another cluster comparison + residuals. I've been comparing clusters with just the tSZ signal from the data, and just the dT map from our sims, but I'm wondering if I should add noise/beam convolution? It don't think it makes sense to compare with the CMB included, as its always generated differently and could cause the residuals to be really large.
Screen Shot 2023-08-01 at 5 04 01 PM
Screen Shot 2023-08-01 at 5 04 07 PM
Screen Shot 2023-08-01 at 5 04 12 PM

@kbanker
Copy link
Contributor

kbanker commented Aug 1, 2023

The same cluster at 148 Ghz
Screen Shot 2023-08-01 at 5 24 35 PM
Screen Shot 2023-08-01 at 5 24 44 PM
Screen Shot 2023-08-01 at 5 25 33 PM

@kbanker
Copy link
Contributor

kbanker commented Aug 1, 2023

Another cluster at 148 Ghz
Screen Shot 2023-08-01 at 5 30 15 PM
Screen Shot 2023-08-01 at 5 30 24 PM
Screen Shot 2023-08-01 at 5 30 31 PM

@kbanker
Copy link
Contributor

kbanker commented Aug 1, 2023

I'm still going to continue looking at more examples, but it seems as though there isn't a specific higher/lower bias in our profile based off of these 2 tests since the first one our sim went low, but it went high in the second one.

@evavagiakis
Copy link
Collaborator

We could start looking at sample averages, so for example take a set of sources with masses within one of the mass bins in Battaglia 2012, generate our sims for the same (z, M) sample, and either stack (average) the maps together and look at the residual between those two stacked maps, or look at the average of the residuals between each of those submap pairs (and plot a histogram of the residuals, either from a central value or from aperture photometry). Since the B12 profiles are fits to average profiles, it wouldn't surprise me too much if there's some scatter as long as the scatter is around 0.

I'd also like to understand better what is present in the Seghal Compton-y maps (any instrument noise modeled in? beam assumptions?) to better advise on whether we should be beam convolving/adding noise ourselves in this comparison. There's probably a paper describing what is included, could you link that here if you have it?

@kbanker
Copy link
Contributor

kbanker commented Aug 2, 2023

I think the Seghal sims used N-body simulations, but I don't see any mention of instrument noise/beam effects in the paper here https://ui.adsabs.harvard.edu/abs/2010ApJ...709..920S/abstract.

I think getting an average of the residuals is doable, so I think I'll do that for one of the mass bins today.

@bnord
Copy link
Contributor Author

bnord commented Aug 2, 2023

When we used this data set in the past, I think we added our own noise and other stuff. I think there's some code for that here: https://github.com/deepskies/deepsz

@kbanker
Copy link
Contributor

kbanker commented Aug 2, 2023

Here's the average of 100 loops, where each makes a map of the residuals for that (z, M, R). This is the average of those residual maps. Specifically, this is for the mass bin 1.1e14 < M200 < 1.7e14 solar masses.
Screen Shot 2023-08-02 at 3 38 40 PM

@kbanker
Copy link
Contributor

kbanker commented Aug 2, 2023

Here it is with 1000 loops, but I think this is making it obvious there is a centering error, where the Seghal sims are centered but our are off by 1 pixel. Im going to work on fixing that now.
Screen Shot 2023-08-02 at 4 03 32 PM

@bnord
Copy link
Contributor Author

bnord commented Aug 2, 2023

I'm glad to see the smoothing (reduction in noise)

@kbanker
Copy link
Contributor

kbanker commented Aug 3, 2023

After fixing the centering, I ran the loops again for the mass bin 1.1e14 < M200 < 1.7e14 solar masses, and it seems as though the signal from the Seghal sims is generally higher by an average of 5.6 uK, and the seghal sims have a larger spread, likely due to the substructures included.
Screen Shot 2023-08-03 at 3 51 43 PM

@bnord
Copy link
Contributor Author

bnord commented Aug 3, 2023

How big a difference do we expect for haloes in that mass range?
Have we figured out of $dT$ and $\mu_K$ are the right variables and units to be using here?
Could you remind me what to expect when it comes to the absolute units for clusters at a given mass?
Should we also plot our scaling relations of mass for TSZ to make sure we know what to expect?

@bnord
Copy link
Contributor Author

bnord commented Aug 3, 2023

Since this issue is about downloading and formatting the benchmark data set, we should probably move this analysis to a different issue or a to a github discussion.

@kbanker
Copy link
Contributor

kbanker commented Aug 3, 2023

I think the central tSZ signal that we expect to see in that mass range is approximately 30-50 uK, so an average of 5.6 uK is not too bad. I do think this would still be dT and uK, since we are talking about the difference in temperature that the tSZ signal creates from the background, but I'm not sure?

I'm unclear on what do you mean by the absolute units for clusters at a given mass? Also, how would we plot our scaling relations for mass vs tSZ given that the profile is a function of mass, radius and redshift? I could plot mass vs tSZ signal given a specific z, and R200 and radius if that's what you mean?

@kbanker
Copy link
Contributor

kbanker commented Aug 3, 2023

Since this issue is about downloading and formatting the benchmark data set, we should probably move this analysis to a different issue or a to a github discussion.

I just saw this but have to agree. Do you know of a good way to get all the comments from here into a new issue/discussion, or should we maybe just rename this issue to something that encompasses downloading + analyzing the data set?

@bnord
Copy link
Contributor Author

bnord commented Aug 3, 2023

I think it's okay to create a new issue and reference this one.
We could also re-name some things. I don' think I have a big preference.

but, we should probably split these tasks up so tha twe can keep track a little better.

@bnord
Copy link
Contributor Author

bnord commented Aug 3, 2023

You answered my question about the absolute value being 30-50.

The scaling relation would have to be something like

  1. for a given redshift range, plot mass vs TSZ at the center of the cluster (so literally central pixel)
    2 for a given redshift range, plot mass vs TSZ within some finite aperture (this is what Elaine and Eve are working on; aperture photometry)

@kbanker
Copy link
Contributor

kbanker commented Aug 4, 2023

We can continue the discussion in #123, as I thought a discussion might be better since that seems appropriate for this sort of broad analysis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants