forked from AaronParsons/fringerate
-
Notifications
You must be signed in to change notification settings - Fork 0
/
refReport.txt
516 lines (421 loc) · 23.8 KB
/
refReport.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
Thank you to the referee for a very detailed read-through of our
paper. All the comments were very helpful, and we have made
alterations to the text to address them. Details of our responses are
provided below.
----------------------------------------------------------------------------
On page 4, in the description of the FIR filter: A crucial
detail for fringe-rate filtering is the requirement to both
have a broad window function (function gamma) for most
effective filtering and to prevent mode mixing, while
simultaneously desiring a small window to satisfy the linear
time-dependence assumption (for Eq 3), which is also
desirable for other practical issues like calibration and
RFI mitigation. While these assumptions are mostly mentioned
during the derivations, the effect of the window size is not
discussed, but has some implications.
One trivial implication is visible in Fig. 3, which shows a
time kernel which is over 2 hours in time. Since in Ali et
al. 2015 night-time observations of 8.30 hour were used, at
face value it seems that in such a case almost 25% of the
data had to be discarded to properly filter the data. How
is this handled? Moreover, could the authors describe what
kind of windowing function is used, and how its size was
chosen?
Although 8.3 hrs of data were used in the final analysis, this time
range was selected after the application of a fringe-rate filter to a
longer (~12 hr) data set. Hence, the 8.3 hr window was not limited by
the boundary conditions to which the reviewer is referring. The
windowing function was chosen to be a blackman-harris, with width
chosen to suppress the time-domain kernel where the weighting falls
below 5% of the peak response. This windowing has a minor effect of
the shape of the effective fringe-domain response. This deviation
from optimal weighting has only a negligible effect on sensitivity,
and is balanced by the other merits of having a compact time-domain
kernel. Changes were made to the caption and the end of S3 to
clarify.
----------------------------------------------------------------------------
Some quantitative results are presented on Page 8 and 12. On
Page 8 this is the signal loss assuming a Gaussian random
field and the point-source gain of the sculpted beam. Page
12 describes the sensitivity improvement of the filtering,
which is convincing. However, none of these tests quantify
how much signal is mixed from one mode to another because of
the finite filter size. This could e.g. be visualized by
simulating a non-uniform (e.g. single mode) power spectrum
and measuring the reconstructed power spectrum. Since this
method is quite crucial for the EoR analyses, I think this
effect should be extensively & quantitatively validated.
The reviewer is correct that fringe-rate filtering creates
correlations between what would otherwise be independent modes sampled
as a function of time. The number of independent modes preserved for
power spectrum analysis corresponds directly to the number and
weighting of fringe-rate modes by the fringe-rate filter. Even for
short time-domain integrations, quantifying the number of sampled
modes is very tricky, as the time dependence of a baselines projection
(and hence, uv mode sampling) toward a patch of sky varies
substantially over the wide fields of view of low-frequency
interferometers such as PAPER and the MWA. Indeed, optimal
fringe-rate filtering addresses this spatially varying time dependence
more directly than standard square time-domain integrations might.
This trickiness is addressed in Parsons et al. (2014), where we
determine that to accurately determine the statistical interdependence
of these modes in either the fringe-rate filtering or square
time-domain integration case, it is necessary to use bootstrapping.
We echo that finding in a new paragraph after equation 35 in the text.
----------------------------------------------------------------------------
Related to this, on Page 8, col 1, 11th line before bottom
the authors write: "These compare well to the results shown
in Figure 7." This is arguable; Fig 6 is clearly different
from Fig. 7. The technique significantly enhances the
sensitivity of the power spectrum, as shown later, but by
itself having a beam which acts like Fig. 6 looks actually
worrying. This needs to be elaborated and should be
quantified.
We assume the reviewer refers to the artifacts along the vertical
axis. We mentioned in the text that these are artifacts associated
with reconstructing beams from sources spaced every 5 degrees in
declination, and are not associated with fringe rate filtering. To
avoid confusion, we echo that statement in the caption of Figure 6 as
well. Other than this reconstruction artifact, Figures 6 and 7 really
do compare well.
----------------------------------------------------------------------------
My final issue is that the paper is a bit long for what it
presents. Parts that were in particular somewhat distracting
from the main points of the paper were the discussion of the
normalization of the power spectrum estimator (Page 7 last
paragraph, and 8 up to the paragraph ending with "...with
the effective primary beam.") and the discussion showing
that optimal map making would filter high fringe rates (Page
16, under Eq 59), which of course is not surprising, as well
as that common uv-gridding employed for map making also
includes a low-pass filter, which is similar to fringe-rate
filtering. Hence, I suggest removing both parts, possibly
putting the PS normalization into an Appendix if the authors
feel it is relevant, and I recommend further reduction of
size if possible.
In response to the referee's suggestion, we had a critical look at the
material the referee pointed to, and discussed possible cuts. However,
we found that it was difficult to do so without compromising the
structure and comprehensibility of the paper. In particular: i) The
normalization of the power spectrum from a visibility-based estimator
has often been mistreated in the literature, partially due to early
forecasting papers that used top-hat beams in their calculations.
These mistreatments were only pointed out explicitly in the relatively
recent work of Parsons et al. (2014). Especially with the subtleties
of heterogeneous primary beams and fringe-rate filtered data (which is
what the normalization derivation in the present paper adds to the
literature), we think it is better to err on the side of being
explicit. ii) Indeed, the referee is correct in saying that Section 5
is in principle entirely encapsulated in Eq. (45), which in turn is
equivalent optimal mapmaking and uv-gridded imaging procedures. Thus,
one may in principle stop with Eq. (45), which really just states that
the data is being fit to a linear model. However, the point of Sec. 5
is that this rather opaque matrix expression can be understood more
intuitively in terms of fringe-rate filtering, which arises naturally
from optimal mapmaking. Sec. 5 is therefore an integral part of the
paper.
----------------------------------------------------------------------------
I have some further smaller issues which I detail below:
Page 1, col 1, first paragraph: "These differences ...
interferometers that only have moderate angular
resolution...", and later LOFAR is mentioned as one such
interferometer. This should be reworded a bit, since I would
not say that LOFAR has 'moderate' angular resolution, better
to say 'with less focus' or something similar.
We changed "that broadly fit this description" to "that broadly fit
some or all of this description"
Page 1, col 2, par 2: "In this paper, we extend ideas..."
Probably the paper by D. Roshi and R. Perley ("A New
Technique to improve RFI suppression in Radio
Interferometers", 2003,
http://arxiv.org/abs/astro-ph/0304493 ) is quite relevant as
well.
Citation has been added to the text.
Page 2, col 1, par 1: "...providing cleaner views of
systematics in the data" "Cleaner" is somewhat subjective --
they can be helpful for some systematics, but can actually
hide others. I would thus rather say "a different view"
DONE
Fig 1: It seems the arrows do not scale properly compared to
the lines of longitude. Caption: delta not defined.
DONE
Eq. 2: (and all Eq following) It might be useful to change
f(t'-t) in (t'-t)f, as the first looks like a function
evaluation.
DONE
4th line below Eq. 2: "...our transform at time t'=t." The
window function here is centred here, not the transform.
Also, it is centred on t, not on t'=t."
DONE. "thereby centering our transform at time $t^\prime = t$" -> "in
essence shifting the origin of our transform to time $t$"
Next line: "...the time-dependence of the visibility will
likely be...through the celestial sphere." The filter size
does not influence the time-dependence of the visibilities.
Reword. Also see my comments on the window size at this
point.
DONE. "If the characteristic width of $\gamma$ is relatively short,
one is effectively examining the visibility over short timescales,
during which its time-dependence will likely be dominated by features
on the sky moving relative to fringes, and not the movement of the
primary beam through the celestial sphere."
Below Eq. 3; worth mentioning explicitly that this assumes a
relatively small window size.
We agree that the caveat of a relatively small window size is
important to state. However, this assumption is discussed in the text
preceding Eq. (3), and we feel that it is probably not necessary to
mention the approximation both before and after the equation.
Eq. 4: It took me some time figuring out why the term
"nu/c(b_t x omega)r" is negative; the authors might help the
reader by already reversing the omega x b term in Eq. 3. The
'x' meaning 'multiply' on the beginning of the 2nd line is
somewhat confusing with cross product or convolution. Below
Eq 4: as it is written, gamma-tilde is the *inverse* FT of
gamma.
DONE.
Page 3, col 2, first word: "as advertised", somewhat awkward
word for a paper.
DONE. "as advertised" -> "as claimed"
Page 3, col 2, line 3: "shows contours". There are no
contours drawn...
DONE. "contours" -> "shaded bands"
Page 4, line 1: "that essentially averages together data." I
wouldn't say filtering is essentially averaging, they're
different operations. The authors could say something like
that it reduces entropy of the data making it better
compressible, if that is what the authors are trying to say.
DONE. Elaborating a little bit: "For example, if $w(f)$ is chosen in a
way that suppresses high fringe rates, the effect in the time domain
will be a low-pass filter that (among other features) has the rough
effect of averaging together data."
Page 4, col 1, par 2, line 3: "be used in for" bad sentence.
DONE. "be used in for" -> "be used for"
Sect 3, line 5: "its origins targeting" awkward sentence.
DONE. "In keeping with its origins targeting observations from the
PAPER array" -> "keeping with its origins as a tool for analyzing data
from the PAPER array"
Figure 3. Can the authors explain why the XX/YY matching
line has such a different shape, and is cut off at ~-0.7?
(possibly on P 12 where it is discussed) I assume this
implies that the fringe response shown is only for XX or YY?
Please specify. Is the high frequency in this fringe-rate
filter also a reason for the insufficient beam correction in
Fig. 8, or is it just the 1D nature of the fringe filter
which causes this?
DONE. XX/YY only aims to normalize the YY beam to the XX beam. This
ratio does not include the absolute beam response (which the other
curves include as optimal weighting), and so the curve has a
substantially different shape. The fringe responses are indeed XX,
and we now note this in the caption and on page 12. The cutoff at
-0.66 is the maximum negative fringe rate for the chosen array
latitude, but the point is taken that it cuts out slightly before the
other curves. This difference was a result of catching the numerical
divergence of a denominator going to zero. We have tweaked the
thresholds to push the cut closer to the actual maximal negative
fringe rate. The "high frequency", which we interpret to mean the
variation of this function with fringe rate, is not a limiting factor
in the beam matching; it is, as the reviewer speculates, just the 1D
nature of the fringe filter.
Page 5, col 2, line 8: "primary beam A_i(r)" and three lines
later: I think it is more logical to write A_i(r, nu).
DONE.
5 lines under Eq 8: "it is implicitly assumed that the
baseline length is given by b=u lambda", I don't understand
why defined this way, why not 'explicitly' say "and u =
b/lambda..." ?
DONE.
Footnote 6; I could not find this (easily) in Liu et al.
(2014b), could the authors specify a location in the paper
for this approximation?
DONE. Turns out to be Section III A.
Page 5, col 2, 9th line to last: "as an integral over
angles" -> solid angle
DONE
Page 6, line under Eq 14: "where nu21 := 1420", use approx symbol.
DONE.
Page 7, before Eq. 24: might clarify this is combining (17),
(22) and (23).
DONE.
Page 7, col 1, under Eq 26: "This is a rather remarkable
result, ..." -- I'm not sure why the authors find this
remarkable... Indeed, it can be described as somewhat
'counter-intuitive' when thinking about 'normal' visibility
averaging.
We only meant "remarkable" in the sense of "worthy of a remark because
it is counterintuitive", and indeed, in the next sentence we state
that the result is counterintuitive. However, we can see how
"remarkable" may sound strange in this context, and thus we have
changed to the blander description of our result as an "interesting"
one.
Page 8, col 1, par 2, line 3: "derviations" -> derivations.
DONE.
Page 8, col 1, 7th line before bottom: "As these simulations
illustrate...point-source simulations". I do not understand
this sentence, clarify. Are the authors saying the
difference between Fig 6 and 7 can be explained by the
'binning'? Isn't the binning equal in both Fig 6 and 7?
DONE. We perhaps mischaracterized this as "binning". We mean that the
binning and interpolation of the simulated source tracks, separated in
5-degree intervals in declination, are the dominant source of error.
We have revised this sentence to clarify.
Page 9, col 2, par 2, "The typical technique for..." Typical
for dishes, but I don't think this is 'typical' for aperture
arrays with a fixed element beam, which involve more
typically a multiplication of an inverse Mueller matrix
(e.g. as is done in AW-projection for LOFAR by Tasse et al.,
2012, http://arxiv.org/abs/1212.6178, or for the MWA as done
in the RTS as in Mitchel et al. 2008
http://arxiv.org/abs/0807.1912 and Offringa et al. 2014
http://arxiv.org/abs/1407.1943 ).
DONE. "The typical technique for..." -> "A commonly used technique..."
Page 9, col 2, par 3, line 10-13: "As such, for a chosen uv
coordinate, ... before summing." I am not sure what the
authors mean here: convolving the XX and YY visibilities in
uv-space (in 2d) with the Fourier transform of the full
inverse Mueller matrix image would completely correct that
particular mode for the beam, resulting in zero leakage on
average. This does not depend on overlap of kernels with
other visibilities. Clarify or remove.
We have added the following clarifying footnote to page 9:
In image domain, the baseline fringe pattern has been integrated
across the sky weighted by two different (XX, YY) beams. This means
that an interferometric baseline for these two polarizations samples
the uv plane convolved by two different kernels --- the Fourier
transforms of the XX, YY beams, respectively. Because the resulting
V_XX, V_YY visibilities no longer retain any direction-dependent
information (they were integrated over angle) V_XX and V_YY cannot be
re-weighted in a way that produces pure Stokes I response in all
directions, except in the trivial case that the YY beam differs from
the XX beam by a direction-independent multiplicative factor. Said
another way, given a single degree of freedom in the relative
weighting of V_XX and V_YY, it is not possible to guarantee a pure
Stokes I response in more than one direction.
It is common practice in synthesis imaging to grid visibilities in the
uv plane with convolving kernels that contain direction-dependent
information. Examples include W-projection kernels, the Fourier
transform of the primary beam used in optimal map making, and
(relevant here) the Fourier transform of the inverse of
direction-dependent Mueller matrices. Indeed, in the case of a
completely sampled uv plane, the inverse Mueller beam kernel would
perfectly match the XX and YY beams. However, as the sampling of the
uv plane becomes sparser, the removal of samples removes degrees of
freedom in weighting as a function of direction. In the limiting case
of a very sparse array where the convolving kernels of uv samples do
not overlap, we revert to the single direction-independent weighting
factor above.
Another way of thinking about this is to realize that, by gridding
visibilities in the uv plane with a convolving kernel, we are
attempting to reassign sky intensity from the visibility, which was
integrated over in equation 1, back to its original location so that
it may be re-weighted to match the XX and YY polarization responses.
This inversion is only well posed in the case of a completely sampled
uv plane. As we lose uv samples, we lose the ability to reconstruct
the original sky intensity, and hence, to perfectly match the XX and
YY polarization responses as a function of direction. In the case of
a sparsely sampled uv plane, it is only possible to construct a pure
Stokes I response as a function of direction if one has sufficient
additional information about the sky (for example, if it sparsely
populated by point sources) that one can reconstruct the missing modes
and hence, fully invert the uv plane.
Page 9, col 2, par 4, line 11-..: "Typically, earth-rotation
synthesis.." somewhat same comment as above; I don't know
what the authors mean, but arrays like MWA and LOFAR can
perform adequate beam correction of snapshots, so this
statement seems not correct.
The reviewer is correct to point out that our statement should be
restricted to diffuse structure (which is more localized in the uv
plane). Synthesis arrays do commonly correct for beams responses
toward point sources. We have added a footnote with this
clarification.
Fig 8. caption: "XX-YY/2" -> "(XX-YY)/2"
DONE
Page 11, col 1, line 8: "In the case of... products" this is
dependent on the above points, and thus needs to be changed
too if above points can't be clarified.
This point has been addressed in the clarifications above.
Table 1: what is meant with "Effective integration time" ?
(in particular how the last number can be 5570 with less
sensitivity)
We have clarified in the text that 'effective' means the noise
equivalent integration time. As for the second question, narrower
fringe-rate weightings require longer integration times, but
restricting the fringe-rate response to a narrower region does not
necessarily improve sensitivity. In fact, any deviation from optimal
weighting, including longer integration times, by definition will
reduce sensitivity. Clarifying text has been added on page 12 to this
effect.
Page 14, last line of par 1: "...the most natural". It is
mentioned quite often in the paper that the fringe domain is
'the (most) natural domain'. One could equally argue that
the uv plane or some kind of weighted image domain is a
natural domain for averaging. I don't object to "a natural
domain", but "the most natural" seems subjective. Also, it
is mentioned a bit too often.
DONE. (But really only just that one instance)
Eq. 45: define adjoint operator.
DONE
Eq. 49: the deltas here should be called delta^D to
distinguish them from declination.
DONE, but only for the second copy, since the first is a Kronecker
delta.
Eq. 50: Why is sigma within the integral and summation?
Simply to save space.
Under Eq. 52: "This is still not...and all t." While this is
true, the non-optimality this causes for telescopes with a
small tracking beam (~a few deg) is negligible. In
particular, in 3rd line under Eq. 53: "...is an optimal
way..only if"; the above proof shows that it is an optimal
way 'if', and not 'only if', so remove 'only'. Also, it
would be helpful to mention that a small beam is sufficient
too.
We have removed the "only", and have added that the small primary beam
will automatically allow the flat-sky approximation (and hence the
time-averaging procedure) to be valid.
Sect 5.3 (as well as the abstract): I think it should be
stressed that these derivations are much more applicable to
drift-scan observations (or with a highly varying beam, if
you wish). In tracking observations, the improvement that
can be made with such weighting is insignificant. This
comment also applies to Sec 6, line 7: "...are almost always
sub-optimal..." ; I think that should specify 'for
drift-scan observations'. Moreover, these weightings could
also be applied by the a-projection technique, which would
actually be more accurate, because it can weight(/correct)
in two dimensions, so the authors should make some remark
about a-projection.
Thank you to the referee for pointing out that we neglected to
mention/cite A-projection. With its similarities to the optimal
mapmaking approach, we have added some citations (Bhatnagar et al.
2008, 2013, Tasse et al. 2013) to the discussion where we introduce
optimal mapmaking. We have also added the phrase "particularly for
wide-field, drift-scan instruments" in the portion of Sec. 6 referred
to by the referee. We have decided to leave the abstract unchanged
because it remains true that "fringe-rate filtering arises naturally
in minimum variance treatments", and there we make no reference to the
(possible) non-optimality of the other techniques. (We mention
potential systematics, but not any statistically non-optimalities in
the sense of an increased variance in the final results). Note that
the two-dimensional weighting that the referee mentions is implemented
in our fringe-rate framework. In Eq. (61), for example, one direction
of the weighting is implemented by the fringe-rate weighting
(everything within the summation), while the other direction of the
weighting is performed in image space (via the B(delta) term near the
beginning of the expression).
Page 17, col 2, line 9, "..while avoid [should be avoiding]
many...". Same comment to 'cleaner analysis' as before ->
'different analysis'. Also, I think that 'analysis-induced
systematics associated with gridding' is too strong. It is
true to say that it is beneficial to avoid gridding, but not
that gridding induces many systematics. Radio astronomers
know well how to grid, and dominating systematics come from
other things.
We have softened the language, switching to saying that we avoid
"potential" analysis-induced systematics associated with gridding.
Page 17, col 2, final sentence: "we anticipate that
fringe-filtering is likely ... [important for] ... the SKA".
Currently, fringe filtering does not seem applicable to the
current SKA low configuration, and the planned technique
seems to be a-projection (e.g. Cornwell 2012). So, it is
currently not likely that fringe filtering will be used for
the SKA.
DONE. "likely be an important analysis tool" -> "may be an important
analysis tool"