This code base was developed in VSCode (on Ubuntu 22.04) within the context of an Anaconda 3 environment named debingest. This environment is configured to support Python 3.8 and the libraries on which the code is dependent.
To set up the debingest environment, having first cloned the GitHub repo, open a Terminal, navigate to this directory and run the following command;
$ conda env create -f environment.yml
You will need to activate this environment, whenever you wish to use the ingest pipeline, with the following command;
$ conda activate debingest
The role of the pipeline is to prepare data for dEB analyis and fitting with JKTEBOP.
The entry point for this pipeline is ingest.py
with example usage shown
below. The first example shows running a target ingest based on the
configuration given in an existing config json file.
$ python3 ingest.py examples/cw_eri.json
This second example shows how to generate a new default ingest configuration json file. Once generated you will need to edit this file to set up the configuration specific to your target.
$ python3 ingest.py -n examples/new_sys.json
If you first run
chmod +x ingest.py
(or equivalent) in the terminal you remove the need to specify python3 whenever you run ingest.py.
An example of a target's configuration json is shown below with the parameters broadly listed in the order they are used.
{
"target": "CW Eri",
"sys_name": "CW Eridani",
"prefix": "cw_eri",
"output_dir": "drop/cw_eri",
"sectors": [
4,
31
],
"flux_column": "sap_flux",
"exp_time": "short",
"quality_bitmask": "hardest",
"quality_masks": [
[58420.00, 58423.00]
],
"bin_time": null,
"period": 2.72837,
"plot_lc": true,
"plot_fold": true,
"polies": [
{ "term": "sf", "degree": 1, "gap_threshold": 0.5 }
],
"trim_masks":[
],
"fitting_params": {
"qphot": 0.836,
"L3": 0.080,
"LD_A": "pow2",
"LD_B": "pow2",
"LD_A1": 0.64,
"LD_B1": 0.64,
"LD_A1_fit": 1,
"LD_B1_fit": 1,
"LD_A2": 0.47,
"LD_B2": 0.50,
"LD_A2_fit": 0,
"LD_B2_fit": 0
}
}
Many of these configuration parameters are optional and may be removed or set
to null
if the default behaviour is required.
The time values for quality_mask
, trim_mask
, or polies
date ranges are
interpreted as BTJD (if < 40 000), reduced JD (< 2.4e6) or JD (>= 2.4e6) all
with the scale matching the corresponding light-curve.
This section describes the ingest pipelines stages and how the parameters shown above are used. Each major stage, 1 to 6, is applied to all of the target's matching sectors/light-curves before moving on to the next stage.
The target
is a compulsory search identifier suitable for locating your target
in the MAST portal (object name or TIC are known to work). The sys_name
is
an optional name for use in plots and diagnostics messages which will default
to the target value if omitted.
The optional prefix
and output_dir
values are used to identify where output
files are written and how they're named. The prefix is used as a prefix for all
files. If omitted, the prefix will be derived from the sys_name and the output
dir will be 'staging/prefix
/'.
The target
, sectors
and exptime
are used when when querying MAST for
available timeseries data assets. Both are optional and if not given they are
assumed to be equivalent to 'any'. Suitable values for exptime are long,
short, fast or a numeric value in seconds, with 'short' being appropriate for
TESS's 120 s cadence light-curve data.
The flux_column
may be set to sap_flux (the default value) or pdcsap_flux
to indicate the source of the flux data to be used.
The ingest pipeline makes extensive use of the Lightkurve library. For more information on the target, sectors, exptime, flux_column and quality_bitmask values see the Lightkurve search and download documentation here
The optional quality_bitmask
and quality_masks
are used to mask out poor
quality data from a light-curve prior to processing. The quality_bitmask may
be set to none, default, hard, hardest or a numeric bitmask to be applied
against the light-curves' QUALITY flag. The quality_masks are time ranges
(from, to) over which all data will be masked from subsequent processing.
The quality_masks and trim_masks both take zero or more two-item arrays, each giving the start and end of a time range. For example, the following defines a pair of ranges from JD 2451005 to 2451007 and 2451020 to 2451022 (inclusive):
[[51005.0, 51007.0], [51020.0, 51022.0]]
The optional bin_time
parameter may be set to a time value (in seconds) to
which the light-curve data will be (re)binned. This will be ignored if not set
or it is given a value which is less than or equal to the exptime of the data
as downloaded.
Now that each light-curve is downloaded, opened, masked and optionally binned the fluxes are detrended and used to derive relative magnitudes.
We now have the light-curves' data in a useable state for processing. First the
primary_epoch is located by selecting the 'most prominent' eclipse in the
light-curve. With the period
(in days), this defines the dEB's ephemeris. The
period will be estimated using a periodogram of the light-curve if not given.
The plot_lc
flag controls whether each light-curve, with its primary_epoch
highlighted, is plotted to a png file.
The ephemeris is used to phase fold the light-curve data, and 1024 point
single phase reduced light-curves are derived for subsequent inspection for
system parameter estimation. The plot_fold
flag controls whether a plot
of each folded light-curve, overlaid with the model, is plotted to a png file.
The reduced folded light-curves are passed to a Machine-Learning model, trained to characterize folded dEB light-curves, for parameter estimation. This us gives estimates of the following fitting parameters:
rA_plus_rB
(sum of the relative radii)k
(ratio of the relative radii)bA
(primary impact parameter)inc
(orbital inclination)ecosw
andesinw
(combined eccentrity and argument of periastron)J
(surface brightness ratio)L3
(amount of third light)
During fitting JKTEBOP may alter the light-curve data by fitting polynomials
to chosen terms, given as poly
instructions. Generally we instruct it to fit
low order polynomials to the sf
(scale factor) term to normalize the data.
Each poly instruction applies over a time range and as a rule of thumb each
should be a contiguous region of light-curve.
The polies
config parameter defines what, if any, poly instructions are
generated. There are two types: time_range
(manual poly) configs apply over
a user defined time range and gap_threshold
configs (auto poly) give a gap
size (in days) which acts as a boundary between one or more automatically
generated time ranges. In both cases a term
and polynomial degree
may be
specified (they default to 'sf' and 1 if omitted).
"polies": [
{ "term": "sf", "degree": 1, "time_range": [58410.00, 58420.00] },
{ "term": "sf", "degree": 1, "time_range": [58424.00, 58434.00] },
{ "term": "sf", "degree": 1, "gap_threshold": 0.5 }
]
The resulting poly instructions, to be written to the JTEBOP '.in' file, may look similar to those shown below. This shows the result of the two manual polies above. In the absence of these, the auto-poly would generate a similar output but the date values would be automatically generated from detecting gaps in the data.
poly sf 58415.00 0.0 0.0 0.0 0.0 0.0 0.0 1 1 0 0 0 0 58410.00 58420.00
poly sf 58429.00 0.0 0.0 0.0 0.0 0.0 0.0 1 1 0 0 0 0 58424.00 58434.00
Poly configs are processed in order, with the supported pattern being zero or more manual polies followed by an optional auto-poly (as shown above). For each of the target's light-curves, the manual polies will generate an instruction if there is an overlap with the light-curve on the time axis. The auto-poly will be used only where no manual polies were applied. The two types of poly config are mutually exclusive, triggering of one type for a given light-curve will cause those of the other type to be subsequently ignored.
The optional trim_masks
config parameter controls what data, if any, is now
trimmed from the light-curves. Unlike quality_masks which are used to mask
poor quality data prior to processing, the trim_masks are applied towards
the end of the pipeline to reduce the data passed on to JKTEBOP for fitting.
If plot_lc
is set the trimmed light-curves will be plotted to png files.
Finally the processed and reduced light-curve time and magnitude data is
written to a JKTEBOP compatible '.dat' file. The parameters for fitting these
data are built up from a set of default values and ephemeris, overlaid with the
estimated fitting parameters from the ML model and again with any user
specified overrides given in the fitting_params
config. These, along with the
poly instructions previously generated, are written to a '.in' file which is
the input instruction and parameters for JKTEBOP to fit the light-curve (the
template for this file is found in library/task3.in.template). The default
fitting params can be seen at the foot of ingest.py.