Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baseline information #159

Open
miguelcarcamov opened this issue May 5, 2021 · 19 comments
Open

Baseline information #159

miguelcarcamov opened this issue May 5, 2021 · 19 comments

Comments

@miguelcarcamov
Copy link

  • dask-ms version: 0.2.6
  • Python version: 3.9
  • Operating System: manjaro Linux

Description

Hello everyone, I would like to partition or group my ms dataset based on FIELD_ID, DATA_DESC_ID and BASELINE (which is not a column, but can be calculated using ANTENNA1 and ANTENNA2). It is possible to do this? Also, for each of the baselines I would like to get the length of them. However, for that we would need to do a query for the entire dataset instead of the list of partitions.

Anyone know how to do this?

This library is awesome, keep the good work, best regards!

@sjperkins
Copy link
Member

sjperkins commented May 10, 2021

Hi @miguelcarcamov, I think it depends on how far you want to take it.

1. The easy, but probably less efficient approach

You could set up your datasets with the following:

datasets = xds_from_ms(ms, group_cols=["FIELD_ID", "DATA_DESC_ID", "ANTENNA1", "ANTENNA2"])

This will create a unique dataset per combination of FIELD_ID, DATA_DESC_ID and BASELINE. Unfortunately, Measurement Sets are frequently monotonically ordered in TIME, rather than ANTENNA1, ANTENNA2, so the resulting datasets will be backed by reads of non-contiguous rows, which results in inefficient disk read patterns. But it should be fairly easy to calculate the max baseline length per dataset as follows:

import dask
import dask.array
datasets = xds_from_ms(ms, group_cols=["FIELD_ID", "DATA_DESC_ID", "ANTENNA1", "ANTENNA2"])

bl_lengths = []

for ds in datasets:
  ant1 = da.full_like(ds.TIME.data, ds.ANTENNA1, dtype=np.int32)
  ant2 = da.full_like(ds.TIME.data, ds.ANTENNA2, dtype=np.int32)
  bl_lengths.append(da.sqrt((ds.UVW.data[ant2, : ] - ds.UVW.data[ant1, :])**2).max())

dask.compute(bl_lengths)

If you want to do more with the baseline length (i.e. process visibility data), then non-contiguous disk access will hurt performance.

2. The harder, but more efficient approach

A second approach requires (1) some knowledge of dask internals (2) the ability to process your baseline data on a per-chunk basis.

from __future__ import print_function

import argparse

import dask
import dask.array as da
from daskms import xds_from_ms
import numpy as np

def create_parser():
    p = argparse.ArgumentParser()
    p.add_argument("ms")
    return p

def _process(ant1, ant2, uvw):
    uvw = uvw[0]  # Contraction over the uvw3 axis
    # Identify unique  baselines in this chunk
    baselines = np.stack((ant1, ant2), axis=1)
    ubl, inv = np.unique(baselines, return_inverse=True, axis=0)

   # Determine their lengths
    bl_length = np.empty(ubl.shape[0], dtype=uvw.dtype)

    for i, (a1, a2) in enumerate(ubl):
        bl_length[i] = np.sqrt(uvw[i == inv, :]**2).max()

    print(bl_length)
    # Further processing required beyond this point

if __name__ == "__main__":
    args = create_parser().parse_args()
    ds = xds_from_ms(args.ms)
    ds = ds[0]    # Just demonstrate on the first dataset

    # Map _process function on input arrays to produce an output arrow
    # A good understanding of dask.array.blockwise is advised
    process = da.blockwise(_process, ("row",),
                           ds.ANTENNA1.data, ("row",),
                           ds.ANTENNA2.data, ("row",),
                           ds.UVW.data, ("row", "uvw3"),
                           concatenate=False,
                           meta=np.empty((), np.object))

    dask.compute(process)

Conclusion

I suspect the approach you take will depend on whether you want to crunch the larger visibility data. What are your thoughts?



@miguelcarcamov
Copy link
Author

miguelcarcamov commented Jun 5, 2021

I ended up using itertools.combinations. Although since I am very new on using dask it might be less efficient than your approach. I would like you to tell me what you think.

antennas = xds_from_table(self.ms_name_dask + "ANTENNA", taql_where=taql_query)[0]
antenna_obj = Antenna(dataset=antennas)

When creating the object antennas it runs this:

self.max_diameter = 0.0 * u.m
self.min_diameter = 0.0 * u.m
        if dataset is not None:
            self.max_diameter = self.dataset.DISH_DIAMETER.data.max().compute() * u.m
            self.min_diameter = self.dataset.DISH_DIAMETER.data.min().compute() * u.m

Then I run:

# Creating baseline object
baseline_obj = antenna_obj.create_baseline_dataset()

This function runs:

def create_baseline_dataset(self):
     ids = self.dataset.ROWID.data.compute()
     combs = np.array(list(combinations(ids, 2)))

     antenna1 = self.dataset.sel(row=combs[:, 0])
     antenna2 = self.dataset.sel(row=combs[:, 1])

     baseline = antenna1.POSITION - antenna2.POSITION
     baseline_length = xarrfunc.sqrt(
         xarrfunc.square(baseline[:, 0]) + xarrfunc.square(baseline[:, 1]) + xarrfunc.square(baseline[:, 2]))
     baseline_length = baseline_length.data.persist()

     row_id = np.arange(len(combs[:, 0]))
     ant1_id = da.from_array(combs[:, 0])
     ant2_id = da.from_array(combs[:, 1])
     row_id = da.from_array(row_id)

     ds = xarray.Dataset(
         data_vars=dict(
             ANTENNA1=(["row"], ant1_id),
             ANTENNA2=(["row"], ant2_id),
             BASELINE_LENGTH=(["row"], baseline_length)
         ),
         coords=dict(
             ROWID=(["row"], row_id)
         ))

     return Baseline(dataset=ds)

Since the baseline lengths are in a xarray dataset we can get the maximum using:

self.max_baseline = self.dataset.BASELINE_LENGTH.max().data.compute() * u.m
self.min_baseline = self.dataset.BASELINE_LENGTH.min().data.compute() * u.m

Let me know if this is not efficient, I would like to use the blockwise function though

Cheers

@miguelcarcamov
Copy link
Author

miguelcarcamov commented Aug 6, 2021

@sjperkins Ok, I have tested your code and the only downside is that the dask array returned from process is bigger than what we should expect. For example, if we are returning an array of dimensions for the baselines, like (id, antenna1_id, antenna2_id) if we pass row as the first dimension we would end up with a much more bigger dask array. Btw, what do you mean with crunching the visibility data? Well, I would like two things - One of them I have seen it as an issue - which is have antenna1 and antenna2 + baseline_id as a coordinate in the datasets. But also I would like to loop my datasets per baselines and work on each one of them. My idea is to make a function that takes a non-gridded datasets and returns a gridded dataset. For that we need to do the gridding for each field, spw and baseline, so all the ids in the main table fit.

@miguelcarcamov
Copy link
Author

miguelcarcamov commented Jul 8, 2022

A follow up to this @sjperkins: I've seen the documentation of CASA ngi, and I was wondering how they get to order their data by baseline if the data is not contiguous by baseline... If you convert the data to zarr then you don't get any problem ordering the data by baseline?

@sjperkins
Copy link
Member

sjperkins commented Jul 11, 2022

A follow up to this @sjperkins: I've seen the documentation of CASA ngi, and I was wondering how they get to order their data by baseline if the data is not contiguous by baseline... If you convert the data to zarr then you don't get any problem ordering the data by baseline?

I don't want to speak too much for the casangi team, but it looks like they enforce a (time, baseline, chan, corr) shape for their zarr representation.

The MSv2.0 (and Ms3.0) spec specifies a (row, chan, corr) shape but there is no constraint that the data should be ordered by TIME, ANTENNA1, ANTENNA2 i.e. (time,baseline,chan,corr). This ordering is optimal for certain applications like calibration but ANTENNA1,ANTENNA2,TIME i.e. (baseline,time,chan,corr)can be more optimal for imaging and flagging. I believe wsclean orders data like this prior to imaging.

Thus, in my opinion, enforcing a(time, baseline, chan, corr) order deviates form the full generality of the MSv{2,3} spec if we're being very precise and splitting hairs, but in practice, most instruments will output data in this ordering.

I've also only mentioned the TIME,ANTENNA1 and ANTENNA2 coluns in this comment. Technically all the columns in the MAIN Table key are used to impose an ordering: https://casa.nrao.edu/Memos/229.html#SECTION00061000000000000000, so columns like FEED1 and FEED2 are also relevant here.

@miguelcarcamov
Copy link
Author

@sjperkins right, makes sense. Although I think that for self-calibration which can be considered as calibration+imaging ordering by (time,baseline,chan,corr) is also useful.

However, what I want to do is this: let's say I calculate a baseline_id for each row in my dask-ms dataset which has already been grouped by ["DATA_DESC"]. Let's say that now I want to regroup the datasets such that they are ordered by ["BASELINE_ID", "FIELD_ID", "DATA_DESC_ID"]. Then my questions are:

  1. It is possible to do this? Is it done using xarray groupby? or is there a more efficient way to do this?
  2. Does this order would cause not contiguous performance issues? I guess not if using zarr?

Cheers

@sjperkins
Copy link
Member

@sjperkins right, makes sense. Although I think that for self-calibration which can be considered as calibration+imaging ordering by (time,baseline,chan,corr) is also useful.

However, what I want to do is this: let's say I calculate a baseline_id for each row in my dask-ms dataset which has already been grouped by ["DATA_DESC"]. Let's say that now I want to regroup the datasets such that they are ordered by ["BASELINE_ID", "FIELD_ID", "DATA_DESC_ID"]. Then my questions are:

  1. It is possible to do this? Is it done using xarray groupby? or is there a more efficient way to do this?

It may be possible to do this via xarray groupby but I'm wary of this approach since it'll create a dask graph for each group (baseline) with a lot of cross-communication between chunks. I think this'll work but will either require:

  1. evaluating each group separately with dask, resulting in accessing the entire dataset multiple times.

    • This might be avoided by persisting each dataset into cluster memory, but then you'll need
      sufficient memory to hold each dataset on your single node/cluster...
  2. or expecting the dask (distributed?) scheduler to perfectly handle cross-communication when work for all groups is submitted at once. This is hard: https://coiled.io/blog/better-shuffling-in-dask-a-proof-of-concept/

Having said that I haven't tried this approach in a long time, so the underlying functionality might have improved.

  1. Does this order would cause not contiguous performance issues? I guess not if using zarr?

One can't really get around this issue, regardless of the storage backend: its a matter of Spatial Locality. If I were to use database terminology, accessing data on the primary key is always more optimal than accessing data via a secondary key because data is usually ordered by primary key on disk.

You might want to try reordering your MS as follows:

dask-ms convert ~/data/input.ms -g "FIELD_ID,DATA_DESC_ID,SCAN_NUMBER" -i "ANTENNA1,ANTENNA2,TIME,FEED1,FEED2" -o ~/data/output.ms --format ms --force

If you've created a BASELINE_ID column, you could probably substitute that for ANTENNA1,ANTENNA2.

@miguelcarcamov
Copy link
Author

Thank you very much @sjperkins. I will try what you have suggested and I will let you know. Last question - Is the convert function part of the dask-ms? That is, can I call it as a function from a python file?

Cheers

@sjperkins
Copy link
Member

Thank you very much @sjperkins. I will try what you have suggested and I will let you know.

Note there were some fixes pushed to master this morning, but I don't think there would have been an issue with MS to MS conversion.

Last question - Is the convert function part of the dask-ms? That is, can I call it as a function from a python file?

It's a class in daskms/apps/convert.py. There are no plans to make this into a generic function.

@miguelcarcamov
Copy link
Author

miguelcarcamov commented Oct 12, 2022

@sjperkins Hi again! quick question - How can I use convert from a piece of code directly with the Convert class and without using os.system? or would I need to program my own wrapper in order to use it as a function to convert a measurement set file? I want to do this because I guess depending on the stage of the software that I'm currently building I will need different orderings (an specific ordering for imaging, gridding and de-gridding, and other ordering for calibration and self-cal). Since I know which ordering I need to use for each case, I would really like to call convert inside my code as a function rather than using os.system. I was wondering if you could please help with that.

@sjperkins
Copy link
Member

@sjperkins Hi again! quick question - How can I use convert from a piece of code directly with the Convert class and without using os.system? or would I need to program my own wrapper in order to use it as a function to convert a measurement set file? I want to do this because I guess depending on the stage of the software that I'm currently building I will need different orderings (an specific ordering for imaging, gridding and de-gridding, and other ordering for calibration and self-cal). Since I know which ordering I need to use for each case, I would really like to call convert inside my code as a function rather than using os.system. I was wondering if you could please help with that.

I'd just instantiate Convert with the relevant command line arguments and a python logger. Something like the following (I haven't run this!)

import logging
log = logging.getLogger(__file__)

args = ["input.ms", "--output", "output.ms", "--group-cols", "FIELD_ID,DATA_DESC_ID", "--index-cols", "TIME,ANTENNA1,ANTENNA2"]

convert = Convert(args, log)
convert.execute()

@miguelcarcamov
Copy link
Author

miguelcarcamov commented Nov 2, 2022

I'm getting this error when running the command @sjperkins :

2022-11-02 10:34:29,585 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21032324 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,592 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21037373 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,597 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21037373 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,602 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21042422 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,607 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21042422 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,613 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21047471 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,617 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21047471 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,624 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21088424 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,629 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21088424 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,634 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21092912 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,640 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21092912 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,646 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21097400 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,651 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21097400 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,657 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21101888 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,661 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21101888 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,666 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21160232 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,672 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21160232 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,678 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21164720 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,683 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21164720 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,689 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21169208 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,694 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21169208 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,699 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21173696 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,704 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21173696 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,711 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21232040 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,716 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21232040 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,721 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21236528 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,726 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21236528 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,732 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21241016 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,736 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21241016 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,743 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21245504 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,748 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21245504 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,753 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21303848 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,758 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21303848 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,764 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21308336 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,769 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21308336 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,775 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21312824 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,780 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21312824 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,786 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21317312 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,792 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21317312 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:31,247 - dask-ms - INFO - Input: 'measurementset' file:///home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg
2022-11-02 10:34:31,247 - dask-ms - INFO - Output: 'measurementset' file:///home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg_time
2022-11-02 10:35:21,005 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,011 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,015 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,021 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,025 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,031 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,036 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,042 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,521 - dask-ms - WARNING - Ignoring SOURCE
2022-11-02 10:35:21,525 - dask-ms - WARNING - Ignoring 'TARGET': Unable to infer shape of column 'TARGET' due to:
'TableProxy::getCell: no such row'
2022-11-02 10:35:21,526 - dask-ms - WARNING - Ignoring 'ENCODER': Unable to infer shape of column 'ENCODER' due to:
'TableProxy::getCell: no such row'
2022-11-02 10:35:21,527 - dask-ms - WARNING - Ignoring 'POINTING_OFFSET': Unable to infer shape of column 'POINTING_OFFSET' due to:
'TableProxy::getCell: no such row'
2022-11-02 10:35:21,527 - dask-ms - WARNING - Ignoring 'DIRECTION': Unable to infer shape of column 'DIRECTION' due to:
'TableProxy::getCell: no such row'
Traceback (most recent call last):
  File "/home/vicente/anaconda3/envs/pyralysis2/bin/dask-ms", line 8, in <module>
    sys.exit(main())
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/entrypoint.py", line 9, in main
    return EntryPoint(sys.argv[1:]).execute()
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/entrypoint.py", line 33, in execute
    cmd.execute()
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/convert.py", line 415, in execute
    writes = self.convert_table(self.args)
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/convert.py", line 500, in convert_table
    writes.append(writer(datasets, out_store))
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/dask_ms.py", line 102, in xds_to_table
    out_ds = write_datasets(
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/writes.py", line 760, in write_datasets
    tp = _updated_table(table, datasets, columns, descriptor)
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/writes.py", line 338, in _updated_table
    table_proxy.addcols(_table_desc, dminfo=_dminfo).result()
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/concurrent/futures/_base.py", line 444, in result
    return self.__get_result()
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/table_proxy.py", line 114, in _impl
    return getattr(table, method)(*args, **kwargs)
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/casacore/tables/table.py", line 1226, in addcols
    self._addcols(tdesc, dminfo, addtoparent)
RuntimeError: Invalid Table operation: Data manager name StandardStMan is already used in table /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg_time/POINTING

@miguelcarcamov
Copy link
Author

I'm worried that this is not working and that casang can re-order their xarray dataset by (time,baseline). I have noticed that this has a very high impact at least for self-calibration.

@bennahugo
Copy link
Collaborator

bennahugo commented Nov 4, 2022

Actually I think this ordering is possibly only good for calibration itself. For imaging one would need to repack by baseline x time instead (like wsclean does when it reorders by w or when ddfacet computes bda ordering). Typically imaging takes a lot longer than the calibration routines so I wonder if it should not be packed like that instead?

@miguelcarcamov
Copy link
Author

@bennahugo Yes, ordering time, baseline is only good for calibration. For imaging the best ordering is baseline, time. I agree. Here is where self-cal enters and it needs both ordering - time, baseline when calibrating and baseline, time when imaging. Given that I'm developing software that will do both, my idea would be to re-order the dataset given what the code is doing (calibration, imaging, self-cal (needs both)). However, the convert script is not able to do that as you can see above, so I haven't been able to test anything at the moment.

@sjperkins
Copy link
Member

I'm getting this error when running the command @sjperkins :

2022-11-02 10:34:29,585 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21032324 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,592 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21037373 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,597 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21037373 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,602 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21042422 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,607 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21042422 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,613 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21047471 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,617 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21047471 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,624 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21088424 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,629 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21088424 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,634 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21092912 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,640 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21092912 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,646 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21097400 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,651 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21097400 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,657 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21101888 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,661 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21101888 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,666 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21160232 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,672 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21160232 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,678 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21164720 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,683 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21164720 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,689 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21169208 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,694 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21169208 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,699 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21173696 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,704 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21173696 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,711 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21232040 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,716 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21232040 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,721 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21236528 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,726 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21236528 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,732 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21241016 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,736 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21241016 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,743 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21245504 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,748 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21245504 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,753 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21303848 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,758 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21303848 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,764 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21308336 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,769 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21308336 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,775 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21312824 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,780 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21312824 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:29,786 - dask-ms - WARNING - Ignoring 'WEIGHT_SPECTRUM': Unable to infer shape of column 'WEIGHT_SPECTRUM' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21317312 of column WEIGHT_SPECTRUM in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f19'
2022-11-02 10:34:29,792 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 21317312 of column FLAG_CATEGORY in /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg/table.f18'
2022-11-02 10:34:31,247 - dask-ms - INFO - Input: 'measurementset' file:///home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg
2022-11-02 10:34:31,247 - dask-ms - INFO - Output: 'measurementset' file:///home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg_time
2022-11-02 10:35:21,005 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,011 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,015 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,021 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,025 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,031 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,036 - dask-ms - WARNING - The shape of column 'ASSOC_SPW_ID' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,042 - dask-ms - WARNING - The shape of column 'ASSOC_NATURE' is unconstrained (ndim == -1). Assuming shape is (31,) from exemplar
2022-11-02 10:35:21,521 - dask-ms - WARNING - Ignoring SOURCE
2022-11-02 10:35:21,525 - dask-ms - WARNING - Ignoring 'TARGET': Unable to infer shape of column 'TARGET' due to:
'TableProxy::getCell: no such row'
2022-11-02 10:35:21,526 - dask-ms - WARNING - Ignoring 'ENCODER': Unable to infer shape of column 'ENCODER' due to:
'TableProxy::getCell: no such row'
2022-11-02 10:35:21,527 - dask-ms - WARNING - Ignoring 'POINTING_OFFSET': Unable to infer shape of column 'POINTING_OFFSET' due to:
'TableProxy::getCell: no such row'
2022-11-02 10:35:21,527 - dask-ms - WARNING - Ignoring 'DIRECTION': Unable to infer shape of column 'DIRECTION' due to:
'TableProxy::getCell: no such row'
Traceback (most recent call last):
  File "/home/vicente/anaconda3/envs/pyralysis2/bin/dask-ms", line 8, in <module>
    sys.exit(main())
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/entrypoint.py", line 9, in main
    return EntryPoint(sys.argv[1:]).execute()
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/entrypoint.py", line 33, in execute
    cmd.execute()
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/convert.py", line 415, in execute
    writes = self.convert_table(self.args)
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/apps/convert.py", line 500, in convert_table
    writes.append(writer(datasets, out_store))
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/dask_ms.py", line 102, in xds_to_table
    out_ds = write_datasets(
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/writes.py", line 760, in write_datasets
    tp = _updated_table(table, datasets, columns, descriptor)
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/writes.py", line 338, in _updated_table
    table_proxy.addcols(_table_desc, dminfo=_dminfo).result()
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/concurrent/futures/_base.py", line 444, in result
    return self.__get_result()
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/daskms/table_proxy.py", line 114, in _impl
    return getattr(table, method)(*args, **kwargs)
  File "/home/vicente/anaconda3/envs/pyralysis2/lib/python3.8/site-packages/casacore/tables/table.py", line 1226, in addcols
    self._addcols(tdesc, dminfo, addtoparent)
RuntimeError: Invalid Table operation: Data manager name StandardStMan is already used in table /home/vicente/Documentos/Ayudantia/complete_data/HLTau_B6cont.calavg_time/POINTING

I can't tell exactly what's happening from your stack trace. Which command line arguments are you using?

It looks like you're writing to an existing table due to the call to _updated_table? This probably won't work. The --force argument will remove any exiting output dataset.

@sjperkins
Copy link
Member

I'm worried that this is not working and that casang can re-order their xarray dataset by (time,baseline). I have noticed that this has a very high impact at least for self-calibration.

As discussed earlier in #159 (comment), we don't impose specific orderings on data because different applications benefit from different orderings.

It's the user's responsibility to reorder their dataset into a format that is convenient for their application. This is possible via dask-ms convert although this is still undocumented: #226.

@miguelcarcamov
Copy link
Author

miguelcarcamov commented Nov 8, 2022

@sjperkins

Maybe if I add the link to the ms here you can traceback the error?

The command line that I'm currently using is:

dask-ms convert HLTau_B6cont.calavg.tav300s -g "FIELD_ID,DATA_DESC_ID,SCAN_NUMBER" -i "ANTENNA1,ANTENNA2,TIME,FEED1,FEED2" -o output.ms --format ms --force

I'm not creating any folder before that.

@sjperkins
Copy link
Member

sjperkins commented Nov 9, 2022

@sjperkins

Maybe if I add the link to the ms here you can traceback the error?

The command line that I'm currently using is:

dask-ms convert HLTau_B6cont.calavg.tav300s -g "FIELD_ID,DATA_DESC_ID,SCAN_NUMBER" -i "ANTENNA1,ANTENNA2,TIME,FEED1,FEED2" -o output.ms --format ms --force

I'm not creating any folder before that.

Thanks for the linked MS.

I can reproduce this error on my side. I'll try block off some time to look at the issue this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants