Pandas DataFrame Reader, additional example. #971

apiszcz · 2024-03-23T04:49:32Z

Is it possible to document an example for the Pandas Dataframe reader that reads files from folder that has continue new data files added?
https://stonesoup.readthedocs.io/en/v1.2/auto_examples/Custom_Pandas_Dataloader.html#dataframe-detection-reader

In the DataFrame Detection Reader section, can the instantiation be changed from dataframe=truth_df to a dataframe generator? dataframe=dataframe_generator. Where the dataframe_generator would read csv files and return the dataframe object.

apiszcz · 2024-03-23T13:38:56Z

I made another endless loop inside of detections_gen, testing now ..
new attribute datapath and new secion is while True until the link for row in.
in this case dataframes are stored as pkl, however csv, etc should be fine/slower.

class DataFrameDetectionReader2(DetectionReader, _DataFrameReader):
    from stonesoup.base import Property
    from stonesoup.buffered_generator import BufferedGenerator
    from stonesoup.types.detection import Detection
    """A custom detection reader for DataFrames containing detections.

    DataFrame must have headers with the appropriate fields needed to generate
    the detection. Detections at the same time are yielded together, and such assume file is in
    time order.

    Parameters
    ----------
    """
    # dataframe: pd.DataFrame = Property(doc="DataFrame containing the detection data.")
    datapath: str = Property(doc="Path to dataframes.")

    @BufferedGenerator.generator_method
    def detections_gen(self):
        while True:
            for ipath in sorted([str(p) for p in pathlib.Path(self.datapath).glob('*.pkl')]):
                data_lockfile = ipath.replace('.pkl', '.lock')
                data_lock = FileLock(data_lockfile)
                with data_lock.acquire():
                    with open(ipath, 'rb') as ipkl:
                        data = pickle.load(ipkl)
                    self.dataframe = data

                detections = set()
                previous_time = None

                for row in self.dataframe.to_dict(orient="records"):

                    time = self._get_time(row)
                    if previous_time is not None and previous_time != time:
                        yield previous_time, detections
                        detections = set()
                    previous_time = time

                    detections.add(self.Detection(
                        np.array([[row[col_name]] for col_name in self.state_vector_fields],
                                 dtype=np.float64),
                        timestamp=time,
                        metadata=self._get_metadata(row)))

                # Yield remaining
                yield previous_time, detections
            time.sleep(1)

sdhiscocks · 2024-03-25T12:00:09Z

More examples are always welcome. It can be challenging with data readers as format and structures of data is very variable, so often need bespoke code for your use case.

apiszcz mentioned this issue May 29, 2024

CSVDetectionReader not parsing csv successfully #1030

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pandas DataFrame Reader, additional example. #971

Pandas DataFrame Reader, additional example. #971

apiszcz commented Mar 23, 2024 •

edited

apiszcz commented Mar 23, 2024 •

edited

sdhiscocks commented Mar 25, 2024

Pandas DataFrame Reader, additional example. #971

Pandas DataFrame Reader, additional example. #971

Comments

apiszcz commented Mar 23, 2024 • edited

apiszcz commented Mar 23, 2024 • edited

sdhiscocks commented Mar 25, 2024

apiszcz commented Mar 23, 2024 •

edited

apiszcz commented Mar 23, 2024 •

edited