reading the BRO data from XML files #105

dbrakenhoff · 2023-03-02T16:15:52Z

Separate the reading of XML files from the API request so users can read manually downloaded XML files.

Discussed in #104

^{Originally posted by rt84ro March 2, 2023}
Hi every one,
I have downloaded some wells from the BROloket website but their format is .xml. I want to read them using the HydroPandas but actually I do not know how I should open these files. could you please let me know how to read them?

HMEUW · 2023-03-15T12:23:54Z

I would like to implement this function. My starting point is to add an elif-statement to from_bro, like code below. And a new function get_obs_list_from_file. Do you agree?

if bro_id is None and (extent is not None):
            obs_list = get_obs_list_from_extent(
                extent,
                [..]
            )
            meta = {}
elif bro_id is not None:
            obs_list, meta = get_obs_list_from_gmn(
                [..]
            )
            name = meta.pop("name")
elif (extent is None) and (bro_id is None) and (fn is not None):
            obs_list = get_obs_list_from_file(
                fn,
            )
            meta = {}
else:
            raise ValueError("specify bro_id or extent")

dbrakenhoff · 2023-03-15T14:50:42Z

Yes, seems logical to me. And in io_bro the code to parse the XML file would then be separated and called in each of the get_obs_list_from_* methods?

Thanks for picking this up!

HMEUW · 2023-04-05T06:59:17Z

Yesterday I started with this issue. I think we need two extra variables to implement this issue: origin and local_path .

My personal issue is added in the last line of the table.

User case	Comment	value `origin`	value `local_path`
Get data from Broloket.nl, within extent	Like current behaviour of function	internet	None
Get data from Broloket.nl, within extent, and save Bro XMLs for future use	Like above, but save downloaded data	internet	path where zip will be created
Use downloaded data from Broloket.nl, read all data	User has downloaded data available, via manual download or case in row above	local	path to zip
Local GMW files that have to be uploaded to bronhouderportaal-bro, read all data	To check data in these files before submission	local_bronhouder	path to zip

I added bronhouder to origin in the last use case, because these files have some minor changes compared to broloket.nl-files. E.g. the BROid is not available, because the file has not yet been sent to broloket.nl. The easiest way is to use a separate value.

What do you think about this approach @dbrakenhoff?

ArtesiaWater#105

dbrakenhoff · 2023-04-14T08:59:57Z

Thanks for the clear overview!

I think we should create two routes for getting data from the BRO, one API (internet) route and one local file route. So both ObsCollection and GroundwaterObs should get a from_bro() method for downloading data from the internet through the BRO API. I like your suggestion for storing this downloaded data, so these methods should accept some sort of directory or filename for storing the downloaded data.

Then I would suggest a separate route for reading the local files, from_bro_file/dir/local(), not sure what the name should be yet, but something along those lines. These methods accept a a directory/zip (in the case of ObsCollection), or a filename (in the case of Observation).

I think separating these two is probably clearest and makes the code less complicated.

Then bro.py should contain something like the following functions:

read_bro_xml() --> reads single BRO XML file
read_bro_dir() --> reads directory or zipfile with one or multiple XML files, basically calls the read_bro_xml() method in a loop.
replace the XML parsing in the current API functions with the read functions listed above.

@HMEUW, let me know what you think about this?

PS. I realize we're probably not very consistent across data sources in how we expose local vs API routes, but we shold probably address that in a separate issue. At this moment I'd vote for separating the two routes for each data source.

HMEUW · 2023-06-13T10:12:23Z

I just completed first version to read XMLs for newly construced wells. These are submitted to https://www.bronhouderportaal-bro.nl. @OnnoEbbens please have a look for this code. The full_meta function is not yet working. Code is in the 'add-bronh'-branch.

HMEUW · 2023-07-07T11:57:47Z

Started reading local BROloket files in the branch import-broloket-from file. Cannot make a direct link here.

HMEUW · 2023-07-17T17:33:33Z

Just comitted my work. I have holiday after this week. I cannot work on it before, and expect I have the two after my holiday no time either.

If someone else want to pickup in July or August. It is okay.

dbrakenhoff added the enhancement New feature or request label Mar 2, 2023

HMEUW added a commit to HMEUW/hydropandas that referenced this issue Apr 14, 2023

share current scripts about BRO import from file

6d2fbe4

ArtesiaWater#105

martinvonk mentioned this issue Mar 18, 2024

Cannot read BRO uitgiftedocument #195

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reading the BRO data from XML files #105

reading the BRO data from XML files #105

dbrakenhoff commented Mar 2, 2023

HMEUW commented Mar 15, 2023 •

edited

dbrakenhoff commented Mar 15, 2023 •

edited

HMEUW commented Apr 5, 2023 •

edited

dbrakenhoff commented Apr 14, 2023

HMEUW commented Jun 13, 2023

HMEUW commented Jul 7, 2023

HMEUW commented Jul 17, 2023

reading the BRO data from XML files #105

reading the BRO data from XML files #105

Comments

dbrakenhoff commented Mar 2, 2023

Discussed in #104

HMEUW commented Mar 15, 2023 • edited

dbrakenhoff commented Mar 15, 2023 • edited

HMEUW commented Apr 5, 2023 • edited

dbrakenhoff commented Apr 14, 2023

HMEUW commented Jun 13, 2023

HMEUW commented Jul 7, 2023

HMEUW commented Jul 17, 2023

HMEUW commented Mar 15, 2023 •

edited

dbrakenhoff commented Mar 15, 2023 •

edited

HMEUW commented Apr 5, 2023 •

edited