Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reading the BRO data from XML files #105

Open
dbrakenhoff opened this issue Mar 2, 2023 Discussed in #104 · 7 comments
Open

reading the BRO data from XML files #105

dbrakenhoff opened this issue Mar 2, 2023 Discussed in #104 · 7 comments
Labels
enhancement New feature or request

Comments

@dbrakenhoff
Copy link
Collaborator

Separate the reading of XML files from the API request so users can read manually downloaded XML files.

Discussed in #104

Originally posted by rt84ro March 2, 2023
Hi every one,
I have downloaded some wells from the BROloket website but their format is .xml. I want to read them using the HydroPandas but actually I do not know how I should open these files. could you please let me know how to read them?

@dbrakenhoff dbrakenhoff added the enhancement New feature or request label Mar 2, 2023
@HMEUW
Copy link
Collaborator

HMEUW commented Mar 15, 2023

I would like to implement this function. My starting point is to add an elif-statement to from_bro, like code below. And a new function get_obs_list_from_file. Do you agree?

if bro_id is None and (extent is not None):
            obs_list = get_obs_list_from_extent(
                extent,
                [..]
            )
            meta = {}
elif bro_id is not None:
            obs_list, meta = get_obs_list_from_gmn(
                [..]
            )
            name = meta.pop("name")
elif (extent is None) and (bro_id is None) and (fn is not None):
            obs_list = get_obs_list_from_file(
                fn,
            )
            meta = {}
else:
            raise ValueError("specify bro_id or extent")

@dbrakenhoff
Copy link
Collaborator Author

dbrakenhoff commented Mar 15, 2023

Yes, seems logical to me. And in io_bro the code to parse the XML file would then be separated and called in each of the get_obs_list_from_* methods?

Thanks for picking this up!

@HMEUW
Copy link
Collaborator

HMEUW commented Apr 5, 2023

Yesterday I started with this issue. I think we need two extra variables to implement this issue: origin and local_path .

My personal issue is added in the last line of the table.

User case Comment value origin value local_path
Get data from Broloket.nl, within extent Like current behaviour of function internet None
Get data from Broloket.nl, within extent, and save Bro XMLs for future use Like above, but save downloaded data internet path where zip will be created
Use downloaded data from Broloket.nl, read all data User has downloaded data available, via manual download or case in row above local path to zip
Local GMW files that have to be uploaded to bronhouderportaal-bro, read all data To check data in these files before submission local_bronhouder path to zip

I added bronhouder to origin in the last use case, because these files have some minor changes compared to broloket.nl-files. E.g. the BROid is not available, because the file has not yet been sent to broloket.nl. The easiest way is to use a separate value.

What do you think about this approach @dbrakenhoff?

HMEUW added a commit to HMEUW/hydropandas that referenced this issue Apr 14, 2023
@dbrakenhoff
Copy link
Collaborator Author

Thanks for the clear overview!

I think we should create two routes for getting data from the BRO, one API (internet) route and one local file route. So both ObsCollection and GroundwaterObs should get a from_bro() method for downloading data from the internet through the BRO API. I like your suggestion for storing this downloaded data, so these methods should accept some sort of directory or filename for storing the downloaded data.

Then I would suggest a separate route for reading the local files, from_bro_file/dir/local(), not sure what the name should be yet, but something along those lines. These methods accept a a directory/zip (in the case of ObsCollection), or a filename (in the case of Observation).

I think separating these two is probably clearest and makes the code less complicated.

Then bro.py should contain something like the following functions:

  • read_bro_xml() --> reads single BRO XML file
  • read_bro_dir() --> reads directory or zipfile with one or multiple XML files, basically calls the read_bro_xml() method in a loop.
  • replace the XML parsing in the current API functions with the read functions listed above.

@HMEUW, let me know what you think about this?

PS. I realize we're probably not very consistent across data sources in how we expose local vs API routes, but we shold probably address that in a separate issue. At this moment I'd vote for separating the two routes for each data source.

@HMEUW
Copy link
Collaborator

HMEUW commented Jun 13, 2023

I just completed first version to read XMLs for newly construced wells. These are submitted to https://www.bronhouderportaal-bro.nl. @OnnoEbbens please have a look for this code. The full_meta function is not yet working. Code is in the 'add-bronh'-branch.

@HMEUW
Copy link
Collaborator

HMEUW commented Jul 7, 2023

Started reading local BROloket files in the branch import-broloket-from file. Cannot make a direct link here.

@HMEUW
Copy link
Collaborator

HMEUW commented Jul 17, 2023

Just comitted my work. I have holiday after this week. I cannot work on it before, and expect I have the two after my holiday no time either.

If someone else want to pickup in July or August. It is okay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants