Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decide whether or not organizational units are in scope #61

Open
augusto-herrmann opened this issue Apr 1, 2014 · 9 comments
Open

Decide whether or not organizational units are in scope #61

augusto-herrmann opened this issue Apr 1, 2014 · 9 comments
Assignees
Labels
Data model Changes to schema and how to represent data

Comments

@augusto-herrmann
Copy link
Collaborator

Are in scope of the data for this project:

a) only organizations (as in org:Organization ); or
b) organization and their respective hierarchy of organizational units (as in org:OrganizationalUnit )?

@augusto-herrmann
Copy link
Collaborator Author

Some countries' data (e.g. Switzerland) already seem to include organizational units.

If it's in scope I would also include this data from Brazil, as it is available.

@rufuspollock
Copy link
Member

@augusto-herrmann @hannesgassert I think it would be good to add these but we should agree the column name and meaning and add to datapackage.json first ...

@augusto-herrmann
Copy link
Collaborator Author

How about this?

          {
            "id": "type",
            "type": "string",
            "description": "Type of entry: 'o' for organization level, 'ou' for organizational unit level"
          },

@rufuspollock
Copy link
Member

@augusto-herrmann seems sensible though I dislike "type" as it is so overloaded. Perhaps "organizationType" or "organization-level" might be better.

/cc @hannesgassert

@augusto-herrmann
Copy link
Collaborator Author

Agreed.

But we should use "organization_level" (with an underscore) in order to be consistent with the word separation scheme used in the rest of the column names.

@augusto-herrmann augusto-herrmann added the Data model Changes to schema and how to represent data label Jan 19, 2016
@augusto-herrmann augusto-herrmann self-assigned this Jan 19, 2016
@augusto-herrmann
Copy link
Collaborator Author

If no one opposes this change to the data model, I should add this soon-ish.

Existing data should be updated with the new organization_level column and their respective cells kept blank until they can be filled in from official sources.

@todrobbins
Copy link
Contributor

I think we should be verbose in the values and list the organization_level as Organization or Organizational Unit.

Or another proposal would be:

  • "organization_level": "https://www.w3.org/ns/org#Organization"
  • "organization_level": "https://www.w3.org/ns/org#OrganizationalUnit"

@augusto-herrmann
Copy link
Collaborator Author

Here's another idea: add organizational units in a different CSV file.

Organizational units can be very numerous, around several thousands for each country. They also tend to be updated in structure much more often. Putting them in a separate file will make downloading easier for people who are only interested in the main organizations. It would also be possible to have a different update schedule for them.

The main organization would remain where they are, at /data. The complete file with main organization and units could all be put in a subfolder named /data/organizational_units, so we would have a new folder with CSV files with the same names and the same schema as the main ones, but much larger.

@augusto-herrmann
Copy link
Collaborator Author

That file would contain the full structure of government down to the smallest internal unit. This data tends to get very large very quickly and update very frequently.

We recently started publishing a daily csv of this for Brazil, and it's a 124 MB file. That is not so large, but to keep track of its changes in Git it may make the repository a lot slower and unwieldly.

I'm open to discussing other alternatives. Or whether or not it is really ok to store a file as large as this, frequently updated, in a Git repo.

Your thoughts, @todrobbins, @rufuspollock, @hannesgassert?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data model Changes to schema and how to represent data
Projects
None yet
Development

No branches or pull requests

3 participants