Skip to content

Initial proof-of-concept of open source development (OSD) status dashboard with data-mining & visualisation components

License

Notifications You must be signed in to change notification settings

OPEN-NEXT/wp2.2_dev

Repository files navigation

OSD status dashboard (wp2.2_dev)

Demo backend API Python version CodeQL standard-readme compliant REUSE compliance status Contributor Covenant GitHub license DOI

Demonstrator data-mining backend for an open source development status dashboard

Targeted at hosters of version control platforms (such as Wikifactory, GitLab, or GitHub), this Python backend program mines open source hardware repositories for metadata and calculates metrics based on it. This backend exposes a representational state transfer (REST) application programming interface (API) where requests for those metrics can be made.

This software is not for general consumers to just "double click" on and install on their devices.

Please see the Install and Usage sections to get up and running with this tool.

Table of Contents

Background

Today’s industrial product creation is expensive, risky and unsustainable. At the same time, the process is highly inaccessible to consumers who have very little input in the design and distribution of the finished product. Presently, SMEs and maker communities across Europe are coming together to fundamentally change the way we create, produce, and distribute products.

OPENNEXT is a collaboration between 19 industry and academic partners across Europe. Funded by the European Union's Horizon 2020 programme, this project seeks to enable small and medium enterprises (SMEs) to work with consumers, makers, and other communities in rethinking how products are designed and produced. Open source hardware is a key enabler of this goal where the design of a physical product is released with the freedoms for anyone to study, modify, share, and redistribute copies. These essential freedoms are based on those of open source software, which is itself derived from free software where the word free refers to freedom, not free-of-charge. When put in practice, these freedoms could potentially not only reduce proprietary vendor lock-in, planned obsolescence, or waste but also stimulate novel – even disruptive – business models. The SME partners in OPENNEXT are experimenting with producing open source hardware and even opening up the development process to wider community participation. They produce diverse products ranging from desks, cargo bike modules, to a digital scientific instrument platform (and more).

Work package 2 (WP2) of OPENNEXT is gathering theoretical and practical insights on best practices for company-community collaboration when developing open source hardware. This includes running Delphi studies to develop a maturity model to describe the collaboration and developing a precise definition for what the "source" is in open source hardware. In particular, task 2.2 in this work package is developing a demonstration project status dashboard with "health" indicators showing the evolution of a project within the maturity model; design activities; or progress towards success based on project goals. Details of the dashboard's technical architecture are described in the deliverable 2.5 (D2.5) report.

This repository contains the backend code for D2.5 and to be clear, this deliverable is: Designed to be deployed on a server operated by version control platforms such as Wikifactory or GitHub.

This deliverable is not: For general end-users to install on consumer devices and "double click" to open.

In addition, this repository aims to follow international standards and good practices in open source development such as, but not limited to:

Install

This section assumes knowledge of Python, Git, and using a GNU/Linux-based server including installing software from package managers and running a terminal session.

Note: This software is designed to be deployed on a server by system administrators or developers, not on generic consumer devices.

This project requires Python version 3.10 or later on your server and running it in a Python virtual environment is optional but recommended. Detailed external library dependencies are listed in the standard-conformant requirements.txt file and also here:

In addition to Python and the dependencies listed above, the following programs must be installed and accessible from the command line:

  • git (version 2.7.4 or later)
  • pip (version 19.3.1 or later)

A GitHub personal access token is required top be available as an environmental variable. This is because the Python scripts will use it for GitHub API queries. This token is an alphanumeric string in the form of "ghp_2D5TYFikFsQ4U9KPfzHyvigMycePCPqkPgWc".

Running from source

The code can be run from source and has been tested on updated versions of GNU/Linux server operating systems including Red Hat Enterprise Linux 8.7. While effort has been made to keep the Python scripts platform-agnostic, they have not been tested under other operating systems such as BSD-derivatives, Apple macOS or Microsoft Windows as they - especially the latter two - are rarely used for hosting code such as this.

On your server, with the tools git and pip installed, run the following commands in a terminal session to retrieve the latest version of this repository and prepare it for development and running locally (usually for testing):

git clone https://github.com/OPEN-NEXT/wp2.2_dev.git
pip install --user -r requirements.txt

The git command will download the files in this repository onto your server into a directory named wp2.2_dev, and pip installs the Python dependencies listed in requirements.txt.

In a terminal window at the root directory of this repository, start the server with the uvicorn Asynchronous Server Gateway Interface (ASGI) server by running this command:

uvicorn oshminer.main:app --reload

There will be some commandline output which ends with something like the following line:

INFO:     Application startup complete.

This means the server API is up an running, and should be accessible on your local machine on port 8000 at 127.0.0.1.

Deploy as container

There is a Dockerfile in this repository that defines a container within which this code can run.

To build and use the container, you need to have programs like Podman or Docker installed.

With the repository cloned by git onto your system, navigate to it and build the container with this command:

podman build -t wp22dev ./ --format=docker

Replace the command podman with docker depending on which one is available (this project has been tested with Podman 4.0.2), and wp22dev can be replaced with any other name. --format=docker is needed to explicitly build this as a Docker-formatted container that will be accepted by cloud services like Heroku.

Then, the run the container on port 8000 at 127.0.0.1 with this command:

podman run --env PORT=8000 --env GITHUB_TOKEN=[token] -p 127.0.0.1:8000:8000 -d wp22dev

Where token is the 40 character alphanumeric string of your GitHub API personal access token. It is in the form of "ghp_2D5TYFikFsQ4U9KPfzHyvigMycePCPqkPgWc".

Heroku deployment example

The image built this way can be pushed to cloud hosting providers such as Heroku. With Heroku as an example:

  1. Set up an empty app from your Heroku dashboard.

  2. In the Settings page for your Heroku app, set a Config Var with Key "GITHUB_TOKEN" and Value being your GitHub API personal access token.

  3. With the Heroku commandline interface installed, first login from your terminal:

heroku container:login
  1. Push the container image built above to your Heroku app:
podman push wp22dev registry.heroku.com/[your app name]/web
  1. Release the pushed container into production:
heroku container:release web --app=[your app name]

Fly.io example

Similar to Heroku, the container image created above can be deployed to an app on Fly.io. Assuming a Fly.io account has already been created:

  1. Log in to Fly.io in a terminal session:
flyctl auth login
  1. Launch a new app. Run the following command, which will ask for an app name. Enter [your app name], replacing it with whatever name you'd like:
flyctl launch
  1. Authorise pushing a container image to the Fly.io image registry:
flyctl auth docker
  1. Push the locally built image to the remote Fly.io image registry:
podman push wp22dev registry.fly.io/[your app name]
  1. Deploy the app:
flyctl deploy --image registry.fly.io/[your app name]
  1. Set GitHub API personal access token as environmental variable:
flyctl secrets set GITHUB_TOKEN=[token]

Where token is the 40 character alphanumeric string of your GitHub API personal access token. It is in the form of "ghp_2D5TYFikFsQ4U9KPfzHyvigMycePCPqkPgWc".

A demo of this is hosted on Fly.io with this API endpoint:

https://wp22dev.fly.dev/data

This demo instance will go into a sleep state after a period of inactivity (approximately 30 minutes at time of writing). If your API calls to this endpoint is taking more than a few seconds, it might be the demo waking from that state.

Usage

The backend server listens to requests for information about a list of open source hardware (and software) repositories hosted on Wikifactory or GitHub.

Making requests to the REST API

GET requests to the API are formed as JSON payloads to the /data endpoint.

There are two components to each request:

  1. repo_urls: An array of strings of repository URLs, such as https://wikifactory.com/+elektricworks/pikon-telescope. Currently, metadata retrieval for Wikifactory project and GitHub repository URLs are implemented. Each URL is composed of the Wikifactory domain (wikifactory.com), space (e.g. +elektricworks), and project (e.g. pikon-telescope).

  2. requested_data: An array of strings representing the types of repository metrics desired for each repository. Currently, the following are implemented for Wikifactory projects:

    1. files_info: The numbers and proportions of mechanical and electronic computer-assisted design (CAD), image, data, document, and other file types in the repository.
    2. files_editability: Basic information about how "editable" the CAD files are in this repository.
    3. license: The license for the repository.
    4. tags: Aggregated tags for the repository and any associated with the maintainers of that repsitory.
    5. commits_level: The hash identifier (contribution id for Wikifactory projects) and timestamp of each commit to the repository. This can be used to graph the commit activity level in a frontend visualisation. Note: This will be based on commits from the first three detected branches in the repository, including the default branch. This is because the time it takes to requests commits across various branches take a long time, and APIs might time out. Also note that branches are not implemented by Wikifactory, so it will behave as if there is only one branch.
    6. issues_level: Similar to commits_level, but for all issues in the repository.

The following is an example request that could be sent to the API for three Wikifactory projects:

{
    "repo_urls": [
        "https://wikifactory.com/+dronecoria/dronecoria-frame", 
        "https://wikifactory.com/@luzleanne/community-composter", 
        "https://wikifactory.com/+elektricworks/pikon-telescope"
    ], 
    "requested_data": [
        "files_info", 
        "files_editability", 
        "license", 
        "tags",
        "commits_level", 
        "issues_level"
    ]
}

API response format

The API will respond with a JSON array containing the requested_data for each repository in repo_urls.

Specifically, for each repository, the response will include:

  • repository: String containing the repository URL.
  • platform: String, only Wikifactory for now.
  • requested_data: Object containing the following:
    • files_editability: Object containing the following:
      • files_count: Integer number of (presumed to be) CAD files that are not text documents or data files (like CSV).
      • files_openness: Object containing the following:
        • open: Integer number of files using open formats.
        • closed: Integer number of files using closed/proprietary formats.
        • other: Integer number of files not categorised in either of the above.
      • files_encoding: Object containing the following:
        • binary: Integer number of files using binary formats.
        • text: Integer number of files using text-based formats.
        • other: Integer number of files not categorised in either of the above.
    • files_info: Object containing the following:
      • total_files: Integer of total number of files in the repository.
      • ecad_files: Integer number of electronic CAD files.
      • mcad_files: Integer number of mechanical CAD files.
      • image_files: Integer number of image files.
      • data_files: Integer number of data files.
      • document_files: Integer number of documentation files.
      • other_files: Integer number of other types of files.
      • ecad_proportion: Floating point proportion of electronic CAD files.
      • mcad_proportion: Floating point proportion of mechanical CAD files.
      • image_proportion: Floating point proportion of image files.
      • data_proportion: Floating point proportion of data files.
      • document_proportion: Floating point proportion of documentation files.
      • other_proportion: Floating point proportion of other types of files.
    • license: Object containing license information:
      • key: String of license idenfifier. Currently the same as spdx_id.
      • name: Full name of license.
      • spdx_id: String of the SPDX license identifier.
      • url: URL to license text.
      • node_id: For some licenses, this will be an identifier in GitHub's license list.
      • html_url: URL to license information.
      • permissions: Array of strings containing the permissions given by the license, which could include:
        • commercial-use: This work and derivatives may be used for commercial purposes.
        • modifications: This work may be modified.
        • distribution: This work may be distributed.
        • private-use: This work may be used and modified in private.
        • patent-use: This license provides an express grant of patent rights from contributors.
      • conditions: Array of strings expressing the conditions under which the work could be used, which could include a combination of:
        • include-copyright: A copy of the license and copyright notice must be included with the work.
        • include-copyright--source: A copy of the license and copyright notice must be included with the work in when distributed in source form.
        • document-changes: Changes made to the source/documentation must be documented.
        • disclose-source: Source code/documentation must be made available when the work is distributed.
        • network-use-disclose: Users who interact with software via network are given the right to receive a copy of the source code.
        • same-license: Modifications must be released under the same license when distributing the work. In some cases a similar or related license may be used.
        • same-license--file: Modifications of existing files must be released under the same license when distributing the work. In some cases a similar or related license may be used.
        • same-license--library: Modifications must be released under the same license when distributing software. In some cases a similar or related license may be used, or this condition may not apply to works that use the software as a library.
      • limitations: Limitations of the license, which could include a combination of:
        • trademark-use: This license explicitly states that it does NOT grant trademark rights, even though licenses without such a statement probably do not grant any implicit trademark rights.
        • liability: This license includes a limitation of liability.
        • patent-use: This license explicitly states that it does NOT grant any rights in the patents of contributors.
        • warranty: The license explicitly states that it does NOT provide any warranty.
    • tags: Aggregated array of strings representing the tags associated with the repository, and tags associated with users who are maintainers/owners of the repository. The implementation of this might change as Wikifactory implements their skill-based matchmaking features.
      • Examples: open-source, raspberry-pi, space, 3d-printing
    • commits_level: Array of objects representing commits (contributions in Wikifactory), where each one would contain:
      • hash: A string, where for Git-based repositories, the unique hash identifier for the commit. For Wikifactory, this is the id field of the contribution.
      • committed: String containing the timestamp for the commmit in ISO 8601 format, e.g. 2018-04-25T20:35:59.614973+00:00.
    • issues_level: Array of objects representing issues, where each one would contain:
      • id: String containing the URL to the issue.
      • published: String containing the creation date of the issue in ISO 8601 format, e.g. 2018-04-25T20:35:59.614973+00:00.
      • isResolved: Boolean (true or false) of whether the issue has been marked as closed or resolved.
      • resolved: String containing ISO 8601 formatted timestamp representing the last time there was activity in the issue (such as comments), or if the issue isResolved, the time it happened.

Notes:

  • For files_editability above, filetypes are identified by file extensions. The categories and mapping are documented in oshminer/filetypes.py, and can be traced the osh-file-types list by Open Source Ecology Germany.
  • For files_info above, filetypes are identified by file extensions. The categories and mapping are located in oshminer/filetypes.py.
  • The license information and formatting is largely based on that from the GitHub-managed choosealicense.com repository, with the exception of some open source hardware licenses which were manually added.

Custom Wikifactory URLs

By default, this tool will:

  1. Identify whether a provided repository URL in the JSON request body as a Wikifactory project if it is under the domain wikifactory.com
  2. Use the public Wikifactory GraphQL API endpoint at https://wikifactory.com/api/graphql

Both can be customised with the following environmental variables during deployment:

  1. WIF_BASE_URL - (default: wikifactory.com) The base domain used for pattern-matching and identifying Wikifactory project URLs in the JSON request body in the form of example.com. If this is customised, then the requested Wikifactory project URLs passed to this tool should also use that domain instead of wikifactory.com. Otherwise, an "Repository URL domain not supported" error will be returned.
  2. WIF_API_URL - (default: https://wikifactory.com/api/graphql) The full URL of the GraphQL API endpoint to make queries regarding Wikifactory projects in the form of https://example.com[:port]/foo/bar.

Maintainers

Dr Pen-Yuan Hsing (@penyuan) is the current maintainer.

Dr Jérémy Bonvoisin (@jbon) was a previous maintainer who contributed greatly to this repository during the first year of the OPENNEXT project and is now an external advisor.

Contributing

Thank you in advance for your contribution. Please open an issue or submit a GitHub pull request. For more details, please look at CONTRIBUTING.md.

This project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by the Contributor Covenant Code of Conduct 2.0.

Acknowledgements

The maintainer would like to gratefully acknowledge:

  • Dr Jérémy Bonvoisin (@jbon) not only for the initial contributions to this work, but also for continued practical and theoretical insight, generosity, and guidance.
  • Dr Elies Dekoninck (@elies30) and Rafaella Antoniou (@rafaellaantoniou) for valuable feedback and support.
  • Max Kampik (@mkampik), Diego Vaquero, and Andrés Barreiro from Wikifactory for close collaboration, design insights, and technical support throughout the project.
  • OPENNEXT internal reviewers Dr Jean-François Boujut (@boujut) and Martin Häuer (@moedn) for constructive criticism.
  • OPENNEXT project researchers Robert Mies (@MIE5R0), Mehera Hassan (@meherrahassan), and Sonika Gogineni (@GoSFhg) for useful feedback and extensive administrative support.
  • The Linux Foundation CHAOSS group for insights on open source community health metrics.
  • The following people for their valuable feedback via a survey (see D2.5 report for details) (in alphabetical order of last name): Jean-François Boujut (@boujut), Martin Häuer (@moedn), James Jones (CubeSpawn), Max Kampik (@mkampik), Johannes Střelka-Petz.

EU flag

The work in this repository is supported by a European Union Horizon 2020 programme grant (agreement ID 869984).

License

GitHub AGPL-3.0-or-later license

The Python code in this repository is licensed under the GNU AGPLv3 or any later version © 2022 Pen-Yuan Hsing

CC BY-SA

This README is licensed under the Creative Commons Attribution-ShareAlike 4.0 International license (CC BY-SA 4.0) © 2022 Pen-Yuan Hsing

Details on other files are in the REUSE specification dep5 file.