Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need some deployment instructions #74

Open
horsburgh opened this issue Jan 4, 2021 · 6 comments
Open

Need some deployment instructions #74

horsburgh opened this issue Jan 4, 2021 · 6 comments

Comments

@horsburgh
Copy link
Member

We plan to continue using this system to manage the ODM2 controlled vocabularies. However, others may want to use this system to manage their own vocabularies. We need some brief instructions on how to deploy this system. There should be a link to this on the main repository readme and then a separate markdown file in a "doc" folder that has deployment instructions.

@amabdallah
Copy link

Thanks for your willingness to accommodate our needs. Here are three ideas to improve the deployment of the app and generalized it for use beyond ODM purposes.

  1. I'm not familiar with how the current system can be deployed. It would be super helpful if the updated system can use Docker and Ansible to facilitate its deployment and redeployment easily. A developer helped me with this some 4-5 years ago.
    Check out how this works for our WaDE and my WaMDaM systems. I've deployed the apps using this way so many times. I can deploy either of them on an EC2 AWS instance from scratch in a few commands.
    WaDE
    http://vocabulary.westernstateswater.org/
    https://github.com/WSWCWaterDataExchange/WaDEControlledVocabularies (Updated instructions)
    WaMDaM
    http://vocabulary.wamdam.org/
    https://github.com/WamdamProject/WaMDaM_ControlledVocabularies (older)

  2. Related to deployment, it would also be helpful if you could incorporate these functions/commands to populate (or empty) the database from an Excel workbook or much better from Google Sheets.
    https://github.com/WSWCWaterDataExchange/WaDEControlledVocabularies/tree/master/cvservices/management/commands
    I know the system is set up to allow new terms additions by others but we have another use case. Our system is growing slowly as we add data and vocabularies for one state at-a-time. So whenever we add terms (across the tables) for a new state, we reset the db, and run a mass-upload from all the tables. Ideally, instead of resting the database, we just want to add the new terms and leave the existing ones as-is. Once we have all the states (by the end of this year), we then sit back and can rely on the moderated system for any future minor changes.

  3. Hopefully the first two suggestions are doable given they worked fine for my two apps. Since I needed to modify the ODM app twice for two different applications (WaDE and WaMDaM), I thought about generalizing how the app is structured so it can be adapted easier for future more applications with much less work or knowledge needed about how the app works (user's perspective). So this repo implements this idea (see cv_models.json file where new app tables would be defined)
    https://github.com/amabdallah/ControlledVocabulariesTemplate

Supporting these three suggestions along with the new technology updates should make the app useful to broader applications and probably to federal agencies like USGS. The intent here is to make the app usable for non-IT folks like myself. Let me know if you have questions.

@jcaraballo17
Copy link
Member

Hello!

I'm going to start working on these changes as soon as I can. Thank you so much @amabdallah for providing a guide on how you made this work with Ansible and Docker, we absolutely need this to make the webpage more easily deployable and accessible for other people. So far this application is deployed manually by installing a virtual environment, pulling the source code from git, and setting up the web server configuration.

Those new commands you suggest are very easy to adapt to the new system and even more so to make it work for every implementation without asking users (who are deploying the system) to change anything. It would be great if you can review these when I'm done, so let me know if this is something you would be willing to do and I'll tag you in the pull request as a reviewer.

Also, I would like to invite you to look at the version-update branch where I've been working on the updates, since I did some work along the lines of what you mention about generalizing the app structure. I've modified it so that to make a new implementation, you just need to add a name and a description of each Vocabulary in a python dictionary, along with other optional fields to customize many aspects of the vocabulary. The point is to make it as easy as possible to anyone to implement their own CV system without having to write, but with the option to do so, any models, views, templates, or API endpoints. This was the original idea when we started working on this project, but with my knowledge of Django 6 years I couldn't abstract as much as I would've liked to.

With this update, the only file that needs to change to create a new implementation is odm2cvs/controlled_vocabularies.py, and if there's a vocabulary with extra fields, those can be created in odm2cvs/cv_specific_fields.py. With this information, all of the models and API points are generated automatically and you can use them without ever working with the models explicitly.

I think it would be good to talk and share ideas about how to abstract the implementation even more at some point in time, so we could setup a Zoom call to go over the details of how the app changed with this new update and how to make it more accessible and easy to deploy in the future.

Let me know if you have any questions or suggestions!

@amabdallah
Copy link

amabdallah commented Jan 13, 2021

Sounds great! Thanks, @jcaraballo17 and @horsburgh for considering the suggestions.
Sure let's have a Zoom call to discuss the next steps. My schedule is flexible next week on Tuesday the 19th, 21st, or 22nd.

@amabdallah
Copy link

amabdallah commented Jan 14, 2021

@jcaraballo17 FYI

I just redeployed the app from scratch and I faced these two new issues which are related to outdated old code. Something to keep in mind for the next update

  1. loading the database from an Excel file with all the sheets. It used to work fine with xlsx files but now it only works with xls.
  File "/usr/local/lib/python2.7/site-packages/xlrd/__init__.py", line 170, in open_workbook
    raise XLRDError(FILE_FORMAT_DESCRIPTIONS[file_format]+'; not supported')
xlrd.biffh.XLRDError: Excel xlsx file; not supported
  1. I got this email from Github about this deprecated call

Hi @amabdallah,

You recently used a password to access the repository at amabdallah/WaDE2.0_CVs with git using git/2.17.1.

Basic authentication using a password to Git is deprecated and will soon no longer work. Visit https://github.blog/2020-12-15-token-authentication-requirements-for-git-operations/ for more information around suggested workarounds and removal dates.

Thanks,
The GitHub Team

@jcaraballo17
Copy link
Member

jcaraballo17 commented Jan 15, 2021

@amabdallah Great, see you next week!

I just redeployed the app from scratch and I faced these two new issues which are related to outdated old code. Something to keep in mind for the next update

  1. loading the database from an Excel file with all the sheets. It used to work fine with xlsx files but now it only works with xls.
  File "/usr/local/lib/python2.7/site-packages/xlrd/__init__.py", line 170, in open_workbook
    raise XLRDError(FILE_FORMAT_DESCRIPTIONS[file_format]+'; not supported')
xlrd.biffh.XLRDError: Excel xlsx file; not supported

This is an issue with the library you're using in that django command, the newer versions of it doesn't seem to support reading XLSX files. If you need it to work right now, you can install the version of xlrd that you're using in your other projects. To fix this issue without using old versions of libraries, the best solution I found online is to use another library called openpyxl:
https://stackoverflow.com/questions/65254535/xlrd-biffh-xlrderror-excel-xlsx-file-not-supported

  1. I got this email from Github about this deprecated call

Hi @amabdallah,
You recently used a password to access the repository at amabdallah/WaDE2.0_CVs with git using git/2.17.1.
Basic authentication using a password to Git is deprecated and will soon no longer work. Visit https://github.blog/2020-12-15-token-authentication-requirements-for-git-operations/ for more information around suggested workarounds and removal dates.

Starting December of last year, GitHub decided to deprecate basic authentication for every use, from now on everyone will have to create a token for authentication and use that to login to be able to use Git.
Go to https://docs.github.com/en/free-pro-team@latest/github/authenticating-to-github/creating-a-personal-access-token and follow the instructions on how to get a personal access token. You have to use that token to authenticate in all of your servers that are using Git before August of this year, which is when they stop accepting basic authentication altogether.

@amabdallah
Copy link

thanks, @jcaraballo17 for your help on these two issues. Ideally, the Github one needs to be incorporated in the Ansible deployment.

By the way, it would be a plus if you can add a few helpful error messages for things that may go wrong in the deployment. It took me a while to understand and debug things in the past.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants