Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add workspace packages #1892

Draft
wants to merge 5 commits into
base: develop
Choose a base branch
from
Draft

WIP: Add workspace packages #1892

wants to merge 5 commits into from

Conversation

JimMadge
Copy link
Member

@JimMadge JimMadge commented May 16, 2024

✅ Checklist

  • You have given your pull request a meaningful title (e.g. Enable foobar integration rather than 515 foobar).
  • You are targeting the appropriate branch. If you're not certain which one this is, it should be develop.
  • Your branch is up-to-date with the target branch (it probably was when you started, but it may have changed since then).

🚦 Depends on

#1909 - Shouldn't dive too far into specifying packages/configuration until we have decided on the distro.

⤴️ Summary

  • Add ansible playbook for desired state
    • Install apt packages
    • Configures shell/prompts
    • ClamAV configuration
    • Auditd configuration

🌂 Related issues

Closes #1574
Closes #1783

🔬 Tests

Copy link

github-actions bot commented May 16, 2024

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  data_safe_haven/infrastructure/common
  transformations.py 16, 31, 39, 47, 80, 88, 96, 104, 112
  data_safe_haven/infrastructure/programs/sre
  application_gateway.py 93
  data.py 467
Project Total  

This report was generated by python-coverage-comment-action

@JimMadge JimMadge mentioned this pull request May 17, 2024
5 tasks
@JimMadge
Copy link
Member Author

Looking at some options to get the files on workspace VMs. The constraints are,

  • Needs to happen early after deployment
    • Will have Linux utils and some basics from cloud-init (curl, bash, sed, but not az)
  • Should happen on all workspaces
    • Easier to avoid authentication if possible
  • Data is non-sensitive
    • No need for authentication for read access

The good solutions I can see are,

  1. Blob container with anonymous read access
  • Use the existing data_configuration storage account (is that appropriate @jemrobinson)
  • Could create a new storage account if we are worried about enabling anonymous access
  • Anonymous read access
  • Networking rules only allow access from inside SRE
  • Upload a manifest file (as you need list permissions to list all files)
  • Script to fetch manifest, then fetch all files in the manifest using curl
  1. Blob container with sftp access
  • Storage account needs hierarchical namespace
  • Enable SFTP
  • Create local user with read/list permissions
    • With machine password for all workspaces or,
    • with ssh keypairs generated on each workspace
  • Script to scp from sftp using local account/password/pubkey

Non-options,

  1. NFS: No easy way for us/admins to put files into NFS share
  2. BlobFuse or SMB: more pre-requisites

Using anonymous access blob and http is nice for not needing authentication. However, you have to keep the manifest up to date which could be prone to error.
SFTP might be the best option, for the cost of storing the SFTP password and writing it to each workspace VM.

Thoughts @jemrobinson @craddm

@jemrobinson
Copy link
Member

jemrobinson commented May 21, 2024

Is the cloud-init Ansible support (https://cloudinit.readthedocs.io/en/latest/reference/modules.html#ansible) any help here?

Some thoughts on your suggestions:

  1. data_configuration storage account
  • Access is allowed from known data_configuration_ip_addresses
  • Currently doesn't have any blob containers (only file shares)
  • See e.g. the way we mount ingress and egress from blob storage over NFSv3
  1. Enabling SFTP
  • This is a bit of a mess, but @craddm got it working for a DSG in the past (hopefully notes on how to do it are somewhere)
  • Not sure I would recommend it

How/when would the file get updated? Is it reasonable to imagine e.g. an Azure Function that would do the updating? If so, perhaps it could write directly to the VM and you wouldn't need a storage volume at all?

@craddm
Copy link
Contributor

craddm commented May 21, 2024

I do have some notes on SFTP, yes. Wasn't enormously tricky, IIRC. Creating a local user was easy through the portal, and seems from Powershell it was easy to create one with a password rather than an SSH key.

I'm not seeing much about Azure Functions writing directly to VMs; you can trigger them on uploads to blob storage but so far I've only seen people suggest that the function copies files to a File Share

@JimMadge
Copy link
Member Author

@jemrobinson How much experience with key pairs in Azure/Pulumi do you have?

Looks like your can't generate a local user with password using Pulumi (at least, the password isn't an output).
You can specify an existing key pair though. That would mean we would need to be able to fetch the private key for each workspace.

@jemrobinson
Copy link
Member

What Pulumi resource are you using to generate a local user? I assume you don't mean in cloud-init?

@JimMadge
Copy link
Member Author

azure_native.storage.LocalUser

Local user for SFTP, not for the workspaces.

@JimMadge
Copy link
Member Author

@jemrobinson Looks like Azure expects to keep the private key for keys in a keyvault.

I think the best option would be to use the TLS package to generate a key, then store that as a secret.
(I'm now vaguely recalling using the TLS package before to do something like this).

@jemrobinson
Copy link
Member

I'm still a bit confused about what advantage we get for using SFTP over one of the other options. Is it just that we can do so earlier in the cloud-init order of operations? If so, why does this matter?

N.B. if you want to generate a Key in a Keyvault, I think you should probably use the Key resource

@JimMadge
Copy link
Member Author

The SFTP way means,

  • No requirements beyond what is already on a bare Ubuntu VM image
  • No need to have a manifest (you can just scp -r user@host:* /)

The current downside is it means sharing a private key between workspaces (or creating a new local account with key pair for each workspace).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ClamAV to the workspaces Minimal required workspace software
3 participants