Skip to content

snap confine Overview

jdstrand edited this page Aug 11, 2017 · 5 revisions

Overview

The snap run program launches snappy applications to restrict access. It uses apparmor and seccomp to do this.

Run with:

$ snap run snapname.command

AppArmor

The apparmor part is similar to aa-exec -p, i.e. it will launch the application under the specified AppArmor profile. AppArmor profiles are found in /var/lib/snapd/apparmor/profiles.

Seccomp

The seccomp filter profile in expected to be located in /var/lib/snapd/seccomp/bpf/*.src (formerly /var/lib/snapd/seccomp/profiles).

The filter file contains lines with syscall names, comments that start with "#" or special directives that start with a "@". Lines with syscall names may optionally specify additional arguments. Eg:

    RULE = ( <syscall> [ARGS] | DIRECTIVE )

    DIRECTIVE = @unrestricted

    ARGS = ( - | [CONDITIONAL]VALUE )*

    CONDITIONAL = ( '!', '>', '>=', '<', '<=', '|' )

    VALUE = ( UNSIGNED INT | KEY )

    KEY = ( SOCKET DOMAIN | SOCKET TYPE | PRCTL | PRIO | CLONE | TIO |
    QUOTA | MKNOD | NETLINK )

    SOCKET DOMAIN = ( AF_UNIX | AF_LOCAL | AF_INET | AF_INET6 | AF_IPX |
    AF_NETLINK | AF_X25 | AF_AX25 | AF_ATMPVC | AF_APPLETALK | AF_PACKET |
    AF_ALG | AF_CAN | AF_BRIDE | AF_NETROM | AF_ROSE | AF_NETBEUI |
    AF_SECURITY | AF_KEY | AF_ASH | AF_ECONET | AF_SNA | AF_IRDA |
    AF_PPPOX | AF_WANPIPE | AF_BLUETOOTH | AF_RDS | AF_LLC | AF_TIPC |
    AF_IUCV | AF_RXRPC | AF_ISDN | AF_PHONET | AF_IEEE802154 | AF_CAIF |
    AF_NFC | AF_VSOCK | AF_IB | AF_MLPS | PF_* synonyms supported AF_* )

    SOCKET TYPE = ( SOCK_STREAM | SOCK_DGRAM | SOCK_SEQPACKET | SOCK_RAW |
    SOCK_RDM | SOCK_PACKET )

    PRCTL = ( PR_CAP_AMBIENT | PR_CAP_AMBIENT_RAISE |
    PR_CAP_AMBIENT_LOWER | PR_CAP_AMBIENT_IS_SET |
    PR_CAP_AMBIENT_CLEAR_ALL | PR_CAPBSET_READ | PR_CAPBSET_DROP |
    PR_SET_CHILD_SUBREAPER | PR_GET_CHILD_SUBREAPER | PR_SET_DUMPABLE |
    PR_GET_DUMPABLE | PR_SET_ENDIAN | PR_GET_ENDIAN | PR_SET_FPEMU |
    PR_GET_FPEMU | PR_SET_FPEXC | PR_GET_FPEXC | PR_SET_KEEPCAPS |
    PR_GET_KEEPCAPS | PR_MCE_KILL | PR_MCE_KILL_GET | PR_SET_MM |
    PR_SET_MM_START_CODE | PR_SET_MM_END_CODE | PR_SET_MM_START_DATA |
    PR_SET_MM_END_DATA | PR_SET_MM_START_STACK | PR_SET_MM_START_BRK |
    PR_SET_MM_BRK | PR_SET_MM_ARG_START | PR_SET_MM_ARG_END |
    PR_SET_MM_ENV_START | PR_SET_MM_ENV_END | PR_SET_MM_AUXV |
    PR_SET_MM_EXE_FILE | PR_MPX_ENABLE_MANAGEMENT |
    PR_MPX_DISABLE_MANAGEMENT | PR_SET_NAME | PR_GET_NAME |
    PR_SET_NO_NEW_PRIVS | PR_GET_NO_NEW_PRIVS | PR_SET_PDEATHSIG |
    PR_GET_PDEATHSIG | PR_SET_PTRACER | PR_SET_SECCOMP | PR_GET_SECCOMP |
    PR_SET_SECUREBITS | PR_GET_SECUREBITS | PR_SET_THP_DISABLE |
    PR_TASK_PERF_EVENTS_DISABLE | PR_TASK_PERF_EVENTS_ENABLE |
    PR_GET_THP_DISABLE | PR_GET_TID_ADDRESS | PR_SET_TIMERSLACK |
    PR_GET_TIMERSLACK | PR_SET_TIMING | PR_GET_TIMING | PR_SET_TSC |
    PR_GET_TSC | PR_SET_UNALIGN | PR_GET_UNALIGN )

    PRIO = ( PRIO_PROCESS | PRIO_PGRP | PRIO_USER )

    CLONE = ( CLONE_NEWIPC | CLONE_NEWNET | CLONE_NEWNS |
    CLONE_NEWPID | CLONE_NEWUSER | CLONE_NEWUTS)

    TIO = TIOCSTI

    QUOTA = ( Q_SYNC | Q_QUOTAON | Q_QUOTAOFF | Q_GETFMT |
    Q_GETINFO | Q_SETINFO | Q_GETQUOTA | Q_SETQUOTA | Q_XQUOTAON |
    Q_XQUOTAOFF | Q_XGETQUOTA | Q_XSETQLIM | Q_XGETQSTAT |
    Q_XQUOTARM )

    MKNOD = ( S_IFREG | S_IFCHR | S_IFBLK | S_IFIFO | S_IFSOCK )

    NETLINK = ( NETLINK_ROUTE | NETLINK_USERSOCK | NETLINK_FIREWALL |
    NETLINK_SOCK_DIAG | NETLINK_NFLOG | NETLINK_XFRM | 
    NETLINK_SELINUX | NETLINK_ISCSI | NETLINK_AUDIT |
    NETLINK_FIB_LOOKUP | NETLINK_CONNECTOR | NETLINK_NETFILTER |
    NETLINK_IP6_FW | NETLINK_DNRTMSG | NETLINK_KOBJECT_UEVENT |
    NETLINK_GENERIC | NETLINK_SCSITRANSPORT | NETLINK_ECRYPTFS |
    NETLINK_RDMA NETLINK_CRYPTO NETLINK_INET_DIAG )

See man 2 socket for details on SOCKET DOMAIN and SOCKET TYPE, man 2 prctl for PRCTL, man 2 getpriority for PRIO, man 2 setns for CLONE, man 4 tty_ioctl for TIO, man 2 quotactl for QUOTA, man 2 mknod for MKNOD and man 7 netlink for NETLINK.

Specifying '-' as the argument skips filtering for that argument. Not specifying a conditional means use exact match. The syntax is meant to reflect how seccomp_rule_add(3) is used.

Examples:

  • The unrestricted profile looks like this:

      # Unrestricted profile
      @unrestricted
    
  • A very strict profile might look like this:

      # Super strict profile
      read
      write
    
  • Use of seccomp argument filtering:

      # allow any socket types for AF_UNIX and AF_LOCAL
      socket AF_UNIX
      socket AF_LOCAL
    
      # Only allow SOCK_STREAM and SOCK_DGRAM for AF_INET
      socket AF_INET SOCK_STREAM
      socket AF_INET SOCK_DGRAM
    
      # Allow renicing of one's own process (arg2 is '0) to higher nice values
      setpriority PRIO_PROCESS 0 >=0
    
      # Allow dropping privileges to uid/gid '1' and raising back again
      setuid <=1
      setgid <=1
      seteuid <=1
      setegid <=1
    
      # Allow mknod for regular files
      mknod - |S_IFREG
    

Limitations

  • seccomp argument filtering currently only allows specifying positive integers as arguments which means you may not dereference pointers, etc.
  • up to 6 arguments may be specified
  • '|' currently can only be used to check a single bit. Checking for OR'd bits may be implemented in the future

devices cgroup

It works like this:

  • when an interface is connected that uses the UDev backend: yaml udev rules are generated that add tags to matching hardware. These assign rules are added to udev via /etc/udev/rules.d/70-snap.... for each snap. The tags are of the form snap_<snap name>_<app>'.
  • when an application is launched, the launcher queries udev to detect if any devices are tagged for this application. If no devices are tagged for this application, a device cgroup is not setup
  • if there are devices tagged for this application, the launcher creates a device cgroup in /sys/fs/cgroup/devices/snap.<snap name>.<app> and adds itself to this cgroup. It then sets the cgroup as deny-all by default, adds some common devices (eg, /dev/null, /dev/zero, etc) and any devices tagged for use by this application using /lib/udev/snappy-app-dev
  • the app is executed and now the normal device permissions/apparmor rules apply
  • udev match rules in /lib/udev/rules.d/80-snappy-assign.rules are in place to run /lib/udev/snappy-app-dev to handle device events for devices tagged with snap_*.

Note, /sys/fs/cgroups/devices/snap.<snap name>.<app> is not (currently) removed on unassignment and the contents of the cgroup for the app are managed entirely by the launcher. When an application is started, the cgroup is reset by removing all previously added devices and then the list of assigned devices is built back up before launch. In this manner, devices can be assigned, changed, and unassigned and the app will always get the correct device added to the cgroup, but what is in /sys/fs/cgroups/devices/snap.<snap name>.<app> will not reflect assignment/unassignment until after the application is started.

private /tmp

The launcher will create a private mount namespace for the application and mount a per-app /tmp directory under it.

devpts newinstance

The launcher will setup a new instance devpts for each application.