A couple of FUSE wrappers around MongoDB gridfs using python3 and pyfuse3.
This work is based on https://github.com/axiros/py_gridfs_fuse and https://github.com/Liam-Deacon/py_gridfs_fuse developments.
There are two implementations:
-
The naive one, done by me (jmfernandez). It is fully compatible with existing GridFS collections, and some write scenarios are not supported. Directories are simulated through the usage of directory separator symbol in the filenames, like in cloud filesystems like s3fs or rclone.
-
The classical one from axiros and Liam Deacon. It is a full filesystem, with subdirectories, but it is not compatible with existing GridFS collections.
naive_gridfs_fuse --mongodb-uri="mongodb://127.0.0.1:27017" --database="gridfs_fuse" --mount-point="/mnt/gridfs_fuse" # --options=allow_other
naive_gridfs_fuse --mongodb-uri="mongodb://127.0.0.1:27017/" -c specialcoll --database="gridfs_fuse" --mount-point="/mnt/gridfs_fuse --show-versions" # --options=allow_other
mongodb://127.0.0.1:27017/gridfs_fuse.fs /mnt/gridfs_fuse gridfs_naive defaults,allow_other 0 0
Note this assumes that you have the mount.gridfs_naive
program (or mount_gridfs_naive
on MacOS X) symlinked into /sbin/
e.g. sudo ln -s $(which mount.gridfs_naive) /sbin/
gridfs_fuse --mongodb-uri="mongodb://127.0.0.1:27017" --database="gridfs_fuse" --mount-point="/mnt/gridfs_fuse" # --options=allow_other
mongodb://127.0.0.1:27017/gridfs_fuse.fs /mnt/gridfs_fuse gridfs defaults,allow_other 0 0
Note this assumes that you have the mount.gridfs
program (or mount_gridfs
on MacOS X) symlinked
into /sbin/
e.g. sudo ln -s $(which mount.gridfs) /sbin/
- pymongo
- pyfuse3
Ubuntu 16.04:
sudo apt-get install libfuse python3-pip
sudo -H pip3 install git+https://github.com/jmfernandez/py_gridfs_fuse.git@v0.4.0
MacOSX:
brew install osxfuse
sudo -H pip3 install git+https://github.com/jmfernandez/py_gridfs_fuse.git@v0.4.0
- create/list/delete directories => folder support (albeit permissions and ownership are not persisted).
- Show all file versions (through mount flag).
- read files (any of their versions).
- delete files (all their versions at once).
- open and write once (like HDFS).
- rename
- modify an existing file (only opening as O_WRONLY, it creates a new version of the file in GridFS).
- create/list/delete directories => folder support.
- read files.
- delete files.
- open and write once (like HDFS).
- rename
- resize an existing file.
- hardlink
- symlink
- statfs
- modify an existing file.
- resize an existing file.
- hardlink
- symlink
- statfs
- AWS d2.xlarge machine.
- 4 @ 2.40Ghz (E5-2676)
- 30 gigabyte RAM
- filesystem: ext4
- block device: three instance storage disks combined with lvm.
lvcreate -L 3T -n mongo -i 3 -I 4096 ax /dev/xvdb /dev/xvdc /dev/xvdd
- mongodb 3.0.1
- mongodb storage engine WiredTiger
- mongodb compression: snappy
- mongodb cache size: 10 gigabyte
- sequential write performance: ~46 MB/s
- sequential read performance: ~90 MB/s
Write performance was tested by copying 124 files, each having a size of 9 gigabytes and different content. Compression factor was about factor three. Files were copied one by one => no parallel execution.
Read performance was tested by randomly picking 10 files out of the 124. Files were read one by one => no parallel execution.
# Simple illustration of the commands used (not the full script).
# Write
pv -pr /tmp/big_file${file_number} /mnt/gridfs_fuse/
# Read
pv -pr /mnt/gridfs_fuse${file_number} > /dev/null