Skip to content

klieret/RandomFileTree

Repository files navigation

Generate random files and folders for testing purposes

gh actions pre-commit.ci status Coveralls Documentation Status Pypi status Black Gitter PR welcome

Description

Create a random file and directory tree/structure for testing purposes.

This is done in an iterative fashion: For every iteration, files and folders are created based on set probabilities in all subfolders of the target folder.

Installation

randomfiletree can be installed with the python package manager:

pip3 install randomfiletree

For a local installation, you might want to use the --user switch of pip. You can also update your current installation with pip3 install --upgrade randomfiletree.

For the latest development version you can also work from a cloned version of this repository:

git clone https://github.com/klieret/randomfiletree/
cd randomfiletree
pip3 install --user .

Usage

Take a look at the documentation for the complete picture.

Command line interface

Simple command line interface:

randomfiletree <output folder> \
    -f <average number of files> \
    -d <average number of folders> \
    -r <repeat>

For example, using the options -f 3 -d 2 -r 2, on average 2 folders and 3 files are created in the first iteration, and another 2 folders and 3 files are added to the output directory and each of its subdirectories in the second iteration.

Type randomfiletree -h to see all supported arguments.

Python API

import randomfiletree

randomfiletree.iterative_gaussian_tree(
    "/path/to/basedir",
    nfiles=2.0,
    nfolders=0.5,
    maxdepth=5,
    repeat=4
)

Randomfiletree will now crawl through all directories in /path/to/basedir and create new files with the probabilities given in the arguments.

Advanced examples

It is possible to pass an optional function to generate the random filenames oneself:

import random
import string

def fname():
    length = random.randint(5, 10)
    return "".join(
        random.choice(string.ascii_uppercase + string.digits)
        for _ in range(length)
    ) + '.docx'

randomfiletree.core.iterative_gaussian_tree(
    "/path/to/basedir",
    nfiles=100,
    nfolders=10,
    maxdepth=2,
    filename=fname
)

The payload optional argument can be used to generate file contents together with their names. For example, it can be used to replicate some template files with randomized names:

import itertools
import pathlib
import randomfiletree

def callback(target_dir: pathlib.Path) -> pathlib.Path:
    sourcedir = pathlib.Path("/path/to/templates/")
    sources = []
    for srcfile in sourcedir.iterdir():
        with open(srcfile, 'rb') as f:
            content = f.read()
        sources.append((srcfile.suffix, content))
    for srcfile in itertools.cycle(sources):
        path = target_dir / (randomfiletree.core.random_string() + srcfile[0])
        with path.open('wb') as f:
            f.write(srcfile[1])
        yield path

randomfiletree.core.iterative_gaussian_tree(
    "/path/to/basedir",
    nfiles=10,
    nfolders=10,
    maxdepth=5,
    repeat=4,
    payload=callback
)

if both filename and payload passed, the first option is ignored.

Contributors ✨

Thanks goes to these wonderful people (emoji key):

Heshanthaka
Heshanthaka

💻
Nicholas Bollweg
Nicholas Bollweg

💻
Vadym Markov
Vadym Markov

💻

This project follows the all-contributors specification. Contributions of any kind welcome!