simple python script that parses files' frontmatter section and output files' metadata as a machine-readable CSV file
a Frontmatter is a YAML style key-value pair section delimited by triple-dashed-lines above and under:
---
date: 2024-02-03
topics:
- data
- intelligence
---
Install my-project with pipenv
pipenv install
start a python shell within your virtual environment
pipenv shell
execute script
python main.py "example_files"
as long as a frontmatter section exists in a file of any type, metadata_to_csv will parse its metadata and print out a CSV file containing the metadata of all files in this directory
author,country,file_type
Kid Kubernetes,USA,null
Jacques Derrida,France,markdown
Heraclitus,ionia,null
max,canada,txt
gunther,germany,python
a python file:
---
author: gunther
country: germany
file_type: python
---
print("hello world")
...
a markdown file:
---
author: Jacques Derrida
country: France
file_type: markdown
---
# the pharmakon hon hon
## first section
...
a yaml file:
---
author: Kid Kubernetes
country: USA
file_type: yaml
---
kind: ultraNice
api: basic/v1
grocery:
- milk
- bread
- yogurt
a txt file:
---
author: max
country: canada
file_type: txt
---
abcdefg
you don't want to bother setting your local machine up with Python, you can run the script in a Docker container
- edit
docker.sh
variables to fit your situation - execute script
chmod +x docker.sh
./docker.sh
- retrieve your metadata dataset in a csv file in this directory:
metadata.csv