Releases · jim-schwoebel/allie

fixed bug in visualize.py script with feature labels and also general size of labels
fixed bug in OpenSMILE feature set that deleted audio files (which enables multiple audio featurizations with featurization API).
edited audiotext feature array in audio features API / cleaned up code
fixed bug in csv file featurizations / optimized to not need re-featurization each time for folders of files that have already been featurized
general updates to documentation
allowed for creation of regression model outputs in the model2csv.py script
enabled CLI access to model API for regression modeling
enabled multiple regression model training based on spreadsheets with the regression_all.py script and create_dataset.py script
added in RadViz plots and correlation plots with the visualize.py script

Changelog:

fix labels on axes for visualization scripts
get Dockerfile to pass all unit tests
write docker.py script to call from docker as a next-step in downloading nltk models
improved readmes and documentation
refactored transformer script to use folder paths (instead of tdir1 and tdir2... --> made CLI a little more user-friendly)
Added in a CLI tutorial in the wiki.
CLI adding in various settings or printing them out.
minor bug fixes in model loading with new CSV load feature
edited cleaning/augmentation scripts to input / output files as lists to iterate sequentially properly without erroring
edited project boards to be up-to-date
solve regression problem loading machine learning models and making predictions (from spreadsheets)
added in new ability to featurize .CSV spreadsheets using the standard Allie Features API and default_features
created a nice CLI interface to use all the API functions of Allie
extensive documentation of the entire repository with readmes, updated the wiki, and individual files to make it clearer what all the sections of the repository mean and how to use them
added rename.py helper script to rename files to prevent naming conflicts after annotation.
added new cleaning feature in renaming files to avoid any file naming issues (with spaces or whatnot) during featurization for audio, text, image, and video files
made it so visualization API does error out on regression problems; disabled this for regression problems in version 1.0
made create-csv.py script to prepare folders of files into a regression or classification format
documentation of the repository / video examples for research paper
improved documentation for cleaning and augmentation techniques
added in text, image, video, and audio cleaning techniques (in new format)
added in text, image, video, and audio augmentation techniques (in new format)
add error handling into all of Allie's featurizations + error array into feature array itself ("error" form of column on features)
kept create_readme setting for making readmes in the repositories themselves (deleted create_YAML setting)
deleted the production folder schema within Allie
added component numbers for both dimensionality reducers and feature selectors in settings.json
fix small bug .JSON files for model files.
add 'pip3 freeze > requirements.txt' --> to machine learning model training systems to reproduct environments on different CPUs
added audio_features/loudness_features.py using pyloudnorm (in dB)
cleaned up audio_features/sa_feature array to be a simpler # of lines (and made a fixed length-array)
fixed bug in loading AutoGluon models for making predictions with the load.py script in the ./models/ directory (and loading model_type variable generally)
add in ['zscore','isolationforest'] to remove outliers (https://stackoverflow.com/questions/51390196/how-to-calculate-cooks-distance-dffits-using-python-statsmodel) - remove_outliers == True / False.
added a sample validation script in the ./models directory to quickly assess how well machine learning models generalize to new datasets
added Figlet for cool text renderings / messages when loading modeling scripts (http://www.figlet.org/)
bug fix - minor bug fix in visualize.py script; fixed loading broken .JSON files during featurization (broke the visualization script during model training)
bug fix - edited transforms such that they are named by the common name and not by all the classes trained, as if you have >30 classes this will cause the transform to fail at saving / loading
added option in modeling script to create csv files (if create_csv == True, then creates .CSV files during training) - note the reason for this is for very large files it can take a long time to create them, so model training sessions can be sped up by setting create_csv == False.
added annotate.py script to annotate files (beta version) - need to add to .JSON schema (in labels (regression)
come up with the ability to train regression models by a class and value
add in single model prediction mode in ./load.py script (-audio (sampletype) -c_autokeras (folder) -directory)
add in all model loaders from the model trainers
fixed cvopt and autokaggle training script bugs
added in the ability to quickly visualize ML models trained in a spreadhseet with the model2csv.py script
bug fix - minor bug fixes associated with transcription during featurization for audio, image, video, and .CSV files
add notion of "tabular" data instead of .CSV to tie to audio, video, and image data (e.g. for loading datasets) - as laid out in the d3m-schema - did this in the featurize_csv script where .CSV files can contain audio, text, image, video, numerical, and categorical data.
test and validate model compression works for all training scripts / can load compressed models and make predictions (w/ production)
finish up model trainers and clean them up with standard metrics for accuracy
add in version to Allie (to assess deprecation issues into the future)
add in deepspeech functionality to transcription for open source (and other open source audio transcribers)
add in transcriber settings as a list ['pocketsphinx', 'deepspeech', 'google', 'aws'], etc.
added in transcribers as lists (can be adapted into future)
created a version 2 trainer for machine learning models (as part of Allie release 1.0.0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: jim-schwoebel/allie

Allie Version 1.0.1

Allie Version 1.0