Releases: grailbio/reflow
Releases · grailbio/reflow
reflow1.16.0
Highlights from this release:
- Reflow can now make use of multiple AWS EC2 availability zones when launching spot instances
- Instances launched by Reflow will now use
gp3
EBS volumes instead ofgp2
- Improvements to the handling of unavailable instance types
- Retry eval errors if underlying causes are restartable
- Scheduler unloads only loaded data (bug fix for a deadlock)
- If an exec exits with code
75
, it is retried up to 3 times - Improved flaky unit tests
Reflow 1.15.0
- TaskDB: This release onwards, users can setup a taskdb if desired.
- Optimized storage format for results pointers (filesets)
- significantly lowers client memory usage during execution
- reduces s3 usage, lowers cost
- Predictor: improved quality of predictions
dir
equality: can now compare resolved and unresolved dirs directlyreflow info
now shows failed attempts for the same flow- New
c6i
/m6i
instances available to use - Built with golang
v1.17.1
Reflow 1.14.0
- Many improvements to reduce (local) memory usage of reflow (ie, local to the
reflow run
) - Various bug fixes and test improvements
Reflow 1.13.0
- Significant reduction in memory usage while using reflow bundles
- Predictor: significant reduction of transient memory usage
- Incorporate AWS Spot Advisor data to select instances with a lower probability of eviction
- Better detection of reflowlet OOMs via node metrics
- Reflow
runbatch
users can significantly reduce memory requirements by usingreflow runbatch <your_module.rfx>
instead ofreflow runbatch <your_module.rf>
(modules can be bundled usingreflow bundle
) - Bug fixes:
Reflow 1.9.0
- Cost information about runs/tasks/cluster is available via
reflow info
andreflow ps
. For more information, seereflow info -help
andreflow ps -help
. reflow logs
command now allows a-reflowlet
option. The-s
option is removed; seereflow logs -help
for details.reflow pred
command allows one to view profiles of a particular exec (using it’s identifier) across various run invocations (under the same AWS account). Example usagereflow pred <ident>
(the idents of execs are usually printed at the end of a run log).- Several internal changes were made to improve scheduling and reduce API usage, memory usage and logging.
reflow1.8.0
Highlights from this release:
- Improvements to Reflow’s AWS API usage to reduce throttling and errors
- Improvements to the
reflow logs
tool to enable streaming log events directly from cloudwatch - Concurrency limits are no longer applied to small filesets which will speed up cache lookups
- Removal of legacy "worker-stealer" code
- Lots of under the hood changes/refactors preparing for upcoming cost computation feature
- New "task draining" logic allows the scheduler to combine resource requirements and request larger allocs
- Corrected the exit status for
reflow test
which was incorrectly flipped (i.e. 1 for success and 0 for failure) - Changed default tracer implementation to
nopTracer
and updated tracing.md to describe how to enable tracing
reflow1.6.0
Highlights from this release:
- Multiple changes to reduce http calls to reflowlets, reducing the likelihood of keepalive failures and zombie execs.
- Cluster manager improvements to group simultaneously submitted tasks.
- Refactoring and unit test improvements in cluster and pool logic.
reflow1.5.0
Highlights from this release:
- fixed data races that could result in concurrent map access errors when using
localtracer
- fixed issue which resulted in non-ASCII characters being printed in the logs
- add tracing for cache operations
- make
Assertions
immutable and reduce memory footprint s3blob
will now delete in batches when more than 1000 keys are specified- added a user guide for tracing Reflow runs
- fixed
1000align.rf
sample program
reflow1.4.0
Highlights from this release
- added a flag to make dot graph generation optional (if you experience reflow getting “stuck” before starting any execs try re-running with
--dotgraph=false
) - an easier to use and more detailed tracing implementation for Reflow runs (see doc/tracing.md to learn more)
- significant memory usage improvements
- a new localcluster implementation for local mode runs (
reflow run -local
) which uses the same scheduler logic as remote runs
Other improvements
- if an exec’s requirements cannot be allocated by the cluster, the run fails immediately
- added flags to filter EC2 instances by minimum CPU or minimum memory
- fixed a bug where Reflow might become deadlocked in cases where the cluster’s
maxInstances
threshold was reached
reflow1.3.1
Reflow 1.3.1 is our first binary release in a long time and contains too many changes to list individually.
It is recommended to upgrade to this release for all users. Notably, this release includes a migration from CoreOS (deprecated) to Flatcar which will fix any launch error
or AMI errors you may have seen.