15 May 18:34

imrajdas

8c3d20a

2.0.0-Beta6

Major Updates

Added MongoDB go-interface and refactored the database operations and structure to accommodate the test cases easily.
Support for adding custom container image registry to chaos workflow manifest.
Enhanced the performance of the analytics APIs with memory caching and added APIs to fetching labels and values for a Prometheus series.
Added support for mutating the sequence of the workflow steps by drag and drop which reflect the live changes in the DAG.
Enhanced the workflow graph to show other node phases such as Omitted, Skipped, and Error for a good user experience.
Enhanced the verify and commit page to allow users to have a final review and edit their workflow details before scheduling the same.
Bug fixed for some user management operations and refactored teaming APIs to increase the performance.
Enhanced the litmusportal user interface to fastens the onboarding process.

Minor Updates

Adding support for liveness check of the dependent applications in the agent plane before going active.
AirGapped support for the pre-defined workflows by moving the fetching logic to the backend.
Added instance-id label in the chaos workflow manifest to avoid multiple scheduling in the multi-Argo server cluster.
Added validations for workflow name, GitHub URL, and different probe inputs.

Assets 2

1 Join discussion

30 Apr 21:30

imrajdas

2.0.0-Beta5

59904a4

2.0.0-Beta5 Pre-release

Pre-release

Minor SA fix in eventtracker (namespace) (#2760)

Signed-off-by: Raj Das <mail.rajdas@gmail.com>

Assets 2

20 Apr 19:49

ksatchit

2.0.0-Beta4

7494b0b

2.0.0-Beta4

Major Updates

Fixes the inability to successfully register the agents/targets when litmus portal server is brought up with loadbalancer/nodeport service type
Makes MyHub source configurable by branch so that latest stable versions of experiments are pulled for custom & predefined workflows
Updates the chaos operator dependencies on the subscriber to make use of the latest api changes for chaos resources
Updates the chaos operator, runner & exporter image tunables/ENVs in the subscriber so that the latest stable versions are installed on the targets
Updates Okteto dev setup instructions to reflect latest image versions and changes in specification (env) as well as instructions
Updates the chaosengine CRD validation schema for annotation injection in the manifests maintained & installed by the subscriber

Minor Updates

Improves the icons for revert chaos and workflow scheduling
Optimizes the teaming code to remove redundant conditions
Improved styling & background adopted from litmus-ui

Assets 2

15 Apr 18:48

imrajdas

2.0.0-Beta3

aff0fef

2.0.0-Beta3

Litmus 2.0.0-Beta3

Major Updates

Support for policy-based control of event tracker where users can define their own policy using JMESPath query and based on that event-tracker will react to the application changes.
Enhanced UI for workflow Scheduling, gives users the ability to tune annotations, target application details like application namespace, labels, and kind, and probe data using User Interface.
New UI for workflow visualization for showing information about workflow and nodes in a better way.
We made the onboarding process for users and easier to use through the new UI.
Enhanced the homepage to show information like Recent workflow runs, Agent details, and Project details.
Shifting project switching from using Redux-based technique to URL-based technique to avoid caching problems.
Migrated CircleCI to GitHub workflow and enhanced the continuous integration of the project.
Enhanced the analytics module in terms of UI and computation
Enhanced the browse workflows table to show resilience score and the total number of experiments passed for the listed workflows.* Support role-based access control in the backend for handling authorization for all requests.
Support for storing scheduled workflow templates and adding some new podtato-head predefined workflow templates

Minor Updates

Increment in the Better Code Hub(BCH) score
Optimized the frontend by shifting the resiliency score calculation to the backend.
Restructured the directory structure for settings in the frontend to modularise the code.
Support for a reinstall of litmus agents by moving the litmus-portal-config configmap independent of the subscriber.
Support for Ingress and Load balancer network type for connecting external agents with Litmus Portal. Based on the server service type, it will generate the endpoint for the external agent.

Assets 2

0 Join discussion

30 Mar 08:35

imrajdas

2.0.0-Beta2

b9fa74c

2.0.0-Beta2 Pre-release

Pre-release

Added beta2 fixes for auth and teaming (#2612)

Signed-off-by: Saranya-jena <saranya.jena@mayadata.io>

Assets 2

15 Mar 18:17

imrajdas

2.0.0-Beta1

ae16761

2.0.0-Beta1

Major Updates

Support for in-built analytics, where users can connect their data sources and generate dashboard panels.
Support for Git as a single source of truth for workflow artifacts. This enables users to have their workflows synced between the portal and Git source.
Introduces the event-tracker microservice to trigger chaos workflows automatically upon change to application images. This feature works in tandem with GitOps frameworks that rollout changes to applications upon manual changes in the Git source or upon image push to registries.
Support for re-running of existing chaos workflow from the litmus portal.
Adding a command-line tool called litmusctl to manage litmus portal services. The key role of litmusctl is to connect the external cluster with the litmus server and install the external agents.
Redesigning the teaming user interface and adding some significant features such as leave project, decline invitation.
Recreating litmus docs for litmus 2.0.x. For more information, visit https://litmusdocs-beta.netlify.app/
Integration of Litmus-UI with litmus portal components
Major directory restructuring of litmus portal’s server for database handlers

Minor updates

Changing MongoDB kind from deployment to statefulsets
Adding chaos-exporter as default external cluster agents for litmusportal
Refactoring authentication server to accommodate new teaming integration
Removing some unnecessary inputs from the welcome modal and predefined chaos workflow

Assets 2

05 Mar 15:14

ksatchit

2.0.0-Beta0

7cd9a5a

2.0.0-Beta0

Fixed default error state for password fields and fixed modal padding…

… (#2505)

* Fixed default error state for password fields and fixed modal padding

Signed-off-by: SarthakJain26 <sarthak.jain@mayadata.io>

* added text to translation

Signed-off-by: SarthakJain26 <sarthak.jain@mayadata.io>

Assets 2

15 Jul 21:43

ksatchit

1.13.8

dc086b3

1.13.8

New Features & Enhancements

Introduces upgraded pod-cpu-hog & pod-memory-hog experiments that inject stress-ng based chaos stressors into target containers pid namespace (non-exec model).
Supports multi-arch images for chaos-scheduler controller
Supports CIDR apart from destination IPs/hostnames in the network chaos experiments
Refactors the litmus-python repository structure to match the litmus-go & litmus-ansible repos. Introduces a sample python-based pod-delete experiment with the same flow/constructs as its go-equivalent to help establish a common flow for future additions. Also adds a BYOC folder/category to hold non-litmus native experiment patterns.
Refactors the litmus-ansible repo to remove the stale experiments (which have been migrated and improved in litmus-go). Retains (improves) samples to help establish a common flow for future additions
Adds GCP chaos experiments (GCP VM stop, GPD detach) in technical-preview mode

Major Bug Fixes

Fixes erroneous logs in the chaos-operator seen while attempting to remove finalizer on chaosengine
Fixes a condition where the chaos revert information is present in both annotations as well as the status of chaosresult CR (the inject/revert status is typically maintained/updated as an annotation on the chaosresult before it is updated into the status and cleared/removed from annotations)
Removes hardcoded experiment job entrypoint, instead of picking from the ChaosExperiment CR’s .spec.definition.command
Fixes a scheduler bug that interprets a minChaosInterval mentioned in hours (ex: 1h) in minutes
Improves the scheduler reconcile to stop flooding/logging every “reconcile” seconds irrespective of the minChaosInterval
Enables the scheduler to start off with the chaos injection immediately upon application of the ChaosSchedule CR without waiting for the first installment of minChaosInterval period - in repeat mode with only the minChaosInterval specified
Handles edge/boundary conditions where chaos StartTime is behind CreationTimeStamp of ChaosSchedule OR next iteration of chaos as per minChaosInterval is beyond the EndTime
Adds a check to ignore chaos pods (operator, runner, experiment/helper/probe pods) and blacklist them from being chaos candidates (esp. needed when appinfo.applabel is configured with exclusion patterns such as: !keys OR <key> notin <value>)
Removes hostIPC, hostNetwork permissions for pod stress chaos experiments
Fixes an incorrect env key for TOTAL_CHAOS_DURATION in pod-dns experiments
Fixes a regression introduced in 1.13.6 wherein the experiment expected the parent workloads (deployment, statefulset et al) to carry labels specified in appinfo.applabel, apart from just the pods even when .spec.annotationCheck was set to false in the ChaosEngine. Prior to this, the parent workloads needed to have the label only when .spec.annotationCheck was set to true. This has been re-corrected as per earlier expectations.

Limitations

Chaos abort (via .spec.engineState set to stop OR via chaosengine deletion) operation is known to have an issue with the namespace scoped chaos-operator in 1.13.8, i.e., an operator running with WATCH_NAMESPACE env set to a specific value and using role permissions. In such cases, the finalizer on the ChaosEngine needs to be removed manually and the resource deleted to ensure the operator functions properly.

This is not needed/necessary for cluster scoped operators (which is the default mode of usage)(where WATCH_NAMESPACE env is set to empty string to cover all ns & leverages clusterrole permissions.)

The fix for correcting the behavior of namespace scoped operators will be added in the next patch.

Installation

kubectl apply -f https://litmuschaos.github.io/litmus/litmus-operator-v1.13.8.yaml

Verify your installation

Verify if the chaos operator is running
kubectl get pods -n litmus
Verify if chaos CRDs are installed
kubectl get crds | grep chaos

For more details refer to the documentation at Docs

Assets 2

15 Jun 21:35

ksatchit

1.13.6

dc086b3

1.13.6

New Features & Enhancements

Supports automated rollback/abort of chaos depending upon predefined conditions (defined in the probes). The probes can now be configured with a StopOnFailure property set to true or false to control the execution flow of the experiment.
Enhances the ChaosResult status schema to provide details of (a) the target resource impacted (b) success of the chaos revert operation.
Introduces additional labels for the “interleaved” chaos metrics (litmus_awaited_experiments & litmus_experiment_verdict) to indicate workflow name & chaos injection timestamp. This is expected to help in the construction of more meaningful dashboards to track app behavior under chaos.
Adds the golang chaoslib and experiment logic for docker-service-kill (from ansible)
Introduces the tech-preview of a new category (aws-ssm) of chaos experiments that can inject common resource and network chaos in EC2 instances (which is part of a kubernetes cluster or a standalone/vanilla instance).
Introduces the tech-preview of refactored pod-cpu-hog & pod-memory-hog chaos experiments that can inject resource chaos on target apps externally (non-exec mode) via cgroup operations.
Improves/dockerizes the build process for most components (removes vendor packages stored on the repo and migrates to github workflows)
Reduces the size of the experiment (go-runner) image by creating a single chaos helper component that takes specific chaos operations as flags
Extends the StatusCheckTimeout property to the helper pods (earlier releases had this only for pre/post chaos checks), thereby helping the flexible evaluation of application availability/readiness during the chaos
Adds a new event for “Abort” on the ChaosResult
Increases coverage in the commit-based e2e runs on the litmus-go repo with the addition of node chaos tests
Adds a new helm chart for kube-aws (chaos experiment bundle) in the litmus-helm repository.
Enhances the litmus-sdk to (a) create a highly generic experiment scaffolding that can trigger and kill chaos via shell commands passed as environment variables (change from an earlier sample of pod-delete) and (b) push all non-code files (CR yamls) into a dedicated directory that can be directly copied/committed to the chaos-charts repo.
Cuts the first tagged release on the test-tools repository and sets up downloadable artifacts for the dependent chaos utils (nsutil, pauseutil, promql, dns-interceptor).

Major Bug Fixes

Adds missing environment variables for kill sequence and pod affected percentage in the kafka-broker-pod-failure experiment
Fixes the missing environment variable for defining the spoof map within the dns-spoof experiment.
Fixes the ChaosScheduler to work with the latest versions of the chaos-operator and updates documentation with missing mandatory properties in the .spec.engineTemplate

Installation

kubectl apply -f https://litmuschaos.github.io/litmus/litmus-operator-v1.13.6.yaml

Verify your installation

Verify if the chaos operator is running
kubectl get pods -n litmus
Verify if chaos CRDs are installed
kubectl get crds | grep chaos

For more details refer to the documentation at Docs

Assets 2

15 May 18:30

ksatchit

1.13.5

dc086b3

1.13.5

New Features & Enhancements

Introduces category for VMWare chaos with VM power-off experiment (supported for vCenter 6.x)
Adds chaos experiments for simulating DNS errors (inability to resolve hosts) and redirection to incorrect/faulty services (using a spoof map that can redirect specific requests)
Makes the chaos annotationCheck against applications “false” by default, making it simpler for users to get started with chaos without any instrumentation step for the application targets.
Updates the CRD version to v1, the min. supported Kubernetes version moved to 1.15
Enhances the disk fill experiment with a tunable to specify write block size for quicker capacity use and fs aligned writes.
Supports label-based selection of node targets for (node-level) chaos injection.
Adds chaos abort routines for AWS chaos experiments
Adds the ability to target EBS volumes by tag, with a sequential and parallel injection of chaos, with support for both simple as well as EKS persistent volumes.
Places non-litmus core images (dependencies such as argo, MongoDB for portal driven chaos) into litmuschaos image registry, while maintaining image names and release tags to simplify the user experience for those who need to set up local mirrors or are in air-gapped environments
Adds support for Openshift Route in the litmus helm charts
Refactors and optimizes chaos libraries for code reuse and simplified flow. Updates the litmus-sdk to generate refactored experiment templates
Adds GitHub actions based workflow/pipeline for node-level chaos experiments in e2e suite

Major Bug Fixes

Fixes the inability to define certain attributes within the ChaosEngines, for which the OpenAPI validation was missing (due to migration of CRD version to v1) using the “preserve-unknown-fields” option. Also adds the validations for a number of properties/attributes.
Fixes a panic encountered in the chaos-runner upon the inability to access the ChaosEngine resource
Fixes the node restart experiment to perform the right verification checks on helper pods executing the chaos
Fixes behavior where helper pods that complete quickly (run for short durations) are treated as failed by verifying for “succeeded” state.
Removes ambiguity in filtering/accessing helper pods by assigning standard label format
Fixes an erroneous decision in pod-cpu & memory hog experiments which considered a non-zero response (137) upon chaos process kill (SIGKILL) as failure to revert/rollback
Adds a check to verify the status of application target containers before attempting an exec operation to perform the desired chaos action
Fixes the ec2-terminate-by-tag experiment to consider only the running instances for stop/termination
Adds the missing PORTAL_ENDPOINT environment to facilitate namespaced mode of execution of the litmus-portal

Installation

kubectl apply -f https://litmuschaos.github.io/litmus/litmus-operator-v1.13.5.yaml

Verify your installation

Verify if the chaos operator is running
kubectl get pods -n litmus
Verify if chaos CRDs are installed
kubectl get crds | grep chaos

For more details refer to the documentation at Docs

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major Updates

Minor Updates

Major Updates

Minor Updates

Litmus 2.0.0-Beta3

Major Updates

Minor Updates

Major Updates

Minor updates

New Features & Enhancements

Major Bug Fixes

Limitations

Installation

Verify your installation

New Features & Enhancements

Major Bug Fixes

Installation

Verify your installation

New Features & Enhancements

Major Bug Fixes

Installation

Verify your installation

Releases: litmuschaos/litmus

2.0.0-Beta6

Major Updates

Minor Updates

2.0.0-Beta5

2.0.0-Beta4

Major Updates

Minor Updates

2.0.0-Beta3

Litmus 2.0.0-Beta3

Major Updates

Minor Updates

2.0.0-Beta2

2.0.0-Beta1

Major Updates

Minor updates

2.0.0-Beta0

1.13.8

New Features & Enhancements

Major Bug Fixes

Limitations

Installation

Verify your installation

1.13.6

New Features & Enhancements

Major Bug Fixes

Installation

Verify your installation

1.13.5

New Features & Enhancements

Major Bug Fixes

Installation

Verify your installation