Skip to content

Releases: apache/skywalking

9.7.0

01 Dec 18:19
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

Dark Mode

The dafult style mode is changed to the dark mode, and light mode is still available.

dark-mode

New Design Log View

A new design for the log view is currently available. Easier to locate the logs, and more space for the raw text.

logs

Project

  • Bump Java agent to 9.1-dev in the e2e tests.
  • Bump up netty to 4.1.100.
  • Update Groovy 3 to 4.0.15.
  • Support packaging the project in JDK21. Compiler source and target remain in JDK11.

OAP Server

  • ElasticSearchClient: Add deleteById API.
  • Fix Custom alarm rules are overwritten by 'resource/alarm-settings.yml'
  • Support Kafka Monitoring.
  • Support Pulsar server and BookKeeper server Monitoring.
  • [Breaking Change] Elasticsearch storage merge all management data indices into one index management,
    including ui_template,ui_menu,continuous_profiling_policy.
  • Add a release mechanism for alarm windows when it is expired in case of OOM.
  • Fix Zipkin trace receiver response: make the HTTP status code from 200 to 202.
  • Update BanyanDB Java Client to 0.5.0.
  • Fix getInstances query in the BanyanDB Metadata DAO.
  • BanyanDBStorageClient: Add keepAliveProperty API.
  • Fix table exists check in the JDBC Storage Plugin.
  • Enhance extensibility of HTTP Server library.
  • Adjust AlarmRecord alarmMessage column length to 512.
  • Fix EventHookCallback build event: build the layer from Service's Layer.
  • Fix AlarmCore doAlarm: catch exception for each callback to avoid interruption.
  • Optimize queryBasicTraces in TraceQueryEsDAO.
  • Fix WebhookCallback send incorrect messages, add catch exception for each callback HTTP Post.
  • Fix AlarmRule expression validation: add labeled metrics mock data for check.
  • Support collect ZGC memory pool metrics.
  • Add a component ID for Netty-http (ID=151).
  • Add a component ID for Fiber (ID=5021).
  • BanyanDBStorageClient: Add define(Property property, PropertyStore.Strategy strategy) API.
  • Correct the file format and fix typos in the filenames for monitoring Kafka's e2e tests.
  • Support extract timestamp from patterned datetime string in LAL.
  • Support output key parameters in the booting logs.
  • Fix cannot query zipkin traces with annotationQuery parameter in the JDBC related storage.
  • Fix limit doesn't work for findEndpoint API in ES storage.
  • Isolate MAL CounterWindow cache by metric name.
  • Fix JDBC Log query order.
  • Change the DataCarrier IF_POSSIBLE strategy to use ArrayBlockingQueue implementation.
  • Change the policy of the queue(DataCarrier) in the L1 metric aggregate worker to IF_POSSIBLE mode.
  • Add self-observability metric metrics_aggregator_abandon to count the number of abandon metrics.
  • Support Nginx monitoring.
  • Fix BanyanDB Metadata Query: make query single instance/process return full tags to avoid NPE.
  • Repleace go2sky E2E to GO agent.
  • Replace Metrics v2 protocol with MQE in UI templates and E2E Test.
  • Fix incorrect apisix metrics otel rules.
  • Support Scratch The OAP Config Dump.
  • Support increase/rate function in the MQE query language.
  • Group service endpoints into _abandoned when endpoints have high
    cardinality.

UI

  • Add new menu for kafka monitoring.
  • Fix independent widget duration.
  • Fix the display height of the link tree structure.
  • Replace the name by shortName on service widget.
  • Refactor: update pagination style. No visualization style change.
  • Apply MQE on K8s layer UI-templates.
  • Fix icons display in trace tree diagram.
  • Fix: update tooltip style to support multiple metrics scrolling view in a metrics graph.
  • Add a new widget to show jvm memory pool detail.
  • Fix: avoid querying data with empty parameters.
  • Add a title and a description for trace segments.
  • Add Netty icon for Netty HTTP plugin.
  • Add Pulsar menu i18n files.
  • Refactor Logs view.
  • Implement the Dark Theme.
  • Change UI templates for Text widgets.
  • Add Nginx menu i18n.
  • Fix the height for trace widget.
  • Polish list style.
  • Fix Log associate with Trace.
  • Enhance layout for broken Topology widget.
  • Fix calls metric with call type for Topology widget.
  • Fix changing metrics config for Topology widget.
  • Fix routes for Tab widget.
  • Remove OpenFunction(FAAS layer) relative UI templates and menu item.
  • Fix: change colors to match dark theme for Network Profiling.
  • Remove the description of OpenFunction in the UI i18n.
  • Reduce component chunks to improve page loading resource time.

Documentation

  • Separate storage docs to different files, and add an estimated timeline for BanyanDB(end of 2023).
  • Add topology configuration in UI-Grafana doc.
  • Add missing metrics to the OpenTelemetry Metrics doc.
  • Polish docs of Concepts and Designs.
  • Fix incorrect notes of slowCacheReadThreshold.
  • Update OAP setup and cluster coordinator docs to explain new booting parameters table in the logs, and how to setup
    cluster mode.

All issues and pull requests are here

9.6.0

03 Sep 14:33
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

New Alerting Kernel

  • MQE(Metrics Query Expression) and a new notification mechanism are supported.
    alerting-rules

Support Loki LogQL

  • Newly added support for Loki LogQL and Grafana Loki Dashboard for SkyWalking collected logs

grafana-logql

WARNING

  • ElasticSearch 6 storage relative tests are removed. It worked and is not promised due to end of life officially.

Project

  • Bump up Guava to 32.0.1 to avoid the lib listed as vulnerable due to CVE-2020-8908. This API is never used.
  • Maven artifact skywalking-log-recevier-plugin is renamed to skywalking-log-receiver-plugin.
  • Bump up cli version 0.11 to 0.12.
  • Bump up the version of ASF parent pom to v30.
  • Make builds reproducible for automatic releases CI.

OAP Server

  • Add Neo4j component ID(112) language: Python.
  • Add Istio ServiceEntry registry to resolve unknown IPs in ALS.
  • Wrap deleteProperty API to the BanyanDBStorageClient.
  • [Breaking change] Remove matchedCounter from HttpUriRecognitionService#feedRawData.
  • Remove patterns from HttpUriRecognitionService#feedRawData and add max 10 candidates of raw URIs for each pattern.
  • Add component ID for WebSphere.
  • Fix AI Pipeline uri caching NullPointer and IllegalArgument Exceptions.
  • Fix NPE in metrics query when the metric is not exist.
  • Remove E2E tests for Istio < 1.15, ElasticSearch < 7.16.3, they might still work but are not supported as planed.
  • Scroll all results in ElasticSearch storage and refactor scrolling logics, including Service, Instance, Endpoint,
    Process, etc.
  • Improve Kubernetes coordinator to remove Terminating OAP Pods in cluster.
  • Support SW_CORE_SYNC_PERIOD_HTTP_URI_RECOGNITION_PATTERN and SW_CORE_TRAINING_PERIOD_HTTP_URI_RECOGNITION_PATTERN
    to control the period of training and sync HTTP URI recognition patterns. And shorten the default period to 10s for
    sync and 60s for training.
  • Fix ElasticSearch scroller bug.
  • Add component ID for Aerospike(ID=149).
  • Packages with name recevier are renamed to receiver.
  • BanyanDBMetricsDAO handles storeIDTag in multiGet for BanyanDBModelExtension.
  • Fix endpoint grouping-related logic and enhance the performance of PatternTree retrieval.
  • Fix metric session cache saving after batch insert when using mysql-connector-java.
  • Support dynamic UI menu query.
  • Add comment for docker/.env to explain the usage.
  • Fix wrong environment variable name SW_OTEL_RECEIVER_ENABLED_OTEL_RULES to right SW_OTEL_RECEIVER_ENABLED_OTEL_METRICS_RULES.
  • Fix instance query in JDBC implementation.
  • Set the SW_QUERY_MAX_QUERY_COMPLEXITY default value to 3000(was 1000).
  • Accept length=4000 parameter value of the event. It was 2000.
  • Tolerate parameter value in illegal JSON format.
  • Update BanyanDB Java Client to 0.4.0
  • Support aggregate Labeled Value Metrics in MQE.
  • [Breaking change] Change the default label name in MQE from label to _.
  • Bump up grpc version to 1.53.0.
  • [Breaking change] Removed '&' symbols from shell scripts to avoid OAP server process running as a background process.
  • Revert part of #10616 to fix the unexpected changes: if there is no data we should return an array with 0s,
    but in #10616, an empty array is returned.
  • Cache all service entity in memory for query.
  • Bump up jackson version to 2.15.2.
  • Increase the default memory size to avoid OOM.
  • Bump up graphql-java to 21.0.
  • Add Echo component ID(5015) language: Golang.
  • Fix index out of bounds exception in aggregate_labels MQE function.
  • Support MongoDB Server/Cluster monitoring powered by OTEL.
  • Do not print configurations values in logs to avoid sensitive info leaked.
  • Move created the latest index before retrieval indexes by aliases to avoid the 404 exception. This just prevents some interference from manual operations.
  • Add more Go VM metrics, as new skywalking-go agent provided since its 0.2 release.
  • Add component ID for Lock (ID=5016).
  • [Breaking change] Adjust the structure of hooks in the alarm-settings.yml. Support multiple configs for each hook types and specifying the hooks in the alarm rule.
  • Bump up Armeria to 1.24.3.
  • Fix BooleanMatch and BooleanNotEqualMatch doing Boolean comparison.
  • Support LogQL HTTP query APIs.
  • Add Mux Server component ID(5017) language: Golang.
  • Remove ElasticSearch 6.3.2 from our client lib tests.
  • Bump up ElasticSearch server 8.8.1 to 8.9.0 for latest e2e testing. 8.1.0, 7.16.3 and 7.17.10 are still tested.
  • Add OpenSearch 2.8.0 to our client lib tests.
  • Use listening mode for apollo implementation of dynamic configuration.
  • Add view_as_seq function in MQE for listing metrics in the given prioritized sequence.
  • Fix the wrong default value of k8sServiceNameRule if it's not explicitly set.
  • Improve PromQL to allow for multiple metric operations within a single query.
  • Fix MQE Binary Operation between labeled metrics and other type of value result.
  • Add component ID for Nacos (ID=150).
  • Support Compare Operation in MQE.
  • Fix the Kubernetes resource cache not refreshed.
  • Fix wrong classpath that might cause OOM in startup.
  • Enhance the serviceRelation in MAL by adding settings for the delimiter and component fields.
  • [Breaking change] Support MQE in the Alerting. The Alarm Rules configuration(alarm-settings.yml),
    add expression field and remove metrics-name/count/threshold/op/only-as-condition fields and remove composite-rules configuration.
  • Check results in ALS as per downstream/upstream instead of per log.
  • Fix GraphQL query listInstances not using endTime query
  • Do not start server and Kafka consumer in init mode.
  • Add Iris component ID(5018).
  • Add OTLP Tracing support as a Zipkin trace input.

UI

  • Fix metric name browser_app_error_rate in Browser-Root dashboard.
  • Fix display name of endpoint_cpm for endpoint list in General-Service dashboard.
  • Implement customize menus and marketplace page.
  • Fix minTraceDuration and maxTraceDuration types.
  • Fix init minTime to Infinity.
  • Bump dependencies to fix vulnerabilities.
  • Add scss variables.
  • Fix the title of instance list and notices in the continue profiling.
  • Add a link to explain the expression metric, add units in the continue profiling widget.
  • Calculate string width to set Tabs name width.
  • [Breaking change] Removed '&' symbols from shell scripts to avoid web application server process running as a background process.
  • Reset chart label.
  • Fix service associates instances.
  • Remove node-sass.
  • Fix commit error on Windows.
  • Apply MQE on MYSQL, POSTGRESQL, REDIS, ELASTICSEARCH and DYNAMODB layer UI-templates.
  • Apply MQE on Virtual-Cache layer UI-templates
  • Apply MQE on APISIX, AWS_EKS, AWS_GATEWAY and AWS_S3 layer UI templates.
  • Apply MQE on RabbitMQ Dashboards.
  • Apply MQE on Virtual-MQ layer UI-templates
  • Apply MQE on Infra-Linux layer UI-templates
  • Apply MQE on Infra-Windows layer UI-templates
  • Apply MQE on Browser layer UI-templates.
  • Implement MQE on topology widget.
  • Fix getEndpoints keyword blank.
  • Implement a breadcrumb component as navigation.

Documentation

  • Add Go agent into the server agent documentation.
  • Add data unit description in the configuration of continuous profiling policy.
  • Remove storage extension doc, as it is expired.
  • Remove how to add menu doc, as SkyWalking supports marketplace and new backend-based setup.
  • Separate contribution docs to a new menu structure.
  • Add a doc to explain how to manage i18n.
  • Add a doc to explain OTLP Trace support.
  • Fix typo in dynamic-config-configmap.md.
  • Fix out-dated docs about Kafka fetcher.
  • Remove 3rd part fetchers from the docs, as they are not maintained anymore.

All issues and pull requests are here

9.5.0

18 Jun 16:20
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

New Topology Layout

image

Elasticsearch Server Monitoring

image

Project

  • Fix Duplicate class found due to the delombok goal.

OAP Server

  • Fix wrong layer of metric user error in DynamoDB monitoring.
  • ElasticSearch storage does not check field types when OAP running in no-init mode.
  • Support to bind TLS status as a part of component for service topology.
  • Fix component ID priority bug.
  • Fix component ID of topology overlap due to storage layer bugs.
  • [Breaking Change] Enhance JDBC storage through merging tables and managing day-based table rolling.
  • [Breaking Change] Sharding-MySQL implementations and tests get removed due to we have the day-based rolling mechanism by default
  • Fix otel k8s-cluster rule add namespace dimension for MAL aggregation calculation(Deployment Status,Deployment Spec Replicas)
  • Support continuous profiling feature.
  • Support collect process level related metrics.
  • Fix K8sRetag reads the wrong k8s service from the cache due to a possible namespace mismatch.
  • [Breaking Change] Support cross-thread trace profiling. The data structure and query APIs are changed.
  • Fix PromQL HTTP API /api/v1/labels response missing service label.
  • Fix possible NPE when initialize IntList.
  • Support parse PromQL expression has empty labels in the braces for metadata query.
  • Support alarm metric OP !=.
  • Support metrics query indicates whether value == 0 represents actually zero or no data.
  • Fix NPE when query the not exist series indexes in ElasticSearch storage.
  • Support collecting memory buff/cache metrics in VM monitoring.
  • PromQL: Remove empty values from the query result, fix /api/v1/metadata param limit could cause out of bound.
  • Support monitoring the total number metrics of k8s StatefulSet and DaemonSet.
  • Support Amazon API Gateway monitoring.
  • Bump up graphql-java to fix cve.
  • Bump up Kubernetes Java client.
  • Support Redis Monitoring.
  • Add component ID for amqp, amqp-producer and amqp-consumer.
  • Support no-proxy mode for aws-firehose receiver
  • Bump up armeria to 1.23.1
  • Support Elasticsearch Monitoring.
  • Fix PromQL HTTP API /api/v1/series response missing service label when matching metric.
  • Support ServerSide TopN for BanyanDB.
  • Add component ID for Jersey.
  • Remove OpenCensus support, the related codes and docs as it's sunsetting.
  • Support dynamic configuration of searchableTracesTags
  • Support exportErrorStatusTraceOnly for export the error status trace segments through the Kafka channel
  • Add component ID for Grizzly.
  • Fix potential NPE in Zipkin receiver when the Span is missing some fields.
  • Filter out unknown_cluster metric data.
  • Support RabbitMQ Monitoring.
  • Support Redis slow logs collection.
  • Fix data loss when query continuous profiling task record.
  • Adapt the continuous profiling task query GraphQL.
  • Support Metrics Query Expression(MQE) and allows users to do simple query-stage calculation through the expression.
  • Deprecated metrics query v2 protocol.
  • Deprecated record query protocol.
  • Add component ID for go-redis.
  • Add OpenSearch 2.8.0 to test case.
  • Add ai-pipeline module.
  • Support HTTP URI formatting through ai-pipeline to do pattern recognition.
  • Add new HTTP URI grouping engine with benchmark.
  • [Breaking Change] Use the new HTTP URI grouping engine to replace the old regex based mechanism.
  • Support sumLabeled in MAL.
  • Migrate from kubernetes-client/java to fabric8 client.
  • Envoy ALS generated relation metrics considers http status codes >= 400 has an error at the client side.
  • Add cause message field when query continuous profiling task.

UI

  • Revert: cpm5d function. This feature is cancelled from backend.
  • Fix: alerting link breaks on the topology.
  • Refactor Topology widget to make it more hierarchical.
    1. Choose User as the first node.
    2. If User node is absent, choose the busiest node(which has the most calls of all).
    3. Do a left-to-right flow process.
    4. At the same level, list nodes from top to bottom in alphabetical order.
  • Fix filter ID when ReadRecords metric associates with trace.
  • Add AWS API Gateway menu.
  • Change trace profiling protocol.
  • Add Redis menu.
  • Optimize data types.
  • Support isEmptyValue flag for metrics query.
  • Add elasticsearch menu.
  • [Clean UI templates before upgrade] Set showSymbol: true, and make the data point shows on the Line graph.
    Please clean ui_template index in elasticsearch storage or table in JDBC storage.
  • [Clean UI templates before upgrade] UI templates: Simplify metric name with the label.
  • Add MQ menu.
  • Add Jeysey icon.
  • Fix: set endpoint and instance selectors with url parameters correctly.
  • Bump up dependencies versions icons-vue 1.1.4, element-plus 2.1.0, nanoid 3.3.6, postcss 8.4.23
  • Add OpenTelemetry log protocol support.
  • [Breaking Change] Configuration key enabledOtelRules is renamed to enabledOtelMetricsRules and
    the corresponding environment variable is renamed to SW_OTEL_RECEIVER_ENABLED_OTEL_METRICS_RULES.
  • Add grizzly icon.
  • Fix: the Instance List data display error.
  • Fix: set topN type to Number.
  • Support Metrics Query Expression(MQE) and allows users to do simple query-stage calculation through the expression.
  • Bump up zipkin ui dependency to 2.24.1.
  • Bump up vite to 4.0.5.
  • Apply MQE on General and Virtual-Database layer UI-templates.

Documentation

  • Add Profiling related documentations.
  • Add SUM_PER_MIN to MAL documentation.
  • Make the log relative docs more clear, and easier for further more formats support.
  • Update the cluster management and advanced deployment docs.

All issues and pull requests are here

9.4.0

09 Mar 04:01
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

PromQL and Grafana Support

image

Zipkin Lens UI Bundled

image

AWS S3 and DynamoDB monitoring

image

Project

  • Bump up Zipkin and Zipkin lens UI dependency to 2.24.0.
  • Bump up Apache parent pom version to 29.
  • Bump up Armeria version to 1.21.0.
  • Clean up maven pom.xmls.
  • Bump up Java version to 11.
  • Bump up snakeyaml to 2.0.

OAP Server

  • Add ServerStatusService in the core module to provide a new way to expose booting status to other modules.
  • Adds Micrometer as a new component.(ID=141)
  • Refactor session cache in MetricsPersistentWorker.
  • Cache enhancement - don't read new metrics from database in minute dimensionality.
    // When
    // (1) the time bucket of the server's latest stability status is provided
    //     1.1 the OAP has booted successfully
    //     1.2 the current dimensionality is in minute.
    //     1.3 the OAP cluster is rebalanced due to scaling
    // (2) the metrics are from the time after the timeOfLatestStabilitySts
    // (3) the metrics don't exist in the cache
    // the kernel should NOT try to load it from the database.
    //
    // Notice, about condition (2),
    // for the specific minute of booted successfully, the metrics are expected to load from database when
    // it doesn't exist in the cache.
  • Remove the offset of metric session timeout according to worker creation sequence.
  • Correct MetricsExtension annotations declarations in manual entities.
  • Support component IDs' priority in process relation metrics.
  • Remove abandon logic in MergableBufferedData, which caused unexpected no-update.
  • Fix miss set LastUpdateTimestamp that caused the metrics session to expire.
  • Rename MAL rule spring-sleuth.yaml to spring-micrometer.yaml.
  • Fix memory leak in Zipkin API.
  • Remove the dependency of refresh_interval of ElasticSearch indices from elasticsearch/flushInterval config. Now,
    it uses core/persistentPeriod + 5s as refresh_interval for all indices instead.
  • Change elasticsearch/flushInterval to 5s(was 15s).
  • Optimize flushInterval of ElasticSearch BulkProcessor to avoid extra periodical flush in the continuous bulk streams.
  • An unexpected dot is added when exp is a pure metric name and expPrefix != null.
  • Support monitoring MariaDB.
  • Remove measure/stream specific interval settings in BanyanDB.
  • Add global-specific settings used to override global configurations (e.g segmentIntervalDays, blockIntervalHours) in BanyanDB.
  • Use TTL-driven interval settings for the measure-default group in BanyanDB.
  • Fix wrong group of non time-relative metadata in BanyanDB.
  • Refactor StorageData#id to the new StorageID object from a String type.
  • Support multiple component IDs in the service topology level.
  • Add ElasticSearch.Keyword annotation to declare the target field type as keyword.
  • [Breaking Change] Column component_id of service_relation_client_side and service_relation_server_side have been replaced by component_ids.
  • Support priority definition in the component-libraries.yml.
  • Enhance service topology query. When there are multiple components detected from the server side,
    the component type of the node would be determined by the priority, which was random in the previous release.
  • Remove component_id from service_instance_relation_client_side and service_instance_relation_server_side.
  • Make the satellite E2E test more stable.
  • Add Istio 1.16 to test matrix.
  • Register ValueColumn as Tag for Record in BanyanDB storage plugin.
  • Bump up Netty to 4.1.86.
  • Remove unnecessary additional columns when storage is in logical sharding mode.
  • The cluster coordinator support watch mechanism for notifying RemoteClientManager and ServerStatusService.
  • Fix ServiceMeshServiceDispatcher overwrite ServiceDispatcher debug file when open SW_OAL_ENGINE_DEBUG.
  • Use groupBy and in operators to optimize topology query for BanyanDB storage plugin.
  • Support server status watcher for MetricsPersistentWorker to check the metrics whether required initialization.
  • Fix the meter value are not correct when using sumPerMinLabeld or sumHistogramPercentile MAL function.
  • Fix cannot display attached events when using Zipkin Lens UI query traces.
  • Remove time_bucket for both Stream and Measure kinds in BanyanDB plugin.
  • Merge TIME_BUCKET of Metrics and Record into StorageData.
  • Support no layer in the listServices query.
  • Fix time_bucket of ServiceTraffic not set correctly in slowSql of MAL.
  • Correct the TopN record query DAO of BanyanDB.
  • Tweak interval settings of BanyanDB.
  • Support monitoring AWS Cloud EKS.
  • Bump BanyanDB Java client to 0.3.0-rc1.
  • Remove id tag from measures.
  • Add Banyandb.MeasureField to mark a column as a BanyanDB Measure field.
  • Add BanyanDB.StoreIDTag to store a process's id for searching.
  • [Breaking Change] The supported version of ShardingSphere-Proxy is upgraded from 5.1.2 to 5.3.1. Due to the changes of ShardingSphere's API, versions before 5.3.1 are not compatible.
  • Add the eBPF network profiling E2E Test in the per storage.
  • Fix TCP service instances are lack of instance properties like pod and namespace, which causes Pod log not to work for TCP workloads.
  • Add Python HBase happybase module component ID(94).
  • Fix gRPC alarm cannot update settings from dynamic configuration source.
  • Add batchOfBytes configuration to limit the size of bulk flush.
  • Add Python Websocket module component ID(7018).
  • [Optional] Optimize single trace query performance by customizing routing in ElasticSearch. SkyWalking trace segments and Zipkin spans are using trace ID for routing. This is OFF by default, controlled by storage/elasticsearch/enableCustomRouting.
  • Enhance OAP HTTP server to support HTTPS
  • Remove handler scan in otel receiver, manual initialization instead
  • Add aws-firehose-receiver to support collecting AWS CloudWatch metric(OpenTelemetry format). Notice, no HTTPS/TLS setup
    support. By following AWS Firehose request, it uses proxy request
    (https://... instead of /aws/firehose/metrics), there must be a proxy(Nginx, Envoy, etc.).
  • Avoid Antlr dependencies' versions might be different in compile time and runtime.
  • Now PrometheusMetricConverter#escapedName also support converting / to _.
  • Add missing TCP throughput metrics.
  • Refactor @Column annotation, swap Column#name and ElasticSearch.Column#columnAlias and rename ElasticSearch.Column#columnAlias to ElasticSearch.Column#legacyName.
  • Add Python HTTPX module component ID(7019).
  • Migrate tests from junit 4 to junit 5.
  • Refactor http-based alarm plugins and extract common logic to HttpAlarmCallback.
  • Support Amazon Simple Storage Service (Amazon S3) metrics monitoring
  • Support process Sum metrics with AGGREGATION_TEMPORALITY_DELTA case
  • Support Amazon DynamoDB monitoring.
  • Support prometheus HTTP API and promQL.
  • Scope in the Entity of Metrics query v1 protocol is not required and automatical correction. The scope is determined based on the metric itself.
  • Add explicit ReadTimeout for ConsulConfigurationWatcher to avoid IllegalArgumentException: Cache watchInterval=10sec >= networkClientReadTimeout=10000ms.
  • Fix DurationUtils.getDurationPoints exceed, when startTimeBucket equals endTimeBucket.
  • Support process OpenTelemetry ExponentialHistogram metrics
  • Add FreeRedis component ID(3018).

UI

  • Add Zipkin Lens UI to webapp, and proxy it to context path /zipkin.
  • Migrate the build tool from vue cli to Vite4.
  • Fix Instance Relation and Endpoint Relation dashboards show up.
  • Add Micrometer icon.
  • Update MySQL UI to support MariaDB.
  • Add AWS menu for supporting AWS monitoring.
  • Add missing FastAPI logo.
  • Update the log details page to support the formatted display of JSON content.
  • Fix build config.
  • Avoid being unable to drag process nodes for the first time.
  • Add node folder into ignore list.
  • Add ElPopconfirm to component types.
  • Add an iframe widget for zipkin UI.
  • Optimize graph tooltips to make them more friendly.
  • Bump json5 from 1.0.1 to 1.0.2.
  • Add websockets icon.
  • Implement independent mode for widgets.
  • Bump http-cache-semantics from 4.1.0 to 4.1.1.
  • Update menus for OpenFunction.
  • Add auto fresh to widgets independent mode.
  • Fix: clear trace ID on the Log and Trace widgets after using association.
  • Fix: reset duration for query conditions after time range changes.
  • Add AWS S3 menu.
  • Refactor: optimize side bar component to make it more friendly.
  • Fix: remove duplicate popup message for query result.
  • Add logo for HTTPX.
  • Refactor: optimize the attached events visualization in the trace widget.
  • Update BanyanDB client to 0.3.1.
  • Add AWS DynamoDB menu.
  • Fix: add auto period to the independent mode for widgets.
  • Optimize menus and add Windows monitoring menu.
  • Add a calculation for the cpm5dAvg.
  • add a cpm5d calculation.
  • Fix data processing error in the eBPF profiling widget.
  • Support for double quotes in SlowSQL statements.
  • Fix: the wrong position of the menu when clicking the topology node.

##...

Read more

9.3.0

04 Dec 03:44
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

Metrics Association

Dashboard Pop-up Trace Query
image image

APISIX Dashboard

image

Use Sharding MySQL as the Database

image

Virtual Cache Performance

image

Virtual MQ Performance

image

Project

  • Bump up the embedded swctl version in OAP Docker image.

OAP Server

  • Add component ID(133) for impala JDBC Java agent plugin and component ID(134) for impala server.
  • Use prepareStatement in H2SQLExecutor#getByIDs.(No function change).
  • Bump up snakeyaml to 1.32 for fixing CVE.
  • Fix DurationUtils.convertToTimeBucket missed verify date format.
  • Enhance LAL to support converting LogData to DatabaseSlowStatement.
  • [Breaking Change] Change the LAL script format(Add layer property).
  • Adapt ElasticSearch 8.1+, migrate from removed APIs to recommended APIs.
  • Support monitoring MySQL slow SQLs.
  • Support analyzing cache related spans to provide metrics and slow commands for cache services from client side
  • Optimize virtual database, fix dynamic config watcher NPE when default value is null
  • Remove physical index existing check and keep template existing check only to avoid meaningless retry wait
    in no-init mode.
  • Make sure instance list ordered in TTL processor to avoid TTL timer never runs.
  • Support monitoring PostgreSQL slow SQLs.
  • [Breaking Change] Support sharding MySQL database instances and tables by Shardingsphere-Proxy. SQL-Database requires removing tables log_tag/segment_tag/zipkin_query before OAP starts, if bump up from previous releases.
  • Fix meter functions avgHistogram, avgHistogramPercentile, avgLabeled, sumHistogram having data conflict when
    downsampling.
  • Do sorting readLabeledMetricsValues result forcedly in case the storage(database) doesn't return data consistent
    with the parameter list.
  • Fix the wrong watch semantics in Kubernetes watchers, which causes heavy traffic to API server in some Kubernetes clusters, we should use Get State and Start at Most Recent semantic instead of Start at Exact because we don't need the changing history events, see https://kubernetes.io/docs/reference/using-api/api-concepts/#semantics-for-watch.
  • Unify query services and DAOs codes time range condition to Duration.
  • [Breaking Change]: Remove prometheus-fetcher plugin, please use OpenTelemetry to scrape Prometheus metrics and
    set up SkyWalking OpenTelemetry receiver instead.
  • BugFix: histogram metrics sent to MAL should be treated as OpenTelemetry style, not Prometheus style:
    (-infinity, explicit_bounds[i]] for i == 0
    (explicit_bounds[i-1], explicit_bounds[i]] for 0 < i < size(explicit_bounds)
    (explicit_bounds[i-1], +infinity) for i == size(explicit_bounds)
    
  • Support Golang runtime metrics analysis.
  • Add APISIX metrics monitoring
  • Support skywalking-client-js report empty service version and page path , set default version as latest and
    default page path as /(root). Fix the
    error fetching data (/browser_app_page_pv0) : Can't split endpoint id into 2 parts.
  • [Breaking Change] Limit the max length of trace/log/alarm tag's key=value, set the max length of column tags
    in tableslog_tag/segment_tag/alarm_record_tag and column query in zipkin_query and column tag_value in tag_autocomplete to 256.
    SQL-Database requires altering these columns' length or removing these tables before OAP starts, if bump up from previous releases.
  • Optimize the creation conditions of profiling task.
  • Lazy load the Kubernetes metadata and switch from event-driven to polling. Previously we set up watchers to watch the Kubernetes metadata changes, this is perfect when there are deployments changes and SkyWalking can react to the changes in real time. However when the cluster has many events (such as in large cluster or some special Kubernetes engine like OpenShift), the requests sent from SkyWalking becomes unpredictable, i.e. SkyWalking might send massive requests to Kubernetes API server, causing heavy load to the API server. This PR switches from the watcher mechanism to polling mechanism, SkyWalking polls the metadata in a specified interval, so that the requests sent to API server is predictable (~10 requests every interval, 3 minutes), and the requests count is constant regardless of the cluster's changes. However with this change SkyWalking can't react to the cluster changes in time, but the delay is acceptable in our case.
  • Optimize the query time of tasks in ProfileTaskCache.
  • Fix metrics was put into wrong slot of the window in the alerting kernel.
  • Support sumPerMinLabeled in MAL.
  • Bump up jackson databind, snakeyaml, grpc dependencies.
  • Support export Trace and Log through Kafka.
  • Add new config initialization mechanism of module provider. This is a ModuleManager lib kernel level change.
  • [Breaking Change] Support new records query protocol, rename the column named service_id to entity_id for support difference entity.
    Please re-create top_n_database_statement index/table.
  • Remove improper self-obs metrics in JvmMetricsHandler(for Kafka channel).
  • gRPC stream canceling code is not logged as an error when the client cancels the stream. The client
    cancels the stream when the pod is terminated.
  • [Breaking Change] Change the way of loading MAL rules(support pattern).
  • Move k8s relative MAL files into /otel-rules/k8s.
  • [Breaking Change] Refactor service mesh protobuf definitions and split TCP-related metrics to individual definition.
  • Add TCP{Service,ServiceInstance,ServiceRelation,ServiceInstanceRelation} sources and split TCP-related entities out from
    original Service,ServiceInstance,ServiceRelation,ServiceInstanceRelation.
  • [Breaking Change] TCP-related source names are changed, fields of TCP-related sources are changed, please refer to the latest oal/tcp.oal file.
  • Do not log error logs when failed to create ElasticSearch index because the index is created already.
  • Add virtual MQ analysis for native traces.
  • Support Python runtime metrics analysis.
  • Support sampledTrace in LAL.
  • Support multiple rules with different names under the same layer of LAL script.
  • (Optimization) Reduce the buffer size(queue) of MAL(only) metric streams. Set L1 queue size as 1/20, L2 queue size as 1/2.
  • Support monitoring MySQL/PostgreSQL in the cluster mode.
  • [Breaking Change] Migrate to BanyanDB v0.2.0.
    • Adopt new OR logical operator for,
      1. MeasureIDs query
      2. BanyanDBProfileThreadSnapshotQueryDAO query
      3. Multiple Event conditions query
      4. Metrics query
    • Simplify Group check and creation
    • Partially apply UITemplate changes
    • Support index_only
    • Return CompletableFuture<Void> directly from BanyanDB client
    • Optimize data binary parse methods in *LogQueryDAO
    • Support different indexType
    • Support configuration for TTL and (block|segment) intervals
  • Elasticsearch storage: Provide system environment variable(SW_STORAGE_ES_SPECIFIC_INDEX_SETTINGS) and support specify the settings (number_of_shards/number_of_replicas) for each index individually.
  • Elasticsearch storage: Support update index settings (number_of_shards/number_of_replicas) for the index template after rebooting.
  • Optimize MQ Topology analysis. Use entry span's peer from the consumer side as source service when no producer instrumentation(no cross-process reference).
  • Refactor JDBC storage implementations to reuse logics.
  • Fix ClassCastException in LoggingConfigWatcher.
  • Support span attached event concept in Zipkin and SkyWalking trace query.
  • Support span attached events on Zipkin lens UI.
  • Force UTF-8 encoding in JsonLogHandler of kafka-fetcher-plugin.
  • Fix max length to 512 of entity, instance and endpoint IDs in trace, log, profiling, topN tables(JDBC storages). The value was 200 by default.
  • Add component IDs(135, 136, 137) for EventMesh server and client-side plugins.
  • Bump up Kafka client to 2.8.1 to fix CVE-2021-38153.
  • Remove lengthEnvVariable for Column as it never works as expected.
  • Add LongText to support longer logs persistent as a text type in ElasticSearch, instead of a keyword, to avoid length limitation.
  • Fix wrong system variable name SW_CORE_ENABLE_ENDPOINT_NAME_GROUPING_BY_OPENAPI. It was opaenapi.
  • Fix not-time-series model blocking OAP boots in no-init mode.
  • Fix ShardingTopologyQueryDAO.loadServiceRelationsDetectedAtServerSide invoke backend miss parameter serviceIds.
  • Changed system variable SW_SUPERDATASET_STORAGE_DAY_STEP to SW_STORAGE_ES_SUPER_DATASET_DAY_STEP to be consistent with other ES storage related variables.
  • Fix ESEventQueryDAO missing metric_table boolQuery criteria.
  • Add default entity name(_blank) if absent ...
Read more

9.2.0

01 Sep 12:54
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

eBPF Network Profiling for K8s Pod

image

Event and Metrics Association

image

MySQL Server Monitoring

image

PostgreSQL Server Monitoring

image

Project

  • [Critical] Fix a low performance issue of metrics persistent in the ElasticSearch storage implementation. One single
    metric could have to wait for an unnecessary 7~10s(System Env Variable SW_STORAGE_ES_FLUSH_INTERVAL) since 8.8.0 -
    9.1.0 releases.
  • Upgrade Armeria to 1.16.0, Kubernetes Java client to 15.0.1.

OAP Server

  • Add more entities for Zipkin to improve performance.
  • ElasticSearch: scroll id should be updated when scrolling as it may change.
  • Mesh: fix only last rule works when multiple rules are defined in metadata-service-mapping.yaml.
  • Support sending alarm messages to PagerDuty.
  • Support Zipkin kafka collector.
  • Add VIRTUAL detect type to Process for Network Profiling.
  • Add component ID(128) for Java Hutool plugin.
  • Add Zipkin query exception handler, response error message for illegal arguments.
  • Fix a NullPointerException in the endpoint analysis, which would cause missing MQ-related LocalSpan in the trace.
  • Add forEach, processRelation function to MAL expression.
  • Add expPrefix, initExp in MAL config.
  • Add component ID(7015) for Python Bottle plugin.
  • Remove legacy OAL percentile functions, p99, p95, p90, p75, p50 func(s).
  • Revert #8066. Keep all metrics persistent even it is default value.
  • Skip loading UI templates if folder is empty or doesn't exist.
  • Optimize ElasticSearch query performance by using _mGet and physical index name rather than alias in these
    scenarios, (a) Metrics aggregation (b) Zipkin query (c) Metrics query (d) Log query
  • Support the NETWORK type of eBPF Profiling task.
  • Support sumHistogram in MAL.
  • [Breaking Change] Make the eBPF Profiling task support to the service instance level,
    index/table ebpf_profiling_task is required to be re-created when bump up from previous releases.
  • Fix race condition in Banyandb storage
  • Support SUM_PER_MIN downsampling in MAL.
  • Support sumHistogramPercentile in MAL.
  • Add VIRTUAL_CACHE to Layer, to fix conjectured Redis server, which icon can't show on the topology.
  • [Breaking Change] Elasticsearch storage merge all metrics/meter and records(without super datasets) indices into one
    physical index template metrics-all and records-all on the default setting.
    Provide system environment variable(SW_STORAGE_ES_LOGIC_SHARDING) to shard metrics/meter indices into
    multi-physical indices as the previous versions(one index template per metric/meter aggregation function).
    In the current one index mode, users still could choose to adjust ElasticSearch's shard
    number(SW_STORAGE_ES_INDEX_SHARDS_NUMBER) to scale out.
    More details please refer to New ElasticSearch storage option explanation in 9.2.0
    and backend-storage doc
  • [Breaking Change] Index/table ebpf_profiling_schedule added a new column ebpf_profiling_schedule_id,
    the H2/Mysql/Tidb/Postgres storage users are required to re-created it when bump up from previous releases.
  • Fix Zipkin trace query the max size of spans.
  • Add tls and https component IDs for Network Profiling.
  • Support Elasticsearch column alias for the compatibility between storage logicSharding model and no-logicSharding model.
  • Support MySQL monitoring.
  • Support PostgreSQL monitoring.
  • Fix query services by serviceId error when Elasticsearch storage SW_STORAGE_ES_QUERY_MAX_SIZE > 10000.
  • Support sending alarm messages to Discord.
  • Fix query history process data failure.
  • Optimize TTL mechanism for Elasticsearch storage, skip executed indices in one TTL rotation.
  • Add Kubernetes support module to share codes between modules and reduce calls to Kubernetes API server.
  • Bump up Kubernetes Java client to fix cve.
  • Adapt OpenTelemetry native metrics protocol.
  • [Breaking Change] rename configuration folder from otel-oc-rules to otel-rules.
  • [Breaking Change] rename configuration field from enabledOcRules to enabledOtelRules and
    environment variable name from SW_OTEL_RECEIVER_ENABLED_OC_RULES to SW_OTEL_RECEIVER_ENABLED_OTEL_RULES.
  • [Breaking Change] Fix JDBC TTL to delete additional tables data.
    SQL Database requires removing segment,segment_tag, logs, logs_tag, alarms, alarms_tag, zipkin_span, zipkin_query before OAP starts.
  • SQL Database: add @SQLDatabase.ExtraColumn4AdditionalEntity to support add an extra column from parent to an additional table.
  • Add component ID(131) for Java Micronaut plugin
  • Add component ID(132) for Nats java client plugin

UI

  • Fix query conditions for the browser logs.
  • Implement a url parameter to activate tab index.
  • Fix clear interval fail when switch autoRefresh to off.
  • Optimize log tables.
  • Fix log detail pop-up page doesn't work.
  • Optimize table widget to hide the whole metric column when no metric is set.
  • Implement the Event widget. Remove event menu.
  • Fix span detail text overlap.
  • Add Python Bottle Plugin Logo.
  • Implement an association between widgets(line, bar, area graphs) with time.
  • Fix tag dropdown style.
  • Hide the copy button when db.statement is empty.
  • Fix legend metrics for topology.
  • Dashboard: Add metrics association.
  • Dashboard: Fix FaaS-Root document link and topology service relation dashboard link.
  • Dashboard: Fix Mesh-Instance metric Throughput.
  • Dashboard: Fix Mesh-Service-Relation metric Throughput
    and Proxy Sidecar Internal Latency in Nanoseconds (Client Response).
  • Dashboard: Fix Mesh-Instance-Relation metric Throughput.
  • Enhance associations for the Event widget.
  • Add event widgets in dashboard where applicable.
  • Fix dashboard list search box not work.
  • Fix short time range.
  • Fix event widget incompatibility in Safari.
  • Refactor the tags component to support searching for tag keys and values.
  • Implement the log widget and the trace widget associate with each other, remove log tables on the trace widget.
  • Add log widget to general service root.
  • Associate the event widget with the trace and log widget.
  • Add the MYSQL layer and update layer routers.
  • Fix query order for trace list.
  • Add a calculation to convert seconds to days.
    q* Add Spring Sleuth dashboard to general service instance.
  • Support the process dashboard and create the time range text widget.
  • Fix picking calendar with a wrong time range and setting a unique value for dashboard grid key.
  • Add PostgreSQL to Database sub-menu.
  • Implement the network profiling widget.
  • Add Micronaut icon for Java plugin.
  • Add Nats icon for Java plugin.
  • Bump moment and @vue/cli-plugin-e2e-cypress.
  • Add Network Profiling for Service Mesh DP instance and K8s pod panels.

Documentation

  • Fix invalid links in release docs.
  • Clean up doc about event metrics.
  • Add a table for metric calculations in the ui doc.
  • Add an explanation for alerting kernel and its in-memory window mechanism.
  • Add more docs for widget details.
  • Update alarm doc introduce configuration property key
  • Fix dependency license's NOTICE and binary jar included issues in the source release.
  • Add eBPF CPU profiling doc.

All issues and pull requests are here

9.1.0

10 Jun 03:01
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

eBPF profiling

image

On-demand Pod Log

image

Project

  • [IMPORTANT] Remove InfluxDB 1.x and Apache IoTDB 0.X as storage options, check details
    at here. Remove converter-moshi 2.5.0, influx-java 2.15,
    iotdb java 0.12.5, thrift 0.14.1, moshi 1.5.0, msgpack 0.8.16 dependencies. Remove InfluxDB and IoTDB relative codes
    and E2E tests.
  • Upgrade OAP dependencies zipkin to 2.23.16, H2 to 2.1.212, Apache Freemarker to 2.3.31, gRPC-java 1.46.0, netty to
    4.1.76.
  • Upgrade Webapp dependencies, spring-cloud-dependencies to 2021.0.2, logback-classic to 1.2.11
  • [IMPORTANT] Add BanyanDB storage implementation. Notice BanyanDB is currently under active development
    and SHOULD NOT be used in production cluster.

OAP Server

  • Add component definition(ID=127) for Apache ShenYu (incubating).
  • Fix Zipkin receiver: Decode spans error, missing Layer for V9 and wrong time bucket for generate Service and
    Endpoint.
  • [Refactor] Move SQLDatabase(H2/MySQL/PostgreSQL), ElasticSearch and BanyanDB specific configurations out of column.
  • Support BanyanDB global index for entities. Log and Segment record entities declare this new feature.
  • Remove unnecessary analyzer settings in columns of templates. Many were added due to analyzer's default value.
  • Simplify the Kafka Fetch configuration in cluster mode.
  • [Breaking Change] Update the eBPF Profiling task to the service level, please delete
    index/table: ebpf_profiling_task, process_traffic.
  • Fix event can't split service ID into 2 parts.
  • Fix OAP Self-Observability metric GC Time calculation.
  • Set SW_QUERY_MAX_QUERY_COMPLEXITY default value to 1000
  • Webapp module (for UI) enabled compression.
  • [Breaking Change] Add layer field to event, report an event without layer is not allowed.
  • Fix ES flush thread stops when flush schedule task throws exception, such as ElasticSearch flush failed.
  • Fix ES BulkProcessor in BatchProcessEsDAO was initialized multiple times and created multiple ES flush schedule tasks.
  • HTTPServer support the handler register with allowed HTTP methods.
  • [Critical] Revert Enhance DataCarrier#MultipleChannelsConsumer to add
    priority
    to avoid consuming issues.
  • Fix the problem that some configurations (such as group.id) did not take effect due to the override order when using
    the kafkaConsumerConfig property to extend the configuration in Kafka Fetcher.
  • Remove build time from the OAP version.
  • Add data-generator module to run OAP in testing mode, generating mock data for testing.
  • Support receive Kubernetes processes from gRPC protocol.
  • Fix the problem that es index(TimeSeriesTable, eg. endpoint_traffic, alarm_record) didn't create even after rerun with
    init-mode. This problem caused the OAP server to fail to start when the OAP server was down for more than a day.
  • Support autocomplete tags in traces query.
  • [Breaking Change] Replace all configurations **_JETTY_** to **_REST_**.
  • Add the support eBPF profiling field into the process entity.
  • E2E: fix log test miss verify LAL and metrics.
  • Enhance Converter mechanism in kernel level to make BanyanDB native feature more effective.
  • Add TermsAggregation properties collect_mode and execution_hint.
  • Add "execution_hint": "map", "collect_mode": "breadth_first" for aggregation and topology query to improve 5-10x
    performance.
  • Clean up scroll contexts after used.
  • Support autocomplete tags in logs query.
  • Enhance Deprecated MetricQuery(v1) getValues querying to asynchronous concurrency query
  • Fix the pod match error when the service has multiple selector in kubernetes environment.
  • VM monitoring adapts the 0.50.0 of the opentelemetry-collector.
  • Add Envoy internal cost metrics.
  • Remove Layer concept from ServiceInstance.
  • Remove unnecessary onCompleted on gRPC onError callback.
  • Remove Layer concept form Process.
  • Update to list all eBPF profiling schedulers without duration.
  • Storage(ElasticSearch): add search options to tolerate inexisting indices.
  • Fix the problem that MQ has the wrong Layer type.
  • Fix NoneStream model has wrong downsampling(was Second, should be Minute).
  • SQL Database: provide @SQLDatabase.AdditionalEntity to support create additional tables from a model.
  • [Breaking Change] SQL Database: remove SQL Database config maxSizeOfArrayColumn and numOfSearchableValuesPerTag.
  • [Breaking Change] SQL Database: move Tags list from Segment,Logs,Alarms to their additional table.
  • [Breaking Change] Remove total field in Trace, Log, Event, Browser log, and alarm list query.
  • Support OFF_CPU eBPF Profiling.
  • Fix SumAggregationBuilder#build should use the SumAggregation rather than MaxAggregation.
  • Add TiDB, OpenSearch, Postgres storage optional to Trace and eBPF Profiling E2E testing.
  • Add OFF CPU eBPF Profiling E2E Testing.
  • Fix searchableTag as rpc.status_code and http.status_code. status_code had been removed.
  • Fix scroll query failure exception.
  • Add profileDataQueryBatchSize config in Elasticsearch Storage.
  • Add APIs to query Pod log on demand.
  • Remove OAL for events.
  • Simplify the format index name logical in ES storage.
  • Add instance properties extractor in MAL.
  • Support Zipkin traces collect and zipkin traces query API.
  • [Breaking Change] Zipkin receiver mechanism changes and traces do not stream into OAP Segment anymore.

UI

  • General service instance: move Thread Pool from JVM to Overview, fix JVM GC Count calculation.
  • Add Apache ShenYu (incubating) component LOGO.
  • Show more metrics on service/instance/endpoint list on the dashboards.
  • Support average values of metrics on the service/list/endpoint table widgets, with pop-up linear graph.
  • Fix viewLogs button query no data.
  • Fix UTC when page loads.
  • Implement the eBPF profile widget on dashboard.
  • Optimize the trace widget.
  • Avoid invalid query for topology metrics.
  • Add the alarm and log tag tips.
  • Fix spans details and task logs.
  • Verify query params to avoid invalid queries.
  • Mobile terminal adaptation.
  • Fix: set dropdown for the Tab widget, init instance/endpoint relation selectors, update sankey graph.
  • Add eBPF Profiling widget into General service, Service Mesh and Kubernetes tabs.
  • Fix jump to endpoint-relation dashboard template.
  • Fix set graph options.
  • Remove the Layer filed from the Instance and Process.
  • Fix date time picker display when set hour to 0.
  • Implement tags auto-complete for Trace and Log.
  • Support multiple trees for the flame graph.
  • Fix the page doesn't need to be re-rendered when the url changes.
  • Remove unexpected data for exporting dashboards.
  • Fix duration time.
  • Remove the total field from query conditions.
  • Fix minDuration and maxDuration for the trace filter.
  • Add Log configuration for the browser templates.
  • Fix query conditions for the browser logs.
  • Add Spanish Translation.
  • Visualize the OFF CPU eBPF profiling.
  • Add Spanish language to UI.
  • Sort spans with startTime or spanId in a segment.
  • Visualize a on-demand log widget.
  • Fix activate the correct tab index after renaming a Tabs name.
  • FaaS dashboard support on-demand log (OpenFunction/functions-framework-go version > 0.3.0).

Documentation

  • Add eBPF agent into probe introduction.

All issues and pull requests are here

9.0.0

09 Apr 12:13
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

image

Project

  • Upgrade log4j2 to 2.17.1 for CVE-2021-44228, CVE-2021-45046, CVE-2021-45105 and CVE-2021-44832. This CVE only effects
    on JDK if JNDI is opened in default. Notice, using JVM option -Dlog4j2.formatMsgNoLookups=true or setting
    the LOG4J_FORMAT_MSG_NO_LOOKUPS=”true” environment variable also avoids CVEs.
  • Upgrade maven-wrapper to 3.1.0, maven to 3.8.4 for performance improvements and ARM more native support.
  • Exclude unnecessary libs when building under JDK 9+.
  • Migrate base Docker image to eclipse-temurin as adoptopenjdk is deprecated.
  • Add E2E test under Java 17.
  • Upgrade protoc to 3.19.2.
  • Add Istio 1.13.1 to E2E test matrix for verification.
  • Upgrade Apache parent pom version to 25.
  • Use the plugin version defined by the Apache maven parent.
    • Upgrade maven-dependency-plugin to 3.2.0.
    • Upgrade maven-assembly-plugin to 3.3.0.
    • Upgrade maven-failsafe-plugin to 2.22.2.
    • Upgrade maven-surefire-plugin to 2.22.2.
    • Upgrade maven-jar-plugin to 3.2.2.
    • Upgrade maven-enforcer-plugin to 3.0.0.
    • Upgrade maven-compiler-plugin to 3.10.0.
    • Upgrade maven-resources-plugin to 3.2.0.
    • Upgrade maven-source-plugin to 3.2.1.
  • Update codeStyle.xml to fix incompatibility on M1's IntelliJ IDEA 2021.3.2.
  • Update frontend-maven-plugin to 1.12 and npm to 16.14.0 for booster UI build.
  • Improve CI with the GHA new feature "run failed jobs".
  • Fix ./mvnw compile not work if ./mvnw install is not executed at least once.
  • Add JD_PRESERVE_LINE_FEEDS=true in official code style file.
  • Upgrade OAP dependencies gson(2.9.0), guava(31.1), jackson(2.13.2), protobuf-java(3.18.4), commons-io(2.7),
    postgresql(42.3.3).
  • Remove commons-pool and commons-dbcp from OAP dependencies(Not used before).
  • Upgrade webapp dependencies gson(2.9.0), spring boot(2.6.6), jackson(2.13.2.2), spring cloud(2021.0.1), Apache
    httpclient(4.5.13).

OAP Server

  • Fix potential NPE in OAL string match and a bug when right-hand-side variable includes double quotes.
  • Bump up Armeria version to 1.14.1 to fix CVE.
  • Polish ETCD cluster config environment variables.
  • Add the analysis of metrics in Satellite MetricsService.
  • Fix Can't split endpoint id into 2 parts bug for endpoint ID. In the TCP in service mesh observability, endpoint
    name doesn't exist in TCP traffic.
  • Upgrade H2 version to 2.0.206 to fix CVE-2021-23463 and GHSA-h376-j262-vhq6.
  • Extend column name override mechanism working for ValueColumnMetadata.
  • Introduce new concept Layer and removed NodeType. More details refer
    to v9-version-upgrade.
  • Fix query sort metrics failure in H2 Storage.
  • Bump up grpc to 1.43.2 and protobuf to 3.19.2 to fix CVE-2021-22569.
  • Add source layer and dest layer to relation.
  • Follow protocol grammar fix GCPhrase -> GCPhase.
  • Set layer to mesh relation.
  • Add FAAS to SpanLayer.
  • Adjust e2e case for V9 core.
  • Support ZGC GC time and count metric collecting.
  • Sync proto buffers files from upstream Envoy (Related to envoyproxy/envoy#18955).
  • Bump up GraphQL related dependencies to latest versions.
  • Add normal to V9 service meta query.
  • Support scope=ALL catalog for metrics.
  • Bump up H2 to 2.1.210 to fix CVE-2022-23221.
  • E2E: Add normal field to Service.
  • Add FreeSql component ID(3017) of dotnet agent.
  • E2E: verify OAP cluster model data aggregation.
  • Fix SelfRemoteClient self observing metrics.
  • Add env variables SW_CLUSTER_INTERNAL_COM_HOST and SW_CLUSTER_INTERNAL_COM_PORT for cluster selectors zookeeper
    ,consul,etcd and nacos.
  • Doc update: configuration-vocabulary,backend-cluster about env variables SW_CLUSTER_INTERNAL_COM_HOST
    and SW_CLUSTER_INTERNAL_COM_PORT.
  • Add Python MysqlClient component ID(7013) with mapping information.
  • Support Java thread pool metrics analysis.
  • Fix IoTDB Storage Option insert null index value.
  • Set the default value of SW_STORAGE_IOTDB_SESSIONPOOL_SIZE to 8.
  • Bump up iotdb-session to 0.12.4.
  • Bump up PostgreSQL driver to fix CVE.
  • Add Guava EventBus component ID(123) of Java agent.
  • Add OpenFunction component ID(5013).
  • Expose configuration responseTimeout of ES client.
  • Support datasource metric analysis.
  • [Breaking Change] Keep the endpoint avg resp time meter name the same with others scope. (This may break 3rd party
    integration and existing alarm rule settings)
  • Add Python FastAPI component ID(7014).
  • Support all metrics from MAL engine in alarm core, including Prometheus, OC receiver, meter receiver.
  • Allow updating non-metrics templates when structure changed.
  • Set default connection timeout of ElasticSearch to 3000 milliseconds.
  • Support ElasticSearch 8 and add it into E2E tests.
  • Disable indexing for field alarm_record.tags_raw_data of binary type in ElasticSearch storage.
  • Fix Zipkin receiver wrong condition for decoding gzip.
  • Add a new sampler (possibility) in LAL.
  • Unify module name receiver_zipkin to receiver-zipkin, remove receiver_jaeger from application.yaml.
  • Introduce the entity of Process type.
  • Set the length of event#parameters to 2000.
  • Limit the length of Event#parameters.
  • Support large service/instance/networkAddressAlias list query by using ElasticSearch scrolling API,
    add metadataQueryBatchSize to configure scrolling page size.
  • Change default value of metadataQueryMaxSize from 5000 to 10000
  • Replace deprecated Armeria API BasicToken.of with AuthToken.ofBasic.
  • Implement v9 UI template management protocol.
  • Implement process metadata query protocol.
  • Expose more ElasticSearch health check related logs to help to
    diagnose Health check fails. reason: No healthy endpoint.
  • Add source event generated metrics to SERVICE_CATALOG_NAME catalog.
  • [Breaking Change] Deprecate All from OAL source.
  • [Breaking Change] Remove SRC_ALL: 'All' from OAL grammar tree.
  • Remove all_heatmap and all_percentile metrics.
  • Fix ElasticSearch normal index couldn't apply mapping and update.
  • Enhance DataCarrier#MultipleChannelsConsumer to add priority for the channels, which makes OAP server has a better
    performance to activate all analyzers on default.
  • Activate receiver-otel#enabledOcRules receiver with k8s-node,oap,vm rules on default.
  • Activate satellite,spring-sleuth for agent-analyzer#meterAnalyzerActiveFiles on default.
  • Activate receiver-zabbix receiver with agent rule on default.
  • Replace HTTP server (GraphQL, agent HTTP protocol) from Jetty with Armeria.
  • [Breaking Change] Remove configuration restAcceptorPriorityDelta (env var: SW_RECEIVER_SHARING_JETTY_DELTA
    , SW_CORE_REST_JETTY_DELTA).
  • [Breaking Change] Remove configuration graphql/path (env var: SW_QUERY_GRAPHQL_PATH).
  • Add storage column attribute indexOnly, support ElasticSearch only index and not store some fields.
  • Add indexOnly=true to SegmentRecord.tags, AlarmRecord.tags, AbstractLogRecord.tags, to reduce unnecessary
    storage.
  • [Breaking Change] Remove configuration restMinThreads (env var: SW_CORE_REST_JETTY_MIN_THREADS
    , SW_RECEIVER_SHARING_JETTY_MIN_THREADS).
  • Refactor the core Builder mechanism, new storage plugin could implement their own converter and get rid of hard
    requirement of using HashMap to communicate between data object and database native structure.
  • [Breaking Change] Break all existing 3rd-party storage extensions.
  • Remove hard requirement of BASE64 encoding for binary field.
  • Add complexity limitation for GraphQL query to avoid malicious query.
  • Add Column.shardingKeyIdx for column definition for BanyanDB.
Sharding key is used to group time series data per metric of one entity in one place (same sharding and/or same 
row for column-oriented database).
For example,
ServiceA's traffic gauge, service call per minute, includes following timestamp values, then it should be sharded by service ID
[ServiceA(encoded ID): 01-28 18:30 values-1, 01-28 18:31 values-2, 01-28 18:32 values-3, 01-28 18:32 values-4]

BanyanDB is the 1st storage implementation supporting this. It would make continuous time series metrics stored closely and compressed better.

NOTICE, this sharding concept is NOT just for splitting data into different database instances or physical files.
  • Support ElasticSearch template mappings properties parameters and _source update.
  • Implement the eBPF profiling query and data collect protocol.
  • [Breaking Change] Remove Deprecated responseCode from sources, including Service, ServiceInstance, Endpoint
  • Enhance endpoint dependency analysis to support cross threads cases. Refactor span analysis code structures.
  • Remove isNotNormal service requirement when use alias to merge service topology from client side. All RPCs' peer
    services from client side are always normal services. This cause the topology is not merged correctly.
  • Fix event type of export data is incorrect, it was EventType.TOTAL always.
  • Reduce redundancy ThreadLocal in MAL core. Improve MAL performance.
  • Trim tag's key and value in log query.
  • Refactor IoTDB storage plugin, add IoTDBDataConverter and fix ModifyCollectionInEnhancedForLoop bug.
  • Bump up iotdb-session to 0.12.5.
  • Fix the configuration of Aggregation and GC Count metrics for oap self observability
  • E2E: Add verify OA...
Read more

8.9.1

11 Dec 13:57
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

Project

  • Upgrade log4j2 to 2.15.0 for CVE-2021-44228. This CVE only effects on JDK versions below 6u211, 7u201, 8u191 and 11.0.1 according to the post. Notice, using JVM option -Dlog4j2.formatMsgNoLookups=true also avoids CVE if your JRE opened JNDI in default.

8.9.0

05 Dec 07:33
Compare
Choose a tag to compare

Download

https://skywalking.apache.org/downloads/

Notice

Don't download source codes from this page.
Please follow build document, if you want to build source codes by yourself.

Project

  • E2E tests immigrate to e2e-v2.
  • Support JDK 16 and 17.
  • Add Docker images for arm64 architecture.

OAP Server

  • Add component definition for Jackson.
  • Fix that zipkin-receiver plugin is not packaged into dist.
  • Upgrade Armeria to 1.12, upgrade OpenSearch test version to 1.1.0.
  • Add component definition for Apache-Kylin.
  • Enhance get generation mechanism of OAL engine, support map type of source's field.
  • Add tag(Map) into All, Service, ServiceInstance and Endpoint sources.
  • Fix funcParamExpression and literalExpression can't be used in the same aggregation function.
  • Support cast statement in the OAL core engine.
  • Support (str->long) and (long) for string to long cast statement.
  • Support (str->int) and (int) for string to int cast statement.
  • Support Long literal number in the OAL core engine.
  • Support literal string as parameter of aggregation function.
  • Add attributeExpression and attributeExpressionSegment in the OAL grammar tree to support map type for the
    attribute expression.
  • Refactor the OAL compiler context to improve readability.
  • Fix wrong generated codes of hashCode and remoteHashCode methods for numeric fields.
  • Support != null in OAL engine.
  • Add Message Queue Consuming Count metric for MQ consuming service and endpoint.
  • Add Message Queue Avg Consuming Latency metric for MQ consuming service and endpoint.
  • Support -Inf as bucket in the meter system.
  • Fix setting wrong field when combining Events.
  • Support search browser service.
  • Add getProfileTaskLogs to profile query protocol.
  • Set SW_KAFKA_FETCHER_ENABLE_NATIVE_PROTO_LOG, SW_KAFKA_FETCHER_ENABLE_NATIVE_JSON_LOG default true.
  • Fix unexpected deleting due to TTL mechanism bug for H2, MySQL, TiDB and PostgreSQL.
  • Add a GraphQL query to get OAP version, display OAP version in startup message and error logs.
  • Fix TimeBucket missing in H2, MySQL, TiDB and PostgreSQL bug, which causes TTL doesn't work for service_traffic.
  • Fix TimeBucket missing in ElasticSearch and provide compatible storage2Entity for previous versions.
  • Fix ElasticSearch implementation of queryMetricsValues and readLabeledMetricsValues doesn't fill default values
    when no available data in the ElasticSearch server.
  • Fix config yaml data type conversion bug when meets special character like !.
  • Optimize metrics of minute dimensionality persistence. The value of metrics, which has declaration of the default
    value and current value equals the default value logically, the whole row wouldn't be pushed into database.
  • Fix max function in OAL doesn't support negative long.
  • Add MicroBench module to make it easier for developers to write JMH test.
  • Upgrade Kubernetes Java client to 14.0.0, supports GCP token refreshing and fixes some bugs.
  • Change SO11Y metric envoy_als_in_count to calculate the ALS message count.
  • Support Istio 1.10.3, 1.11.4, 1.12.0 release.(Tested through e2e)
  • Add filter mechanism in MAL core to filter metrics.
  • Fix concurrency bug in MAL increase-related calculation.
  • Fix a null pointer bug when building SampleFamily.
  • Fix the so11y latency of persistence execution latency not correct in ElasticSearch storage.
  • Add MeterReportService collectBatch method.
  • Add OpenSearch 1.2.0 to test and verify it works.
  • Upgrade grpc-java to 1.42.1 and protoc to 3.17.3 to allow using native Mac osx-aarch_64 artifacts.
  • Fix TopologyQuery.loadEndpointRelation bug.
  • Support using IoTDB as a new storage option.
  • Add customized envoy ALS protocol receiver for satellite transmit batch data.
  • Remove logback dependencies in IoTDB plugin.
  • Fix StorageModuleElasticsearchProvider doesn't watch on trustStorePath.
  • Fix a wrong check about entity if GraphQL at the endpoint relation level.

UI

  • Optimize endpoint dependency.
  • Show service name by hovering nodes in the sankey chart.
  • Add Apache Kylin logo.
  • Add ClickHouse logo.
  • Optimize the style and add tips for log conditions.
  • Fix the condition for trace table.
  • Optimize profile functions.
  • Implement a reminder to clear cache for dashboard templates.
  • Support +/- hh:mm in TimeZone setting.
  • Optimize global settings.
  • Fix current endpoint for endpoint dependency.
  • Add version in the global settings popup.
  • Optimize Log page style.
  • Avoid some abnormal settings.
  • Fix query condition of events.

Documentation

  • Enhance documents about the data report and query protocols.
  • Restructure documents about receivers and fetchers.
    1. Remove general receiver and fetcher docs
    2. Add more specific menu with docs to help users to find documents easier.
  • Add a guidance doc about the logic endpoint.
  • Link Satellite as Load Balancer documentation and compatibility with satellite.

All issues and pull requests are here