Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[inputs.modbus] Error in plugin: modbus: response data size '5' does not match count '4' #11099

Closed
branimirborisov opened this issue May 13, 2022 · 8 comments · Fixed by #15276
Assignees
Labels
area/iot New plugins or features relating to IoT monitoring area/modbus bug unexpected problem or unintended behavior platform/windows upstream bug or issues that rely on dependency fixes

Comments

@branimirborisov
Copy link

branimirborisov commented May 13, 2022

Relevant telegraf.conf

# Configuration for telegraf agent
[agent]
  ## Default data collection interval for all inputs
  interval = "10s"
  ## Rounds collection interval to 'interval'
  ## ie, if interval="10s" then always collect on :00, :10, :20, etc.
  round_interval = true

  ## Telegraf will send metrics to outputs in batches of at most
  ## metric_batch_size metrics.
  ## This controls the size of writes that Telegraf sends to output plugins.
  metric_batch_size = 1000

  ## Maximum number of unwritten metrics per output.  Increasing this value
  ## allows for longer periods of output downtime without dropping metrics at the
  ## cost of higher maximum memory usage.
  metric_buffer_limit = 10000

  ## Collection jitter is used to jitter the collection by a random amount.
  ## Each plugin will sleep for a random time within jitter before collecting.
  ## This can be used to avoid many plugins querying things like sysfs at the
  ## same time, which can have a measurable effect on the system.
  collection_jitter = "0s"

  ## Default flushing interval for all outputs. Maximum flush_interval will be
  ## flush_interval + flush_jitter
  flush_interval = "10s"
  ## Jitter the flush interval by a random amount. This is primarily to avoid
  ## large write spikes for users running a large number of telegraf instances.
  ## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
  flush_jitter = "0s"

  ## By default or when set to "0s", precision will be set to the same
  ## timestamp order as the collection interval, with the maximum being 1s.
  ##   ie, when interval = "10s", precision will be "1s"
  ##       when interval = "250ms", precision will be "1ms"
  ## Precision will NOT be used for service inputs. It is up to each individual
  ## service input to set the timestamp at the appropriate precision.
  ## Valid time units are "ns", "us" (or "µs"), "ms", "s".
  precision = ""

  ## Log at debug level.
  debug = true
  ## Log only error level messages.
  # quiet = false

  ## Log target controls the destination for logs and can be one of "file",
  ## "stderr" or, on Windows, "eventlog".  When set to "file", the output file
  ## is determined by the "logfile" setting.
  # logtarget = "file"

  ## Name of the file to be logged to when using the "file" logtarget.  If set to
  ## the empty string then logs are written to stderr.
  # logfile = ""

  ## The logfile will be rotated after the time interval specified.  When set
  ## to 0 no time based rotation is performed.  Logs are rotated only when
  ## written to, if there is no log activity rotation may be delayed.
  # logfile_rotation_interval = "0d"

  ## The logfile will be rotated when it becomes larger than the specified
  ## size.  When set to 0 no size based rotation is performed.
  # logfile_rotation_max_size = "0MB"

  ## Maximum number of rotated archives to keep, any older logs are deleted.
  ## If set to -1, no archives are removed.
  # logfile_rotation_max_archives = 5

  ## Pick a timezone to use when logging or type 'local' for local time.
  ## Example: America/Chicago
  # log_with_timezone = ""

  ## Override default hostname, if empty use os.Hostname()
  hostname = ""
  ## If set to true, do no set the "host" tag in the telegraf agent.
  omit_hostname = false
[[outputs.influxdb_v2]]
  ## The URLs of the InfluxDB cluster nodes.
  ##
  ## Multiple URLs can be specified for a single cluster, only ONE of the
  ## urls will be written to each interval.
  ##   ex: urls = ["https://us-west-2-1.aws.cloud2.influxdata.com"]
  urls = ["${INFLUXDB_HOST}:${INFLUXDB_PORT}"]

  ## Token for authentication.
  token = "${DOCKER_INFLUXDB_INIT_ADMIN_TOKEN}"

  ## Organization is the name of the organization you wish to write to; must exist.
  organization = "${DOCKER_INFLUXDB_INIT_ORG}"

  ## Destination bucket to write into.
  bucket = "${DOCKER_INFLUXDB_INIT_BUCKET}"

  ## The value of this tag will be used to determine the bucket.  If this
  ## tag is not set the 'bucket' option is used as the default.
  # bucket_tag = ""

  ## If true, the bucket tag will not be added to the metric.
  # exclude_bucket_tag = false

  ## Timeout for HTTP messages.
  # timeout = "5s"

  ## Additional HTTP headers
  # http_headers = {"X-Special-Header" = "Special-Value"}

  ## HTTP Proxy override, if unset values the standard proxy environment
  ## variables are consulted to determine which proxy, if any, should be used.
  # http_proxy = "http://corporate.proxy:3128"

  ## HTTP User-Agent
  # user_agent = "telegraf"

  ## Content-Encoding for write request body, can be set to "gzip" to
  ## compress body or "identity" to apply no encoding.
  # content_encoding = "gzip"

  ## Enable or disable uint support for writing uints influxdb 2.0.
  # influx_uint_support = false

  ## Optional TLS Config for use on HTTP connections.
  tls_ca = "${INFLUXD_TLS_CERT}"
  tls_cert = "${INFLUXD_TLS_CERT}"
  tls_key = "${INFLUXD_TLS_KEY}"
  ## Use TLS but skip chain & host verification
  # insecure_skip_verify = false
[[inputs.cpu]]
  ## Whether to report per-cpu stats or not
  percpu = true
  ## Whether to report total system cpu stats or not
  totalcpu = true
  ## If true, collect raw CPU time metrics
  collect_cpu_time = false
  ## If true, compute and report the sum of all non-idle CPU states
  report_active = false
[[inputs.disk]]
  ## By default stats will be gathered for all mount points.
  ## Set mount_points will restrict the stats to only the specified mount points.
  # mount_points = ["/"]
  ## Ignore mount points by filesystem type.
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]
[[inputs.diskio]]
  ## By default, telegraf will gather stats for all devices including
  ## disk partitions.
  ## Setting devices will restrict the stats to the specified devices.
  # devices = ["sda", "sdb", "vd*"]
  ## Uncomment the following line if you need disk serial numbers.
  # skip_serial_number = false
  #
  ## On systems which support it, device metadata can be added in the form of
  ## tags.
  ## Currently only Linux is supported via udev properties. You can view
  ## available properties for a device by running:
  ## 'udevadm info -q property -n /dev/sda'
  ## Note: Most, but not all, udev properties can be accessed this way. Properties
  ## that are currently inaccessible include DEVTYPE, DEVNAME, and DEVPATH.
  # device_tags = ["ID_FS_TYPE", "ID_FS_USAGE"]
  #
  ## Using the same metadata source as device_tags, you can also customize the
  ## name of the device via templates.
  ## The 'name_templates' parameter is a list of templates to try and apply to
  ## the device. The template may contain variables in the form of '$PROPERTY' or
  ## '${PROPERTY}'. The first template which does not contain any variables not
  ## present for the device is used as the device name tag.
  ## The typical use case is for LVM volumes, to get the VG/LV name instead of
  ## the near-meaningless DM-0 name.
  # name_templates = ["$ID_FS_LABEL","$DM_VG_NAME/$DM_LV_NAME"]
[[inputs.mem]]
  # no configuration
[[inputs.net]]
  ## By default, telegraf gathers stats from any up interface (excluding loopback)
  ## Setting interfaces will tell it to gather these explicit interfaces,
  ## regardless of status.
  ##
  # interfaces = ["eth0"]
  ##
  ## On linux systems telegraf also collects protocol stats.
  ## Setting ignore_protocol_stats to true will skip reporting of protocol metrics.
  ##
  # ignore_protocol_stats = false
  ##
[[inputs.processes]]
  # no configuration
[[inputs.swap]]
  # no configuration
[[inputs.system]]
  ## Uncomment to remove deprecated metrics.
  # fielddrop = ["uptime_format"]

# Retrieve data from MODBUS slave devices
[[inputs.modbus]]
  ## Trace the connection to the modbus device as debug messages
  ## Note: You have to enable telegraf's debug mode to see those messages!
  debug_connection = true
  ## Define the configuration schema
  ##  |---register -- define fields per register type in the original style (only supports one slave ID)
  ##  |---request  -- define fields on a requests base
  configuration_type = "register"

  ## Connection Configuration
  ##
  ## The plugin supports connections to PLCs via MODBUS/TCP or
  ## via serial line communication in binary (RTU) or readable (ASCII) encoding
  ##
  ## Device name
  name = "SIEMENS SICAM P"

  ## Slave ID - addresses a MODBUS device on the bus
  ## Range: 0 - 255 [0 = broadcast; 248 - 255 = reserved]
  slave_id = 1

  ## Timeout for each request
  timeout = "1s"

  ## Maximum number of retries and the time to wait between retries
  ## when a slave-device is busy.
  # busy_retries = 0
  # busy_retries_wait = "100ms"

  # TCP - connect via Modbus/TCP
  controller = "tcp://${MODBUS_HOST}:${MODBUS_PORT}"

  ## Serial (RS485; RS232)
  # controller = "file:///dev/ttyUSB0"
  # baud_rate = 9600
  # data_bits = 8
  # parity = "N"
  # stop_bits = 1
  # transmission_mode = "RTU"


  ## Measurements
  ##

  ## Digital Variables, Discrete Inputs and Coils
  ## measurement - the (optional) measurement name, defaults to "modbus"
  ## name        - the variable name
  ## address     - variable address

  discrete_inputs = [
  ]
  coils = [
  ]

  ## Analog Variables, Input Registers and Holding Registers
  ## measurement - the (optional) measurement name, defaults to "modbus"
  ## name        - the variable name
  ## byte_order  - the ordering of bytes
  ##  |---AB, ABCD   - Big Endian
  ##  |---BA, DCBA   - Little Endian
  ##  |---BADC       - Mid-Big Endian
  ##  |---CDAB       - Mid-Little Endian
  ## data_type  - INT16, UINT16, INT32, UINT32, INT64, UINT64, FLOAT32-IEEE, FLOAT64-IEEE (the IEEE 754 binary representation)
  ##              FLOAT32 (deprecated), FIXED, UFIXED (fixed-point representation on input)
  ## scale      - the final numeric variable representation
  ## address    - variable address

  holding_registers = [
    { name = "active_energy_demand", byte_order = "ABCD", data_type = "FLOAT32", scale=1.0, address = [40807, 40808] },
  ]
  input_registers = [
  ]

Logs from Telegraf

adex-iot-telegraf | 2022-05-13T18:47:40Z D! [inputs.modbus] trying to read holding@40807[2]...
adex-iot-telegraf | 2022-05-13T18:47:40Z D! [inputs.modbus] modbus: send 02 34 00 00 00 06 01 03 9f 67 00 02
adex-iot-telegraf | 2022-05-13T18:47:40Z D! [inputs.modbus] modbus: recv 02 34 00 00 00 08 01 03 04 4d 4b bb a1 2e
adex-iot-telegraf | 2022-05-13T18:47:40Z E! [inputs.modbus] Error in plugin: modbus: response data size '5' does not match count '4'

System info

Telegraf 1.22-alpine, InfluxDB 2.2-alpine, Ubuntu 18.04.5 LTS, Docker version 19.03.12

Docker

version: '3.7'
services:
influxdb:
image: influxdb:2.2-alpine
container_name: ${CONTAINER_NAME}-influxdb
volumes:
- influxdb-data:/var/lib/influxdb2:rw
- ./influxdb/ssl/influxdb-selfsigned.crt:${INFLUXD_TLS_CERT}:ro
- ./influxdb/ssl/influxdb-selfsigned.key:${INFLUXD_TLS_KEY}:ro
env_file:
- .env
ports:
- 8086:8086
telegraf:
image: telegraf:1.22-alpine
container_name: ${CONTAINER_NAME}-telegraf
volumes:
- ./influxdb/telegraf.conf:/etc/telegraf/telegraf.conf:ro
- ./influxdb/ssl/influxdb-selfsigned.crt:${INFLUXD_TLS_CERT}:ro
- ./influxdb/ssl/influxdb-selfsigned.key:${INFLUXD_TLS_KEY}:ro
env_file:
- .env
depends_on:
- influxdb
volumes:
influxdb-data:
name: ${CONTAINER_NAME}-influxdb-data

Steps to reproduce

  1. Configure all env vars from telegraf.conf on your local environment
  2. Lift both InfluxDB & Telegraf containers from the docker-compose.yml file
  3. Wait until Telegraf starts reading the MODBUS holding registers and check the debug output

Expected behavior

The data from the holding registers is read and published to InfluxDB

Actual behavior

Following error occurs from Telegraf service/container:

Error in plugin: modbus: response data size '5' does not match count '4'

No data is sent/stored to InfluxDB.

Additional info

Device is SIEMENS SICAM P with Modbus RTU translated over TCP.
Reading the same holding registers with another tools works perfectly, for example using modpoll:

modpoll -p ${MODBUS_PORT} -a 1 -f -t 4:float -r 40807 ${MODBUS_HOST} 

results in:

[40807]: 213209248.000000
...
@branimirborisov branimirborisov added the bug unexpected problem or unintended behavior label May 13, 2022
@telegraf-tiger telegraf-tiger bot added area/iot New plugins or features relating to IoT monitoring platform/windows labels May 13, 2022
@powersj
Copy link
Contributor

powersj commented May 17, 2022

@srebhan is this something you could look at and see what the next steps are?

@srebhan
Copy link
Contributor

srebhan commented May 19, 2022

The error originates from the underlying grid-x library and basically means that the received modbus packet says it has length 4 but actually contains 5 bytes (+1 byte length field). So your device somehow sends one byte more that expected...

@branimirborisov, are you sure that the device is ModbusTCP or does it do RTUoverTCP or similar? Can you maybe try modbus-cli to be sure if the error happens in Telegraf or the underlying library...

@srebhan srebhan self-assigned this May 19, 2022
@powersj powersj added the waiting for response waiting for response from contributor label May 19, 2022
@branimirborisov
Copy link
Author

branimirborisov commented May 22, 2022

@srebhan Thanks for looking into the issue.

Indeed the error is coming from the underlying grid-x/modbus library.
Same error occurs when running the modbus-cli, example:

./modbus-cli.amd64 -address tcp://{HOST}:{PORT} -register 40807 -quantity 2

Error message:

modbus: response data size '5' does not match count '4'

In terms of the device - it is actually a RTU device routed via TCP. I've also tried enabling the transmition mode in Telegraf to transmission_mode = "RTUoverTCP". In this case though the connection does not work at all and I'm getting a timeout error again from the plugin:

Error in plugin: read tcp 172.24.0.3:52840->{HOST}:{PORT}: i/o timeout

Is there any additional configuration that needs to be added when using the RTUoverTCP mode?

Also does it make sense to open an issue to the grid-x/modbus library for this case?

P.S. I've also opened an issue in the grid-x/modbus repo to see if the issue can be resolved there: grid-x/modbus#52

Thanks!

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label May 22, 2022
@srebhan
Copy link
Contributor

srebhan commented May 23, 2022

@branimirborisov yeah, probably you want just tcp as this is what your other tool is also doing... Anyway I guess your device is strange as it is sending additional data. I guess the underlying library could just relax the safety check and say if count <= length, but this is out of our hands...

Please drop me a note if your upstream issue is fixed and I will bump the library so telegraf can pick-up the fix.

@powersj powersj added the waiting for response waiting for response from contributor label Oct 12, 2022
@telegraf-tiger
Copy link
Contributor

Hello! I am closing this issue due to inactivity. I hope you were able to resolve your problem, if not please try posting this question in our Community Slack or Community Page. Thank you!

@srebhan srebhan reopened this Nov 4, 2022
@srebhan
Copy link
Contributor

srebhan commented Nov 4, 2022

Please keep this open as it tracks an upstream bug...

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Nov 4, 2022
@powersj
Copy link
Contributor

powersj commented Nov 4, 2022

Please keep this open as it tracks an upstream bug...

FYI use the "upstream" tag to make it clear it is waiting on something other than us or the reporter

@srebhan
Copy link
Contributor

srebhan commented May 2, 2024

@branimirborisov please test the binary in PR #15276, available as soon as CI finished the tests, and let me know if this works with your device!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/iot New plugins or features relating to IoT monitoring area/modbus bug unexpected problem or unintended behavior platform/windows upstream bug or issues that rely on dependency fixes
Projects
None yet
3 participants