Ragged Batching is not working proberbly #7206

AWallyAllah · 2024-05-10T13:52:37Z

I have a model ensemble with output shape from some models with different shapes. I have multiple clients but each one sends a single frame per inference request. For instance, I got a bytetracker that produces a different shapes. But it looks like Triton is not batching and sending a single input/output per inference request and keep caching.

My questions are:

Should I remove dynamic batching? since each client sends a single frame per request or keep both dynamic_batching and ragged_batching?
The batch_input config should only be to inputs to models right? for instance if I have a model produces a different shapes I won't add something like batch_input or even batch_output? If so, should I keep a -1 in that output?
Should I output a batch of 1 from each model so triton can concatenate? for instance, a bytetracker outputs (batch_size, num_det, 56) or should it be ( num_det, 56) and triton will batch?

Triton version FROM nvcr.io/nvidia/tritonserver:24.01-py3

This is a sample from bytetracker for instance:

name: "detection_with_ppe_bytetracker"
backend: "python"
max_batch_size: 4

input [
{
    name: "ppe_bytetracker_input"
    data_type: TYPE_FP16
    dims: [ -1, 57 ]
    allow_ragged_batch: true
}
]
batch_input [
  {
    kind: BATCH_ACCUMULATED_ELEMENT_COUNT
    target_name: "INDEX"
    data_type: TYPE_FP32
    source_input: "ppe_bytetracker_input"
  }
]

input [
{
    name: "detection_bytetracker_input"
    data_type: TYPE_FP16
    dims: [-1, 6]
    allow_ragged_batch: true
}
]
batch_input [
  {
    kind: BATCH_ACCUMULATED_ELEMENT_COUNT
    target_name: "INDEX"
    data_type: TYPE_FP32
    source_input: "detection_bytetracker_input"
  }
]

input [
{
    name: "detection_with_ppe_camera_id"
    data_type: TYPE_FP16
    dims: [ 1 ]
}
]

output [
{
    name: "detection_with_ppe_bytetracker_output"
    data_type: TYPE_FP16
    dims: [ -1, 58 ]
}
]

dynamic_batching {}

instance_group [
    {
        count: 1
        kind: KIND_CPU
    }
]

optimization {
  execution_accelerators {
    cpu_execution_accelerator : [{
      name : "openvino"
    }]
  }
}

This is a sample from post processing where outputs are not ragged_batching:

name: "ppe_yolo_postprocessing"
backend: "python"
max_batch_size: 4

input [
{
    name: "ppe_yolo_postprocessing_input"
    data_type: TYPE_FP32
    dims: [56, 20160]
}
]

output [
{
    name: "ppe_yolo_postprocessing_output"
    data_type: TYPE_FP16
    dims: [ -1, 57 ]
}
]

dynamic_batching {}

instance_group [
    {
        count: 1
        kind: KIND_CPU
    }
]

optimization {
  execution_accelerators {
    cpu_execution_accelerator : [{
      name : "openvino"
    }]
  }
}

The text was updated successfully, but these errors were encountered:

AWallyAllah closed this as completed May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ragged Batching is not working proberbly #7206

Ragged Batching is not working proberbly #7206

AWallyAllah commented May 10, 2024 •

edited

Ragged Batching is not working proberbly #7206

Ragged Batching is not working proberbly #7206

Comments

AWallyAllah commented May 10, 2024 • edited

AWallyAllah commented May 10, 2024 •

edited