Ability to change the playback rate of an AudioBufferSourceNode without affecting the pitch #2487

p-himik · 2022-04-27T20:03:28Z

Describe the feature
Basically the title - it would be nice to have an easy way to change the playback rate of a source node without affecting the pitch.

Is there a prototype?
Alas, I have no clue how to implement something like at this point.

Describe the feature in more detail
In my particular case, I'm working on a web app that needs synthesized audio alignment and preview. By itself it's very simple - there's no audio know-how involved at all. But users have requested an ability to change the playback rate so they could align and preview the audio data much quicker. Initially I tried simply changing sourceNode.playbackRate.value - after all, changing the value of an attribute with the same name on <audio> and <video> works just fine. But as you can guess, it didn't produce the desired result.

There are also quite a few other instances when people ask for something like this:

This issue that shows that at least some interest has been shown here for such functionality: Interdependence between playbackRate and detune #723
A StackOverflow question with a decent amount of votes: Changing Speed of Audio Using the Web Audio API Without Changing Pitch (and even thought there are answers, they are quite far from being simple - especially when you have to work with thirdparty audio libraries that don't have that ability built into them)
A request for MDN to provide an example of how it could be done: How to preserve an audio's pitch after changing AudioBufferSourceNode.playbackRate? mdn/webaudio-examples#53
An issue in an audio-related project that I'm using: Alter playback speed? naomiaro/waveform-playlist#19 (has been open for years because there's no easy way to do this)
A Chromium bug with the WontFix resolution because "Changing this will require a change in the spec": Implement "preserve pitch" on Web Audio API when changing playbackRate
A relevant Chrome Web Audio FAQ entry, albeit about the reverse problem: Can I change pitch without changing speed?
A 6 year old issue in the old W3C bugtracker: Enable in WebAudio to change the tempo of audio without changing the pitch

One of those is from my own experience, the others are all from just the first page of a Google search. I'm sure I'd be able to find quite a few more if I spent more time on it.

Given that there already is such a functionality when it comes to playbackRate of <audio> and <video>, I'd think that it shouldn't be that hard from the implementation perspective to add it to AudioBufferSourceNode as well.

The text was updated successfully, but these errors were encountered:

agrathwohl · 2022-05-05T16:29:26Z

Given that there already is such a functionality when it comes to playbackRate of <audio> and <video>, I'd think that it shouldn't be that hard from the implementation perspective to add it to AudioBufferSourceNode as well.

At the risk of seeming pedantic, I always found the playbackRate property of the audio element to be a bit of a half-measure, since many in the audio/DSP spaces are opinionated about time stretching algorithms and desire finer control over those algorithms based upon the type of audio content being processed. (For example, see Ardour's "Stretching" page in their documentation.)

A common use case for changing playback rate without changing pitch is to slow down or speed up voice recordings. This use case has WCAG implications, so the type of algorithm deployed can be pretty important for businesses. There are particular time stretching algorithms that address this need, by ensuring greater vocal intelligibility at the expense of lower dynamic & frequency ranges.

For music and sound effects use cases, the granularity of control is crucial since the algorithm deployed has a very noticeable impact upon the way the resulting audio will sound. This means the chosen algorithm has direct aesthetic/artistic consequences.

Initially I tried simply changing sourceNode.playbackRate.value - after all, changing the value of an attribute with the same name on and

The playback rate of an audio buffer is conventionally understood to be a multiplier applied to the source's reported sampling frequency, which results in a change in the output pitch when no other DSP is applied. Given this, the current behavior of the AudioBufferSourceNode.playbackRate property seems correct to me.

My team and I have built an audio player that uses the media element's playbackRate property to achieve this kind of time stretching, keeping everything else strictly within the Web Audio API. However, out of a desire to achieve finer-grained control over the time stretching algorithm and overcome the constraints imposed by using media elements, we have been working on implementing a solution to address exactly this ticket's concern.

Our approach is to create an AudioWorklet that uses message port and audio parameters to set a connected source node's playbackRate property. Once the new rate has been sent from the worklet to the source node, the source node sends a message back to the worklet confirming that new value. This then kicks off logic to calculate the appropriate pitch change necessary to maintain a consistent pitch upon output.

Here's some code -- would love others' thoughts on this approach and happy to answer any questions folks might have:

/**
 * Timestretch Worklet
 * ES6 class abstraction for a phaseVocoder worklet via the AudioWorkletNode. This
 * worklet is capable of shifting pitch without affecting the playback speed. Using
 * this in combination with adjusting playback speed, it can be used for a
 * timestretch effect in which audio playback speed changes without affecting pitch. An
 * example implementation of both can be found in www.js
 * @public
 * @class
 */
class TimestretchWorklet {

  /**
   * Create new iteration of the TimestretchWorklet class along with a new AudioWorkletNode
   * which is available on the class's .workletNode property.
   * @param {AudioContext} ctx - Web audio context to be used
   * @param {AudioBufferSourceNode} bufferSource - (optional) bufferSource to automatically connect the new AudioWorkletNode to
   * @param {string} modulePath - Path to the module (for use when custom paths to local assets are needed, ie: vue.js)
   * @param {opts} opts - (optional) Options to pass directly to the AudioWorkletNode
   * @param {float} pitch - (optional) Initial pitch shift for the AudioWorkletNode
   * @returns {TimestretchWorklet}
   */
  static async createWorklet({
    ctx,
    bufferSource,
    modulePath,
    opts={},
    pitch,
  }) {
    const worklet = new TimestretchWorklet(ctx)

    try {
      await ctx.audioWorklet.addModule(modulePath || 'phaseVocoder.js')
    } catch (err) {
      throw new Error(`Error adding module: ${err}`)
    }

    try {
      worklet.workletNode = new AudioWorkletNode(
        ctx,
        'phase-vocoder-processor',
        opts
      )

      if (pitch) {
        worklet.updatePitch(pitch)
      }

      if (bufferSource) {
        worklet.workletNode.parameters.get('playbackRate').value = bufferSource.playbackRate.value
        bufferSource.connect(worklet.workletNode);
        worklet.bufferSource = bufferSource;
      }

      // update playbackRate via message to ensure they stay in sync
      worklet.workletNode.port.onmessage = (e) => {
        const { data } = e
        if (data.type === 'updatePlaybackRate') {
          worklet.bufferSource.playbackRate.value = data.rate
        }
      }
    } catch (err) {
      throw new Error(`Error creating worklet node: ${err}`)
    }

    return worklet
  }

  /**
   * Meant for interior use only via the static method createWorklet()
   * @param {AudioContext} ctx - Web audio context to be used
   */
  constructor(ctx) {
    this.bufferSource = null;
    this.ctx = ctx;
    this.pitch = 1.0;
    this.playbackRate = 1.0;
    this.workletNode = null;
  }


  /**
   * Connects an audio bufferSource (AudioBufferSourceNode) to the existing AudioWorkletNode
   * @param {AudioBufferSourceNode} bufferSource - bufferSource connect the AudioWorkletNode
   */
  connectBufferSource(bufferSource) {
    if (!this.workleNode) {
      throw new Error('No worklet created. Call createWorklet() first')
    }

    this.workletNode.parameters.get('playbackRate').value = bufferSource.playbackRate.value
    bufferSource.connect(this.workletNode)
    this.bufferSource = bufferSource;
  }

  /**
   * Updates the pitch of the worklet via an {AudioParam} of the AudioWorkletNode's processor
   * @param {float} pitch - Value of the pitch to set (0.1 to 2.0)
   */
  updatePitch(pitch) {
    this.workletNode.parameters.get('pitchFactor').value = parseFloat(pitch)
  }

  /**
   * Updates the playback rate of the AudioWorkletNode parameter. The processor
   * keep adjust the pitch to keep it the same despite the speed change.
   * @param {float} pitch - Value of the pitch to set (0.1 to 2.0)
   */
  updateSpeed(rate) {
    let parsedRate = parseFloat(rate)
    this.workletNode.parameters.get('playbackRate').value = parsedRate
  }

}

module.exports = TimestretchWorklet

hoch · 2022-05-05T16:34:43Z

WG agreed that adding preservePitch property on AudioBufferSourceNode can be useful, but there are more spaces that we want to explore. (e.g. quality, algorithm, complexity, etc)

chrisguttandin · 2022-05-24T19:52:41Z

The code example provided by @agrathwohl above made me think that it could be enough to add a separate PitchShiftNode (called 'phase-vocoder-processor' above) to the spec. Such a node could also be used independently of an AudioBufferSourceNode.

Let's say a separate PitchShiftNode exists and for the sake of simplicity it also has a playbackRate param. This param is used to shift the pitch back to the original pitch. To achieve the preservePitch effect one could build a graph like this.

┌──────┐ ┌────────────┐    ┌──────┐
│ ABSN │-│playbackRate│ ━━ │  CSN │
└──────┘ └────────────┘    └──────┘
   ┃                          ┃
┌──────┐ ┌────────────┐       ┃
│  PSN │-│playbackRate│ ━━━━━━┛
└──────┘ └────────────┘
   ┃
┌──────┐
│  DST │
└──────┘

The audio signal would be routed from the AudioBufferSourceNode through the PitchShiftNode into the destination. A ConstantSourceNode could be used to control both playbackRate AudioParams at the same time. This would make the back-and-forth messaging implemented above unnecessary.

While this approach is a little more complex than adding a simple preservePitch property it could be used for other sources which aren't an AudioBuffer, too.

But all of this could already be implemented using an AudioWorkletProcessor without any changes to the spec. The example above could be modified to do exactly that.

chrisguttandin · 2022-05-24T19:55:47Z

After typing all of the above I realized that this is almost the same as the summary of last year's meeting. #2443 (comment)

🤦

hoch · 2022-09-14T22:35:58Z

We think of two paths:

If user wants simple and cheap time-stretching, ABSN.preservePitch switch can support that. The behavior/sonic characteristic of time-stretching should match the UA's HTMLMediaElement's counterpart.
If a more sophisticated approach is required, a custom AudioWorkletNode can be used.

That leads to:

partial interface AudioBufferSourceNode {
  boolean preservePitch = false;
}

mdjp · 2023-01-12T17:42:24Z

Will review on next call after grouping all related issues/requests.

Tenpi · 2023-06-28T03:57:38Z

Please add this, in 99% of cases you don't want to change the pitch. And HTML Audio elements already have preservesPitch to toggle this behavior, so it would be consistent to also support it in Web Audio API.

As a workaround I have been using this AudioWorklet to correct the pitch, but it doesn't really sound that great and you can notice a lot of artifacts. Hopefully a native solution would sound better. https://github.com/olvb/phaze

hoch added the status: needs discussion label May 5, 2022

beefchimi mentioned this issue Mar 19, 2023

[Sound] Add speed accessor beefchimi/earwurm#30

Closed

beefchimi mentioned this issue Dec 29, 2023

[Sound] New pitch accessor beefchimi/earwurm#31

Open

chrisguttandin mentioned this issue Jan 8, 2024

PreservePitch in web Audio API #2564

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ability to change the playback rate of an AudioBufferSourceNode without affecting the pitch #2487

Ability to change the playback rate of an AudioBufferSourceNode without affecting the pitch #2487

p-himik commented Apr 27, 2022 •

edited

agrathwohl commented May 5, 2022

hoch commented May 5, 2022

chrisguttandin commented May 24, 2022

chrisguttandin commented May 24, 2022

hoch commented Sep 14, 2022 •

edited

mdjp commented Jan 12, 2023

Tenpi commented Jun 28, 2023 •

edited

Ability to change the playback rate of an AudioBufferSourceNode without affecting the pitch #2487

Ability to change the playback rate of an AudioBufferSourceNode without affecting the pitch #2487

Comments

p-himik commented Apr 27, 2022 • edited

agrathwohl commented May 5, 2022

hoch commented May 5, 2022

chrisguttandin commented May 24, 2022

chrisguttandin commented May 24, 2022

hoch commented Sep 14, 2022 • edited

mdjp commented Jan 12, 2023

Tenpi commented Jun 28, 2023 • edited

p-himik commented Apr 27, 2022 •

edited

hoch commented Sep 14, 2022 •

edited

Tenpi commented Jun 28, 2023 •

edited