Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add initial proposal for V2 recording & playback API. (WIP) #791

Draft
wants to merge 25 commits into
base: v2-docs
Choose a base branch
from

Conversation

microbit-carlos
Copy link
Collaborator

@microbit-carlos microbit-carlos commented May 15, 2023

Docs preview:

This initial proposal has been discussed in:

But we have some open question that will likely result and a rework of some of this.

Initial proposal

The initial proposal in this PR was to create a new AudioBuffer class to contain the audio data and sampling rate.
The AudioBuffer.rate property could then be used by microphone.record() and audio.play() to configure recording and playback rates.
This was done to avoid introducing a new parameter to audio.play() to configure the sampling rate, when it could only work with a single type of sound input (as it might not be possible to change the rate of the SoundExpressions or AudioFrames).

Disadvantages

However, changing the rate in a buffer type to change the playback rate in real-time is a bit awkward:

my_recording = audio.AudioBuffer(duration=5000, rate=5500)
microphone.record_into(my_recording)
audio.play(my_recording, wait=False)
while audio.is_playing():
    x = accelerometer.get_x()
    my_recording.rate = scale(x, (-1000, 1000), (2250, 11000))
    sleep(50)

An alternative we considered was to have the playback sampling rate modified via the audio module itself:

audio.play(my_recording, wait=False)
while audio.is_playing():
    x = accelerometer.get_x()
    audio.set_rate(scale(x, (-1000, 1000), (2250, 11000)))
    sleep(50)

However, this would have to set the same rate to everything played via the audio module, and Sound Expression have a different default rate (44K) than recordings (11K). So audio.set_rate(22000) should slow down Sound Expression and speed up recordings.

Alternatively, if we wanted to change the playback rate via the audio module, we could set a ratio instead. Something equivalent to audio.set_speed(100%) (with different semantics). But a disadvantage would be that it's removing some of math/physics learning opportunity to directly relate the sampling rate value with the effects that it has in playback speed.

Alternative proposal: bytearray as the buffer type

In this case a byte array would be returned by microphone.record() and used withmicrophone.record_into().

As this data type does not include info about the rate, we depend on the audio.play() adding an extra argument that might not work with other sound types like Sound Expressions and Audio Frames.

However, we still have the issue of updating the playback rate in real time during playback, which means we might would have to use use a similar approach to the previously mentioned audio.set_speed(100%):

sound_in_byte_array = microphone.record(duration=3000, rate=5500)
audio.play(sound_in_byte_array, rate=5500 wait=False)
while audio.is_playing():
    x = accelerometer.get_x()
    audio.set_speed(scale(x, (-1000, 1000), (50, 200)))
    sleep(50)
DURATION_SECONDS = 3
SAMPLE_RATE = 5500
recording = bytearray(DURATION_SECONDS * SAMPLE_RATE)
microphone.record_into(recording, rate=SAMPLE_RATE)
audio.play(recording, rate=SAMPLE_RATE)

Alternative proposal: AudioFrames as the buffer type

This would be the same as the bytearray proposal, but using the existing AudioFrames instead.

We might need to tweak the AudioFrame class to let us user larger buffers, as the default is 32 samples. As audio.play() can consume an iterable as well, we would need to figure out a good balance between AudioFrame size and number of AudioFrames in a recording buffer.

@microbit-carlos microbit-carlos force-pushed the docs-recording branch 3 times, most recently from 52f230c to e635d92 Compare May 15, 2023 10:53
@microbit-carlos microbit-carlos marked this pull request as ready for review May 19, 2023 11:27
@microbit-carlos microbit-carlos marked this pull request as draft May 19, 2023 11:27
@microbit-carlos microbit-carlos changed the title docs: Add initial proposal for recording & playback API. (WIP) docs: Add initial proposal for V2 recording & playback API. (WIP) May 19, 2023
@microbit-carlos
Copy link
Collaborator Author

microbit-carlos commented Aug 3, 2023

Based on the latest discussion we have agreed that we'd prefer to avoid introducing a new data type for his feature.
In that case we have two options:

Use a byte array and change sampling via function in the audio module

sound_in_byte_array = microphone.record(duration=3000, rate=5500)
audio.play(sound_in_byte_array, wait=False)
while audio.is_playing():
    x = accelerometer.get_x()
    playback_sampling_rate = scale(x, (-1000, 1000), (2_200, 11_000))
    audio.set_rate(playback_sampling_rate)
    sleep(50)

One previous suggestion was to create a function in the lines of audio.set_speed() that used a percentage value (0 to 100) as a the input range. This could solve the issue with different pipelines in the sound mixer having different playback sampling rates, however we believe it's important to use real numbers to be able to directly compare and understand how changing the sampling rate during recording vs playback affects sound.

To be able to implement something like audio.set_rate() we would probably need to ensure everything that is inputted to audio.play() uses the same default sampling rate. Right now these are the types of input audio.play() takes:

  • Sound effects via user-created audio.SoundEffect() instances
    • CODAL SoundExpressions, default sampling rate: 44_100
  • Built-in sounds via microbit.Sound pre-generated instances
    • CODAL SoundExpressions, default sampling rate: 44_100
  • audio.AudioFrame
    • MicroPython data type, default sampling rate: 8_000?
      • Not sure about this one, the docs list as single frame of 32 samples to take a bit over 4ms.

We'd have to check how it'd affect sound quality, but we could consider decreasing the sampling rate for SoundExpressions to 11K. However, we would still have the isse that we won't be able to increase the default AudioFrame sampling rate as that would how old programmes sound.

While this would be the cleanest way to do this for the user API, I'm not seeing a way in which it can be achieved with the current requirements? Does anybody have any ideas to overcomes this issues?

Expand AudioFrame to include sampling rate

This option is similar to the original proposals about creating a new AudioBuffer data type, but instead we can expand AudioFrames to be able to have different sizes and different sampling rates.

By default they should still behave as they do in micro:bit V1 (and older MicroPython versions for V2), which is 32 samples at 8K (?) sampling rate.

But this could be changed via constructor parameters to have any size buffer at any sample rate.

Unfortunately, that still leaves us with the awkward case of changing playback sampling rate by changing a variable from the samples class, instead of a method in the audio module.

my_recording = audio.AudioFrames(length=11000, rate=5500)
my_recording = microphone.record_into(my_recording)
# Or
my_recording = microphone.record(duration=2000, rate=5500)

audio.play(my_recording, wait=False)
while audio.is_playing():
    x = accelerometer.get_x()
    my_recording.rate = scale(x, (-1000, 1000), (2250, 11000))
    sleep(50)

@microbit-carlos
Copy link
Collaborator Author

@dpgeorge the docs have been updated, let me know if something doesn't match our previous conversation.

@microbit-carlos
Copy link
Collaborator Author

@dpgeorge there are couple of issues related to setting the sampling rate, but shouldn't affect the MicroPython implementation and should be fixed (without changes in the API) in the next CODAL release: https://github.com/lancaster-university/codal-microbit-v2/issues?q=is%3Aopen+milestone%3Av0.2.60+label%3Ap0

docs/audio.rst Outdated
@@ -69,6 +72,13 @@ Functions

Stops all audio playback.

.. py:function:: set_rate(sample_rate)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this could be made into a static-method of the AudioFrame class? That way the scope of it is clear, it only acts on these objects. And it would then be possible to write:

AudioFrame.set_rate(8000)
...
my_frame = AudioFrame(...)
my_frame.set_rate(8000)
``

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, as a static method that would affect all AudioFrame playback instead of individual instances?

I agree that having it as part of AudioFrame makes it a lot more clear that it only affects playback of this type, but might be confusing when accessing the function via instances:

frame_one = AudioFrame(...)
frame_two = AudioFrame(...)
frame_one.set_rate(8000)
frame_two.set_rate(8000)
...
while audio.is_playing():
    frame_one.set_rate(scale(accelerometer values...))

# At this point, I'd expect playback for frame_two to still be 8000?
audio.play(frame_two) 

Would it'd be more intuitive if each instance hold its playback rate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, as a static method that would affect all AudioFrame playback instead of individual instances?

Yes, as a static method calling it once affects all AudioFrame instances.

If we went this way, probably best to document it as AudioFrame.set_rate(...) and never refer to it as frame.set_rate(...). Then it's clear it sets the global rate for all audio frames.

Would it'd be more intuitive if each instance hold its playback rate?

I'm not sure... maybe? But then it would be an instance method and you would always call frame.set_rate(...) to set the playback rate of that instance.

I think that's possible to implement.

docs/audio.rst Outdated
AudioFrame
==========

.. py:class::
AudioFrame
AudioFrame(size=32)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this could be extended to take optional keyword arguments duration and rate (with a default), to make it easy to create an AudioFrame of a given duration.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case, similar to the previous comment about having an AudioFrame.set_rate(), if rate was added to the constructor then we'd need to have set_rate() as a method changing each instance value, instead of a class static function.

I do like the idea to be able to make the arguments of the AudioFrame constructor simpler to work out if the user just wants a specific "size in time".

What would happen if the constructor is provided an argument for size and for duration? Would that throw an exception?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that there's a difference between record and play rates. The rate here is related to the recording rate, although the actual recording rate is set when you call microphone.record_into(). And that's different again to the playback rate set by .set_rate().

What would happen if the constructor is provided an argument for size and for duration? Would that throw an exception?

Yes it could throw an exception.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs/audio.rst Outdated
The ``audio.play()`` function can consume an instance or iterable
(sequence, like list or tuple, or generator) of ``AudioFrame`` instances.
Its default playback rate is 7812 Hz, and uses linear interpolation to output
a PWM signal at 32.5 kHz.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should get rid of this linear interpolation, and just output the 7812 Hz signal directly. I did some tests, and the audio quality is better without the interpolation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, in that case let's do that and remove this from the docs 👍

@@ -70,11 +119,61 @@ Functions
* **return**: a representation of the sound pressure level in the range 0 to
255.

.. py:function:: record(duration=3000, rate=7812, wait=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can remove this function altogether, and just have record_into(). That way memory management is explicit, it avoids situations where there's a lot of heap fragmentation, and also avoids any difficulties with this function returning an AudioFrame object that is growing over time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot the context in which AudioFrame would be growing. I take it that was the case when wait=False, but if we have to have a known/default value for duration and rate, wouldn't the AudioFrame be allocated at the start before the recording occurs?

I still quite like the idea of being able to do audio.play(microphone.record(duration=3000)).
Would it be simpler if wait was removed from this version and only available in record_into()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it that was the case when wait=False, but if we have to have a known/default value for duration and rate, wouldn't the AudioFrame be allocated at the start before the recording occurs?

When wait=False the returned-AudioFrame will be gradually filling up. If we go with the current implementation where the whole buffer is preallocated at the start of the recording, then I guess it's not too bad, the user just sees blank data that's gradually filling up.

So, we can keep this function. And even support wait=False. Under the hood it will simply allocate an AudioFrame of the desired size, then pass it to record_into().


Example
=======
.. py:function:: record_into(buffer, rate=7812, wait=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this function can set some new internal state in an AudioFrame object which indicates the length of the recording (as opposed to the total allocated length of the buffer). Then audio.play() would use this state to only play the amount that was recorded.

Could then add a method to get/set this length, eg AudioFrame.get_recording_length(), and AudioFrame.set_recording_length().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I completely agree here, playing with the current branch and recording a couple of seconds into a 5 second "buffer" results in a few seconds of "silent playback".

Would record_into() then also use this marker to continue recording where it left off?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would record_into() then also use this marker to continue recording where it left off?

I think that would be confusing. It would require the user to explicitly reset the marker to the beginning to reuse the buffer. Maybe instead record_into() could have an additional argument which lets the user specify the starting location in the buffer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

microbit-mark and others added 13 commits February 26, 2024 18:09
* Add microphone API

* include V2 on index

* update music and V2 pins

* update speech

* update audio

* update i2c

* add description to built-in sounds

* format sounds

* format pins

* format pins

* spelling

* Update docs/audio.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Update docs/audio.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Update docs/i2c.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Update docs/microbit_micropython_api.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Update docs/microbit_micropython_api.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* changes based on initial feedback

* update return pin

* sound is in microbit module

* Update docs/audio.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Update docs/microbit_micropython_api.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Update docs/microbit_micropython_api.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* more feedback updates

* spacing

* Update docs/audio.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* does this comment work?

* Revert "does this comment work?"

This reverts commit 4846113.

* update audio

* update audio

* add Python Editor

* add references to sound

* update mic image

* Update docs/microbit_micropython_api.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* multiple fixes

* use Param in docs

* update SouneEvent

* update microphone module

* update param

* update image list

* update parameters

* remove note

* update param

* add line break

* add mic and make clases a-z

* format parameters

* Update docs/microbit_micropython_api.rst

* Update docs/microbit_micropython_api.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* more updates

* tidy up

* move SoundEvent

* Revert "move SoundEvent"

This reverts commit 8f52c97.

* move soundEvent

* update image list

* remove param reference

* more feedback updates

* move audio to modules

* Update docs/music.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Update docs/microphone.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Update docs/microbit_micropython_api.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Update docs/microphone.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Update docs/music.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Update docs/audio.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* more updates

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>
* Docs: update for RGBW neopixel

* Docs: RGBW neopixels

* Update docs/neopixel.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Update docs/neopixel.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* change to bpp

* change to bpp

* add image attribution

* Update docs/neopixel.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Add V2 methods

* Docs: Update neopixel

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>
* speaker pin changes

* Docs: update speaker pin info

* Add speaker page

* grammar

* Link to speaker page

* add speaker to navigation

* add speaker example

* Update docs/speaker.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Update docs/speaker.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Update docs/music.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>

* Docs: Update speaker pin

* remove build

* gitignore _build directory

* update music

* Update docs/music.rst

* Update docs/speech.rst

* Update docs/speech.rst

* Update docs/speech.rst

* Update docs/music.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>
* V2 docs: add set_touch_mode()

* V2 docs: add set_touch_mode() to micropython API doc
…bcmicrobit#705)

* Add layout table and partial flashing

* deprecated appending script

* add build info for v2

* Add section on BLE DFU for V2

* Carlos' feedback

* fix filesystem link

* fixup! fix filesystem link

* fixup! fixup! fix filesystem link

* formatting

* formatting

* fix filesystem link

* Update docs/devguide/hexformat.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>
* docs: update-tutorials for V2

* docs: update images

* remove pin functions reference link

* reduce size of image

* Update docs/pin.rst

* Update docs/pin.rst

* Update docs/tutorials/io.rst

Co-authored-by: Carlos Pereira Atencio <carlos@microbit.org>
Completes and merges build and flash pages.
Adds V2 info.
Removes outdated FAQ page.
Adds WebUSB info to REPL page.

Rebase update: Keeps and updates the flashing paragraph in
flashfirmware page.
Includes microbit.run_every function/decorator.
* docs: Add Power Management documentation.

* docs: Merge the wake_source() function into deep_sleep().

* docs: Merge power.deep_sleep() `pins` & `buttons` params into `wake_on`.

* docs: Move power examples to Python files, and the tech details to the end of the page.
* docs: Add Sound Effects documentation.

* docs: Change proposed audio.Effect() to audio.SoundEffect()

* docs: Update the proposed default values for audio.SoundEffect().

* docs: Move the SoundEffects constants from `audio` to `audio.SoundEffect`.

* docs: Remove SoundEffect `preset` constructor parameter, and add `copy()` method.

* docs: Change SoundEffect() `interporlation` parameter to `shape`.

* docs: Remove mentions of SoundEffect presets, add ranges, example to file.
* docs: Update Power Mgm to change run_every behaviour.

From waking up the board when the next function is scheduled, to
allow the scheduled functions to continue running indefinitely
while the boards sleeps.

* docs: Tweak Power Mgm run_every description & indicate UART wakeup.

.. py:function:: stop_recording()

Stops an a recording running in the background.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Stops an a recording running in the background.
Stops a recording running in the background.

Fixing typo but that also got me wondering what happens if you record more than once with wait=False (before the first has finished).
Do they all work? If so does this function stop them all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one can be recorded at once, so a previously running recording is aborted at the point you call record() or record_into() again (as though you called stop_recording() first).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants