Convert AudioData stream to arrays of floats, and convert arrays of floats to AudioData #84

lucasw · 2016-08-03T14:45:05Z

I think this needs some work before merging, but I'd like to draw attention to it to get some advice.

The audio_to_float node converts gstreamer encoded AudioData packets into easy to process and understand ChannelFloat32 messages.

I have a viewer that will plot the ChannelFloat32 messages, it takes up a lot of cpu however, and it probably ought to exist elsewhere as a generic utility for plotting arrays (of more types) topics in a way that assumes they are actually 1-D and in sequence. Maybe this is already solved elsewhere?

ChannelFloat32 was used because it was the first message type I found already installed on my system that had floats in an array, but maybe there is something else I ought to use?

The float_to_audio node is resistant to dying with ctrl-c, it would be nice to fix that.

The audio_to_float node makes assumptions about the data format when it converts it, I think I understand gstreamer well enough now to have gstreamer convert to floating point for me.

I'd really like to reformat the code to be roslint compliant, though both cpp files are adapted from code here and mostly the existing style hasn't been changed.

Possibly this should just be a standalone package?

trainman419 · 2016-08-08T16:57:51Z

I like the goal here; it aligns closely with one of the overall goals for the package, to provide a more general audio transport within ROS: #12

I don't work with audio data frequently so I don't know what the preferred format is, but choosing ChannelFloat32 as the message type seems a bit limiting. I think a new message type for formatted audio data would be fine, and there is already the audio_common_msgs package for it.

The comments in ChannelFloat32 indicate that it is used for 3D point cloud data, so if you do want to use it for audio you should probably request an update to the comments.

Some of the most common requests I get for audio_capture and audio_play are for a more understandable message format, so I think it would make sense to integrate this directly into those nodes. (it would probably decrease the latency too)

I'm not aware of a viewer like this elsewhere. I'd be happy to accept it here.

jack-oquin · 2016-08-11T15:19:36Z

My experience (long ago) working with the JACK Audio Connection Kit suggests that an array of 32-bit floats for each channel is a very good way to exchange audio data between different programs.

However, I feel that we should not re-pupose ChannelFloat32 for that, because it already exists with different semantics. We can define a new specifically-audio message, instead.

…rrays of samples.

…rrectly so the callback gets triggered repeatedly with the same buffer. It seems like I need something else in the pipeline to receive the app_buffer.

… I have to know that and cast the data appropriately, or can I force conversion to expected format?

…w I only see five good samples and everything else is zero.

…s to handle

… to understand it (should be 0 to 1.0?)

…aveforms

…AudioData.

…format, currently getting [ INFO] [/audio_capture] [/home/lucasw/catkin_ws/src/audio_common/float_to_audio/src/float_to_audio.cpp]:[208] [emitted push] [ERROR] [/audio_capture] [/home/lucasw/catkin_ws/src/audio_common/float_to_audio/src/float_to_audio.cpp]:[194] [gstreamer: Internal data flow error.] terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::lock_error> >' what(): boost: mutex Not sure if I'm specifying the float format correctly, x-raw-float is 0.10 and not 1.0.

0:00:00.016236728 17257 0x19a4720 WARN basetransform gstbasetransform.c:1414:gst_base_transform_setcaps:<filter> transform could not transform audio/x-raw, format=(int)28, channels=(int)1, width=(int)32, depth=(int)32, endianness=(int)1234, rate=(int)16000 in anything we support [ INFO] [/audio_capture] [/home/lucasw/catkin_ws/src/audio_common/float_to_audio/src/float_to_audio.cpp]:[208] [emitted push 8] 0:00:00.258644946 17257 0x19a4720 WARN basetransform gstbasetransform.c:1414:gst_base_transform_setcaps:<filter> transform could not transform audio/x-raw, format=(int)28, channels=(int)1, width=(int)32, depth=(int)32, endianness=(int)1234, rate=(int)16000 in anything we support 0:00:00.258680231 17257 0x19a4720 WARN basetransform gstbasetransform.c:1414:gst_base_transform_setcaps:<filter> transform could not transform audio/x-raw, format=(int)28, channels=(int)1, width=(int)32, depth=(int)32, endianness=(int)1234, rate=(int)16000 in anything we support 0:00:00.258704660 17257 0x19a4720 WARN basesrc gstbasesrc.c:2948:gst_base_src_loop:<source> error: Internal data flow error. 0:00:00.258712146 17257 0x19a4720 WARN basesrc gstbasesrc.c:2948:gst_base_src_loop:<source> error: streaming task paused, reason not-negotiated (-4) 0:00:00.258770927 17257 0x19a4720 WARN basetransform gstbasetransform.c:1414:gst_base_transform_setcaps:<filter> transform could not transform audio/x-raw, format=(int)28, channels=(int)1, width=(int)32, depth=(int)32, endianness=(int)1234, rate=(int)16000 in anything we support 0:00:00.258792146 17257 0x19a4720 WARN basetransform gstbasetransform.c:1414:gst_base_transform_setcaps:<filter> transform could not transform audio/x-raw, format=(int)28, channels=(int)1, width=(int)32, depth=(int)32, endianness=(int)1234, rate=(int)16000 in anything we support [ERROR] [/audio_capture] [/home/lucasw/catkin_ws/src/audio_common/float_to_audio/src/float_to_audio.cpp]:[194] [gstreamer: Internal data flow error.] terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::lock_error> >' what(): boost: mutex lock failed in pthread_mutex_lock: Invalid argument Aborted (core dumped)

….c gst_pad_push_data error pushing events, return not-negotiated (Seen on debug level 6)

… 16823 0x224a000 WARN basetransform gstbasetransform.c:1414:gst_base_transform_setcaps:<filter> transform could not transform audio/x-raw, format=(string)S16LE, channels=(int)1, layout=(int)0, width=(int)16, depth=(int)16, rate=(int)16000, signed=(boolean)true in anything we support

…tring. The input and output rates do look different though

…is to approximate nes sound chip with A-D voices, have four ros nodes in control of each, and then they can be played individually.

…s in it results in internal data error. Also trying to make noise sound more like online examples- I think when sample rate exceeds output sample rate there needs to be averaging. Also it is possible the mp3 compression is making the sound much different (could increase bitrate?), which is why I want wave to work.

… though), and can't get rqt to load the config file

…eems like there ought to be an auditory difference (or maybe I need the conversion to float to understand the decoded format better)

…udio_from_float to have the same prefix as the rest of the projects? Also deleted a lot of commented out code. Inability to kill float_to_audio easily remains, also other major issue is that audio_to_float assumes format to convert to float, I think I understand gstreamer enough to make it do the floating point conversion.

…h float_to_audio, but when I run it manually it does work: rosrun float_to_audio float_to_audio audio:=audio2 samples:=decoded __name:=float_to_audio __ns:=/audio

lucasw added 28 commits May 19, 2023 07:12

x-raw allows my mic to work- needs to be parameter

46e1c7f

Not compiling yet, want to be able to decode audio and then process a…

a740590

…rrays of samples.

py example doesn't work

dba1563

This compiles but results in glib/gstreamer errors when it runs.

e830583

It looks like only one buffer is received, but I'm not handling it co…

2f2f32b

…rrectly so the callback gets triggered repeatedly with the same buffer. It seems like I need something else in the pipeline to receive the app_buffer.

This may be the raw data- but what about 8-bit vs. 16-bit samples? Do…

16c637d

… I have to know that and cast the data appropriately, or can I force conversion to expected format?

I thought I was getting good decoded data from the microphone, but no…

ad36501

…w I only see five good samples and everything else is zero.

Now seeing reasonable waveform, and it is 16-bit, which the viewer ha…

e65a238

…s to handle

Passing data as -1.0 to 1.0 float now, so receiving node doesn't have…

3c51d1e

… to understand it (should be 0 to 1.0?)

More visually pleasing but perhaps not super useful fade out of old w…

c87d14f

…aveforms

Generating a float array signal, now need to get it into gst encoded …

9095bed

…AudioData.

Trying out paused, but the next problem seems to be with pads? gstpad…

4677fbe

….c gst_pad_push_data error pushing events, return not-negotiated (Seen on debug level 6)

Trying out audio resample, no success yet

6a0592b

Finall got it! The whole problem was not specifying the layout as a s…

c8c1653

…tring. The input and output rates do look different though

Making messages debug level, also adjusting visualization. Next step …

4fe1db6

…is to approximate nes sound chip with A-D voices, have four ros nodes in control of each, and then they can be played individually.

Trying out nes style noise, but the waveform doesn't sound right

bba66bd

flac seems to work, though I got a crash once (maybe a different node…

18b4d29

… though), and can't get rqt to load the config file

disabling wave. Flac works but the waveform is really different, it s…

a4c7f70

…eems like there ought to be an auditory difference (or maybe I need the conversion to float to understand the decoded format better)

The launch file doesn't work when I launch it- something is wrong wit…

2fc59f5

…h float_to_audio, but when I run it manually it does work: rosrun float_to_audio float_to_audio audio:=audio2 samples:=decoded __name:=float_to_audio __ns:=/audio

Forgot to add this headers

5b7fb73

Add options to disable parts of the float to audio launch file

4584078

View a spectrogram of the audio

8bae90d

Bigger nperseg shows more detail in the spectrogram

f6a80dc

lucasw force-pushed the appsink branch from b94ee6a to f6a80dc Compare May 19, 2023 14:14

github-actions bot added audio_capture audio_play labels May 19, 2023

lucasw added 3 commits May 19, 2023 07:17

flake8 compliance

497bba7

flake8 compliance for view.py

d61ee03

flake8 compliance in float_to_audio scripts

e0a3be8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert AudioData stream to arrays of floats, and convert arrays of floats to AudioData #84

Convert AudioData stream to arrays of floats, and convert arrays of floats to AudioData #84

lucasw commented Aug 3, 2016

trainman419 commented Aug 8, 2016

jack-oquin commented Aug 11, 2016

Convert AudioData stream to arrays of floats, and convert arrays of floats to AudioData #84

Are you sure you want to change the base?

Convert AudioData stream to arrays of floats, and convert arrays of floats to AudioData #84

Conversation

lucasw commented Aug 3, 2016

trainman419 commented Aug 8, 2016

jack-oquin commented Aug 11, 2016