Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert AudioData stream to arrays of floats, and convert arrays of floats to AudioData #84

Open
wants to merge 31 commits into
base: master
Choose a base branch
from

Conversation

lucasw
Copy link
Contributor

@lucasw lucasw commented Aug 3, 2016

I think this needs some work before merging, but I'd like to draw attention to it to get some advice.

The audio_to_float node converts gstreamer encoded AudioData packets into easy to process and understand ChannelFloat32 messages.

I have a viewer that will plot the ChannelFloat32 messages, it takes up a lot of cpu however, and it probably ought to exist elsewhere as a generic utility for plotting arrays (of more types) topics in a way that assumes they are actually 1-D and in sequence. Maybe this is already solved elsewhere?

ChannelFloat32 was used because it was the first message type I found already installed on my system that had floats in an array, but maybe there is something else I ought to use?

The float_to_audio node is resistant to dying with ctrl-c, it would be nice to fix that.

The audio_to_float node makes assumptions about the data format when it converts it, I think I understand gstreamer well enough now to have gstreamer convert to floating point for me.

I'd really like to reformat the code to be roslint compliant, though both cpp files are adapted from code here and mostly the existing style hasn't been changed.

Possibly this should just be a standalone package?

@trainman419
Copy link
Contributor

I like the goal here; it aligns closely with one of the overall goals for the package, to provide a more general audio transport within ROS: #12

I don't work with audio data frequently so I don't know what the preferred format is, but choosing ChannelFloat32 as the message type seems a bit limiting. I think a new message type for formatted audio data would be fine, and there is already the audio_common_msgs package for it.

The comments in ChannelFloat32 indicate that it is used for 3D point cloud data, so if you do want to use it for audio you should probably request an update to the comments.

Some of the most common requests I get for audio_capture and audio_play are for a more understandable message format, so I think it would make sense to integrate this directly into those nodes. (it would probably decrease the latency too)

I'm not aware of a viewer like this elsewhere. I'd be happy to accept it here.

@jack-oquin
Copy link
Member

My experience (long ago) working with the JACK Audio Connection Kit suggests that an array of 32-bit floats for each channel is a very good way to exchange audio data between different programs.

However, I feel that we should not re-pupose ChannelFloat32 for that, because it already exists with different semantics. We can define a new specifically-audio message, instead.

lucasw added 28 commits May 19, 2023 07:12
…rrectly so the callback gets triggered repeatedly with the same buffer. It seems like I need something else in the pipeline to receive the app_buffer.
… I have to know that and cast the data appropriately, or can I force conversion to expected format?
…w I only see five good samples and everything else is zero.
…format, currently getting

[ INFO] [/audio_capture] [/home/lucasw/catkin_ws/src/audio_common/float_to_audio/src/float_to_audio.cpp]:[208] [emitted push]
[ERROR] [/audio_capture] [/home/lucasw/catkin_ws/src/audio_common/float_to_audio/src/float_to_audio.cpp]:[194] [gstreamer: Internal data flow error.]
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::lock_error> >'
  what():  boost: mutex

Not sure if I'm specifying the float format correctly, x-raw-float is 0.10 and not 1.0.
0:00:00.016236728 17257      0x19a4720 WARN           basetransform gstbasetransform.c:1414:gst_base_transform_setcaps:<filter> transform could not transform audio/x-raw, format=(int)28, channels=(int)1, width=(int)32, depth=(int)32, endianness=(int)1234, rate=(int)16000 in anything we support
[ INFO] [/audio_capture] [/home/lucasw/catkin_ws/src/audio_common/float_to_audio/src/float_to_audio.cpp]:[208] [emitted push 8]
0:00:00.258644946 17257      0x19a4720 WARN           basetransform gstbasetransform.c:1414:gst_base_transform_setcaps:<filter> transform could not transform audio/x-raw, format=(int)28, channels=(int)1, width=(int)32, depth=(int)32, endianness=(int)1234, rate=(int)16000 in anything we support
0:00:00.258680231 17257      0x19a4720 WARN           basetransform gstbasetransform.c:1414:gst_base_transform_setcaps:<filter> transform could not transform audio/x-raw, format=(int)28, channels=(int)1, width=(int)32, depth=(int)32, endianness=(int)1234, rate=(int)16000 in anything we support
0:00:00.258704660 17257      0x19a4720 WARN                 basesrc gstbasesrc.c:2948:gst_base_src_loop:<source> error: Internal data flow error.
0:00:00.258712146 17257      0x19a4720 WARN                 basesrc gstbasesrc.c:2948:gst_base_src_loop:<source> error: streaming task paused, reason not-negotiated (-4)
0:00:00.258770927 17257      0x19a4720 WARN           basetransform gstbasetransform.c:1414:gst_base_transform_setcaps:<filter> transform could not transform audio/x-raw, format=(int)28, channels=(int)1, width=(int)32, depth=(int)32, endianness=(int)1234, rate=(int)16000 in anything we support
0:00:00.258792146 17257      0x19a4720 WARN           basetransform gstbasetransform.c:1414:gst_base_transform_setcaps:<filter> transform could not transform audio/x-raw, format=(int)28, channels=(int)1, width=(int)32, depth=(int)32, endianness=(int)1234, rate=(int)16000 in anything we support
[ERROR] [/audio_capture] [/home/lucasw/catkin_ws/src/audio_common/float_to_audio/src/float_to_audio.cpp]:[194] [gstreamer: Internal data flow error.]
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::lock_error> >'
  what():  boost: mutex lock failed in pthread_mutex_lock: Invalid argument
Aborted (core dumped)
….c gst_pad_push_data error pushing events, return not-negotiated (Seen on debug level 6)
… 16823 0x224a000 WARN basetransform gstbasetransform.c:1414:gst_base_transform_setcaps:<filter> transform could not transform audio/x-raw, format=(string)S16LE, channels=(int)1, layout=(int)0, width=(int)16, depth=(int)16, rate=(int)16000, signed=(boolean)true in anything we support
…tring. The input and output rates do look different though
…is to approximate nes sound chip with A-D voices, have four ros nodes in control of each, and then they can be played individually.
…s in it results in internal data error. Also trying to make noise sound more like online examples- I think when sample rate exceeds output sample rate there needs to be averaging. Also it is possible the mp3 compression is making the sound much different (could increase bitrate?), which is why I want wave to work.
… though), and can't get rqt to load the config file
…eems like there ought to be an auditory difference (or maybe I need the conversion to float to understand the decoded format better)
…udio_from_float to have the same prefix as the rest of the projects? Also deleted a lot of commented out code. Inability to kill float_to_audio easily remains, also other major issue is that audio_to_float assumes format to convert to float, I think I understand gstreamer enough to make it do the floating point conversion.
…h float_to_audio, but when I run it manually it does work:

rosrun float_to_audio float_to_audio audio:=audio2 samples:=decoded __name:=float_to_audio __ns:=/audio
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants