Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert AudioData stream to arrays of floats, and convert arrays of floats to AudioData #84

Open
wants to merge 31 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
46e1c7f
x-raw allows my mic to work- needs to be parameter
lucasw Jul 16, 2016
a740590
Not compiling yet, want to be able to decode audio and then process a…
lucasw Jul 16, 2016
dba1563
py example doesn't work
lucasw Jul 19, 2016
e830583
This compiles but results in glib/gstreamer errors when it runs.
lucasw Jul 23, 2016
2f2f32b
It looks like only one buffer is received, but I'm not handling it co…
lucasw Jul 23, 2016
16c637d
This may be the raw data- but what about 8-bit vs. 16-bit samples? D…
lucasw Jul 24, 2016
ad36501
I thought I was getting good decoded data from the microphone, but no…
lucasw Jul 24, 2016
e65a238
Now seeing reasonable waveform, and it is 16-bit, which the viewer ha…
lucasw Jul 24, 2016
3c51d1e
Passing data as -1.0 to 1.0 float now, so receiving node doesn't have…
lucasw Jul 25, 2016
c87d14f
More visually pleasing but perhaps not super useful fade out of old w…
lucasw Jul 25, 2016
9095bed
Generating a float array signal, now need to get it into gst encoded …
lucasw Jul 29, 2016
094a920
Trying to convert ros messages of floating point arrays to gstreamer …
lucasw Jul 29, 2016
88f69cc
Should be able to trace problem with debug on:
lucasw Jul 29, 2016
4677fbe
Trying out paused, but the next problem seems to be with pads? gstpa…
lucasw Jul 30, 2016
9e7d9a1
Trying out S16LE as the input type, but now back to 0:00:00.276624022…
lucasw Jul 30, 2016
6a0592b
Trying out audio resample, no success yet
lucasw Jul 30, 2016
c8c1653
Finall got it! The whole problem was not specifying the layout as a …
lucasw Jul 30, 2016
4fe1db6
Making messages debug level, also adjusting visualization. Next step…
lucasw Jul 30, 2016
bba66bd
Trying out nes style noise, but the waveform doesn't sound right
lucasw Aug 1, 2016
c642cbd
Wave encoding doesn't work, trying to fix that- if I put too many cap…
lucasw Aug 1, 2016
18b4d29
flac seems to work, though I got a crash once (maybe a different node…
lucasw Aug 2, 2016
a4c7f70
disabling wave. Flac works but the waveform is really different, it …
lucasw Aug 2, 2016
52fe48c
Rename projects for consistency, but maybe float_to_audio should be a…
lucasw Aug 3, 2016
2fc59f5
The launch file doesn't work when I launch it- something is wrong wit…
lucasw Aug 4, 2016
5b7fb73
Forgot to add this headers
lucasw Aug 12, 2018
4584078
Add options to disable parts of the float to audio launch file
lucasw Sep 1, 2018
8bae90d
View a spectrogram of the audio
lucasw Sep 5, 2018
f6a80dc
Bigger nperseg shows more detail in the spectrogram
lucasw Sep 5, 2018
497bba7
flake8 compliance
lucasw May 19, 2023
d61ee03
flake8 compliance for view.py
lucasw May 19, 2023
e0a3be8
flake8 compliance in float_to_audio scripts
lucasw May 19, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions audio_capture/launch/capture.launch
Original file line number Diff line number Diff line change
@@ -1,16 +1,19 @@
<launch>
<!-- arecord -l will show available input devices, use the car number as
the first number and the subdevice number as the second in a string
like hw:1,0 -->
like hw:1,0
run pacmd list-sources to input specs
(but not all possible ones?)
-->
<arg name="dst" default="appsink"/>
<arg name="device" default=""/>
<arg name="format" default="mp3"/>
<arg name="format" default="mp3" doc="can only be mp3 or wave"/>
<arg name="bitrate" default="128"/>
<arg name="channels" default="1"/>
<arg name="depth" default="16"/>
<arg name="sample_rate" default="16000"/>
<arg name="sample_format" default="S16LE"/>
<arg name="ns" default="audio"/>
<arg name="ns" default="audio" doc="namespace to run node in"/>
<arg name="audio_topic" default="audio"/>

<group ns="$(arg ns)">
Expand Down
69 changes: 69 additions & 0 deletions audio_capture/scripts/audio_capture.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
#!/usr/bin/env python
# https://adnanalamkhan.wordpress.com/2015/03/01/using-gstreamer-1-0-with-python/
import gi
# import rospy

gi.require_version('Gst', '1.0')
# gi.require_version('Gtk', '3.0')
# from gi.repository import Gtk
from gi.repository import GObject
from gi.repository import Gst as gst

GObject.threads_init()
gst.init(None)
# rospy.init_node('audio_capture')

# Create the pipeline for our elements.
pipeline = gst.Pipeline()
# Create the elements for our project.

audio_source = gst.ElementFactory.make('filesrc', 'audio_source')
# audio_source = gst.ElementFactory.make('alsasrc', 'audio_source')
decode = gst.ElementFactory.make('mad', 'decode')
convert = gst.ElementFactory.make('audioconvert', 'convert')
equalizer = gst.ElementFactory.make('equalizer-3bands', 'equalizer')
audio_sink = gst.ElementFactory.make('autoaudiosink', 'audio_sink')

# Ensure all elements were created successfully.
if (not pipeline or not audio_source or not decode or
not convert or not equalizer or not audio_sink):
print('Not all elements could be created.')
exit(-1)

# Configure our elements.
filename = 'blah' # 'Kevin_MacLeod_-_05_-_Impact_Allegretto.mp3'
audio_source.set_property('location', filename)
equalizer.set_property('band1', -24.0)
equalizer.set_property('band2', -24.0)

# Add our elements to the pipeline.
pipeline.add(audio_source)
pipeline.add(decode)
pipeline.add(convert)
pipeline.add(equalizer)
pipeline.add(audio_sink)

# Link our elements together.
audio_source.link(decode)
decode.link(convert)
convert.link(equalizer)
equalizer.link(audio_sink)

# Set our pipelines state to Playing.
# check the following documentation whenever you get
# some AttributeError.
# link: http://lazka.github.io/pgi-docs/#Gst-1.0/flags.html
pipeline.set_state(gst.State.PLAYING)

# Wait until error or EOS.
bus = pipeline.get_bus()

while True: # not rospy.is_shutdown():
# msg = bus.timed_pop_filtered(gst.CLOCK_TIME_NONE, gst.MessageType.ERROR | gst.MessageType.EOS)
msg = bus.timed_pop_filtered(1e9, gst.MessageType.ERROR | gst.MessageType.EOS)
if msg is None:
break
print msg

# Free resources.
pipeline.set_state(gst.State.NULL)
16 changes: 16 additions & 0 deletions audio_capture/src/audio_capture.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,22 @@ namespace audio_transport
gst_bin_add_many( GST_BIN(_pipeline), _source, _filter, _sink, NULL);
link_ok = gst_element_link_many( _source, _filter, _sink, NULL);
}
#if 0
GstCaps *caps;
// caps = gst_caps_new_simple("audio/x-raw-int",
caps = gst_caps_new_simple("audio/x-raw",
"channels", G_TYPE_INT, _channels,
"width", G_TYPE_INT, _depth,
"depth", G_TYPE_INT, _depth,
"rate", G_TYPE_INT, _sample_rate,
"signed", G_TYPE_BOOLEAN, TRUE,
NULL);

g_object_set( G_OBJECT(_sink), "caps", caps, NULL);
gst_caps_unref(caps);
gst_bin_add_many( GST_BIN(_pipeline), _source, _sink, NULL);
link_ok = gst_element_link_many( _source, _sink, NULL);
#endif
} else {
ROS_ERROR_STREAM("format must be \"wave\" or \"mp3\"");
exitOnMainThread(1);
Expand Down
2 changes: 1 addition & 1 deletion audio_play/src/audio_play.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ namespace audio_transport
gst_buffer_unref(buffer);
}

static void cb_newpad (GstElement *decodebin, GstPad *pad,
static void cb_newpad (GstElement *decodebin, GstPad *pad,
gpointer data)
{
RosGstPlay *client = reinterpret_cast<RosGstPlay*>(data);
Expand Down
1 change: 1 addition & 0 deletions audio_to_float/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
build
88 changes: 88 additions & 0 deletions audio_to_float/CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Changelog for package audio_play
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Forthcoming
-----------
* Changed message level to warning
* Fixed problem that CMake uses gstreamer-0.1 instead of gstreamer-1.0
* Fixed underflow.
Before the sink buffer underflows the pipeline is paused. When data is received again the pipeline is set to playing again.
* Added gstreamer 1.0 dependecies
* Ported to gstreamer 1.0
package.xml dependencies still missing
* Contributors: Benny

0.2.11 (2016-02-16)
-------------------
* Add changelogs
* Contributors: trainman419

0.2.10 (2016-01-21)
-------------------
* Add changelogs
* Contributors: trainman419

0.2.9 (2015-12-02)
------------------
* Add changelogs
* Contributors: trainman419

0.2.8 (2015-10-02)
------------------
* Changed message level to warning
* Fixed underflow.
Before the sink buffer underflows the pipeline is paused. When data is received again the pipeline is set to playing again.
* Change audio sink to autoaudiosink
* Update maintainer email
* Contributors: Benny, Hans Gaiser, trainman419

0.2.7 (2014-07-25)
------------------

0.2.6 (2014-02-26)
------------------
* audio_capture and play _require\_ gstreamer, it's not optional
* Contributors: v4hn

0.2.5 (2014-01-23)
------------------
* "0.2.5"
* Contributors: trainman419

0.2.4 (2013-09-10)
------------------

0.2.3 (2013-07-15)
------------------
* Fix dependencies and install rules.
* Contributors: Austin Hendrix

0.2.2 (2013-04-10)
------------------

0.2.1 (2013-04-08 13:59)
------------------------

0.2.0 (2013-04-08 13:49)
------------------------
* Finish catkinizing audio_common.
* Catkinize audio_play.
* Fix typo in package.xml
* Versions and more URLs.
* Convert manifests to package.xml
* Ditch old makefiles.
* Updates manifest
* Updated manifests for rodep2
* oneiric build fixes, bump version to 0.1.6
* Removed another duplicate thread::thread
* Added a rosdep.yaml file
* Fixed to use audio_common_msgs
* Added ability to use different festival voices
* Updated documentation
* Update to audio_play
* Fixed ignore files
* Added hgignore files
* Audio_capture and audio_play working
* Making separate audio_capture and audio_play packages
* Contributors: Austin Hendrix, Brian Gerkey, Nate Koenig, nkoenig
35 changes: 35 additions & 0 deletions audio_to_float/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
cmake_minimum_required(VERSION 2.8.3)

project(audio_to_float)

find_package(catkin REQUIRED COMPONENTS cv_bridge roscpp audio_common_msgs)

find_package(PkgConfig)
pkg_check_modules(GST1.0 gstreamer-1.0 REQUIRED)
pkg_check_modules(GSTAPP1.0 gstreamer-app-1.0 REQUIRED)

find_package(Boost REQUIRED COMPONENTS thread)

include_directories(
${catkin_INCLUDE_DIRS}
${Boost_INCLUDE_DIRS}
${GST1.0_INCLUDE_DIRS}
${GSTAPP1.0_INCLUDE_DIRS}
)

catkin_package()

add_executable(audio_to_float src/audio_to_float.cpp)
target_link_libraries(audio_to_float
${catkin_LIBRARIES}
${GST1.0_LIBRARIES}
${GSTAPP1.0_LIBRARIES}
${Boost_LIBRARIES}
)
add_dependencies(audio_to_float ${catkin_EXPORTED_TARGETS})

install(TARGETS audio_to_float
DESTINATION ${CATKIN_PACKAGE_BIN_DESTINATION})

install(DIRECTORY launch
DESTINATION ${CATKIN_PACKAGE_SHARE_DESTINATION})
13 changes: 13 additions & 0 deletions audio_to_float/launch/audio_to_float.launch
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
<?xml version="1.0"?>
<launch>
<arg name="ns" default="audio"/>

<include file="$(find audio_capture)/launch/capture.launch">
</include>
<group ns="$(arg ns)">
<node name="audio_to_float" pkg="audio_to_float" type="audio_to_float"
output="screen">
</node>
<node name="view" pkg="audio_to_float" type="view.py" />
</group>
</launch>
22 changes: 22 additions & 0 deletions audio_to_float/launch/spectrogram.launch
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
<?xml version="1.0"?>
<launch>

<arg name="float_sample_rate" default="16000"/>
<node name="gen_float" pkg="float_to_audio" type="gen_float.py"
output="screen" >
<param name="sample_rate" value="$(arg float_sample_rate)"/>
</node>

<node name="spectrogram" pkg="audio_to_float" type="spectrogram.py"
output="screen" >
<param name="sample_rate" value="$(arg float_sample_rate)"/>
</node>

<node name="view_input" pkg="audio_to_float" type="view.py">
<remap from="decoded" to="samples"/>
<remap from="image" to="image"/>
<param name="fade1" value="0.5"/>
<param name="fade2" value="0.5"/>
</node>

</launch>
22 changes: 22 additions & 0 deletions audio_to_float/mainpage.dox
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
/**
\mainpage
\htmlinclude manifest.html

\b audio_play is a package that listens to a node that produces audio_msgs, and plays them through a connected speaker.


\section codeapi Code API

<!--
Provide links to specific auto-generated API documentation within your
package that is of particular interest to a reader. Doxygen will
document pretty much every part of your code, so do your best here to
point the reader to the actual API.

If your codebase is fairly large or has different sets of APIs, you
should use the doxygen 'group' tag to keep these APIs together. For
example, the roscpp documentation has 'libros' group.
-->


*/
32 changes: 32 additions & 0 deletions audio_to_float/package.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
<package>
<name>audio_to_float</name>
<version>0.2.7</version>
<description>
Converts a stream of gstreamer AudioData messages to a stream of floating point arrays.
</description>
<maintainer email="[email protected]">Lucas Walter</maintainer>
<author>Lucas Walter</author>
<license>BSD</license>
<url type="website">http://ros.org/wiki/audio_play</url>
<url type="repository">https://github.com/ros-drivers/audio_common</url>
<url type="bugtracker">https://github.com/ros-drivers/audio_common/issues</url>

<buildtool_depend>catkin</buildtool_depend>

<build_depend>cv_bridge</build_depend>
<build_depend>roscpp</build_depend>
<build_depend>audio_common_msgs</build_depend>
<build_depend>libgstreamer1.0-dev</build_depend>
<build_depend>libgstreamer-plugins-base1.0-dev</build_depend>

<run_depend>cv_bridge</run_depend>
<run_depend>roscpp</run_depend>
<run_depend>audio_common_msgs</run_depend>
<run_depend>libgstreamer1.0-0</run_depend>
<run_depend>libgstreamer-plugins-base1.0-0</run_depend>
<run_depend>gstreamer1.0-plugins-ugly</run_depend>
<run_depend>gstreamer1.0-plugins-good</run_depend>

</package>


52 changes: 52 additions & 0 deletions audio_to_float/scripts/spectrogram.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
#!/usr/bin/env python

import collections
import numpy as np
import rospy

# from audio_common_msgs.msg import AudioData
from cv_bridge import CvBridge
from scipy import signal
from sensor_msgs.msg import ChannelFloat32, Image


class View():
def __init__(self):
self.bridge = CvBridge()
self.buffer_len = rospy.get_param("~buffer_len", 2**16)
self.buffer = collections.deque(maxlen=self.buffer_len)
self.sample_rate = rospy.get_param("~sample_rate", 44100)
# self.window = 256
self.im = None
self.pub = rospy.Publisher("image_spectrogram", Image, queue_size=1)
self.sub = rospy.Subscriber("samples", ChannelFloat32,
self.audio_callback, queue_size=1)
self.timer = rospy.Timer(rospy.Duration(0.2), self.update)

def audio_callback(self, msg):
for i in range(len(msg.values)):
self.buffer.append(msg.values[i])

def update(self, event):
if len(self.buffer) < self.buffer_len:
return
samples = np.asarray(self.buffer)
# TODO(lucasw) this is hugely inefficient if it is re-calculating
# for samples that were processed in previous update.
f, t, Sxx = signal.spectrogram(samples, self.sample_rate, nperseg=512)
# TODO(lucasw) is there a standard spectrogram conversion?
Sxx = np.log(1.0 + Sxx * 2**16)
mins = np.min(Sxx)
maxs = np.max(Sxx)
Sxx -= mins
print(Sxx.shape, mins, maxs)
self.im = (Sxx * 50).astype(np.uint8)
# self.im[y0:y1+1, i, :] = 255
self.pub.publish(self.bridge.cv2_to_imgmsg(self.im, "mono8"))
# rospy.signal_shutdown("")


if __name__ == '__main__':
rospy.init_node('spectrogram')
view = View()
rospy.spin()
Loading