我想我明白你在尝试什么。从你的编码器脚本我假设每个bit
转换为波形文件中的 10 毫秒,并以 5 毫秒 1600hz 音调作为一种分隔符。如果这些持续时间是固定的,您可以简单地使用scipy
and numpy
对音频进行分段并解码每个分段。
根据上面的编码器脚本生成 105ms (7 * 15ms) 单声道output.wav
对于字节串:1001011
如果要忽略界定频率,我们应该返回一个代表每个频率的列表bit
:
[800, 200, 200, 800, 200, 800, 800]
我们可以使用以下方式读取音频scipy
并对音频片段执行 FFT 使用numpy
获取每个段的频率:
from scipy.io import wavfile as wav
import numpy as np
rate, data = wav.read('./output.wav')
# 15ms chunk includes delimiting 5ms 1600hz tone
duration = 0.015
# calculate the length of our chunk in the np.array using sample rate
chunk = int(rate * duration)
# length of delimiting 1600hz tone
offset = int(rate * 0.005)
# number of bits in the audio data to decode
bits = int(len(data) / chunk)
def get_freq(bit):
# start position of the current bit
strt = (chunk * bit)
# remove the delimiting 1600hz tone
end = (strt + chunk) - offset
# slice the array for each bit
sliced = data[strt:end]
w = np.fft.fft(sliced)
freqs = np.fft.fftfreq(len(w))
# Find the peak in the coefficients
idx = np.argmax(np.abs(w))
freq = freqs[idx]
freq_in_hertz = abs(freq * rate)
return freq_in_hertz
decoded_freqs = [get_freq(bit) for bit in range(bits)]
yields
[800.0, 200.0, 200.0, 800.0, 200.0, 800.0, 800.0]
转换为位/字节:
bitsarr = [1 if freq == 800 else 0 for freq in decoded_freqs]
byte_array = bytearray(bitsarr)
decoded = bytes(a_byte_array)
print(decoded, type(decoded))
yields
b'\x01\x00\x00\x01\x00\x01\x01' <class 'bytes'>
有关导出峰值频率的更多信息,请参阅这个问题 https://stackoverflow.com/questions/3694918/how-to-extract-frequency-associated-with-fft-values-in-python