Google TTS in Django: Create Audio File in Javascript from base64 String - javascript

I am currently using Google's TTS Python API "synthesize_text" function in one of my Django views.
def synthesize_text(text):
"""Synthesizes speech from the input string of text."""
from google.cloud import texttospeech
client = texttospeech.TextToSpeechClient()
input_text = texttospeech.types.SynthesisInput(text=text)
# Note: the voice can also be specified by name.
# Names of voices can be retrieved with client.list_voices().
voice = texttospeech.types.VoiceSelectionParams(
language_code='en-US',
ssml_gender=texttospeech.enums.SsmlVoiceGender.FEMALE)
audio_config = texttospeech.types.AudioConfig(
audio_encoding=texttospeech.enums.AudioEncoding.MP3)
response = client.synthesize_speech(input_text, voice, audio_config)
# The response's audio_content is binary.
# Removing this because I do not care about writing the audio file
# ----------------------------------------------------
'''
with open('output.mp3', 'wb') as out:
out.write(response.audio_content)
print('Audio content written to file "output.mp3"')
'''
# ----------------------------------------------------
# instead return the encoded audio_content to decode and play in Javascript
return response.audio_content
def my_view(request):
test_audio_content = synthesize_text('Test audio.')
return render('my_template.html', {'test_audio_content': test_audio_content})
The only change I made to the "synthesize_text" function is that I return the audio_content instead of writing it out to an audio file. This is because I don't care about storing the file, and instead just want to play it in my template using Javascript. Google claims they encode the audio_content in base64: "Cloud Text-to-Speech API allows you to convert words and sentences into base64 encoded audio data of natural human speech. You can then convert the audio data into a playable audio file like an MP3 by decoding the base64 data." So I tried creating and playing the audio file with the following code as suggested here:
<!-- my_template.html -->
<script>
var audio_content = "{{ test_audio_content }}";
var snd = new Audio("data:audio/mp3;base64," + audio_content);
console.log(snd);
snd.play();
</script>
But I get the following error:
Uncaught (in promise) DOMException: Failed to load because no supported source was found.
I logged out the audio_content, and it starts as b'ÿóDÄH.. not sure if that is base64 or not.
Also I tried to decode the audio_content by doing:
var decoded_content = window.atob(audio_content);
And that gave me an error as well, claiming it isn't base64.

From your example:
The response's audio_content is binary
This means that you'll need to encode the result as base64 first before you can use it:
import base64
...
return base64.b64encode(response.audio_content).decode('ascii'))
Then this should work with your JS snippet exactly as you intended.

Related

How to read binary data response from AWS when doing a GET directly to an S3 URI in browser?

Some general context: This is an app that uses the MERN stack, but the question is more specific to AWS S3 data.
I have an S3 set up and i store images and files from my app there. I usually generate signedURLs with the server and do a direct upload from the browser.
within my app db i store the object URIs as a string and then an image for example i can render with an <img/> tag no problem. So far so good.
However, when they are PDFs and i want to let the user download the PDF i stored in S3, doing an <a href={s3Uri} download> just causes the pdf to be opened in another window/tab instead of prompting the user to download. I believe this is due to the download attribute being dependent on same-origin and you cannot download a file from an external resource (correct me if im wrong please)
So then my next attempt is to then do an http fetch of the resource directly using axios, it looks something like this
axios.create({
baseURL: attachment.fileUrl,
headers: {common: {Authorization: ''}}
})
.get('')
.then(res => {
console.log(res)
console.log(typeof res.data)
console.log(new Buffer.from(res.data).toString())
})
So by doing this I am successfully reading the response headers (useful cuz then i can handle images/files differently) BUT when i try to read the binary data returned i have been unsuccessful and parsing it or even determining how it is encoded, it looks like this
%PDF-1.3
3 0 obj
<</Type /Page
/Parent 1 0 R
/Resources 2 0 R
/Contents 4 0 R>>
endobj
4 0 obj
<</Filter /FlateDecode /Length 1811>>
stream
x�X�R�=k=E׷�������Na˅��/���� �[�]��.�,��^ �wF0�.��Ie�0�o��ݧO_IoG����p��4�BJI���g��d|��H�$�12(R*oB��:%먺�����:�R�Ф6�Xɔ�[:�[��h�(�MQ���>���;l[[��VN�hK/][�!�mJC
.... and so on
I have another function I use to allow users to download PDFs that i store directly in my database as strings in base64. These are PDF's my app generates and are fairly small so i store them directly in the DB, as opposed to the ones i store in AWS S3 which are user-submitted and can be several MBs in size (the ones in my db are just a few KB)
The function I use to process my base64 pdfs and provide a downloadable link to the users looks like this
export const makePdfUrlFromBase64 = (base64) => {
const binaryImg = atob(base64);
const binaryImgLength = binaryImg.length;
const arrayBuffer = new ArrayBuffer(binaryImgLength);
const uInt8Array = new Uint8Array(arrayBuffer);
for (let i = 0; i < binaryImgLength; i++) {
uInt8Array[i] = binaryImg.charCodeAt(i);
}
const outputBlob = new Blob([uInt8Array], {type: 'application/pdf'});
return URL.createObjectURL(outputBlob)
}
HOWEVER, when i try to apply this function to the data returned from AWS i get this error:
DOMException: Failed to execute 'atob' on 'Window': The string to be decoded contains characters outside of the Latin1 range.
So what kind of binary data encoding do i have here from AWS?
Note: I am able to render an image with this binary data by passing the src in the img tag like this:
<img src={data:${res.headers['Content-Type']};base64,${res.data}} />
which is my biggest hint that this is some form of base64?
PLEASE! If anyone has a clue how i can achieve my goal here, im all ears! The goal is to be able to prompt the user to download the resource which i have in an S3 URI. I can link to it and they can open it in browser, and then download manually, but i want to force the prompt.
Anybody know what kind of data is being returned here? any way to parse it as a stream? a buffer?
I have tried to stringify it with JSON or to log it to the console as a string, im open to all suggestions at this point
You're doing all kinds of unneeded conversions. When you do the GET request, you already have the data in the desired format.
const response = await fetch(attachment.fileUrl,
headers: {Authorization: ''}}
});
const blob = await response.blob();
return URL.createObjectURL(res.data);

Send Audio data represent as numpy array from python to Javascript

I have a TTS (text-to-speech) system that produces audio in numpy-array form whose data type is np.float32. This system is running in the backend and I want to transfer the data from the backend to the frontend to be played when a certain event happens.
The obvious solution for this problem is to write the audio data on disk as a wav file and then pass the path to the frontend to be played. This worked fine, but I don't want to do that for administrative reasons. I just want to transfer only the audio data (numpy array) to the frontend.
What I have done till now is the following:
backend
text = "Hello"
wav, sr = tts_model.synthesize(text)
data = {"snd", wav.tolist()}
flask_response = app.response_class(response=flask.json.dumps(data),
status=200,
mimetype='application/json' )
# then return flask_response
frontend
// gets wav from backend
let arrayData = new Float32Array(wav);
let blob = new Blob([ arrayData ]);
let url = URL.createObjectURL(blob);
let snd = new Audio(url);
snd.play()
That what I have done till now, but the JavaScript throws the following error:
Uncaught (in promise) DOMException: Failed to load because no supported source was found.
This is the gist of what I'm trying to do. I'm so sorry, you can't repreduce the error as you don't have the TTS system, so this is an audio file generated by it which you can use to see what I'm doing wrong.
Other things I tried:
Change the audio datatype to np.int8, np.int16 to be casted in the JavaScript by Int8Array() and int16Array() respectively.
tried different types when creating the blob such as {"type": "application/text;charset=utf-8;"} and {"type": "audio/ogg; codecs=opus;"}.
I have been struggling in this issue for so long, so any help is appriciated !!
Convert wav array of values to bytes
Right after synthesis you can convert numpy array of wav to byte object then encode via base64.
import io
from scipy.io.wavfile import write
bytes_wav = bytes()
byte_io = io.BytesIO(bytes_wav)
write(byte_io, sr, wav)
wav_bytes = byte_io.read()
audio_data = base64.b64encode(wav_bytes).decode('UTF-8')
This can be used directly to create html audio tag as source (with flask):
<audio controls src="data:audio/wav;base64, {{ audio_data }}"></audio>
So, all you need is to convert wav, sr to audio_data representing raw .wav file. And use as parameter of render_template for your flask app. (Solution without sending)
Or if you send audio_data, in .js file where you accept response, use audio_data to construct url (would be placed as src attribute like in html):
// get audio_data from response
let snd = new Audio("data:audio/wav;base64, " + audio_data);
snd.play()
because:
Audio(url) Return value:
A new HTMLAudioElement object, configured to be used for playing back the audio from the file specified by url.The new object's preload property is set to auto and its src property is set to the specified URL or null if no URL is given. If a URL is specified, the browser begins to asynchronously load the media resource before returning the new object.
Your sample as is does not work out of the box. (Does not play)
However with:
StarWars3.wav: OK. retrieved from cs.uic.edu
your sample encoded in PCM16 instead of PCM32: OK (check the wav metadata)
Flask
from flask import Flask, render_template, json
import base64
app = Flask(__name__)
with open("sample_16.wav", "rb") as binary_file:
# Read the whole file at once
data = binary_file.read()
wav_file = base64.b64encode(data).decode('UTF-8')
#app.route('/wav')
def hello_world():
data = {"snd": wav_file}
res = app.response_class(response=json.dumps(data),
status=200,
mimetype='application/json')
return res
#app.route('/')
def stat():
return render_template('index.html')
if __name__ == '__main__':
app.run(debug = True)
js
<audio controls></audio>
<script>
;(async _ => {
const res = await fetch('/wav')
let {snd: b64buf} = await res.json()
document.querySelector('audio').src="data:audio/wav;base64, "+b64buf;
})()
</script>
Original Poster Edit
So, what I ended up doing before (using this solution) that solved my problem is to:
First, change the datatype from np.float32 to np.int16:
wav = (wav * np.iinfo(np.int16).max).astype(np.int16)
Write the numpy array into a temporary wav file using scipy.io.wavfile:
from scipy.io import wavfile
wavfile.write(".tmp.wav", sr, wav)
Read the bytes from the tmp file:
# read the bytes
with open(".tmp.wav", "rb") as fin:
wav = fin.read()
Delete the temporary file
import os
os.remove(".tmp.wav")

How do I save base64 image from html5 canvas as ImageField in Django?

On my form, I have an ImageField as well as a hidden CharField to accept an image as base64. When a user selects an image on the frontend form my JS script on the frontend auto resizes the image as a canvas then sets the generated dataUrl as the content for the CharField. I would like that when the form is submitted I can convert the base64 text to an image and override the saving of the original Image in the ImageField so that the decoded base64 image is saved in its place instead (filename and all).
I have followed several tutorials online which have helped me formulate the code below. I have overwritten the clean() function in my forms.py so that the base64 text is decoded into a PIL Image. Now I would like to set it in place of the actual ImageField's content so that it is saved and handled as if the base64 decoded image was the file sent in POST.
Other answers suggest saving the decoded base64 image manually using a with open() command but that approach opens other headaches. Especially since Django already handles image renaming if an image is submitted with a name that already exists. I would rather not have to remake the wheel for this. Replacing the ImageField file with the generated one just before save through Django's built-in methods seems like the best approach.
Is there any way my desired approach is feasible? I have not found any answers which attempt to handle this scenario without having to manually save the file myself and update the model afterward.
from PIL import Image
from io import BytesIO
import os
import sys
import re
import base64
def clean(self):
cleaned_data = super().clean()
dataUrlPattern = re.compile('data:image/(png|jpeg);base64,(.*)$')
ImageData = cleaned_data.get("image_base64_text")
ImageData = dataUrlPattern.match(ImageData).group(2)
ogImg = cleaned_data.get("image")
# If none or len 0, means illegal image data
if (ImageData == None or len(ImageData) == 0):
# PRINT ERROR MESSAGE HERE
print('corrupt or illegal image data')
# Decode the 64 bit string into 32 bit
ImageData = base64.b64decode(ImageData)
ImageData = Image.open(BytesIO(ImageData))
Define clean_image method in form and return an instance of InMemoryUploadedFile from clean method. Ps I encourage you to use pep8 style guidelines for variable naming and other things related to code style.
def clean_image(self):
dataUrlPattern = re.compile('data:image/(png|jpeg);base64,(.*)$')
ImageData = self.cleaned_data.get("image_base64_text")
ImageData = dataUrlPattern.match(ImageData).group(2)
# If none or len 0, means illegal image data
if (ImageData == None or len(ImageData) == 0):
# PRINT ERROR MESSAGE HERE
print('corrupt or illegal image data')
# Decode the 64 bit string into 32 bit
ImageData = base64.b64decode(ImageData)
ImageData = Image.open(BytesIO(ImageData))
return InMemoryUploadedFile(ImageData,
'ImageField', 'filename'),
'image/jpeg',
sys.getsizeof(output),
None)

JPEG data obtained from FileReader doesn't match data in file

I'm trying to select a local JPEG file in the web browser via the HTML5 FileReader so I can submit it to a server without reloading the page. All the mechanics are working and I think I'm transferring and saving the exact data that JavaScript gave me, but the result is an invalid JPEG file on the server. Here's the basic code that demonstrates the problem:
<form name="add_photos">
​<input type=​"file" name=​"photo" id=​"photo" /><br />
​<input type=​"button" value=​"Upload" onclick=​"upload_photo()​;​" />​
</form>
<script type="text/javascript">
function upload_photo() {
file = document.add_photos.photo.files[0];
if (file) {
fileReader = new FileReader();
fileReader.onload = upload_photo_ready;
fileReader.readAsBinaryString(file);
}
}
function upload_photo_ready(event) {
data = event.target.result;
// alert(data);
URL = "submit.php";
ajax = new XMLHttpRequest();
ajax.open("POST", URL, 1);
ajax.setRequestHeader("Ajax-Request", "1");
ajax.send(data);
}
</script>
Then my PHP script does this:
$data = file_get_contents("php://input");
$filename = "test.jpg";
file_put_contents($filename, $data);
$result = imagecreatefromjpeg($filename);
That last line throws a PHP error "test.jpg is not a valid JPEG file." If I download the data back to my Mac and try to open it in Preview, Preview says the file "may be damaged or use a file format that Preview doesn’t recognize."
If I open both the original file on my desktop and the uploaded file on the server in text editors to inspect their contents, they are almost but not quite the same. The original file starts like this:
ˇÿˇ‡JFIFˇ˛;CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), quality = 90
But the uploaded file starts like this:
ÿØÿàJFIFÿþ;CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), quality = 90
Interestingly, if I view the data in a JavaScript alert with the commented-out line above, it looks just like the uploaded file's data, so it seems as if the FileReader isn't giving the correct data at the very beginning, as opposed to a problem that is introduced while transferring or saving the data on the server. Can anyone explain this?
I'm using Safari 6 and I also tried Firefox 14.
UPDATE: I just figured out that if I skip the FileReader code and change ajax.send(data) to ajax.send(file), the image is transferred and saved correctly on the server. So my problem is basically solved, but I'll award the answer points to anyone who can explain why my original approach with readAsBinaryString didn't work.
Your problem lies with readAsBinaryString. This will transfer the binary data byte-for-byte into a string, so that you will send a text string to your PHP file. Now a text string always has an encoding; and when you use XmlHttpRequest to upload a string, by default it will use UTF-8.
So each character, which was originally supposed to represent one byte, will be encoded as UTF-8... which uses multiple bytes for each character with a code point above 127!
Your best best is to use readAsArrayBuffer instead of readAsBinaryString. This will avoid all the character set conversions (that are necessary when dealing with strings).

Base64 encode and image in Adobe Air

I am writing an Adobe Air app in HTML/JavaScript and I am trying to base64 encode an image so I can add it to and XML RPC request. I have tried many methods and nothing seems to work.
I see that actionscript has a Base64Encoder class that look like it would work, is there any way to utilize this in JavaScript?
Thanks #some for the link.
I used the btoa() function to base64 encode image data like this:
var loader = new air.URLLoader();
loader.dataFormat = air.URLLoaderDataFormat.BINARY;
loader.addEventListener(air.Event.COMPLETE,function(e){
var base64image = btoa(loader.data);
});
var req = new air.URLRequest('file://your_path_here');
loader.load(req);
I was trying to upload an image using metaWeblog.newMediaObject, but it turns out that the data doesn't need to be base64 encoded, so the binary value was all that was needed.

Categories

Resources