speech synthesis - How to change gender? - javascript

I've created a simple text-to-speech web app that uses window.speechSynthesis, and it's working well. I'd like to add the ability to set certain text in female, and other text in male voice.
It seems that if I use getVoices() from web speech API, the list varies from browser to browser. There doesn't seem to be a property for "gender". Some of the chrome names have 'Male' in them, some don't. Some have female names and male names.
Google chrome doesn't seem to have a male US speaker listed. Am I missing something?
If I knew what all the names possibly could be, then I could store the list, and just go through the list until I find one that works. Is there a way to know if a voice has been enabled on that device?
Or, is there a way to download (with my app) a male and a female voice that will work in all browsers?
I need this to work on all devices.

If I were you, I'd create an array of the desired voice names (male and female), then iterate through the getVoices array to find a match. For example:
window.speechSynthesis.onvoiceschanged = () => {
const maleVoices = [
'Google US English Male',
'Microsoft David Desktop - English (United States)',
];
const foundVoice = speechSynthesis.getVoices()
.find(({ name }) => maleVoices.includes(name));
console.log('speaking');
speechSynthesis.cancel(); // sometimes needed due to Chrome's buggy implementation
const utter = new SpeechSynthesisUtterance('foo bar');
if (foundVoice) utter.voice = foundVoice;
else console.log('no voice found, using default');
speechSynthesis.speak(utter);
};
The voices available are determined by the browser and the user's computer; there's no way to download additional voices through Javascript if the user doesn't have the desired ones.
My Google Chrome 66 does have both 'Google US English Male' and 'Microsoft David Desktop - English (United States)'.

Related

Edge browser Speech Synthesis stop working after call speak with some empty paragraph text using Xiaoxiao voice

I'm using SpeechSynthesis APIs on Microsoft Edge browser. But something went wrong...
Here is what I have got so far (already minimized to what can reproduce the result)
I had Chinese language pack installed on Windows. The Edge browser also have some online voices available. You may need the same environments installed to make the following snippet working.
const speak = (p, voice) => {
p.split('\n').forEach(text => {
const ssu = new SpeechSynthesisUtterance(text);
ssu.lang = 'zh-CN';
ssu.voice = voice;
speechSynthesis.speak(ssu);
console.log(ssu);
});
};
let voices = [];
const getVoice = voiceURI => {
const voice = voices.find(voice => voice.voiceURI === voiceURI);
return voice;
};
const huihuiURI = "Microsoft Huihui - Chinese (Simplified, PRC)";
const xiaoxiaoURI = "Microsoft Xiaoxiao Online (Natural) - Chinese (Mainland)";
const voicesChanged = () => {
voices = speechSynthesis.getVoices();
if (getVoice(huihuiURI) && getVoice(xiaoxiaoURI)) {
mainarea.hidden = false;
}
};
speechSynthesis.addEventListener('voiceschanged', voicesChanged);
voicesChanged();
huihui.addEventListener('click', () => {
speak('第一段\n……\n第二段', getVoice(huihuiURI));
});
xiaoxiao.addEventListener('click', () => {
speak('第一段\n……\n第二段', getVoice(xiaoxiaoURI));
});
<main id=mainarea hidden>
<button id=huihui>Microsoft Huihui</button>
<button id=xiaoxiao>Microsoft Xiaoxiao Online</button>
</main>
The script first wait two speech voices available, and then show two buttons. When certain button is clicked, it try to speak texts with specified voice.
When I click the button Huihui, it works correctly. But when I try the voice Xiaoxiao, only first paragraph is spoken. The Xiaoxiao voice refused to speak the ssu without any words, and simply stop working instead of skip it and continue to the next one. I'm not sure why this happened. (You may need to reload / reopen the webpage to test different buttons.)
The text going to speak will come from user input (out of my control) in my project. So I don't think I can strip empty words before sending them to SpeechSynthesis APIs.
I want to know what's wrong here and how can I fix this, so I can use Xiaoxiao voice to speak the whole text.
In case it matters: I'm using Microsoft Edge Version 92.0.902.67 (Official build) (64-bit) on Microsoft Windows [Version 10.0.19043.1151].
I make some tests and find that the issue happens on some versions of Windows 10. On Windows 10 version 20H2, OS build 19042.630, it works well with both voices. But on Windows 10 version 1909, OS build 18363.1679, I can reproduce the same issue. The Edge versions are the same in both machines, which is 92.0.902.67 (Official build) (64-bit). I think the issue may be related with OS builds.
In the Xiaoxiao voice not working scenario, I observed that it can't speak paragraphs with only symbols like the paragraph only has "......", then it stops to speak the remaining things. According to this, I think the only workaround is not speaking the article paragraph by paragraph, but speaking the whole article for once.
Then in the code, you don't need to split the text by \n and you can edit the first part of the js code like below. Then it can speak the whole text with Xiaoxiao voice:
const speak = (p, voice) => {
const ssu = new SpeechSynthesisUtterance(p);
ssu.lang = 'zh-CN';
ssu.voice = voice;
speechSynthesis.speak(ssu);
console.log(ssu);
};

Why Can't I Choose Female "Microsoft Zira" Voice in SpeechSynthesisUtterance()?

My following HTML page correctly speaks the text but it is not speaking in a female voice Microsoft Zira Desktop - English (United States). Question: What I may be missing here and how can we make it speak in a female voice?
Remark: I tried this html in MS Edge and Google Chrome multiple times with and without refreshing the page but it keeps speaking with the same male voice. It seems it is ignoring the msg.voice value in the JavaScript below. I am using Windows 10 - that probably should not matter.
<!DOCTYPE html>
<html>
<head>
<script>
function myFunction() {
var msg = new SpeechSynthesisUtterance();
msg.voice = speechSynthesis.getVoices().filter(function(voice) {
return voice.name == "Microsoft Zira Desktop - English (United States)"})[0];
msg.text = document.getElementById("testDiv").textContent;
window.speechSynthesis.speak(msg);
}
</script>
</head>
<body>
<h2>JavaScript in Head</h2>
<div id="testDiv">SQL Managed Instance gives you an instance of SQL Server but removes much of the <b>overhead</b> of managing a <u>virtual machine</u>. Most of the features available in SQL Server are available in SQL Managed Instance. This option is ideal for customers who want to use instance-scoped features and want to move to Azure without rearchitecting their applications.</div>
<button type="button" onclick="myFunction()">Try it</button>
</body>
</html>
UPDATE
Per a suggestion from user #Frazer, I ran speechSynthesis.getVoices() in my google chrome console and got the following results - that does contain Microsoft Zira .... voice:
Observation:
Following this advice, I moved the the <script> block to end of the body block (just before </body>) but still the same male voice. HOWEVER, when I replaced the voice from Microsoft Zira Desktop - English (United States) to Google UK English Female, the following happens: On the first click of Try it button, the speaker is still the default male, but on every subsequent clicks on this button, I correctly get the Google UK English Female voice. Note: The Microsoft Zira Desktop - English (United States) does nothing in the above scenario. This leads me to believe that this technology still is experiential - as mentioned here.
Why Does it Work for Some Browsers?
I have an answer to a similar question here, Why is the voiceschanged event fired on page load?, but I think your situation is sufficiently different to merit a new answer.
First, why does it work sometimes? Because "Microsoft Zira Desktop - English (United States)" is retrieved from the web, through an API call, and this data is not available by the time the next line executes. Basically, you should wait until onvoiceschanged is called before actually calling getVoices() to get the voices.
To quote the docs...
With Chrome however, you have to wait for the event to fire before populating the list, hence the bottom if statement seen below. (Source: MDN WebDocs: SpeechSynthesis.onvoiceschanged) (Emphasis mine.)
If the list doesn't populate, and you don't have the female language available, the male will play by default.
Because Constructor `getVoices()` Makes an API Call, Treat it as Asynchronous
Try running your code like so...
var msg = new SpeechSynthesisUtterance();
var voices = window.speechSynthesis.getVoices();
window.speechSynthesis.onvoiceschanged = function() {
voices = window.speechSynthesis.getVoices();
};
function myFunction() {
console.log(voices);
msg.voice = voices.filter(function(voice) {
return voice.name == "Microsoft Zira - English (United States)"})[0];
console.log(msg.voice);
msg.text = document.getElementById("testDiv").textContent;
window.speechSynthesis.speak(msg);
}
P.S. Here is my own coding example of how I handle the voices loading on a text-to-audio reader: GreenGluon CMS: text-audio.js Also here: PronounceThat.com pronounce-that.js
Add the 'disabled' attribute to your button then try this before the /body tag with either Zira or the Chrome voice.
speechSynthesis.onvoiceschanged = () => {
voices = speechSynthesis.getVoices()
if (voices.length) document.querySelector("button").disabled = false
}
let voices = speechSynthesis.getVoices()
const myFunction = () => {
const msg = new SpeechSynthesisUtterance();
msg.voice = voices.filter(voice => {
return voice.name === "Microsoft Zira Desktop - English (United States)"
})[0];
msg.text = document.getElementById("testDiv").textContent;
speechSynthesis.speak(msg);
}

SpeechSynthesis in Android-Chrome: cannot change English voice from US English

I'm using the speech synthesis API on Android-Chrome. The issue is that although there are 4 English voices available, it is always US English that is used by the browser, no matter what the code specifies. I can use other languages e.g. French, just not other English voices e.g en-AU, GB, or IN.
This code filters British English voice objects from the getVoices array and uses the first to utter the word 'tomato'. The problem is that the word is always pronounced "to-may-lo" not "to-mar-to" which means my text doesn't rhyme as it should.
The voice object that was used is displayed and (on the phones I've tried) is an GB one.
The html...
<!DOCTYPE html>
<html lang="en-GB">
<head>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Let's not call the whole thing off</title>
<script src="tomato.js" defer></script>
</head>
<body>
<div id="tomato" lang="en-GB">Tomato</div>
<div id="platform"></div>
</body>
</html>
And the script...
var platform = document.getElementById("platform");
var tomato = document.getElementById("tomato");
var voices = [];
var voicesGB = [];
voices = speechSynthesis.getVoices();
speechSynthesis.onvoiceschanged = function() {
voices = speechSynthesis.getVoices();
voicesGB = voices.filter(voice => /en(-|_)GB/.test(voice.lang));
};
function speak(word) {
var msg = new SpeechSynthesisUtterance();
msg.default = false;
msg.text = word;
msg.voice = voicesGB[0];
msg.lang = voicesGB[0].lang;
window.speechSynthesis.speak(msg);
for (var p in msg.voice) {
platform.innerHTML += p + ': ' + msg.voice[p] + '<br>\n';
}
}
tomato.addEventListener("click",() => {speak('tomato');});
And a jsbin: https://jsbin.com/xefukemuga/edit?html,js,output
Run this in Android Chrome and tap the word 'tomato'.
I have searched all over and tried various fixes. How do you control what voice Android-Chrome uses?
The only way to work around this issue on Android version 5.0.2 is to change the default voice in the Android settings and then restart the device. That will let you use the voice you want, but the other English ones will then be unavailable. Here is some more detail:
SpeechSynthesis.getVoices() will return several options for English
(United States, Australia, Nigeria, India, and United Kingdom) but
only one is available at a time. You can pick which one by going to
the Settings app, then Controls->Language and input->Text-to-speech
options. Select the gear icon next to Google Text-to-speech Engine,
then under Language you can update the exact locale you want to use.
If you select "Install voice data" you can even select from a sample
of different voices for some locales. You need to restart the device
after changing this setting for it to take effect.
The voice used on an Android device when you play a
SpeechSynthesisUtterance will depend on what you have selected in the
Android settings. You can choose which language you want to play from
javascript (see below for details) but you have no control over the
locale or exact voice used.
This problem occurs on Chrome and Firefox, so it is likely a problem
with the Android platform's implementation of the speechSynthesis API.
It's unlikely that a browser update will fix this, but different
versions of Android might. (My test device is on Android 5.0.2, so if
this is fixed in a future update, please let me know).

Polish TextToSpeech in Xamarin - Android

I am writing an application for Android with Xamarin and I need to implement text-to-speech in Polish language.
My first step was, of course, to google it and I've found out that text-to-speech is implemented already in Xamarin (link to developer.xamarin.com).
Unfortunatelly, not in Polish (there is a way to change language, but I wasn't able to change it to Polish). Is there a way to do this?
I've found a nice website with text-to-speech in many languages and free non-commercial api: https://responsivevoice.org/api/
But, it works in JS and I don't know a way to implement JS in Xamarin, Android app. Is there a way to do this?
There are some other free text-to-speech APIs, but they don't seem to sound great, just 3/10 or something, but my work is an engineer project, so I don't wont to use such weak things.
Android supports Polish and a couple of dozen other languages, you could do this in the (TextToSpeech.IOnInitListener) OnInit method to review all the languages available and set the one you want:
public void OnInit([GeneratedEnum] OperationResult status)
{
if (status.Equals(OperationResult.Success))
{
foreach (var locale in speaker.AvailableLanguages)
{
Log.Debug(TAG, locale.Language); // review all the languages available
if (locale.Language == "pl")
speaker.SetLanguage(locale);
}
speaker.Speak("jak się masz?", QueueMode.Flush, null, null);
}
else
Log.Error(TAG, status.ToString());
}
iOS also supports a couple of dozen languages, including Polish (pl-PL). You can review all the support languages via AVSpeechSynthesisVoice.GetSpeechVoices() and assign one via the AVSpeechSynthesisVoice.FromLanguage to the AVSpeechUtterance.Voice property:
foreach (var voice in AVSpeechSynthesisVoice.GetSpeechVoices())
{
Console.WriteLine(voice.Language); // review all the languages available
}
var speechSynthesizer = new AVSpeechSynthesizer();
var speechUtterance = new AVSpeechUtterance("jak się masz?")
{
Voice = AVSpeechSynthesisVoice.FromLanguage("pl-PL"),
Volume = 1.0f,
PitchMultiplier = 1.0f
};

Smart card selection for digital signature

I am mantaining a VB6 Windows application which digitally signs PDF documents by launching a JS file, located in the Javascripts subfolder of Acrobat 9.0. Now my Customer wants to plug another smart card reader to the PC which hosts the application, with its own smart card containing certificates related to a second person who will sign certain type of documents.
My question is: how can I programmatically choose, from my JavaScript code, the smart card reader I want?
In my JavaScript code I do the following:
//Initialize the signature handler
var myEngine = security.getHandler("Adobe.PPKLite");
//Obtain the available certificates
var ids = myEngine.digitalIDs;
var myCerts = ids.certs;
//Find the certificate I want to use to sign
for(var j=0; j<myCerts.length; j++)
{
if(myCerts[j].subjectCN == "SMITH JOHN")
{
oCert = myCerts[j];
break;
}
}
//Log to the signature engine by passing the certificate I want to use
//and the slot where the corresponding smart card reader is plugged
myEngine.login( { oParams: { cDIPath: ACROSDK.sigDigitalIDPath,
cPassword: ACROSDK.sigUserPwd,
iSlotID: 1,
oEndUserSignCert: oCert
}
} );
//Digitally sign the document with the certificate I chose
sigField.signatureSign({oSig: myEngine,
bUI: false,
oInfo: { password: ACROSDK.sigUserPwd,
location: ACROSDK.sigLocation,
reason: ACROSDK.sigReason,
contactInfo: ACROSDK.sigContactInfo,
appearance: "FirmaRPPR"
}
});
Why do I receive a General Error when executing signatureSign? Which is the correct way to assign the iSlotID parameter when logging to the signature engine or, alternatively, the cTokenLabel parameter?
Thanks in advance for your help and suggestions!
Mind you, I have no experience in using Acrobat scripting, but in PKCS#11 slot id would refer to the id of the smart card reader connected to the computer, and token label would be assigned label to one of the smart carts in that slot/reader, which can vary from PKCS#11 implementation to another.
And the easiest way to find out the label of the PKCS#11 token would be to configure the PKCS#11 DLL you're using as a Security device in Firefox browser and see the label field in the configuration. But that would be just to get you going in the right direction.
You can write a short C program against the PKCS#11 and use C_GetSlotList and C_GetSlotInfo to find out the slot id's and token labels, here is an example of that. It should not be a problem to port that code over to VB. Also there is NCryptoki that you can use to interface the PKCS#11 DLL.

Categories

Resources