This functionality would include SSML tags support inside the text editor that would allow users to change voice tones, speed, etc.
It would look something like this -
<pitch=low> Here is some text that is spoken <volume=low> in a low volume </volume=low> and this is text at the normal volume. </pitch=low>.