TEXT TO SPEECH SYNTHESIZER FOR TAMIL LANGUAGE
Dr.
S. VEERA ALAGIRI
DEPT
OF CEN
AMRITA
UNIVERSITY
COIMBATORE.
Email:
alagiri.bagath@gmail.com
1 Introduction
Incorporation of human
facilities like speech and vision in to machine is a basic issue of artificial
intelligence research. The capability of a computer to generate speech output
is termed as speech synthesis. It requires an in depth understanding of speech
production and perception. There is a wide spread talk about improvement of the
human interface to the computer. No longer have people wanted to sit and type
out required data or to read data from the monitor since it is a painstaking
which involves strain to the eyes. In this aspect, we all know that Speech
Synthesis is becoming one of the most important steps towards improving the
human interface to the computer. The main aim of this paper discuss about the developing
of a text to speech synthesis system for Tamil.
The Implementation of this
TTS is shown in Figure1. The concatenation method is used in our subsystem.
This approach is used in many popular speech engines available today. This will
be very useful to the visually impaired people and to those who want all the
digital books to be read loudly. Nowadays there are many books getting
digitized and many foreign books are translated into Tamil and then digitalized.
This system will help people who cannot read Tamil and people who don’t want to
waste time in reading.
This field of TTS in Tamil
has remained untouched for a long time due to various facts; a few of them are
listed below.
1. The Complexity of the
Tamil language.
2. The problems, which are
posed by the Tamil grammar.
3. Less Knowledge of pure
Tamil.
4. Huge gap and difference
between the spoken Tamil, which is full of slang and the pure Tamil, which is
written.
Even though some people tried to develop TTS
system for Tamil, the required naturalness is not yet achieved. This system
fulfills that level.
Figure
1: Block Diagram of TTS System
2 Text Processing
The Tamil which is entered
cannot be given directly to the unit selection process; the raw text should be
processed and it will be given for further processing. For the text to speech
system, we need the input as a syllable. The system doesn’t know where to
separate the given word into sequence of syllables so to do this sequence
syllabeling we need tell the machine where to break the word or sentence.
Texas Ranger, Texel, text, text editor, text message, text wrap, textbook, textile, text phone, text speak and the handling of alphabetic characters by a computer.
For example, மார்ச் 31 needs to be pronounced மார்ச் முப்பத்தி ஒன்று, not . மார்ச் முன்று ஒன்று. The expansion of ரூ.1 should be of ரூபாய் ஒன்று; it should not be pronounced as ரூ ஒன்று. The second step in text
normalization is normalizing non-standard words. Nonstandard words are tokens
like numbers or abbreviations, which need to be expanded into sequences of
Tamil words before they can be pronounced.
The TTS
system comprises of these 5 fundamental components:
1.
Text
Analysis and Detection
2.
Text
Normalization and Linearization
3.
Phonetic
Analysis
4.
Prosodic
Modeling and Intonation
5.
Acoustic
processing
The input text is passed through
these phases to obtain the speech.
Input Text
|
Text Normalization
& Text Linearization
|
Text Analysis &Text
Detection
|
Phonetic Analysis
|
Prosodic Modeling &
Intonation
|
Acoustic Processing
|
Speech as output
|
The Figure 2 System Overview
of TTS
2.1
Text Analysis and Detection
The Text Analysis part is a preprocessing
part which analyses the input text and organize it into manageable list of
words. It consists of numbers, abbreviations, acronyms and idiomatic expressions
and transforms them into full text when needed. An important problem is
encountered as soon as the character level: that of punctuation ambiguity
(sentence end detection). It can be solved, to some extent, with elementary
regular grammars.
Text detection is localizing
the text areas from any kind of printed documents. Most of the previous
researches were concentrated on extracting text from video. We aim at
developing a technique that work for all kind of documents like newspapers,
books etc
2.2
Text
Normalization and Linearization
Text Normalization is the
transformation of text to pronounceable form. Text normalization is often
performed before text is processed in some way, such as generating synthesized
speech or automated language translation. The main objective of this process is
to identify punctuation marks and pauses between words. Usually the text
normalization process is done for converting all letters of lowercase or upper
case, to remove punctuations, accent marks , stopwords or “too common words
“and other diacritics from letters .
Text normalization is useful
for example for comparing two sequences of characters which represented
differently but mean the same.
2.3
Phonetic
Analysis
Phonetic Analysis converts
the orthographical symbols into phonological ones using a phonetic alphabet. This
is basically known as “grapheme-to-phoneme” conversion. Phone is a sound that
has definite shape as a sound wave. Phone is the smallest sound unit. A
collection of phones that constitute minimal distinctive phonetic units are
called Phoneme. Pronunciation of word based on its spelling has two approaches
to do speech synthesis namely
(a)Dictionary based approach
(b) Rule based approach.
A dictionary is kept were It
stores all kinds of words with their correct pronunciation, it‟s a matter of
looking in to dictionary for each word for spelling out with correct
pronounciation. This approach is very quick and accurate and the pronounciation
quality will be better but the major drawback is that it needs a large database
to store all words and the system will stop if a word is not found in the
dictionary.
The letter sounds for a word
are blended together to form a pronunciation based on some rule. Here main
advantage is that it requires no database and it works on any type of input.
same way the complexity grows for irregular inputs.
2.4
Prosodic Modeling And Intonation
The concept of prosody is
the combination of stress pattern, rhythm and intonation in a speech. The
prosodic modeling describes the speakers emotion. Recent investigations suggest
the identification of the vocal features which signal emotional content may
help to create a very natural synthesized speech.
Intonation is simply a
variation of speech while speaking. All languages use pitch, as intonation to
convey an instance, to express happiness, to raise a question etc. Modelling of
an intonation is an important task that affects intelligibility and naturalness
of the speech. To receive high quality text to speech conversion, good model of
intonation is needed.
Generally intonations are
distinguished as follows:
(i)
Rising
Intonation (when the pitch of the voice increases)
(ii)
Falling
Intonation (when pitch of the voice decreases)
(iii)
Dipping Intonation (when the pitch of the
voice falls and then rises)
(iv)
Peaking
Intonation (when the pitch of the voice raises and then falls)
2.5
Acoustic Processing
The speech will be spoken according to the voice
characteristics of a person. There are three types of Acoustic synthesis available.
(i).Concatenative
Synthesis
(ii).Formant Synthesis
(iii).Articulatory Synthesis
The
concatenation of prerecorded human voice is called Concatenative synthesis, in
this process a database is needed having all the prerecorded words .The natural
sounding speech is the main advantage and the main drawback is the using and
developing of large database.
Formant-synthesized
speech can be constantly intelligible .It does not have any database of speech
samples. So the speech is artificial and robotic. Speech organs are called
Articulators. In this articulatory synthesis techniques for synthesizing speech
based on models of the human vocal tract are to be developed. It produces a
complete synthetic output, typically based on mathematical models.
2.1 Preprocessing
Before giving to the
Syllabification process some preprocessing should be done. That is
tokenization. The first task in text normalization is sentence tokenization. In
order to segment the given Tamil paragraph into separate utterances for
synthesis, we need to know that the first sentence ends at the period after (.)
full stop not at the period of தா.நா… It is somewhat easy to tokenize a word with help
of full stop, that is most of the sentences will be ending with full stop. But
there are some other cases where it ends in abbreviation and with semicolon or
some other punctuation like the previous case.
This problem can be solved
by expanding the abbreviation and removing the unwanted punctuation. For
example
Ø
மகா வீட்டுக்குச் சென்றாள்.
There will not be any
problem in this sentence to be tokenized, because the sentence can be tokenized
according the reference of full stop. But for the sentence like
Ø
மகா தா.நா. சென்றாள்.
There comes the
abbreviation, for this there is full stop coming in between the sentence for
the abbreviation. This problem can be solved by expanding the abbreviation so
that there will not be any full stop in between the sentence. So it is easy to
tokenize as in the previous case. All the Tamil abbreviations cannot be
expanded, because some mostly used abbreviations are stored in a separate
database. When certain abbreviation comes in the text, then it will search in
the database for that abbreviation. If that abbreviation is present the system
will replace the text, if not it will be leaving the original text as it is. It
is difficult to add all the abbreviations in the database, so some mostly used
abbreviations are used. For example
Ø
தா.நா_____ தமிழ் நாடு
Ø
ரூ _____ ரூபாய்
Then the unwanted
punctuation like (: , ; ‘ „ ` $) etc... are to be removed from the given Tamil
paragraph to avoid confusion and not to give any disturbance in the naturalness
of the speech. Each and every text in the input should be assigned some sound
file for the concatenation. So if we are keeping this punctuation and assigning
some speech file for that punctuation will lead to unnaturalness in final
output speech file. Because many punctuation in the input text. The given
algorithm will do all these things and it will assign @ symbol for spaces
present in the input text. In some passage there will be more space and extra
punctuations will be there then the punctuation will be made as blank space and
more gaps are made as a single space. And finally all the space are assigned @
symbol, this @ symbol is assigned silence for some short range in the matlab.
This will be helpful in further processing.
Number converter
Number is pronounced
differently in different situations. The second step in text normalization is
normalizing non-standard words. Non standard words are tokens like numbers or
abbreviations, which need to be expanded into sequences of Tamil words before
they can be pronounced. What is difficult about these non-standard words is
that they are often very ambiguous. For example, the number 1983 can be spoken in
at least three different ways, depending on the context:
Ø
பத்தொண்பது என்பத்து மூன்று
Ø
ஒன்று ஒன்பது எட்டு மூன்று
Ø
ஆயிரத்தி தொல்லாயிரத்தி என்பத்து மூன்று.
This problem can be handled
by considering a single type of methodology for all the cases. Number system
for Tamil has already been developed in python code. That gives hundred percent
accuracy, so by incorporating it in TTS system makes the system somewhat slow.
So for time being Number system has been cancelled in current TTS system. It
will be coming under future work. The algorithm will cancel all the numbers
coming in text, and then it will become normal text without numbers or any
extra punctuation.
Syllabification
This can be done in many
ways for example, the syllabification algorithm breaks a word such that there
are minimum numbers of breaks in the word, as minimum number of joins will have
less artifacts. The algorithm dynamically looks for polysyllable units making
up the word, cross checks the database for availability of units, and then breaks
the word accordingly. If polysyllable units are not available, then algorithm
naturally picks up smaller units. This mean, if database is populated with all
available phones of language along with syllable units, algorithm falls back on
phones if bigger units are not available.
But in this system we are
not following this methodology instead of this we are using a mapping file
which contains Tamil letter and corresponding Romanized letter for that.
According to the input Tamil letters the corresponding Romanized letters are
taken and made as a syllable. The important thing in this mapping file is that,
the arrangement of the letters should be based on the length of the Tamil
letter. Because if the letters are not arranged in a required manner then there
will be system error and we will not get the output.
For example if we are having
a Tamil word as the input the letters அ and will not be a problem.
If ம comes before then it will
be big problem. The algorithm will search for ம then it replace as ம. Then one unrecognized
character will be coming, so the system cannot process this for unit selection,
this is why arrangement of letters is compulsory.
Input
Text
காமராஜர்
|
Mapping File
|
kA \ ma \ rA \ ja \r
|
The Figure 3 clearly shows
the syllabification of a Tamil word kamarajar.
3 Grammatical Rules
After text processing we
will getting the processed Tamil text it cannot be given directly to the
conversion process. That is converting Tamil text in to Romanized text which is
similar to English. Because computer cannot process the Tamil data as such so the
conversion is compulsorily needed. In Tamil same letter is pronounced
differently at different places. Like English word ambiguity is not possible in
Tamil but letter ambiguity is there.
Making Tamil system is
somewhat easy than English TTS in ambiguity wise. Tamil is having less number
of letters to compensate the pronunciation of all the Tamil words so for some
same letters one or two different pronunciation is possible. This problem can
be handled by writing rules, Tamil language will be having a language that
makes the structure for that language. By that grammar we can somehow predict
the letters which are having many possible pronunciation and in which place.
This process is done in the grammatical block.
In Tamil the same letter in
text format will have different pronunciation at different places they occur.
This can be solved in many ways, which is by writing rules or any of the
machines learning technique like SVM algorithm, HMM etc. In this project
phonetic rule of Tamil language is used instead of machine learning approach as
lot of previous work as done. Tamil language is somewhat weak in number of
alphabets to fulfill all the pronunciation of letters. But south Indian
languages like Malayalam, Kannada etc are having many alphabets to fill the gap
in pronunciation change.
Making a TTS system for
Malayalam is easy compared to Tamil language because we need to know the
complete grammatical and phonetical structure of Tamil language. In English
language there are many ambiguous words, this is not the case in Tamil. In
Tamil the only problem is that the same letter will be pronounced differently
at different place. For example in English a sentence.
Ø
Do
u live (/l ih v/) near a zoo with live (/l ay v/) animals?
Ø
I
prefer bass (/b ae s/) fishing to playing the bass (/b ey s/) guitar?
From the example we can
understand ambiguous words in the English sentence are high. But in Tamil it
will not happen
Ø
காக்கா நிறம் கருப்பு
Ø
காகம் கூட்டில் இருக்கிறது.
In the first sentence க is pronounced as ‘ka’ but in the second sentence the same க is pronounced as ‘gha’. These types of ambiguous letters
are more in Tamil language.
Ø
பாப்பா இங்கே வா.
Ø
பாபா படம் கண்டேன்.
In the first sentence பா is pronounced as ‘pA’ but in the second sentence the same பா is pronounced as ‘bhA’. These types of problem can be
handled by getting the complete information of Tamil phonetic grammar. Some of
the rules are shown below
Ø
அவன் ஆசை காட்டினான்
Ø
அவளுக்கு பச்சை வண்ணம் பிடிக்கும்.
In the first sentence is
pronounced as ‘sai' but in the second sentence the same is pronounced as ‘chai’.
Some of rules can handle this, the letters like (sai, sa, sA, si) etc that is
the letters based ‘sa’ will be pronounced as same as (sai, sa, sA) etc. But
when the same ‘sa’ terms coming after ‘s’ will be pronounced as (chai, cha,
chA) etc everything will be coming in ‘cha’ series.
Ø
அழகிரி பாடம் படித்தான்.
Ø
மணி படம் பார்த்தான்.
In the first sentence ட is pronounced as ‘da’ but in the second sentence the same ட is pronounced as ‘ta’. Some of rules can handle this, the
letters like (dai, da, dA, si) etc that is the letters based ‘da’ will be
pronounced as same as (dai, da, dA) etc. But when the same ‘da’ terms coming
after ‘t’ will be pronounced as (tai, ta, tA) etc everything will be coming in ‘ta’
series.
Other than this some of the
Tamil letters will have long, short and medium range pronunciation
corresponding to their places where they appear. For example coming in
different places of a word will have this sort of long, short, medium
pronunciation.
Ø
பந்து (long pronunciation of து)
Ø
துவங்கினான் (short pronunciation of து )
Ø
பாதுகாப்பு (short pronunciation of து)
Almost 60% of the letters
will have this property, so to get these information in TTS system it is
necessary to record three phase of the same letter. Our algorithm is designed
to fix the appropriate pronunciation of letters in appropriate place. For example
Process of Phonetic Rules Applied to the Input Text is show in Figure 4.
Input Text
அவனுக்கு பாதுகாப்பு வழங்கப்பட்டது
|
Text Processing
avanukku pAthukAppu
vazangkappattathu
|
Romanized
Form
அவனுக்கு பாதுகாப்பு வழங்கப்பட்டது
|
Grammatical
Rules
avanukku pAthu(M)kAppu
vazangkappattathu(L)
|
Figure 4: Process of
Phonetic Rules Applied to the Input Text.
4
Speech Database
All the letters of Tamil
should be recorded and kept as a separate database. After the text processing, grammatical
change and conversion of romanized text, the system should take the
corresponding speech file from the speech database. This section of selecting
corresponding speech file by the text file will be handled in the unit
selection block. Recording of speech sounds of Tamil letters needs some
requirement for further signal processing.
The recording should be on
the sound proof room, the pronunciation Tamil letters should be good and a good
quality mike should be used to avoid noise in speech file. This requirement
will help a lot in the naturalness of the output speech. In text to speech
synthesis the accuracy of the system is calculated in the ways of naturalness
of the output speech. So instead of recording only a single letters of Tamil,
we record some combination of letters that is diphone. This will improve the
naturalness of the system in a better manner.
Figure
5: Speech Database
The main part of this
project is to record all the letters of Tamil language, that is 247 letters and
some of the Sanskrit letters which will be rarely there in Tamil language.
Finally the database contains speech file up to 350. The grammatically changed letters
should have their corresponding speech file, so it is necessary to record those
files and store in the database. Most of the Tamil letters will have three form
of pronunciation that is very important for the naturalness of the output
speech.
Ø
பருப்பு (long pronunciation of பு)
Ø
புலவர்(short pronunciation of பு)
Ø
அன்புடையீர்(medium pronunciation of பு)
Apart from the normal
letters of Tamil, the three forms of same letters should be recorded and stored
in the database. Same letters will be pronounced differently at different
places, so the grammatically changed letters should be recorded separately
other than the basic alphabets. With this speech database we cannot get a good
accuracy that is we cannot get high naturalness.
We can record all the words
of Tamil language, so that we can get very high naturalness but it is not
possible to record all the Tamil words. At least recording the most frequently
used word in the database to get good naturalness. It will also take time for
processing, so we are using the all possible combination of vowels and
consonants. Adding this in a database will not have that much space and the
processing time also less compare to the previous methodology.
But it will give output speech naturalness
similar to record the complete word. Recording and adding the most frequently
used word will be having inflection, it may come in any form so it needs
morphology process to do that work. So it is somewhat time consuming work. The
combination of consonants vowels gives good result in this TTS system. The
combination contains 4457 speech files which will be added in the database. For
example the combination table will look like this which is given below in
table.
Tamil
Alphabets Chart
|
||||||||||||
அ
a |
||||||||||||
ங
nga |
ஙா
ngA |
ஙி
ngi |
ஙீ
ngI |
ஙு
ngu |
ஙூ
ngU |
ஙெ
nge |
ஙே
ngE |
ஙை
ngai |
ஙொ
ngo |
ஙோ
ngO |
ஙௌ
ngau |
|
ஞ
Gna |
ஞா
GnA |
ஞி
Gni |
ஞீ
GnI |
ஞு
Gnu |
ஞூ
GnU |
ஞெ
Gne |
ஞே
GnE |
ஞை
Gnai |
ஞொ
Gno |
ஞோ
GnO |
ஞௌ
Gnau |
|
ண
Na |
ணா
NA |
ணி
Ni |
ணீ
NI |
ணு
Nu |
ணூ
NU |
ணெ
Ne |
ணே
NE |
ணை
Nai |
ணொ
No |
ணோ
NO |
ணௌ
Nau |
|
த்
th |
||||||||||||
யௌ
yau |
||||||||||||
ழ
zha |
ழா
zhA |
ழி
zhi |
ழீ
zhI |
ழு
zhu |
ழூ
zhU |
ழெ
zhe |
ழே
zhE |
ழை
zhai |
ழொ
zho |
ழோ
zhO |
ழௌ
zhau |
ழ்
zh |
Sanscrit Characters - வடமொழி (கிரந்த) எழுத்துக்கள்
|
||||||||||||
க்ஷோ
kshO |
The database part is over
then the output from grammatical rules section will be in the form of Romanized
sentence which has been already undergone syllabification process. So the unit
selection part will pick the correct speech file from the speech database which
has already created. The main process done in unit selection is that, it will
take speech file of the corresponding syllabalized sentence and make a setup
which groups those speech units.
5
Concatenation
The last and final stage
process is the concatenation process. All the arranged speech units are
concatenated using a concatenation algorithm. The concatenations of speech
files are done using matlab. Because matlab will be very useful for further
signal processing if needed. In this project no signal processing techniques
are used. The main problem in concatenation process is that, there will be
glitches in joint. In previous project they used some signal processing
methodology to solve this problem. But in this project the glitches are avoided
by perfect recording, where no glitches will come. The concatenation process
combine all the speech file which is given as a output of the unit selection
process and then making in to a single speech file. This can be played and
stopped any where needed. The main aim of this project is to achieve good
naturalness in output speech.
6.
Conclusion
This
paper made a clear and simple overview of working of text to speech system
(TTS) in step by step process. There are many text to speech systems (TTS)
available in the market and also much improvisation is going on in the research
area to make the speech more effective, natural with stress and emotions. We
expect the synthesizers to continue to improve research in prosodic phrasing,
improving quality of speech, voice, emotions and expressiveness in speech and
to simplify the conversion process so as to avoid complexity in the program.
REFERENCE
1. A.J.
Hunt and Alan W. Black. Unit selection in a concatenative speech synthesis system
for a large speech database. In Proceedings of IEEE Int. Conf. Acoust., Speech,
and Signal Processing, pages 373-376, 1996.
2. Anupam
Basu, Debashish Sen, Shiraj Sen and Soumen Chakraborty “An Indian language
speech synthesizer – techniques and application” IIT Kharagpur, pages 17-19,
Dec 2003.
3. Black,
A., Taylor, P., Automatically Clustering Similar Units for Unit Selection in
Speech Synthesis. Eurospeech97, Rhodes, Greece, 2:601-604, 1997.
4. G.
L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar
“Thirukkural - A Text-to-Speech Synthesis System” Department of Electrical
Engg, Indian Institute of Science, Bangalore presented at INFITT 2001.
5. Harsh
Jain, Varun Kanade, Kartik Desikan “Vani-An Indian Language Text To Speech
Synthesizer” Department of Computer Science and Engineering, Indian Institute
of Technology Mumbai, India. April 2004.
6. James
Allen, “Natural Language Understanding”, 1995
7. Klatt
D.H. “Review of text-to-speech conversion for English” pages 737-793, 1987.
8. M.
Vinu Krithiga and T.V. Geetha “Introducing Pitch Modification in Residual
Excited LPC Based Tamil Text-to-Speech Synthesis” Department of Computer
Science and Engineering, Anna University, Chennai-25, India. AACC 2004, LNCS
3285, pp. 177–183, 2004.
9. Michael
W. Macon and Mark A. Clements “Speech Concatenation and Synthesis using an
Overlap-Add Sinusoidal Model” School of Electrical and Computer Engineering,
Georgia Institute of Technology, Atlanta. GA 30332-0250, 1996.
10. Nur-Hana
SAMSUDIN, Sabrina TIUN and TANG Enya Kong “A Simple Malay Speech Synthesizer
Using Syllable Concatenation Approach” Computer-Aided Translation Unit, School
of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia, Presented at
M2USIC, October 2004.
11. Prahallad
Kishore, Black Alan “A text to speech interface for Universal Digital Library”
Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA
15217, USA and International Institute of Information Technology, Hyderabad,
AP, 500019, India. Journal of Zhejiang University Science, vol. 6A, no.11, pp.
1229-1234, Oct 2005.
12. R.
Kaplan and M. Kay, “Regular Models of Phonological Rule Systems”, Computational
Linguistics Vol. 20, No. 3, 331-378, September 1994.
13. S.P.
Kishore, Rohit Kumar and Rajeev Sangal “A Data-Driven Synthesis Approach For
Indian Languages using Syllable as Basic Unit” Language Technologies Research
Center, International Institute of Information Technology, Hyderabad, India and
Punjab Engineering College, Chandigarh, India, presented at International
Conference on Natural Language Processing (ICON), 2002.
14. S.R.
Rajeshkumar. Significance of Durational Knowledge for a Text-to-Speech System
in an Indian Language. MS dissertation, Indian Institute of Technology,
Department of Computer Science and Engg. Madras, 1990.
15. S.Subramanian
and K.M.Ganesh “Study and Implementation of a Tamil Text-to-Speech Engine”
Students, III yr B.E.CSE, Department of Computer Science and Engineering, PSG
College of Technology, presented at the Tamil Internet conference 2001.
16. S.Veera
Alagiri, “Spectrograph Analysis of Tamil Vowels (a,aa)” MA Thesis, Linguistics,
Tamil University, Thanjavur. 2007.
17. Sangam
P. Borkar, Prof. S. P. Patil “Text to Speech System for Konkani (Goan) language” Rajarambapu Institute of Technology
Sakharale, Islampur, Maharashtra, India, Oct 2003.
18. Silvia
Quazza, Laura Donetti, Loreta Moisa, Pier Luigi Salza “A Multilingual
Unit-Selection Speech Synthesis System” Loquendo S.p.A, Via Nole 55, Torino,
Italy. Paper 209 presented at SSW4-2001.
19. Venugopalakrishna
Y R, Sree Hari Krishnan P, Samuel Thomas, Kartik Bommepall, Karthik Jayanthi,
Hemant Raghavan, Suket Murarka and Hema A Murthy “Design and Development of a
Text-To-Speech Synthesizer for Indian Languages” Indian Institute of Technology
Madras, Chennai, India. IDIAP Research Institute, Martigny, Switzerland.
January 2008.
20.
Y.A El-Imam and Z.M. Don, “Text-to-Speech
Conversion of standard Malay”, international Journal of Speech Technology, no
3,pp. 129-146, 2000