ABOUT THE GAME
xVASynth is an AI based app for creating new voice lines using neural speech synthesis. The app loads models individually trained on character voice data from games. The app gives users control over details such as pitch and durations of individual letters to provide control over emotion and emphasis. To see it in action, watch the short intro/tutorial videos, narrated by various supported voices.The use of neural speech synthesis leads to natural sounding voices, something which is very difficult to do with more traditional methods involving concatenations of existing data. It also means new vocabulary can be generated, outside of what the voice actors have already read out.ARPAbet pronunciationYou can specify exact pronunciation for words by using ARPAbet notation between { } brackets in the input, or by managing words in your own (or other people's) dictionaries. Included is CMUdict with 135k words with American-English pronunciations.Batch ModeFor larger projects, where you need to synthesize a large amount of lines, you can alternatively use the Batch synthesis mode. You can use either a .txt file or a .csv file to batch generate hundreds or even thousands of lines, in one go, with parallelization. Although the pitch/duration/energy editor is sometimes needed to get a line sounding just right, it's sometimes not needed, and this is a good way to get an initial pass on lines. Using the GPU is especially highly recommended for this, as you can greatly parallelize the number of lines generated in one go (limited by VRAM). You should also check the various settings, such as multi-threading, to get the best possible speed out of this for your system.3D Voice embeddings visualizerThe 3D voice embeddings visualizer is an interactive panel where you can explore in 3D all the voices in the app, as seen by an AI representation learning model, projected down to 3D. There are no axes, and this serves purely as a visualization, to enable voice discovery. You can colour the...