A community for electronic music production, modular synthesis, sdiy and more. Log inSign up with Discord, Facebook, Google, Twitter
jp avatar

this is the second attempt to recreate my voice by a recurrent neuronal network (3 layers instead of 2 in the previous attempt) trained with a sample of my voice. the audio quality is quite low (8-bit 12khz)because of processing power restrictions. but sounds more like speaking this time. Soundcloud diy Programming AI

3 4 788 Sep 2016

Comments

vate avatar

Interesting, how are you creating this?

Sep 2016
jp avatar
jp

it's made with a recurrent neuronal network (RNN). it's using karpathy's char-rnn (on github). that's a way to predict characters learned from a text corpus. i simply converted a wav-file of a recording of me speaking into an 8-bit 12khz raw file. that is basically a text file. i fed that into the RNN and had it train the data. took 36 hours on my old macbook pro. then i had the RNN predict text = samples.
so basically the neuronal network learned to predict which sample = character came after previous samples = characters. i was inspired by google's WaveNet-paper (on deepmind) but my attempt is much simpler. there also was a video a few month's ago doing the same i did.
i think the results are astonishing. read the WaveNet-paper and listen to the samples. it's crazy:
https: //deepmind. com/blog/wavenet-generative-model-raw-audio/

Sep 2016
n9 avatar
n9

man, is it just me or there something really creepy about this? reminds me of Borges's "The Library of Babel" eventually this algorithm will sound like your voice, telling you something that you need to know.

Sep 2016
jp avatar
jp

yeah this is creepy. and it will be our future. google, facebook and all the rest are working on this full speed. so let's make art with this stuff. i think it's a way to better understand what's going on.

Sep 2016
What do you think? Sign up with Discord, Facebook, Google, Twitter to leave a comment.