The first generation of scribepod!
Methodology
Wrote a script to get all of the arxiv links that got tweeted out by @_akhaliq in the past week
Follow him on twitter and check out his Patreon! patreon.com/akhaliq
Wrote a script to download all of the raw latex of those papers from arxiv.org
Wrote a script to parse the introduction & conclusion out of the raw latex from all of the research papers.
Wrapped the chatGPT website with web browser automation
I could have done it manually, I just like to script stuff (I promise I didn’t break TOS!!)
Wrote a script to generate dialogue
Provides the introduction to chatGPT, and ask it to simulate a podcast dialogue. Prompted to have the simulated individuals to be excited.
Does the same, but instead, doing it with the conclusion.
Dumps the result to disk.
Wrote a script to take the generated dialogue, and generate speech using tortoises-tts by jbetker.
Check out jbetker’s blog at nonint.com (warning, it’s impressive to the point of being an infohazard)
What I’ll change for the next generation
Problem: The dialogue is a bit repetitive. It’s also redundantly excited.
I’ll figure out a way to include both the introduction & conclusion into the same GPT output.
Problem: the dialogue is always surface level
I’m going to figure out a way to methodically summarize the whole paper, and then use that to become a dialogue.
Ferris Prime’s voice isn’t as good as Joe Prime’s voice.
I’ll experiment with having one person in the podcast, instead of two people.
The generation takes a really long time (~6h for 1.5h of audio). I’m thinking of building orchestration software over runpod.io or vast.ai, and then “map reducing” the text to speech inference. I’ll be able to speed it up 10x if I had the ability to rent 10 consumer GPUs.
Requests?
Do you have anything that you’d like me to try to throw into this pipeline? Do you have any feedback or suggestions? Would you actually listen to this?
Donate? Link.
Paper links
cat twitterData.json | grep expanded.*http.*arxiv | sort | uniq
"expanded_url": "https://arxiv.org/abs/2212.09802",
"expanded_url": "https://arxiv.org/abs/2212.09877",
"expanded_url": "https://arxiv.org/abs/2212.09898",
"expanded_url": "https://arxiv.org/abs/2212.10465",
"expanded_url": "https://arxiv.org/abs/2212.10544",
"expanded_url": "https://arxiv.org/abs/2212.10550",
"expanded_url": "https://arxiv.org/abs/2212.10554",
"expanded_url": "https://arxiv.org/abs/2212.10559",
"expanded_url": "https://arxiv.org/abs/2212.10560",
"expanded_url": "https://arxiv.org/abs/2212.10562",
"expanded_url": "https://arxiv.org/abs/2212.10622",
"expanded_url": "https://arxiv.org/abs/2212.10699",
"expanded_url": "https://arxiv.org/abs/2212.10770",
"expanded_url": "https://arxiv.org/abs/2212.10846",
"expanded_url": "https://arxiv.org/abs/2212.10923",
"expanded_url": "https://arxiv.org/abs/2212.10947",
"expanded_url": "https://arxiv.org/abs/2212.11263",
"expanded_url": "https://arxiv.org/abs/2212.11270",
"expanded_url": "https://arxiv.org/abs/2212.11377",
"expanded_url": "https://arxiv.org/abs/2212.11419",
"expanded_url": "https://arxiv.org/abs/2212.11565",
"expanded_url": "https://arxiv.org/abs/2212.11685",
"expanded_url": "https://arxiv.org/abs/2212.11696",
"expanded_url": "https://arxiv.org/abs/2212.11715",
"expanded_url": "https://arxiv.org/abs/2212.11972",
"expanded_url": "https://arxiv.org/abs/2212.11984",
"expanded_url": "https://arxiv.org/abs/2212.12017",
"expanded_url": "https://arxiv.org/abs/2212.12249",
"expanded_url": "https://arxiv.org/abs/2212.12294",
"expanded_url": "https://arxiv.org/abs/2212.12552",
"expanded_url": "https://arxiv.org/abs/2212.12652",
"expanded_url": "https://arxiv.org/abs/2212.12952",
"expanded_url": "https://arxiv.org/abs/2212.13138",
Share this post