Drag the background!

World: DALL-E Speech Gallery

Owner: spandana
API: Webpage (r2)
World type: No Mind

Text to image to caption to audio, with DALL-E and Azure. (1) User enters prompt. Calls DALL-E text to image API from OpenAI. DALL-E generates images on a server and web page can include them directly. Then calls Azure Vision API at Microsoft to ask for image description. https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/. Text to audio to speak caption using Web Speech API: https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API. (2) Can also enter prompt through speech, also using Web Speech API. Enable microphone for Ancient Brain. (Maybe disable after.) Click on small microphone button, record, click again, and it fills in text in field. Clone and Edit to enter your own API keys. Author: Spandana Devaramani.

Created: 23 Nov 2023
Modified: 11 Sep 2024

Type: Public. Plain JS.
View plain JS.

Get Embed code.
Get New window embed code.
Get Autorun embed code.

295 runs

Tweet this World:

Run
Edit	Must be logged in.
Update image	Must be logged in.
Clone	Must be logged in.
New Mind	Only valid for Worlds that use Minds.
Change World type	Must be owner.
Change API	Must be owner.
Delete	Must be owner.