Using generative artificial intelligence, a team of researchers at The University of Texas at Austin has converted sounds from audio recordings into street-view images. The visual accuracy of these generated images demonstrates that machines can replicate human connection between audio and visual perception of environments. The research team describes training a soundscape-to-image AI model using audio and visual data gathered from a variety of urban and rural streetscapes and then using that model to generate images from audio recordings.
Leave a reply