Heavily inspired by the algorithm used in 2FA made by Samip Regmi, Raju Bhetwal, and Nayan Nembang
Following is the algorithm I used to convert image to sound:
- Resizes the source image to specified image size
- Converts the resized image to grayscale whose pixel value can be mapped from
0-255
- Each pixel from
0-255is mapped to frequency from200-1000using linear mapping
- Those mapped frequencies are saved in each
0.01second of the audio file, holding441samples each - with total frequency data being size of resized image
- default: 50 x 50 = 2500
- As duration per frequency is known, we extract all the chunks of data.
- Each chunk holds
441samples, in total2500chunks.
- We then use librosa to find the frequency of each chunk
- Using linear mapping we convert the frequencies back to pixels
- Finally, all the pixel data is saved back into the image
2025-09-22.23-27-28.mp4
- clone the repo using
git clone <remote-url>u can use ssh or https - after cloning install required dependencies using
pip install -r requirements.txt - add and correct the required paths in main.py
- run the program using
python3 main.py