Augment Data Visualization with Sonification for better Data Story Telling

7 min readMar 9, 2022

For the last few months, I have been talking to Jeff McSpadden —CEO and co-founder of Composure about his approach to using music for healing purposes, currently his focus is on seniors in specialized care facilities. I loved this idea; in my opinion, catering to the seniors is an excellent proposition and it serves a great cause. During our discussions about his business strategy for building out the sonic journeys based on preset business rules, he setup a call with folks at Sonify, they have created an open source setup called twotone for creating music compositions based on data as input — the concept is called data sonification, similar to data visualization. This blog is about my experience with data sonification.

Being at Adobe, I keep hearing about various ways of data visualization all day, there is Adobe Analytics Workspace — web based data visualization solution, very useful for online data collected through the Adobe sdk. It boasts quick visualizations of massive amounts of data and is a world class solution. There is Customer Journey Analytics, which is based on a similar UI as the workspace, but lets users join the online data collected via analytics with other offline sources and helps with enterprise user journey based reporting. There are other generally available purpose built visualization solutions like looker, Power BI, Tableau and a whole host of others. Data visualization is a mainstream activity in the data analysis process.

So when I first heard about data sonification, I started to wonder why is there not much emphasis on the sound aspect in data analysis, after all, hearing is one of our five senses. Here are some points that came to my mind when comparing both the mediums for data analysis:

Let’s compare the optic and auditory connections to the brain: ear takes more energy for perceiving a sound as opposed to what an eye takes to perceive light, since the middle and inner ear functionality is mostly mechanical. Comparing the frequency of seeing and hearing, eye can sense light ranging from 460THz to 750 THz, about 0.7 octaves, where as the ear can hear sounds from 20 Hz to 20 KHz, around 10 octaves. Also, hearing is a tiny fraction of the process of audition, i.e. the entire process our brain follows to make sense of the world based on sound is much more than the sound itself and is complex and sophisticated as compared to the process of making sense of light, e.g. did you know we are blind at the optic disc, i.e. when the light makes it’s way from the retina to the optic disc, there is a dark spot in the middle, but our brain processes the images and covers up the dark spot with its projections based on the images around the dark spot.
Higher availability of trained analysts in visualization techniques — we all have been taught about understanding patterns in data using graphs and other visual techniques from secondary school onwards, so understanding a pattern in data by looking at a graph is easier for a lot of people, comes naturally to some. However, to understand a data pattern from a sonification journey needs a trained ear, I ll share some examples as part of this article to further clarify this point.

I used a free dataset available on kaggle: the movie database, to answer the question: is there any trend on the budget of movies over time? below is a scatter plot to answer this question drawn using seaborn in python, we can see clearly that there is a positive correlation between time and the budget, basically the budget of movies have steadily increased over time. It took me around 10 minutes or so to get the data from kaggle, load, clean and visualize, understanding the pattern in data took a few seconds.

scatter chart with budget in the y axis and release year in the x axis

The same dataset was sonified using the no code UI based application: twotone; here is the file: soundcloud link
Explanation of the above audio: violin tones represents the release_year and the piano tones represent the budget, you ll notice that the piano tones become significantly audible as the composition progresses (after the 6 minute mark, there is a brief period of silence around the 4 min mark), indicating a positive correlation. I was however a little lost when I was built another journey to answer the question: ‘do larger budget movies produce larger profits?’ didn’t get a coherent output. While creating the dashboard, the initial version was not really musical, so I had to take help from Jeff to make the composition more meaningful and melodious, since he has over three decades of music composition experience, it took him less than a few minutes to make some quick adjustments. I do think for two column corelation kind of problems in exploratory data analysis, sonification could help in some form. Thanks for the no code application, I was able to generate the audio file from the processed data in minimal time, but to understand the correlation, I had to hear the entire 10 minute clip. The point I am trying to make is there will be significant ETL involved in generating the audio file and the time of conducting analysis would be atleast equal to the length of the play time in the audio file.

I couldn’t stop myself with this dataset; so I tried with another dataset. Since I deal with retail clients, I created a test dataset for a product funnel report (micro conversions on the various steps the user takes from seeing a product in the page to buying it on the site) of fictitious retail company and created a musical dashboard, take a listen. I don’t know what can be made out of it, but I did like the groove of the composition. However, if you have to analyze the data for this piece, you ll need to know to recognize the sounds of each type of instrument and also remember which instrument corresponds to which column of data, for this audio file, below is the mapping:

product views on the piano
Cart additions on double bass
Orders on church organ
Units on electric guitar
Revenue on mandolin
Orders/Visits on marimba

After trying a few combinations with a few datasets, tempo, bpm, octaves, apperegio and feeling like on top of the world, I started to wonder where can data sonification fit in real world scenarios, here are scenarios that come to mind:

Exploratory data analysis in the sonic medium will open the data analysis process to a whole host of users with vision challenges; along with inclusion, it will help to get new perspectives from the visually challenged. In fact, I would love to see this application as a browser extension that could convert the graphs on the screen into musical notes, so it makes visual reports accessible to all.
I have developed a huge respect for audio books lately and the main reason is due to the fact that I get to hear it when I am doing something completely unrelated; like mundane household chores. So if we are able to create sonic dashboards related to data observability and monitoring, then the network operations center or similar teams in charge of monitoring data pipelines could have one less screen to monitor, instead the room could be playing a musical composition and the anomalies could be spotted by an erratic sound beat. So it would certainly be useful in monitoring and observability kind of scenarios.
Data storytelling — we all have heard about this term more than we need to, however in today’s parlance, data story telling does not have a musical component in it, other than the voice of the data story teller. What if we can design self sustaining stories with audio-visual components to be presented to the audience. We have a lot of qualified musicians and sound engineers with great knowledge in music and how it can relate to a given dataset, so if we can create opportunities for the data analysts and the music pros to get together, I am sure we can come up with intuitive ways to use the auditory medium to speak the language of data. Most of the dominant species in the animal kingdom have used the audio-visual cues to make sense of their surroundings and take the right calls, so using two of our senses will help the brain a get a better sense of the dataset than just visualization.

To get a sense of sonified data, one needs to understand the basics of music composition, like tempo, melody, timbre, panning, rhythm, etc. and getting a primer on MIDI is also absolutely required. MIDI is like the http of music, it’s a format that the electronic music is coded and communicated on.

All in all; I was really intrigued with the learning experience of data sonification, I can’t wait to see stories/musicals being told based on the data. Won’t that be music to the ears!

I would recommend trying out the beta version of the application on the sonify website to get a hang of this concept.

If you are interested in further reading, below are some recommendations:

Follow Hugh McGrory here is one of the articles from him on data sonification: https://medium.com/sonify/data-driven-storytelling-making-civic-data-accessible-with-audio-1bc4fc17429f
MIDI: one of the books that I read to get an understanding on MIDI was Modern Recording Techniques, 9th Edition published by O’Reilly Media — Ch 9 is related to MIDI.

Augment Data Visualization with Sonification for better Data Story Telling

Written by Nagendra Nukala