Please select your home edition
Edition
Zhik 2024 December
Product Feature
Sailingfast Vakaros RS21 ATLAS 2 INSTRUMENT BRACKET
Sailingfast Vakaros RS21 ATLAS 2 INSTRUMENT BRACKET

Science blog: Machine learning and whale song

by Ann Allen, NOAA Fisheries 2 Nov 2018 15:13 GMT
A humpback whale surfaces near Maui Island © Hawaiian Islands Humpback Whale National Marine Sanctuary / Jason Moore

What do you do when you have too much data? It's not a question I've ever had to ask myself in my scientific career, and is rarely a problem when working with hard-to-study animals like whales and dolphins.

But this is exactly the question I found myself asking shortly after starting my new job with the Pacific Islands Fisheries Science Center's Cetacean Research Program in Hawaii in 2017. I was describing my job to my father, who is not a scientist, and trying to convey what we want to do.

We conduct research to understand the health and status of whale and dolphin species around U.S. Pacific Islands and across the Pacific Ocean, a VAST area. One tool we use to monitor elusive whales across this large region is a network of seafloor mounted recorders called high-frequency acoustic recording packages (HARPs).

The HARPs have hydrophones (underwater microphones) that can record the calls of whales and dolphins, as well as any environmental or human-made noise. We deploy HARPs for months to years at a time and we have to retrieve them to obtain data.

So far, our network extends to 13 different recordings sites, of which several have been monitored for a decade or more. This means we now have more than 200 TB of data, or almost 170,000 hours of recordings. If you were to sit and listen to all of that audio straight through it would take you more than 19 years. So if we monitor the whales by listening to their calls in the recordings, how in the world do you ever listen to that much data?

After telling my father all of this, he asked me, "Well why don't you just get those Shazam or Google people to help you? They teach computers to recognize human songs, I bet they can teach it whale song."

I paused. "Well, because it's just not that simple..."

"That's just not how you..."

"Because people have been..."

"Um well y'know, I don't know."

Thus challenged by my dad, I figured I'd find out if those Google people would be interested in teaching their computers to speak whale.

A friend who works for Google put me in contact with their Artificial Intelligence group. They responded almost immediately and enthusiastically; an ocean's worth of data sounded like it would fit in well with their current projects. They already have a lot of machine-learning techniques they've developed for other purposes that they would be excited to put to use for conservation. Machine learning attempts to mimic the learning process of the human brain by teaching a computer to carry out a task, rather than programming it to do the task step by step.

Automatic detection of whale calls isn't a new approach, but designing a good detector can be quite difficult. The ocean is a noisy place, with lots of sounds from wind, waves, fish, and even the recording equipment itself. And humans contribute an ever increasing amount of noise to the ocean, from things like container ships, oil and gas prospecting, military activities, and construction. The whale vocalizations themselves can also sound very different depending on how far away the whale is from the recorder, how it echoes off of the ocean floor and surface, and whether it has a local variation, or dialect, so to speak.

While some whales, such as blue whales and fin whales, make consistent, stereotyped calls that are relatively straightforward to recognize once you've seen some examples, humpback whales have vocalizations that are extremely variable. These calls make up phrases and themes that form songs. The songs change over time, as the whales incorporate new notes or swap the order of phrases, and many populations of humpbacks sing different songs. So how do you teach a computer to recognize whale song when the song keeps changing?

Researchers have been trying to solve this problem for quite a while. So far all the methods developed come with caveats: poor detection rates, a lot of user input to fine tune the detector for each call type and situation, or a high rate of false detections that require someone to go through by hand and pick out the correct detections from the bad ones.

However, there have been a lot of advancements in the last few years in artificial intelligence, and deep-machine learning. You may have noticed that your phone is much better about not making tie bows, I mean typos, when you use voice recognition these days. This is due to advances in machine learning, some made at Google. So with this in mind I figured I would throw Google in the deep end and see if they could come up with a solution to the complex humpback song problem. I gave them a subset of our audio data, with a collection of 'training labels', or humpback whale calls that are marked by a human (me). This training data is how the computer learns what humpback calls look like.

And it turns out that Google's computers are pretty good at recognizing humpback song! In less time than I thought possible (practically gale force speed by academic standards), they sent me timestamps of humpback whale song in our data. The computer isn't perfect yet, we still need to teach it to better recognize different kinds of noise from sites that didn't have as much training data, but it is already doing far better than expected.

There is so much new, important information we can learn from just knowing when there is humpback whale song throughout our dataset. In the Pacific Islands region, very little is known about humpback whale presence, seasonality, daily singing behavior, or population structure outside of the main Hawaiian Islands. We do not even know if humpback whales ever journey to some of the smaller and more remote uninhabited islands. With this data we can start to address these questions. Additionally, because our dataset spans more than a decade, knowing when and where humpback whales are singing will give us information on whether or not the animals have changed their distribution over the years, especially in relation to increasing human activity in some areas.

The possibilities don't stop there. We're now interested in teaching the computer to find other species of whales and dolphins in our recordings. The hope is that if we give the computer enough examples of different whale calls it will learn to generally recognize 'whale' sounds and get better at distinguishing between whale species and also between 'whale' and 'ship' or other noise. My hope is that in the future Google will provide us with a tool that scientists can directly use. A tool that we can provide new data, from a new location, and training labels for a new type of whale call, and it can then tell us where those calls are in the rest of the data.

So, if you're a scientist with acoustic data and labelled whale calls and want to help teach Google how to speak whale, whale then give us a call!

Check out more details of the story from the Google side:

Related Articles

2025 Cherub UK Nationals at Paignton
Four days of mainly challenging skiff sailing conditions Paignton Sailing Club were excellent hosts of the 2025 Cherub Nationals, Nick and his race team expertly running races over the full 4 days of the event in mainly challenging skiff sailing conditions with one day of lighter winds thrown into the mix. Posted today at 5:46 pm
56th La Solitaire du Figaro Paprec Leg 2 day 3
Advantage to the Southerners On this second leg of La Solitaire du Figaro Paprec 2025, between the Bay of Morlaix and Vigo in Spain, two options have taken shape over the past hours. Posted today at 5:38 pm
GP14 Ireland Autumn Open and Youth Championship
A great showing from female and youth sailors, in a fleet of 20 boats On 13 September 2025, 20 GP14s launched from Blessington Sailing Club despite the blustery conditions and made their way out to the race area and across the start line to start the GP14 Autumn Open. Posted today at 5:29 pm
UK Open Challenger Championship at Rutland
The wind gods smiled and a full series of races was run Fifteen sailors gathered at Rutland Sailing Club. The weather forecasts in the week leading up to the event had suggested that there may be, at worst, no racing or one or more days lost due to strong winds. However, the wind gods smiled. Posted today at 3:45 pm
Entry criteria announced for Sardinia Cup 2026
Iconic Yacht Club Costa Smeralda regatta to mirror Admiral's Cup rating bands The Yacht Club Costa Smeralda (YCCS) is pleased to announce that the next edition of the Sardinia Cup, taking place from 31 May to 7 June 2026, will adopt the IRC Rating Bands applied by the Royal Ocean Racing Club for the Admiral's Cup. Posted today at 3:36 pm
Team Toddbroccoli at the Southport 24 Hour Race
Toddbrook Sailing Club once again turned out in style for the famous race! Toddbrook Sailing Club once again turned out in style for the famous Southport 24 Hour Race, hosted by West Lancashire Yacht Club. Posted today at 2:34 pm
JOG Stoneways Marine Cowes-Poole Race Weekend
Plan B delivered two days of tactical, hard-fought racing Sometimes, the weather simply refuses to play ball. With forecasts of near-gale force winds and spring tides making the Hurst Narrows untenable, JOG made the prudent call to keep racing "in-Solent" for the Stoneways Marine Cowes-Poole Race Weekend. Posted today at 1:35 pm
2026 Fireball Worlds Torquay - website now online
Continuing the momentum after the superb event at Lake Garda this year The taste of the last limoncellos may still be lingering on the taste buds following the recent 135 boat Worlds on Lake Garda, but the Fireball Class is already turning its attention to next year's Worlds being hosted by Royal Torbay Yacht Club in the UK. Posted today at 1:21 pm
Noble Marine RS300 Nationals at Abersoch
Charlie South wins after a tough two days racing Congratulations to Charlie South, Noble Marine RS300 National Champion 2025! After a tough two days racing at South Caernarvonshire Yacht Club, the weather blew up as forecast and racing was abandoned for day three. Posted today at 1:08 pm
Impala UK Nationals at Parkstone
A rare alignment of forecast models told a brutal story The early signs weren't promising for the Curtiss-Wright Impala Nationals, hosted this year by Parkstone Yacht Club. A rare alignment of forecast models told a brutal story of gales and rain for the three day event. Posted today at 12:49 pm