humpback_whale
TensorFlow 2
Model Details
The model classifies 3.92-second context windows of audio as containing or not containing humpback whale sounds. It is intended to be applied as a detector by scoring every context window in a set of underwater passive acoustic monitoring data.
The model feeds a PCEN-normalized spectrogram through a ResNet-50 architecture to a single logistic output unit. The training data is all from HARP recording platforms deployed at depths in the hundreds of meters in humpback whale winter breeding grounds in the Hawaiian archipelago. Further details, including performance metrics, can be found in:
A. Allen et al., "A convolutional neural network for automated detection of humpback whale song in a diverse, long-term passive acoustic dataset", Front. Mar. Sci., 2021, doi: 10.3389/fmars.2021.607321.
M. Harvey, "Acoustic Detection of Humpback Whales Using a Convolutional Neural Network," Google AI Blog, Oct. 29, 2018.
Use cases
This model is suitable for:
- Predicting the presence of a humpback whale call in a given audio sample.
- Analyzing acoustic data collected by deep-water deployments.
This model is not suitable for:
- Detecting species of whales other than humpback whales.
- Counting how many whales are present.
- Localizing whales.
- Analyzing acoustic data with high levels of surface or platform noise.
Usage
Main Inference Signature: score
The default signature, score, is recommended for inference-only use cases. It scores batches of waveforms at once, framing each waveform in the batch into multiple context windows before outputting per-window scores.
Examples
TF2
TF1
Inputs
- waveform, a float32 Tensor of shape [batch_size, num_samples, num_channels], where it is required that num_channels = 1, but where batch_size and num_samples may take the caller's preferred values on each call.
- Each audio channel (slice [channel_index, :, 0]), should contain 10kHz PCM float32 audio.
- The training data left plenty of headroom; the level of clips with humpback present was typically 0.003 RMS, 0.02 peak, much "quieter" than consumer digital audio.
- Although the model is relatively insensitive to input gain variations as wide as +/-20 dB, users may wish to apply linear scaling to match the levels the model saw in training.
- context_samples, an int64 Tensor of shape [], the hop length at which to slide the scoring context window over waveform.
Outputs
- scores, a float32 Tensor of shape [batch_size, num_windows, num_classes], where it will always be true that num_classes = 1, where batch_size will equal the one from the input, and where num_windows is determined by num_samples and context_step_samples.
Advanced Usage
Model attributes allow isolated reuse of parts of the model, in accord with the Reusable SavedModels interface. The callable attributes exposed are:
- front_end, which can be called on a waveform Tensor as described in the score signature inputs to produce a PCEN-normalized spectrogram of shape [batch_size, num_stft_bins, num_channels], where num_channels = 64 is fixed and where num_stft_bins depends on the number of input samples.
- features, which when called on a PCEN spectrogram slice of shape [batch_size, 128, 64] produces feature vectors of shape [batch_size, 2048]. (These might be useful for detecting other audio event types in the HARP data or similar underwater passive acoustic monitoring datasets, but the model developers have not yet validated this through experiment.)
- logits, which, when called on the same type of input as features, outputs the log odds of the input spectrogram containing humpback vocalization.
Examples
TF2
TF1
Introspection
The metadata signature returns the sample rate of the audio the model expects to see as input and the duration of the context window to which each score applies. This signature is a bit of future proofing so that batch inference systems can support models where these values may differ.
Examples
TF2
TF1
Dataset
Through a partnership with the NOAA National Center for Environmental Information, the audio and labels are available as part of the Google Cloud Platform Passive Acoustic Monitoring public data set under the pifsc/ path prefix as well as through NCEI at DOI:10.25921/Z787-9Y54.
Acknowledgements
The model developers thank NOAA Pacific Islands Fisheries Science Center for collecting and sharing the data and for their partnership in the model development process, which included providing the initial training labels as well as labels for several rounds of active learning that improved candidate models.
Regarding the dataset, please also refer to the funding and acknowledgement sections in Allen et al.
Terms of Use
This model has been developed as part of the AI for Social Good program at Google. The developers request that users adhere to Google’s AI principles, in particular #1 “Be socially beneficial." in only pursuing applications which have societal and/or environmental benefit, as well as wildlife conservation for not-for-profit decision-making, education, or research. (The official license remains Apache 2.0.) If you have any questions about appropriate use cases for this model, please contact [email protected].
Limitations
Limits to Generalization
There is wide variation in underwater acoustic conditions and recording equipment, and the training set for this model covered only one type of recording platform, only one geographic area, and only deep-water deployments. Anecdotally, the developers have observed both successes and failures of the model to generalize to mismatched conditions. Specific examples include:
- (success) predictions consistent with real-time human observations on towed array and Sonobuoys at depths in the tens of meters
- (success) predictions consistent with manual analysis of historical data on an ocean-bottom cabled hydrophone in a different geographic region
- (failure) a high rate of false positives on a dataset from a mobile surface platform, which therefore had higher levels of platform and flow noise
- (failure) loss of recall when masking by vessel noise is especially severe
Users are encouraged to manually verify the performance of the model on a sample of audio representative of their own inference conditions before drawing conclusions based on its output.
Lack of Output Calibration
The logits represent the model's level of confidence in a binary decision, not true probabilities of humpback presence. (For example, in our eval set, where about 10% of the examples were positives, precision was already 90% at a "probability" threshold of 0.1 - or equivalently a logit threshold of -2.2.) Applications needing true probabilities should calibrate the output on a sample of their own data.
Model Files
# import the necessary packages
from imutils import paths
import argparse
import requests
import cv2
import os
# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-u", "--urls", required=True,
help="path to file containing image URLs")
ap.add_argument("-o", "--output", required=True,
help="path to output directory of images")
args = vars(ap.parse_args())
# grab the list of URLs from the input file, then initialize the
# total number of images downloaded thus far
rows = open(args["urls"]).read().strip().split("\n")
total = 0
# loop the URLs
for url in rows:
try:
# try to download the image
r = requests.get(url, timeout=60)
# save the image to disk
p = os.path.sep.join([args["output"], "{}.jpg".format(
str(total).zfill(8))])
f = open(p, "wb")
f.write(r.content)
f.close()
# update the counter
print("[INFO] downloaded: {}".format(p))
total += 1
# handle if any exceptions are thrown during the download process
except:
print("[INFO] error downloading {}...skipping".format(p))
# loop over the image paths we just downloaded
for imagePath in paths.list_images(args["output"]):
# initialize if the image should be deleted or not
delete = False
# try to load the image
try:
image = cv2.imread(imagePath)
# if the image is `None` then we could not properly load it
# from disk, so delete it
if image is None:
delete = True
# if OpenCV cannot load the image then the image is likely
# corrupt so we should delete it
except:
print("Except")
delete = True
# check to see if the image should be deleted
if delete:
print("[INFO] deleting {}".format(imagePath))
os.remove(imagePath)
print("Program to check a no. is armstrong number or not between a interval")
lower=int(input("enter a lower range.: "))
upper=int(input("enter upper range: "))
for num in range(lower,upper+1):
sum=0
temp=num
while temp>0:
digit=temp%10
sum+=digit**3
temp//=10
if num==sum:
print(num,"is a armstrong number")
#else:
# print(num,"is not a armstrong number")
0 comments
No comments yet.