Perch

TensorFlow 2

Model Details

This model is trained on Xeno-Canto recordings of bird vocalizations. It provides output logits over more than 10k bird species, and also creates embedding vectors which can be used for other tasks.

Note that the embedding is the real goal here; the model's logit outputs are uncalibrated, which might make interpretation difficult in some cases.


Model Quality


We have evaluated the model on an array of soundscape datasets. For evaluation, we restrict the logits to a feasible set of classes for the target dataset. (This prevents misclassification as out-of-domain species.) For each dataset, we select 5s segments for evaluation with a peak-detector, and associate ground-truth labels with each audio segment. We evaluate classifier performance on these windowed segments.

Note that the metrics are all uncalibrated, with no dependence on choice of threshold.

We have found that the model is sensitive to the normalization of the input audio. We recommend peak-normalizing each 5s segment to 0.25, as we have done in running the following evaluations. This normalization is handled automatically by the Perch library.


Metrics


  1. MAP is the mean average precision, computed per-example. It is subject to class imbalance.
  2. cMAP5 computes the mAP of each class independently over the entire dataset, and then averages over classes. Classes with fewer than five examples in the evaluation set are excluded, to reduce metric noise.
  3. Top-1 accuracy computes the accuracy of the highest logit, per example. (Note that this provides little insight into examples with multiple vocalizations.)


Datasets


  1. Caples is an unreleased dataset collected by the California Academy of Science at the Caples Creek area in the central Californian Sierra Nevadas. Work is underway to open-source this dataset.
  2. High Sierras is an unreleased dataset of birds from high altitudes in the Sierra Nevadas in California. Previously used as part of the test set for the Cornell Birdcall Identification challenge. Recordings are typically sparse, but with very low SNR due to wind noise. Work is underway to open-source this dataset.
  3. Sapsucker Woods (SSW) contains soundscape recordings from the Sapsucker Woods bird sanctuary in Ithaca, NY, USA.
  4. Sierra Nevada (Kahl et al., 2022b) contains soundscape recordings from the Sierra Nevadas in California, USA.
  5. Hawai’i (Navine et al., 2022) contains soundscape recordings from Hawai’i, USA. Many species, particularly endangered honeycreepers, are endemic to Hawai’i and many are under-represented in the Xeno-Canto training set.
  6. Powdermill (Chronister et al, 2021) contains high-activity dawn chorus recordings captured over four days in Pennsylvania, USA.
  7. Colombia is an unreleased dataset, previously used as part of the test set for the BirdCLEF 2019 competition.
  8. Peru is a dataset of recordings from the Amazon rainforest.


Model Description

The current version of the model uses an EfficientNet-B1 architecture.

The model is trained on Xeno-Canto recordings. We have excluded recordings with a no-derivative license, and recordings for species on the IUCN Red List.


License

Copyright 2023 Google, LLC

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.


Example using TFHub Lib


import tensorflow_hub as hub

# Load the model.
model = hub.load('/models/google/bird-vocalization-classifier/TensorFlow2/bird-vocalization-classifier/8')

# Input: 5 seconds of silence as mono 32 kHz waveform samples.
waveform = np.zeros(5 * 32000, dtype=np.float32)

# Run the model, check the output.
logits, embeddings = model.infer_tf(waveform[np.newaxis, :])


Example using Chirp lib


from chirp.inference import models

# Input: 5 seconds of silence as mono 32 kHz waveform samples.
waveform = np.zeros(5 * 32000, dtype=np.float32)

model = models.TaxonomyModelTF(SAVED_MODEL_PATH, 5.0, 5.0)
outputs = model.embed(audio)
# do something with outputs.embeddings and outputs.logits['label']


Changelog


  1. 1.4 - (version 5 on tfhub) The model now outputs a dictionary, including the melspectrogram (useful for debugging) and additional taxonomic outputs for bird genus, family, and order. The underlying checkpoint is identical to versions 1.1, 1.2, and 1.3, so all embeddings should match.
  2. 1.3 - (version 4 on tfhub) It seems 1.2 didn't include the batched SavedModel. So, this new version is exactly the same, but with the batched SavedModel.
  3. 1.2 - (version 3 on tfhub) Exact same model as 1.1, but with batch processing enabled, which we find gives a ~2x speedup per-example on CPU and ~3x speedup on a 3090 GPU. We have checked that the model outputs match exactly on various examples, so there is no need to re-run the model to take advantage of the speedups.
  4. We also update the reported metrics. We have switched to using sliding-windows instead of peak detection for evaluation. Because there are no other model changes in this release, one can compare the metrics from the previous release to see the impact of this change (mostly the reported scores went down a bit).
  5. We also now report class-averaged ROC-AUC metrics for each eval dataset, after finding that cMAP is subtly sensitive to pos/neg balance within each class, and can be very negatively impacted by missing positive ground truth labels.
  6. 1.1 - (version 2 on tfhub) Updated spectrogram ops, reducing kernel size dramatically. This yields a ~2x speedup on CPU, with no impact on model quality. Replaced some unpublished datasets (Colombia, High Sierras) with their newly published versions, and updated the model stats table. Note that some stats which seem to decrease since the previous version were affected by a switch to the published versions of datasets. In these cases (like SSW), we re-ran evaluation for both the new and old model and confirmed there were minimal changes.
  7. 1.0 - Initial Release




Model Files

armstrong no.py
                                            print("Program to check a no. is armstrong number or not between a interval")
lower=int(input("enter a lower range.: "))
upper=int(input("enter upper range: "))
for num in range(lower,upper+1):

    sum=0
    temp=num
    while temp>0:
        digit=temp%10
        sum+=digit**3
        temp//=10

    if num==sum:
        print(num,"is a armstrong number")

    #else:
     #   print(num,"is not a armstrong number")


                                        

Model Comments

1 comments
avatar
Ashish Edited
Jul 01, 2024 01:21 pm

Great..