File size: 3,984 Bytes
dee29a4 3217423 8b60e8f 3217423 8b60e8f da88bf8 5a7aff7 8b60e8f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 |
---
pretty_name: Perch
license: apache-2.0
tags:
- audio
- bird
- nature
- science
- vocalization
- bio
- birds-classification
- bioacoustics
base_model:
- cgeorgiaw/Perch
---
# Perch
tflite and munually optimized onnx format of the Perch v2 model.
Source https://www.kaggle.com/models/google/bird-vocalization-classifier/
- `perch_v2_no_dft.onnx`: ONNX model with the DFT node converted to MatMul using `scripts/convert_dft_to_matmul.py` for addtional speedup. It is slightly less accurate (tolerance within 2e-4 vs 1e-5 when comparing agaist the tflite model).
- `perch_v2.onnx`: The converted ONNX model.
- `perch_v2.tflite`: The tflite model.
## Model information
```
ONNX Model Information:
Inputs:
- Name: inputs, Shape: ['batch', 160000], Type: tensor(float)
Outputs:
- Name: embedding, Shape: ['batch', 1536], Type: tensor(float)
- Name: spatial_embedding, Shape: ['batch', 16, 4, 1536], Type: tensor(float)
- Name: spectrogram, Shape: ['batch', 500, 128], Type: tensor(float)
- Name: label, Shape: ['batch', 14795], Type: tensor(float)
TFLite Model Information:
Inputs:
- Name: serving_default_inputs:0, Shape: [ 1 160000], Type: <class 'numpy.float32'>
Outputs:
- Name: StatefulPartitionedCall:0, Shape: [ 1 1536], Type: <class 'numpy.float32'>
- Name: StatefulPartitionedCall:2, Shape: [ 1 16 4 1536], Type: <class 'numpy.float32'>
- Name: StatefulPartitionedCall:3, Shape: [ 1 500 128], Type: <class 'numpy.float32'>
- Name: StatefulPartitionedCall:1, Shape: [ 1 14795], Type: <class 'numpy.float32'>
Generating random inputs:
- inputs: shape=(1, 160000), dtype=float32
Running ONNX model inference...
Running TFLite model inference...
================================================================================
COMPARISON RESULTS
================================================================================
Output 0:
ONNX Runtime shape: (1, 1536), dtype: float32
TFLite shape: (1, 1536), dtype: float32
ONNX Runtime vs TFLite:
Max difference: 0.0000007208
Mean difference: 0.0000001543
Relative tolerance: 1e-05
Absolute tolerance: 1e-05
✅ Outputs match within tolerance
Output 1:
ONNX Runtime shape: (1, 16, 4, 1536), dtype: float32
TFLite shape: (1, 16, 4, 1536), dtype: float32
ONNX Runtime vs TFLite:
Max difference: 0.0000131130
Mean difference: 0.0000005482
Relative tolerance: 1e-05
Absolute tolerance: 1e-05
✅ Outputs match within tolerance
Output 2:
ONNX Runtime shape: (1, 500, 128), dtype: float32
TFLite shape: (1, 500, 128), dtype: float32
ONNX Runtime vs TFLite:
Max difference: 0.0000005960
Mean difference: 0.0000000100
Relative tolerance: 1e-05
Absolute tolerance: 1e-05
✅ Outputs match within tolerance
Output 3:
ONNX Runtime shape: (1, 14795), dtype: float32
TFLite shape: (1, 14795), dtype: float32
ONNX Runtime vs TFLite:
Max difference: 0.0000152588
Mean difference: 0.0000014861
Relative tolerance: 1e-05
Absolute tolerance: 1e-05
✅ Outputs match within tolerance
================================================================================
✅ ALL OUTPUTS MATCH!
================================================================================
Benchmarking ONNX model (10 warmup + 100 test runs)...
Benchmarking TFLite model (10 warmup + 100 test runs)...
================================================================================
BENCHMARK RESULTS
================================================================================
ONNX Model:
Mean: 66.350 ms
Median: 66.339 ms
Std: 2.160 ms
Min: 61.801 ms
Max: 74.614 ms
TFLite Model:
Mean: 608.777 ms
Median: 606.753 ms
Std: 11.304 ms
Min: 602.735 ms
Max: 684.807 ms
Comparison:
ONNX Runtime is 9.18x faster than TFLite
Difference: 542.427 ms
================================================================================
```
|