Hi, I am using the yamnet.tflite file from the repository for testing in Python. The input audio frame size should be 15600, which corresponds to 0.97 seconds. However, the VAD detection time is too long. I would like to ask how the Valid Frame Sizes (243, 487, 731, 975) you provided are processed
Hi, I am using the yamnet.tflite file from the repository for testing in Python. The input audio frame size should be 15600, which corresponds to 0.97 seconds. However, the VAD detection time is too long. I would like to ask how the Valid Frame Sizes (243, 487, 731, 975) you provided are processed