How is a “detection” defined by BirdNET?

The system records for 15 secs, then slices that 15 sec recording into five intervals of three seconds each. Each three second interval gets analyzed which results in a species name and a confidence value for that species for each interval. Exceeding a default threshold of 0.7 for the confidence value qualifies as a “detection”. The detection with the highest confidence value is recorded. A spectrogram and corresponding recording of the detection is saved for later verification. The system is recording the next 15 seconds of data while processing the previous 15 seconds of data.

Adjusting the knobs

There are several factors that influence how well this process works. Obviously, distance from the device has a major effect on performance. More distant birds may be identified correctly but not receive as high of a confidence value, and so not reach the threshold value. I haven’t seen an instance where this was the case and another call had a higher confidence value - there just wasn’t a detection recorded (false negative).

The quality and type of mic influences performance as well. I initially used a “shotgun” mic and later acquired an “omnidirectional” mic. When testing the two with recorded calls the shotgun mic had higher confidence values even when facing away from the speakers. The omnidirectional mic may have identified the species correctly but with confidence values that were much below the threshold level.

Site selection also comes into play here. Siting near brush may improve detection of some groups of birds but if grassland birds are of interest, brush would confound the results.

What is the appropriate threshold value?

Adjusting the threshold level may compensate for the above issues in a particular setting, but I’ve yet to investigate that.