Evaluating Map Stability Across Repeated Entry-Light Surveys
How Much of an Entry-Light Survey Swing Is Real
USFS dual-frame sampling research on 82 bat-hibernaculum structures with five or more repeat counts documented up to 22 percent annual fluctuation in sites where the population was known to be stable. Some of that is real year-to-year variation in how many bats are present; much of it is method. The 2015 PLOS ONE paper on efficacy of visual surveys for WNS response showed that visual surveys have known detection limits, especially for small scattered clusters on complex karst ceilings.
The Journal of Fish and Wildlife Management review of long-term opportunistic surveys at Lava Beds NM from 1991 through 2012 catalogs the sources of bias: observer rotation, lamp-angle drift, ceiling-landmark choice, and cluster-boundary judgment. Over two decades, those biases compound into a trend line that may not reflect Townsend's big-eared bat (Corynorhinus townsendii) population reality. Rodhouse and colleagues' 2019 Ecology and Evolution paper addresses this with Bayesian occupancy models that handle repeat-survey uncertainty explicitly, and improved analysis of long-term monitoring published in PMC emphasizes that detecting marked regional declines requires careful handling of the repeat-survey structure.
Biologists already know this. The practical question is how to evaluate whether a given site's multi-year trajectory is signal or noise.
An EchoQuilt Stability Evaluation Against Entry-Light Surveys
EchoQuilt's answer is to register every winter's quilt to the same ceiling landmark set, then compute three stability metrics against each year's limited entry-light survey. The metrics compare the passive-acoustic reconstruction with what a biologist saw during a single spring visit.
The first metric is ceiling-landmark registration error. We log the 3D offset between each winter's reconstructed stalactite positions and the reference stalactite positions from the initial deployment. An offset under 3 centimeters per landmark means the sensor array has not drifted and the quilt's geometry is directly comparable to prior years.
The second is cluster-boundary agreement. The EchoQuilt cluster outline — the 95 percent occupancy contour derived from wing-beat and stirring density — gets compared against the biologist's hand-drawn outline from the entry-light survey. Digital photography work on hibernacula counts found photo-based cluster outlines varied under 1.5 percent between observers, while visual estimates without a reference swung 76 to 142 percent. The quilt's 3D reconstruction plays the role photographs play in 2D — a durable artifact that can be re-measured.
The third metric is cluster-count stability. Here we compare EchoQuilt's density-derived count against the entry-light visual count. When the two agree within 10 percent, the quilt is trustworthy for year-over-year comparison. When they diverge by more, the failure mode is almost always visual: cryptic clusters on bedding-plane overhangs that the observer could not see from the floor, or a visual over-count of a loose scatter that the acoustic reconstruction reveals is actually a single small cluster.
The quilt metaphor matters to the evaluation. Each winter's reconstruction is a patchwork of acoustic arrivals stitched along ceiling seams, and the stability question is whether the seams still line up year over year. A quilt whose seams shift 10 centimeters between winters tells you the sensors moved, not that the cave did. Anchoring discipline is what makes stability defensible, and it is the same discipline behind multi-year archives — you cannot stitch across decades if each winter's patch floats in its own coordinate frame.
This same framing applies to planetary analog work. Map-fidelity evaluation under flight-like power constraints treats registration error as a first-class metric for exactly the same reason: without it, the comparison between timepoints is a comparison of rigs, not terrain.

The mockup shows three side-by-side panels — January 2022, January 2023, January 2024 — for the same hibernaculum. Each panel overlays the EchoQuilt ceiling reconstruction (gray wireframe) with that winter's entry-light cluster outlines (colored contours). The bottom strip reports per-year registration error, cluster-boundary agreement, and count deviation. Across the three years, the reconstruction held to under 2.8 cm mean landmark drift, and cluster counts agreed with entry-light surveys within 7 percent.
Advanced Tactics for Stability Evaluation
Three tactics sharpen the evaluation. First, deploy a subset of fixed reference nodes — three to five per hibernaculum — that stay in place for multi-year campaigns rather than being removed and redeployed. Multi-year continuity is the single largest lever on registration stability. Second, schedule entry-light surveys in the same phenological window each year (first two weeks of February, for instance); bat positional variance is smaller inside a fixed window than across the whole winter, and the stability evaluation tightens as a result. Third, compute registration error at every redeployment, not just at the end of a field season; a node that drifts at the September deployment is a node you want to reseat before December, not flag in April.
The non-intrusive trends piece of this stack is directly connected: as entries get rarer, stability evaluation has to carry more of the weight of trend defensibility. A quilt that has been shown to register within 3 cm year over year holds up to federal review even when the biologist visits once per winter instead of twice.
Fourth, build a per-cluster stability dashboard that shows registration error, boundary agreement, and count deviation as a 5-year time series rather than as a single-year snapshot. The trend in stability metrics matters as much as the absolute values: a site whose registration error has crept from 1.8 cm to 4.2 cm over five winters is on a degradation arc that needs intervention, even though no single year would fail the 3 cm threshold in isolation. EchoQuilt's stability dashboard surfaces these trends with explicit early-warning thresholds.
Fifth, calibrate the count-stability metric per species. A 10 percent agreement threshold is appropriate for clustered Indiana bat populations where boundary detection is straightforward, but tri-colored bat (Perimyotis subflavus) clusters are smaller, more diffuse, and inherently harder to count even with photographs; a 15-20 percent agreement threshold is more realistic for tri-colored bat sites. EchoQuilt's stability evaluation surfaces species-specific agreement bands so the threshold reflects actual count difficulty rather than aspirational uniformity.
Sixth, distinguish observer-drift error from sensor-drift error in the failure attribution. When the count deviation exceeds the threshold, the diagnostic question is whether the EchoQuilt reconstruction or the visual count is responsible. Cross-referencing against a paired photograph reference (when available at a single accessible cluster) attributes the deviation. EchoQuilt's evaluation report flags whether the failure mode is acoustic, visual, or both — which directly informs the corrective action.
Seventh, archive every stability evaluation report alongside the year's quilt as a versioned record. A 2024 cluster-count trend that cited 2023's stability report should be reproducible in 2027 from the same archived stability report, not from a re-run that may use updated metrics or thresholds. Versioned stability reports are the audit substrate that lets multi-year trend papers withstand peer review across the years between submission and acceptance.
Eighth, integrate stability evaluation into NABat data-submission workflow. NABat's current ingestion accepts cluster counts but does not surface their stability metrics. EchoQuilt's NABat export includes a stability sidecar so the receiving analyst sees not just the count but also its registration error, boundary agreement, and count deviation. The pooled multi-state trend models can then weight observations by stability rather than treating all counts as equally credible.
Ninth, schedule a multi-site stability audit as a state-level annual exercise, paralleling the USGS Bird Banding Lab band-recovery quality assurance review. A state coordinator reviews the per-site stability metrics across all 40 Priority 1 hibernacula in one quarterly pass, identifying sites whose stability has degraded and triaging them for next winter's redeployment priority list.
Evaluate Your Hibernaculum's Quilt Stability Before You Publish Trends
State DNR crews and NABat partners with multi-year EchoQuilt deployments should not publish cluster-count trend lines until the stability evaluation has closed. We will walk you through the three-metric protocol — registration error, cluster-boundary agreement, count deviation — against your own site's entry-light record, so the trend you publish is population change rather than observer drift. The protocol scales from a single sentinel hibernaculum to a 40-site state network with the same metric definitions and reporting templates, so a state coordinator can deploy stability evaluation as a standardized program-wide quality assurance practice. Sites that pass the stability evaluation become the trend-defensible core of your WNS response, while sites that fail go onto a redeployment priority list with explicit corrective actions documented for each failure mode.
Join the Waitlist for Hibernacula Biologists and bring a site with at least two winters of quilt data — we will turn around a stability report you can attach to your next WNS response submission, with explicit confidence statements that anticipate the questions a federal reviewer or a peer-reviewer will ask. The stability evaluation is the foundation that subsequent population trend analyses, gate-placement decisions, and disturbance-budget allocations all depend on, so investing one winter cycle in the evaluation pays dividends across every downstream use of the quilt data.