Robots deployed in real-world environments, such as homes, must not only navigate safely but also understand their surroundings and adapt to changes in the environment. To perform tasks efficiently, they must build and maintain a semantic map that accurately reflects the current state of the environment. Existing research on semantic exploration largely focuses on static scenes without persistent object-level instance tracking. In this work, we propose an open-vocabulary, semantic exploration system for semi-static environments. Our system maintains a consistent map by building a probabilistic model of object instance stationarity, systematically tracking semi-static changes, and actively exploring areas that have not been visited for an extended period. In addition to active map maintenance, our approach leverages the map's semantic richness with large language model (LLM)-based reasoning for open-vocabulary object-goal navigation. This enables the robot to search more efficiently by prioritizing contextually relevant areas. We compare our approach against state-of-the-art baselines using publicly available object navigation and mapping datasets, and we further demonstrate real-world transferability in three real-world environments. Our approach outperforms the compared baselines in both success rate and search efficiency for object-navigation tasks and can more reliably handle changes in mapping semi-static environments. In real-world experiments, our system detects 95% of map changes on average, improving efficiency by more than 29% as compared to random and patrol strategies.
We extract object candidates from the current pose and RGB-D frame (green). These are associated with objects in the semantic map, which is updated based on a probabilistic consistency estimate (red). Based on the scene belief, we build a semantic exploration priority map, indicating which map regions are relevant to current tasks — maintaining an up-to-date map or object-goal navigation (orange). Finally, the robot leverages the priority map to select and navigate to sampled positions (blue)
We curate 60 object search tasks using scenes from the InteriorAgent dataset. The tasks are divided into three categories based on how to the target object was changed since the initial mapping phase: (1) target object has not changed, (2) target object was not present during initial mapping, and (3) target object has changed location. Check the code.
| kujiale_0004 | kujiale_0008 | kujiale_0020 | kujiale_0021 | kujiale_0022 | kujiale_0024 | kujiale_0026 | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| book | bowl | plate | wineglass | book | cookingpot | bottle | lamp | book | bowl | cup-plant | bowl | cup | knife | stove | book | cup | washingmachine | laptop | plate | |
| Ours | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| DynaMem | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Random-Navigation | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| kujiale_0004 | kujiale_0008 | kujiale_0020 | kujiale_0021 | kujiale_0022 | kujiale_0024 | kujiale_0026 | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| book | bowl | plate | wineglass | book | cookingpot | bottle | lamp | book | bowl | cup-plant | bowl | cup | knife | stove | book | cup | washingmachine | laptop | plate | |
| Ours | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| DynaMem | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Random-Navigation | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| kujiale_0004 | kujiale_0008 | kujiale_0020 | kujiale_0021 | kujiale_0022 | kujiale_0024 | kujiale_0026 | kujiale_0030 | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| book | bowl | plate | wineglass | book | cookingpot | lamp | plate | bottle | cup | potted-plant | chair | coffee_table | potted_plant | cup | pillow | chair | pillow | bowl | cup | |
| Ours | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| DynaMem | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Random-Navigation | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
@ARTICLE{semi-static-semantic-exploration,
author={Bogenberger, Benjamin and Harrison, Oliver and Dahanaggamaarachchi, Orrin and Brunke, Lukas and Qian, Jingxing and Zhou, Siqi and Schoellig, Angela P.},
journal={IEEE Robotics and Automation Letters},
title={Where Did I Leave My Glasses? Open-Vocabulary Semantic Exploration in Real-World Semi-Static Environments},
year={2026},
volume={11},
number={3},
pages={3342-3349},
doi={10.1109/LRA.2026.3656790}
}