Planning, Decision Making, and Learning in Spatially & Temporally Varying Environments

  • Understanding the surrounding environment is extremely important to autonomous systems, because by taking advantage of the learned (perfect or imperfect) model of the environment, the robots are then able to plan meaningful actions and make appropriate decisions. For quite a long period robotics research has been conducted in indoor laboratories where environments are typically static. Increasingly, the demands of modern applications and technologies require that robots move to outdoor space, such as in the air or under water, with constantly prolonged task durations (i.e., long-term or life-long autonomy). In the meantime, these practical scenarios have raised a critical challenge: the robots’ habitats are oftentimes non-static and uncertain. In other words, it is very common that the environmental parameters change not only spatially, but also temporally, with possibly high uncertainty in every dimension. One focus of my research has been developing principles that (1) use data-driven methods to guide robots to learn the spatio-temporal and stochastic environment model, and (2) utilize the learned model to seek for more efficient and accurate planning and decision-making solutions, which in turn benefits future environmental exploration and exploitation.

  • Informative Planning and Online Learning for Environmental Monitoring

  • Harmful matters in water can be excessive, which needs to be monitored

    We humans are able to gain a greater understanding of the environmental processes (e.g., physical, chemical or biological attributes) through environmental sensing and monitoring. I am interested in the problem of collecting data about a scalar field of important environmental attributes, and learning a model to best describe the environment. However, the unknown environmental phenomena that we are interested in are usually time-varying. In order to provide a good estimate of the state of the environment and maintain the prediction model at any time, the environmental sensing (information gathering) needs to be carried out persistently to catch up to possible variations.

  • Persistent environment monitoring with an autonomous surface vehicle

    My student and I have designed planning and learning methods that enable an autonomous marine vehicle to perform persistent ocean monitoring tasks by learning and refining an underlying unknown environmental model. We use Gaussian Processes (GPs) to model the environment, and use data-driven methods to refine the model. To alleviate the computational bottleneck caused by large-scale data accumulated, we Persistent environment monitoring with an propose a framework that iterates between two components: a planning component aimed at collecting the most information-rich data, and a sparse Gaussian Process learning component where the environmental model and hyperparameters are learned online by taking advantage of only a subset of data that provides the greatest contribution. In addition, we also developed a long-term autonomy planning method that integrates information-theoretic and decision-theoretic planning frameworks, so that both the mutual-information-based path informativeness and turbulence-induced action uncertainty are taken into account simultaneously. Such a planning framework allows the autonomous vehicles to adapt to the spatio-temporal variations of both information (entropy) and disturbances, which leads very high efficiency for environmental sensing and monitoring.

  • Left: planned informative path (yellow dots are saved sampling data -- here is salinity); Middle: predicted environment model; Right: model prediction variance.

  • Time-Dependent Decision-Making

  • Ocean currents that a marine vehicle needs to fight against. The method also applies to air turbulences.

    I have been working on decision-making problems. The decision-making solutions allow robots to effectively reject external disturbances caused by uncertain environments. Unfortunately, the aforementioned non-stationary environmental properties are usually simplified as certain locally static disturbances (noises) in prevalent robotic planning research, and such an assumption and simplification may lead to high sub-optimality from the global view in the long run. I have developed a decision- theoretic planning framework called time-dependent Markov Decision Process. The new method that I designed generalizes the well-known MDP by considering an extra continuous dimension — the time evolution. Specifically, the time-dependent MDP is built upon an upgraded transition model that varies both spatially and temporally, with an underlying computing mechanism that can be imagined as value iterations combining both spatial "expansion" and temporal "evolution", which occur simultaneously in two dimensions. Such a framework is able to integrate a future horizon of environmental dynamics and produce highly accurate action policies under the spatio-temporal disturbances that are caused by, e.g, ocean currents and air turbulences.

  • Left: ocean current observed by SCCOOS. Middle: glider dead-reckoning trajectory makes unnecessary detours under time-varying disturbances. Right: Trajectory produced by a decision-making strategy that has integrated ocean wave prediction.

Multi-vehicle Coordination

  • My broader research goal focuses on exploiting cooperative decision-making methodologies (stochastic and deterministic, centralized and decentralized) with which the system behaviors are robust and the system benefits are the best. There are several open challenges that complicate designing—especially analyzing—decentralized algorithms. For example, the information attained by individual robots may be incomplete and inaccurate, resources can be limited, communication can be constrained, and action decisions may have to be conservative due to sub-optimal or uncertain results. Within the distributed context, I have been particularly interested in improving performance in terms of computation and communication complexities, solution quality, robustness, hierarchical flexibility, etc.

  • Micro Aerial Vehicles Collective Decision-Making

  • Action uncertainty is ubiquitous across autonomous robots. This is particularly true for light-weight micro aerial vehicles (MAVs), which are very likely to deviate from the planned paths due to various disturbances. I have been working on motion/path/trajectory planning and goal assignment for multi-MAV systems while considering stochasticity resulting from disturbed actions.

    ⇦ Left figure: a miniature multi-MAV system involving two micro aerial vehicles navigating towards designated goal states and avoiding a set of virtual obstacles.

  • I worked on developing collective decision making methods which can guide a team of homogeneous MAVs with action uncertainty to avoid obstacles and eventually reach mutually exclusive goals. One of our objectives is to design a light-weight approximation method that only needs to exploit partial state and/or action spaces, and quickly respond (e.g., replan) to changes or updates of states or rewards (e.g., emerging new tasks).

  • More formally, the efficiency is achieved by first abstracting the stochastic problem into a deterministic counterpart and obtaining an initial solution to decompose the problem. Then the sub-problems are transformed back to stochastic domain and solved by individual agents. In this way, computational complexity is maintained at the level of single agent planning case.

  • Motivated by a persistent surveillance task, my colleagues and I also developped an adaptive planning architecture for multi-vehicle teams subject to an uncertain, spatially-varying disturbance force. The planning architecture is designed with three hierarchical levels. The highest level generates interference-free routes for the entire team to monitor areas of interest that have higher uncertainty. The lower-level planners compute trajectories that can be tracked accurately along these routes by anticipating the effects of the disturbance force. To this end, the vehicles maintain an online estimate of the disturbance force, which drives adaptation at all planning levels.

  • I also worked on energy-constrained planning problems, where a team of MAVs is deployed to pursue an objective such as surveillance, monitoring, or exploration of an indoor environment with cognizance of available onboard energy resources and pre-deployed recharging platforms. The method can be extended to other applications, for example, drone pacakge deliveries. My work focuses on deployments which seek to balance individual and cooperative vehicle task requirements, with overall travel and energy costs and charging station availability, toward enabling extended duration operation.

  • Matching Graph based Formation

  • Fish school maintains certain formation

    Part of formation control research involves manoeuvring a spatially dispersed system from one formation to another. We consider the problem of changing smoothly between formations of spatially deployed swarm systems. I am interested in addressing scenarios in which gradual and seamless formation transitions are needed, a problem which we term "formation morphing". We show that this can be achieved by routing agents on a Euclidean graph that corresponds to paths computed on - and projected from - an underlying three-dimensional matching graph. The three-dimensional matching graph is advantageous in that it simultaneously represents a logical assignment problem (for which an optimal solution must be sought) and metric information that comprises the spatial aspects of the Euclidean graph.

  • Task Allocation Uncertainty Assessment

  • I have developed an extremely fast sensitivity analysis method for classic assignment problem (O(n^2) time for each variable). It is achieved through modifying the matching graph operations of the well-known Hungarian algorithm, so that the standard computation is relaxed and bounds are produced. Different from existing uncertainty analysis approaches most of which utilize the stochastic means, this algorithm is a deterministic method with bounded time complexity.

    iRobots with particle filters used to analyze sensitivity of task allocation.

    The robots we used are the create type made by iRobot, as shown right figure. A Hokuyo laser range sensor is mounted on each robot which records range data up to ∼5 m and an ASUS netbook is carried by each robot to compute all data including the communication, utility estimation, interval analysis, logging, etc.

  • Distributed Multi-robot Task Allocation

  • Task-Swap-based Anytime Algorithm: I developed a fully decentralized task allocation approach that allows local search processes to execute concurrently while minimizing interactions amongst the processes, needing neither global broadcast nor a multi-hop communication protocol. The formulation is analyzed in a novel way using tools from group theory and optimization duality theory to show that the convergence of local searching processes is related to a shortest path routing problem on a graph subject to the network topology.

  • Optimal Market-based Method: Rather than adopting a buyer's "selfish" bidding perspective as in existing auction/market-based approaches, we developed an auctioning mechanism from a merchant’s point of view, producing a pricing policy that responds to cliques of customers and their preferences. Our algorithm uses price escalation to clear a market of all its items, producing a state of equilibrium that satisfies both the merchant and customers.

    Right figure: Visualization as an equilibrium flow. We wish to manipulate the four striped balls so they flow into four wired spherical containers that are connected with tunnels among each other. The way to remove balls from a wired container is to create a perturbation, i.e., add energy to a container to make the enclosed balls unstable (imagine the balls behave like heated microscopic particles). The essence of the algorithm is that it always adds the minimum energy required to push exactly one ball out of a crowded container; and it guarantees that a ball will never flow back into an already heated sphere.

  • Matrix Coarsening and Partition: We observe that an assignment can be computed through coarsening and partitioning operations on the standard utility matrix via a set of mature partitioning techniques and programs. We designed an adaptive algorithm that mixes centralized and decentralized approaches dynamically at different scales to produce a fast, robust method that is accurate and scalable, and reduces both the global communication and unnecessary repeated computation.

  • Other Past Work (To be added)