Andrew R. Buck, Ph.D.

How Should Simulated Data Be Collected for AI/ML and Unmanned Aerial Vehicles?

Jeffrey Kerley, Derek T. Anderson, Brendan Alvey, Andrew Buck

Proc. SPIE 12529, Synthetic Data for Artificial Intelligence and Machine Learning: Tools, Techniques, and Applications, 2023

SPIE, Simulation, EH Detection, Drones

[DOI] [paper] [video]

Abstract

Large and diverse datasets can now be simulated with associated truth to train and evaluate AI/ML algorithms. This convergence of readily accessible simulation (SIM) tools, real-time high performance computing, and large repositories of high quality, free-to-inexpensive photorealistc scanned assets is a potential artificial intelligence (AI) and machine learning (ML) game changer. While this feat is now within our grasp, what SIM data should be generated, how should it be generated, and how can this be achieved in a controlled and scalable fashion? First, we discuss a formal procedural language for specifying scenes (LSCENE) and collecting sampled datasets (LCAP). Second, we discuss specifics regarding our production and storage of data, ground truth, and metadata. Last, two LSCENE/LCAP examples are discussed and three unmanned aerial vehicle (UAV) AI/ML use cases are provided to demonstrate the range and behavior of the proposed ideas. Overall, this article is a step towards closed-loop automated AI/ML design and evaluation.

Media

Files

[paper]