Building document treatment chains using reinforcement learning and intuitive feedbackReport as inadecuate

Building document treatment chains using reinforcement learning and intuitive feedback - Download this document for free, or read online. Document in PDF available to download.

1 Equipe MAD - Laboratoire GREYC - UMR6072 GREYC - Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen 2 Cordon Electronics DS2i 3 DECISION LIP6 - Laboratoire d-Informatique de Paris 6 4 Airbus Defence & Space Elancourt Airbus group

Abstract : We model a document treatment chain as a Markov Decision Process, and use reinforcement learning to allow the agent to learn to construct custom-made chains - on the fly - , and to continuously improve them. We build a platform, BIMBO Benefiting from Intelligent and Measurable Behaviour Optimisation which enables us to measure the impact on the learning of various models, algorithms, parameters, etc. We apply this in an industrial setting, specifically to a document treatment chain which extracts events from massive volumes of web pages and other open-source documents. Our emphasis is on minimising the burden of the human analysts, from whom the agent learns to improve guided by their feedback on the events extracted. For this, we investigate different types of feedback, from numerical feedback, which requires a lot of user effort and tuning, to partially and even fully qualitative feedback, which is much more intuitive, and demands little to no user intervention. We carry out experiments, first with numerical feedback, then demonstrate that intuitive feedback still allows the agent to learn effectively.

Keywords : Artificial intelligence Reinforcement learning Extraction and knowledge management Man-machine interaction Open source intelligence OSINT

Author: Esther Nicart - Bruno Zanuttini - Hugo Gilbert - Bruno Grilhères - Fredéric Praca -



Related documents