Automatic Generation of Regular Expressions from Examples with Genetic Programming
Alberto Bartoli, Giorgio Davanzo, Andrea De Lorenzo, Marco Mauri, , Enrico Sorio
ACM Genetic and Evolutionary Computation Conference (GECCO), held in Philadelphia (USA)
Links and material:
We explore the practical feasibility of a system based on genetic programming (GP) for the automatic generation of regular expressions. The user describes the desired task by providing a set of labeled examples, in the form of text lines. The system uses these examples for driving the evolutionary search towards a regular expression suitable for the specified task. Usage of the system should require neither familiarity with GP nor with regular expressions syntax. In our GP implementation each individual represents a syntactically correct regular expression. We performed an experimental evaluation on two different extraction tasks applied to real-world datasets and obtained promising results in terms of precision and recall, even in comparison to an earlier state-of-the-art proposal.