This article describes a method for web sites reverse engineering. It is composed of five processes: Web pages classification, HTML cleaning, Semantic enrichment, Data/schema extraction and Schemas (...)