Enterprises are increasingly in need of tools to effectively manage the volume and variety of data and the way to process them quickly. At CETIC we explore existing tools that provide solutions to this problem. We (...)
Developed as an activity of the Walloon region project CETIC-CEIQS, Retroweb is a tool for data extraction from the Internet. Now that Internet has become one of the main source of information, this kind of tool is (...)
The search engine market is mainly dominated by three big players but these strong giants also have weaknesses. In many cases, a solution of personalized research can be more efficient that the generalist search (...)
This article describes a method for web sites reverse engineering. It is composed of five processes: Web pages classification, HTML cleaning, Semantic enrichment, Data/schema extraction and Schemas (...)