Indexing Digital Content

CETIC offers services for content indexing and knowledge extraction from unstructured data. This expertise allows to propose tailored search engines, automated processes for websites migration and Web Intelligence tools. Searching for a technology partner in these domains? Feel free to contact us!

Semantic extraction from unstructured data

CETIC is able to automatically extract content from Web pages, using reverse engineering techniques, while preserving the meaning of data. Hence, web based data migration is simplified, and tracking Web content like press releases, product catalogues, news, etc. gets easier.

CETIC can act as a technology partner during a migration process, or to implement tailored dashboards.

Creating tailored search engines

CETIC is able to develop tailored search tools.

Customisation allows for more comprehensive and fresh pages databases, to handle specific file types, as well as to implement semantic search capabilities.

CETIC has a strong expertise in all the steps involved in search engine development:
- crawling, i.e. discovering and collecting web pages or files
- indexing, i.e. transforming information into a searchable structure
- defining and implementing search user interfaces

Applications

- focused search engines
- intranet search engines
- search engines for product catalogues
- general-purpose public search engines (several million pages)