AgentMat: Framework for Data Scraping and Semantization
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F09%3A00207431" target="_blank" >RIV/00216208:11320/09:00207431 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
AgentMat: Framework for Data Scraping and Semantization
Original language description
Most of the enormous amount of information from the internet is available just like web pages made for a human reader. They don?t have any common interface for accessing, searching or browsing the data. Hence, it?s hard to extract the semantic data fromthe web, categorize them and keep them updated. For this purpose we have designed and implemented a system called AgentMat. This system is designed for efficient extraction of large amount of data from the web pages. AgentMat processing is based on an XML-based language describing the given extraction task in a declarative way. Thanks to this scraping system the raw contents from the irregularly updated and unstructured web pages can be kept categorized and accessed together with the semantic metadata.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
JC - Computer hardware and software
OECD FORD branch
—
Result continuities
Project
—
Continuities
Z - Vyzkumny zamer (s odkazem do CEZ)
Others
Publication year
2009
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
3rd International Conference on Research Challenges in Information Science
ISBN
978-1-4244-2864-9
ISSN
—
e-ISSN
—
Number of pages
12
Pages from-to
—
Publisher name
IEEE Computer Society Press
Place of publication
Fez, Morocco
Event location
Fez, Morocco
Event date
Jan 1, 2009
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000271860800025