Twitter as a source of big spatial data
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27350%2F16%3A86098563" target="_blank" >RIV/61989100:27350/16:86098563 - isvavai.cz</a>
Result on the web
<a href="http://dx.doi.org/10.5593/SGEM2016/B21/S08.116" target="_blank" >http://dx.doi.org/10.5593/SGEM2016/B21/S08.116</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.5593/SGEM2016/B21/S08.116" target="_blank" >10.5593/SGEM2016/B21/S08.116</a>
Alternative languages
Result language
angličtina
Original language name
Twitter as a source of big spatial data
Original language description
Social networks represent a valuable source of information about personal attitudes, behaviour and activities. Various studies prove interesting relationships to the development of stock market, tourist activities, spreading of infection diseases etc. Dynamic development of social networks offers integration of many potentially valuable data sources. The progress in the field of sensor networks, Internet of Things, expanding coverage of internet access in developing countries and rise of smart mobile devices make data sources more heterogeneous where data quantity and quality significantly vary over space and time. Handling such big data in the world or the continental extent in real-time represents a current challenge. Appropriate algorithmic processing is necessary due to features of large volume data streams which require special treatment for proper data extraction and data fusion. The issues and possibilities of solution are discussed on the processing of data sample from one day worldwide Twitter activity containing more than 4 million of tweets. REST and Streaming API's of Twitter are compared and discovered issues are discussed. Namely limits for data harvesting are explored. Data streamed from social networks contains not only textual, but also spatial and time information. A spatio-temporal exploratory data analysis verifies data consistency together with integrity and shows the appropriate data pre-processing as a key step to build a relevant database. Spatial location of messages can be expressed by point coordinates or by place names with a bounding box. It was found there are important differences between identification of place and point coordinates of tweets which indicates a need for verification. Also time of tweet has to be well reconstructed using time of creation, user's time zone, UTC offset as well as the location of the tweet. The results enable to create recommendation how to process such big data. The study brings a new point of view on ...
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach
Others
Publication year
2016
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
16th International Multidisciplinary Scientific GeoConference : SGEM 2016 : energy and clean technologies : conference proceedings : 30 June -6 July, 2016, Albena, Bulgaria. Volume II, Recycling, air pollution and climate change
ISBN
978-619-7105-64-3
ISSN
1314-2704
e-ISSN
—
Number of pages
7
Pages from-to
921-"928 pp"
Publisher name
STEF92 Technology Ltd.
Place of publication
Sofia
Event location
Albena
Event date
Jun 30, 2016
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—