Twitter as a source of big spatial data
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27350%2F16%3A86098563" target="_blank" >RIV/61989100:27350/16:86098563 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.5593/SGEM2016/B21/S08.116" target="_blank" >http://dx.doi.org/10.5593/SGEM2016/B21/S08.116</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.5593/SGEM2016/B21/S08.116" target="_blank" >10.5593/SGEM2016/B21/S08.116</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Twitter as a source of big spatial data
Popis výsledku v původním jazyce
Social networks represent a valuable source of information about personal attitudes, behaviour and activities. Various studies prove interesting relationships to the development of stock market, tourist activities, spreading of infection diseases etc. Dynamic development of social networks offers integration of many potentially valuable data sources. The progress in the field of sensor networks, Internet of Things, expanding coverage of internet access in developing countries and rise of smart mobile devices make data sources more heterogeneous where data quantity and quality significantly vary over space and time. Handling such big data in the world or the continental extent in real-time represents a current challenge. Appropriate algorithmic processing is necessary due to features of large volume data streams which require special treatment for proper data extraction and data fusion. The issues and possibilities of solution are discussed on the processing of data sample from one day worldwide Twitter activity containing more than 4 million of tweets. REST and Streaming API's of Twitter are compared and discovered issues are discussed. Namely limits for data harvesting are explored. Data streamed from social networks contains not only textual, but also spatial and time information. A spatio-temporal exploratory data analysis verifies data consistency together with integrity and shows the appropriate data pre-processing as a key step to build a relevant database. Spatial location of messages can be expressed by point coordinates or by place names with a bounding box. It was found there are important differences between identification of place and point coordinates of tweets which indicates a need for verification. Also time of tweet has to be well reconstructed using time of creation, user's time zone, UTC offset as well as the location of the tweet. The results enable to create recommendation how to process such big data. The study brings a new point of view on ...
Název v anglickém jazyce
Twitter as a source of big spatial data
Popis výsledku anglicky
Social networks represent a valuable source of information about personal attitudes, behaviour and activities. Various studies prove interesting relationships to the development of stock market, tourist activities, spreading of infection diseases etc. Dynamic development of social networks offers integration of many potentially valuable data sources. The progress in the field of sensor networks, Internet of Things, expanding coverage of internet access in developing countries and rise of smart mobile devices make data sources more heterogeneous where data quantity and quality significantly vary over space and time. Handling such big data in the world or the continental extent in real-time represents a current challenge. Appropriate algorithmic processing is necessary due to features of large volume data streams which require special treatment for proper data extraction and data fusion. The issues and possibilities of solution are discussed on the processing of data sample from one day worldwide Twitter activity containing more than 4 million of tweets. REST and Streaming API's of Twitter are compared and discovered issues are discussed. Namely limits for data harvesting are explored. Data streamed from social networks contains not only textual, but also spatial and time information. A spatio-temporal exploratory data analysis verifies data consistency together with integrity and shows the appropriate data pre-processing as a key step to build a relevant database. Spatial location of messages can be expressed by point coordinates or by place names with a bounding box. It was found there are important differences between identification of place and point coordinates of tweets which indicates a need for verification. Also time of tweet has to be well reconstructed using time of creation, user's time zone, UTC offset as well as the location of the tweet. The results enable to create recommendation how to process such big data. The study brings a new point of view on ...
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
IN - Informatika
OECD FORD obor
—
Návaznosti výsledku
Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2016
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
16th International Multidisciplinary Scientific GeoConference : SGEM 2016 : energy and clean technologies : conference proceedings : 30 June -6 July, 2016, Albena, Bulgaria. Volume II, Recycling, air pollution and climate change
ISBN
978-619-7105-64-3
ISSN
1314-2704
e-ISSN
—
Počet stran výsledku
7
Strana od-do
921-"928 pp"
Název nakladatele
STEF92 Technology Ltd.
Místo vydání
Sofia
Místo konání akce
Albena
Datum konání akce
30. 6. 2016
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—