During the development of Sento Crawler I had to face some technical challenges and take important steps towards the definition and implementation of the data pipeline. The technical goals At the start of development a initial set of requirements were set: Easily configurable: the user should not need to read the code or modify it in order to set up an instance. Asynchronous: the tool was expected to perform heavy usage of network communications, asynchronous programming was a strong requirement, as we always want loose no time on blocked I/O operations.
In this post I will introduce you the architecture of Sento. The project consists of the following components: Crawler This piece of software communicates with Twitter’s and OpenStreeMap Nominatim’s APIs and does the following tasks: Extracts the text content in tweets beloging to the current available trends. Extracts metadata information from the trends themselves*, such as: How did the trend rank during its lifetime in a certain location.
Welcome! My name is Roberto García Calero, I’m a Software Engineering student at the University of Seville and also a proud member of Geographica, an awesome location intelligence company! I want to introduce you Sento, my Undergraduate Thesis Project. The idea Sento, sentiment in Esperanto, will be the combination of the sentiment analysis teachings I will learn during this year and the professional experience I’m gaining as a GIS (Geographic Information System) engineer at my job.