10 / 26 / 2021

Crawling as a tool to help create effective storytelling

Article by CERTH

The digital era of the information abundance drives people into everyday choices, no matter how insignificant or life-changing they initially seem. Consequently, it is of great importance to be able to gather and aggregate such useful data from online public sources and understand their underlying semantics in order to further extract knowledge and useful insights.

In the case of the SO-Close project, retrieving free online multimedia fuels the creation and the potency of experiences destined to connect the past with the future and the local with the foreign. Websites and Social media like YouTube and Twitter are channels of communication for people desiring to honor one’s ancestry, share struggles and so on. With experienced users and specialized tools like those multimodal data fusion and analytics group (M4D) under CERTH is currently developing, focused and targeted crawling becomes more feasible than ever. By providing suitable collections of keywords and/or user accounts the crawlers will most certainly fetch every relevant content spanning across different time periods. That way cultural institutions can acquire more than enough material to exploit and pursue more immersive approaches for effective storytelling. Those modules consist only some of the sources of information at the disposal of the user group who most of the times possesses content on their own in private repositories.

The numerous streams of incoming information necessitate the existence of a central knowledge representation system. Semantic technologies facilitate the entire knowledge management of the project by adopting ontologies. With this approach, the same resource can be understood correctly by both humans and machines in any part of the world, as long as it is properly annotated. Not only, such technological structures are state-of-the-art at the moment by deploying dense semantic graphs but they also offer efficient ways to infer additional knowledge on top of existing well-established knowledge, draw insights of patterns otherwise a human eye most likely would have missed and conclude in the wisdom lurking inside the knowledge graph of SO-Close. To deduce such useful assumptions one only needs to explore and investigate the graph by asking the right questions, more and more intricate over time for more complex and fulfilling answers. Then, the final step is sharing the resources and findings with the rest of the world by simply linking entities.

Ultimately, we are lucky to live in the information revolution where all these data combined with intelligence enable the materialization of technologies and tools which render every day life more inclusive and vibrant.