From Big Data to Smart Data: The Smart Data Principle
From Big Data to Smart Data: The Smart Data Principle
Intelligent Solutions for the Flood of Data
The term Big Data has become widely known outside of IT in recent years due to the advancing digitalization of numerous areas of life. In the course of digital communication, huge amounts of information of all kinds are exchanged on a daily basis: documents, chats and messages, internal documents and external news feeds, editorial and user-generated content. However, most companies need solutions based on the smart data principle, i.e. a transformation from big to smart data.
Competitive Advantages Through Big Data?
More and more providers are aiming to use this constantly growing digital information mountain for their own purposes. In most cases, the aim is to gain a competitive edge with the broadest possible knowledge of the current needs of potential customers. On the hardware side, there are virtually no limits to the collection of data thanks to cost-effective mass storage technologies. The real challenge lies in analyzing this largely unstructured and often incoherent flood of data. In order for computers to be able to meaningfully evaluate such volumes of data, they must first understand them.
Why Smart Data?
And this is where successfully bridging the gap between big data and smart data is crucial. It is not primarily the amount of communication data collected from your target group that matters, but rather its relevance. The aim of smart data analyses is to identify the right data, quickly filter out what is really important and place it in the right context in relation to your own tasks and goals. For the purposes of customer-oriented communication, it is crucial to use analyses that also take into account the linguistic characteristics of a language to identify the special features of the language and expressions of selected target groups and to optimize external communication accordingly.
Thanks to enormous progress in the development of technologies for natural language processing in conjunction with machine learning methods (in particular so-called deep learning), it is now possible to automatically decipher the essential meaning of high-quality content.
Smart data analyses recognize what is important in a document and automatically enrich it with additional metadata. Among other things, the documents are automatically indexed and classified.
Automated Enrichment of Meaningful Metadata with CONTEXTSUITE
Thanks to a pre-trained understanding of human language and important concepts and their contexts, smart data platforms such as MORESOPHY ‘s CONTEXTSUITE are in principle capable of interpreting and enriching content from a wide variety of technical domains completely without human intervention. Enrichment according to the smart data principle involves recognizing the following types of objects or data in the content as standard:
-
- Topics or concepts: e.g. sustainability, risk management, service.
-
- Persons (via their name, e.g. Richard Müller or Baron von Stauffenberg)
-
- Places and regions: e.g. Canada, Ontario, Bonn-Beuel, both by name and by zip code or other codes
-
- Organizations: e.g. Siemens AG, FC St. Pauli Hamburg, Ministry of Education and Science
-
- Specialist topics: e.g. society, nutrition, healthcare, …
-
- Moods: in the sense of a rather positive or negative mood in a document
It is important to understand that the methods are not purely keyword-based, but work on a semantic level. Different spellings, synonyms, generalizations or specializations of a term are taken into account, as are the different contexts in which terms occur. The machine is therefore also able to recognize whether a text is about a lens in the sense of an objective or about the starchy foodstuff.
Natural language processing and intelligent processes can therefore be used to build up a comprehensive understanding of language. This understanding is important in order to filter the data relevant for certain use cases from huge amounts of unstructured text data. This is because most companies want to filter and organize relevant information, not collect as much information as possible. In this way, big data can be successfully transformed into smart data.
Another article in our blog deals with smart data from the perspective of a machine learning engineer: machine learning with customer-specific data.
Project manager
Andreas studied Technology & Media Communication and is primarily responsible for internal and external communication and documentation within the company. This gives him an optimal overview of the various technologies, applications and customers of MORESOPHY.
More articles from Content in Context


