In a world increasingly centralized around information technology, huge amounts of data are produced and stored each day. Often these data come from automatic detection systems, sensors, and scientific instrumentation, or you produce them daily and unconsciously every time you make a withdrawal from the bank or make a purchase, when you record on various blogs, or even when you post on social networks. But what are the data? The data actually are not information, at least in terms of their form. In the formless stream of bytes, at first glance it is difficult to understand their essence if not strictly the number, word, or time that they report. Information is actually the result of processing, which taking into account a certain set of data, extracts some conclusions that can be used in various ways. This process of extracting information from the raw data is precisely data science. The purpose of data science is precisely to extract information that is not easily deducible but that, when understood, leads to the possibility of carrying out studies on the mechanisms of the systems that have produced them, thus allowing the possibility of making forecasts of possible responses of these systems and their evolution in time. Starting from a simple methodical approach on data protection, data analysis has become a real discipline leading to the development of real methodologies generating models.
The model is in fact the translation into a mathematical form of a system placed under study. Once there is a mathematical or logical form able to describe system responses under different levels of precision, you can then make predictions about its development or response to certain inputs. Thus the aim of data analysis is not the model, but the goodness of its predictive power.