Words like machine learning, machine learning, big data, data science is the new trend. In fact, the profession of data scientist has been rated among the most desirable jobs of the 21st century. Many people talk about the data revolution and artificial intelligence, but what is machine learning really and why is it so much talked about?
Lets try to understand what is Data science and what is its importance ?
Tom Mitchell defines machine learning in one of his books as: “The study of computer algorithms that automatically improve their performance thanks to experience. It is said that a computer program learns about a set of tasks, thanks to experience and using a performance measure, if its performance in these tasks improves with experience.”
As per Wikipedia, the definition of Machine learning is given as follows.
“Machine learning (ML) is the study of computer algorithms that improve automatically through experience. It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model based on sample data, known as “training data”, in order to make predictions or decisions without being explicitly programmed to do so. Machine learning algorithms are used in a wide variety of applications, such as email filtering and computer vision, where it is difficult or infeasible to develop conventional algorithms to perform the needed tasks.”
In other words, algorithms that learn and improve “on their own” thanks to experience. This fact that they do it alone is in quotes because they do it using data, past experiences. Unlike models in which a business expert assigns rules and models something based on their knowledge (their past experience), statistical models and machine learning models let the data do the talking and get the relationships automatically.
Why is Data science so important ?
Many of the methods used in machine learning and statistical modeling have been with us for several decades. Algorithms such as neural networks or vector support machines (SVMs) were devised a long time ago, some of them even fell into disuse.
One of the main reasons for the current demand in these techniques are:
- On the one hand, the computational capacity of computers has been increasing and it is now possible to treat problems that could not be treated before. This increase has been vertical (improvement of individual computing capacity, and also horizontal (increase in computing capacity when working with several computers at the same time using the Big Data paradigm).
- On the other hand , the data revolution, motivated by digitization, has led to a huge increase in data that can be processed and modeled to gain knowledge of the data. Years ago there was much less data, which made it possible to see statistical models of a few hundred records.
Today we live in an exciting period in which data and the application of techniques that extract value from it will be strategic for many countries and sectors.