Buzzwords like Machine Learning and Deep Learning have been around for quite some time. We’ve always known that intelligent systems had been a promising technology that would enable us to search through vast amounts of information quickly and effectively, facilitating the discovery and application of knowledge. Over the last decades, technology has successfully taken the reins of numerous tasks that require different degrees of intelligence. In fact, many of the services offered by today’s biggest companies are based on Artificial Intelligence, such as Apple’s Siri or Amazon and Netflix’s recommendation engines.
However, it’s important that we learn to distinguish between the two technologies; Deep Learning (DL) and Machine Learning (ML) are two different concepts.
Machine Learning is a branch of computing; it is a very extensive subfield that aims to provide computers with “intelligence”. It strives to develop the machine’s ability to learn so that it can find the correct solution to a problem without any further explicit programming. Thus, in Machine Learning, systems learn to solve puzzles by themselves.
Researchers in the field of Machine Learning are concerned with developing mathematical approaches and determining the parameters that can be used to solve different problems. At present, we can choose from a range of Machine Learning algorithms, such as classification, regression, dimensionality reduction or clustering. Our choice will of course depend on the type of problem being dealt with. In short:
- Classification algorithms assign categories to unseen samples
- Regression algorithms predict numeric values from samples
- Dimensionality reduction algorithms search for alternative mathematical representations of the data
- Clustering algorithms group samples depending on their similarity
Both Machine Learning and Deep Learning have experienced some ups and downs along the way. At times, ML was ahead of DL in terms of interest and at others, DL was ahead of ML. It all depended on the computing performance and their ability to match expectations at a given point in time.
In ML, the term “intelligence” refers to a specific type of intelligence. Unlike an all-purpose, general AI, ML intelligence enables a system to provide a degree of assistance to the user; a helping hand that supplements the human skills or wisdom with knowledge automatically extracted from datasets via mathematical and computational techniques. This implies knowledge is obtained not through programming, but through “training”.
To this end, a model is built so that the system can make predictions on the basis of an input dataset, being the mathematics behind the model that drives the entire learning process. This learning process usually involves adjusting weights -called parameters- during the training phase to ensure that predictions are valid in terms of accuracy, mean error, or inertia, depending on the nature of the data and algorithm. A fine-tune with statistical quality enhancement purposes is performed by finding values in a non-automatic way for the so-called hyperparameters. These numbers have to be specified by the user in a predefined grid.
Models are then successively and iteratively defined, trained, evaluated and tested with different portions of the data to make parameter adjustments. Overfitting is to be avoided here: a phenomenon by which a model would stick too much to the training data, returning biased predictions, making it less general and thus less useful. Likewise, the information contained in the dataset will have to be pre-processed to ensure the desired degree of accuracy and interpretability. If learning is supervised, the algorithm will compare the results with tagged data (this process requires manual tagging of all the data, which is costly, cumbersome and virtually impossible in Big Data terms), helping to determine if the model was right or wrong for every sample. On the contrary, if no tagging data is available, we’ll stick to different unsupervised learning algorithms like those for clustering, feature extraction and dimensionality reduction to extract information from our datasets.
Neural networks can roughly be considered a subset of these Machine Learning techniques. They are particularly useful when it comes to problems related to unsupervised datasets or Big Data, making it possible to automatically extract valuable information from patterns.
Approaching Neural Networks: Deep Learning
Deep Learning itself extends Machine Learning, focusing on Big Data and GPU processing -not necessary but convenient. Neither of them is a one-size-fits-all tool for all problems.
Software neurons are simple processing units which simulate -to some extent- the work of their biological counterpart. A neuron has some weighted inputs and an output, to which an activation function is applied. They are grouped in layers, linking one’s outputs with the following inputs –there are different variants of this structure. A layer can contain an undetermined number of neurons.
Neural networks are composed of a number of combinations of layers, each one performing different simple operations which make up a complex “reasoning” process when combined. These layers fall into three categories: input, output, and hidden (the ones in between). Optimization functions are applied to infer the adequate weights for each neuron; hence the computation-demanding nature of these processes.
Taking the widely used example of a handwritten number or an image classification problem, each of the layers would be responsible for identifying details as a border, a particular shape pattern, or performing any of the former with a specific degree of accuracy. To put things in perspective, this can also be done by Machine Learning algorithms, such as Support Vectorial Machine(SVM), by a different implementation approach.
The Deep Learning concept refers to training Neural Networks with more than two hidden layers, independently of how deep the Neural Network is.
A vast variety of Neural Network configurations is available nowadays, but these are the most popular ones:
This is the simplest model, in which neurons apply the activation function over the weighted inputs and turn it directly into an output. A multi-layer (one hidden layer + input layer + output layer) perceptron-composed version called Vanilla Neural Network enhances this behavior by adding a layer of heavily interconnected neurons. This is made possible by a backpropagation algorithm, which allows to calculate the loss of a neural network or, in other words, a function that has to be minimized to enhance the quality of the predictions.
Convolutional Neural Networks
These Neural Networks take an image as an input and return another as an output. A common example is object identification in images. They decompose the problem into simpler ones by applying filters to the original channel-decomposed(RGB) information. Recent applications in the field of malicious code identification have had impressive results.
Recurrent Neural Networks
These ones are focused on the identification of patterns in sequences of data. Some of them are given a small amount of “memory”, being neurons capable of remembering prior states through a “thinking” process (Long/Short Term Memory LSTM). The most popular application of this type of Neural Network is Natural Language Processing.
Several other groups are to be mentioned, such as Recursive Neural Networks -image treatment and NLP- or Unsupervised Pretrained Networks -data generation and unsupervised learning-.
A range of tools can be used to develop Neural Networks. Luckily enough, some are at a quite high level of abstraction, such as the widespread Keras or PyTorch. Other, lower-level tools include Google’s popular Tensorflow which offers Graphics Processing Unit capabilities.
How can Deep Learning and Machine Learning help SmartCLIDE
Developing software can be frustrating and messy; the boilerplate code is repetitive and it may be difficult to reuse previously generated items, especially in large company environments. Nevertheless, services are a useful programming paradigm which enhances scalability and resource control, facilitating the maintenance (zero-downtime updates in continuous integration environments) processes carried out by small independent teams on the basis of their atomic functionality. Hence, SmartCLIDE proposes an assistant who will:
- Help users develop services based on BPMN (Business Process Model and Notation) schemes
- Help developers create quality code through suggestions, syntax highlighting and providing easy documentation
- Help users/developers reuse already existing services
Apart from this, a non-technical user should be capable to define a functionality and be guided through the software composition process, enabling the use of existing and previously classified services.
SmartCLIDE will research Machine and Deep Learning techniques, testing their advantages over simpler approaches, when faced with challenges in quality assessment, service composition, service classification and service discovery, along with code suggestions.
In sum, a Deep Learning Engine will be designed to support ML and DL techniques at the core of SmartCLIDE. This is a big challenge in terms of project aims and the number of techniques to be tested.
Let’s get started!!