If you attended our NextStep 2020 product session about AI infusion in your Apps, you already got a sneak peek of Machine Learning Builder (in short, ML Builder). In a nutshell, it’s a tool that allows you to use existing data to solve highly personalized scenarios through machine learning.
In the past, I provided a bit of background already on the importance of using predictive capabilities and the pragmatic uses of artificial intelligence in different scenarios and industries. Now, I’ll deep dive into the machine learning lifecycle while providing some guidance and best practices that you can apply to ML Builder and understand what’s under the hood.
The Traditional Machine Learning Lifecycle
Traditionally, to build and deploy a machine learning model, you must complete several steps commonly known as the machine learning lifecycle.
You start by setting the business goal, identifying the problem you want to solve. Define what you want to predict, and then understand the data you have available and ensure that it aligns with your business goal. Additionally, it’s best to guarantee that the required data is accessible.
After getting through the first two stages, it’s time to prepare the data to be used. This is the step that’ll ensure your data will be relevant and usable to train a machine learning model, and in it, you may find additional sub-steps, such as:
- Set up your infrastructure to consume the needed data.
- Detect and correct corrupt or inaccurate records, identifying unusable or irrelevant data and replacing or modifying it (data cleansing).
- Transform and map data from raw data into a more appropriate format, valuable for analytics purposes (data wrangling).
- Extract features from raw data through data mining techniques. These are all the attributes that feed the model.
- Manually curate data (data labeling).
After preparing all your data, you should fit it into the best model. In this step, you’ll train your model through several experiments consisting of a set of features and machine learning mathematical models. Then, evaluate the results you gained through the experiment’s KPIs and metric results, guaranteeing that your model’s performance in the test dataset matches your business needs.
All that’s left in this traditional process is to deploy the ML model so that it can be consumed (for instance, as a REST API). Afterward, keep monitoring your model in production, evaluating its performance and retraining if necessary.
Paradigm Change with ML Builder
To change the traditional paradigm, we’re introducing ML Builder. Its main goal is to automate several of the traditional steps described above with processing capabilities and mathematical techniques to infer and replace data scientists' manual decisions. This technology is commonly called auto-ML.
McKinsey reports that many organizations have found that between 60 and 80 percent of a data scientist’s time is spent preparing data for modeling. This time's automation represents a significant boost in a data scientist's productivity and ability to focus on other complex problems.
The introduction of auto-ML implies a massive transformation in the field of machine learning. One from which companies without data scientists or data engineers would also benefit.
A Simplified Version of the Machine Learning Lifecycle
At OutSystems, we aim to reduce the skill level required for anyone to execute as a data scientist expert in the machine learning lifecycle. ML Builder automates most processes with this goal in mind, optimizing the user decisions and guaranteeing that there’s no factor negatively impacting the model’s performance.
This is how we reimagine the machine learning lifecycle when using ML Builder.
This first phase is composed mainly of defining your business goal and understanding what you want to predict.
Out of the four steps – solution design, data preparation, model training, and deploying – this is the one that requires the most knowledge about your business, and if machine learning is the better solution to achieve your goals. Here you should be able to identify the target data and the machine learning use case that will be used to make that prediction.
Here you’ll find the previously mentioned steps of cleansing and wrangling data and feature engineering. This is where the automation starts to occur, but understanding the data you want to deal with is key to success.
We’re continuously developing ML Builder in a way that we address the most common use cases by automating and proactively detecting them. You may need human supervision in specific use cases, but ML Builder will always guide the end-user.
This is the phase in which you compose the experiments of features and algorithms, train the model, and evaluate its performance.
It’s where you’ll find the most automation, such as building several experiments (a combination of features and algorithms), running and training processes, and measuring the results. Everything is processed much faster than if performed manually.
The most prominent challenge in this stage is precision mathematics (and a lot of trial and error) that leads to model validation through hyperparameter tuning and model testing. The evaluation process is also critical because, ultimately, it's what defines a good enough model performance. And the simple definition of "good" is a challenge on its own.
The final stage is when the machine learning model is available to get predictions, evaluate its results, and monitor production performance.
Deploying a model as a REST API may seem an operational process. However, traditionally, to use the model prediction, you'd have to consider and involve different roles, such as scientists, engineers, and even designers. With ML Builder, the complexity is kept from the user.
Additionally, monitoring the model is key to success. Identifying and detecting model degradation and deciding when relevant business changes may affect the model is crucial to understanding when it should be retrained or even redone.
Automate Your Processes with Machine Learning
Building a machine learning model is never a simple task. However, at OutSystems, we’re trying to ease the needed effort by enabling you to use your data and leverage the latest technologies, massively reducing the data science knowledge required to prepare data and training models.