Automated machine learning, or “AutoML,” is getting a lot of attention today, and for obvious reasons. Just about every company across every industry is trying to use machine learning and artificial intelligence (AI) to boost productivity, optimize operations, and accelerate revenue.
But with all the attention, there are also various takes on what constitutes an AutoML system. In this post, we explain how Ople defines a true AutoML solution and how “brute force” AutoML doesn’t meet these end-to-end data science process requirements.
There are a number of brute force AutoML offerings available, but for the most part, they operate as follows:
- Data Preparation: Before you can start using a brute force AutoML platform, you need to prepare your data just as if you were going to build models by hand. Taking as much as 80% of a data scientist’s time, data preparation requires gathering data from multiple sources, normalizing and extrapolating, and eventually getting your data into a single flat file. Most brute force AutoML tools have no capabilities for data prep and must rely on specialized software and lots of data wrangling.
- Data Ingestion: This is where brute force AutoML systems start to deliver value. By simply importing a single data file and specifying the variable target to use for predictions, these systems analyze the data and make a few basic adjustments like normalization and simple transformations. The system makes sure the data is suitable for the machine learning challenge at hand.
- Model Building: Building machine learning models is the core of brute force AutoML. Containing a library of dozens or even hundreds of machine learning models allows the system to choose which models are appropriate for the data challenge at hand, and they run a competition of all these models. This is the real “brute force” part of the equation. By running all of the models and ranking them by performance metrics, the outcome is a leaderboard of potential models to use. The best brute force AutoML solutions let you tweak and tune hyperparameters and then re-run the competition for better results.
- Model Exploration: Once an acceptable model has been identified, a data scientist is required to validate that the model operates as it should. Depending on the brute force technology, there may be very little visibility into why the model worked as it did. Sometimes called “black boxes,” these solutions offer little in the way of explainability, so data scientists are needed to make the call on whether a model is working as it should, or there is some overfitting of the data. In fact, model explainability is the single biggest reason why companies do not deploy models into production – if the data science team cannot explain the model in a way that is easily understood, leadership is rarely comfortable enough to risk their production systems, and their business, on that model.
- Deployment: Once a model is chosen, and the data science team is comfortable with its operation, the next step is deployment. Most brute force AutoML systems have very few options for deployment. Most often, they simply provide the code for the final model in R or Python, and then hand that code to the development team to re-code in the C++ or Java languages used in their production systems. This is a lengthy process that often introduces errors that need to be tested and refined.
Brute force AutoML systems offer a host of productivity benefits for companies looking to accelerate the data science process, but they require data scientists to operate efficiently. Without the knowledge of math, statistics, and programming, brute force systems can deliver models quickly but require expensive data science talent to ensure the results. The challenge is that scientists are difficult to find, hire, and retain.
What brute force AutoML lacks is true automation across the data science process – from data prep, to model building, to explainability, and finally, deployment. Moreover, brute force approaches lack the intelligence to enable business analysts and other users to be effective in developing machine learning and AI.
A true AutoML technology, like the Ople Platform, delivers an end-to-end experience that allows anyone who is familiar with your business and your data to be highly effective in creating high-performing machine learning models and AI projects.
Offering the best of brute force systems – speed and automation – while delivering advanced intelligence and ease-of-use, a true AutoML solution provides the following capabilities:
- Engineered for Business Users: True AutoML systems enable business users, like business analysts, BI specialists, and other business resources who understand the business and data – without having a traditional data science background. True AutoML delivers value in minutes with end-to-end automation, taking care of all the complex operations: data prep, feature engineering, model creation, optimization, and deployment. True AutoML solutions push AI and machine learning to every corner of your organization.
- Fully Transparent: By offering a complete explanation of how a model is built and which variables impact every prediction, true AutoML provides the transparency and explainability needed to build trust. When business users are developing machine learning models, there is a tendency to be cautious about the results. With a true AutoML solution, full explainability helps everyone understand how a model makes its predictions and why, building trust in the results. Without trust, there is no adoption.
- Deliver Business Value in Real-Time: Building machine learning models is great, but the value is only derived when you can deliver the predictions on new data. This requires deployment, and a true AutoML system is designed with deployment as the end goal. With hooks into your analytical tools or business applications, or the ability to export results to a simple file like CSV/XLS, true AutoML systems make the last mile, which is historically the most complex, extremely easy.
As with most things in business, finding the right tools is critical to delivering value for your company. And with AI and machine learning, the number of tools and solutions available can be overwhelming. There’s much riding on your company’s ability to deliver advanced analytics, and remaining competitive in an increasingly analytics-driven world is at the top of the list.
Brute force AutoML solutions offer a boost in productivity for those organizations that have a fully-staffed data science team. Brute force systems enable these companies to accelerate their model building for some of the more practical data science challenges they work with. But there are still gaps in the process, especially in data prep, explainability, and deployment. And, these systems require data science experience to fully realize the value.
With a true AutoML solution like the Ople Platform, your company will empower business users to deliver machine learning and AI projects in a fraction of the time using traditional data science methods, and with features that allow business users to develop and deploy these projects with ease.