Machine learning has been trending since the grand trio of mega corporations – Amazon, Microsoft, and Google – announced the addition of ML services to their range of digital products. Microsoft Azure ML was introduced to the public in February of 2015, then Amazon released its ML services in April of the same year, and in October of 2015, Google has launched a public beta test of Google Cloud Datalab. We already did a comprehensive overview of machine learning services offered by Amazon, and today we’re going to take a closer look at Microsoft Azure and its machine learning capabilities.
The evolution of machine learning demanded new tools to enhance the process of creating ML models that can be implemented in applications. The general idea was to speed up all typical operations in ML life-cycles (preparing and processing data, applying learning algorithms, testing, deploying). Another grand question was: is it possible to do all these things automatically? The answer was positive, and Azure machine learning stepped into the play.
Quick Overview and Notable Features
Azure has four main machine learning components. Up to September of 2018, there were five of them, but Azure ML Workbench application was cut out by Microsoft in order to improve user experience and architecture. So today, developers can tinker with the Experimentation service, a set of libraries, and Visual Studio Tools for AI. Anybody can use these tools to train and deploy machine learning models in the cloud or on-premise or at the Edge. It fully supports open-source Python libraries, which means you can use TensorFlow or PyTorch to build ML models.
Experiments can be carried out in a managed environment (e.g., Spark cluster) with plenty of GPUs to accelerate the execution. Azure’s custom MMLSpark framework provides integration with Apache Spark. Plus, there is a cognitive toolkit as well as OpenCV to enhance the working process. Thanks to these features, developers can quickly create scalable predictive models for large images and text data. For additional specifics, we recommend you to check Azure’s GitHub repository, which contains extensive documentation on this topic.
Since we mentioned Spark in the previous paragraph and highlighted it as one of the useful integrations for machine learning, we will also briefly explain what it is. Spark is a unified analytics engine for machine learning, which was adapted and deployed by leading IT companies on a massive scale. It is one of the best frameworks for processing large amounts of data in parallel. Due to its capabilities, Spark is often used for ads optimization, security monitoring routines, fraud detection, and operational optimization. And for any of these cases, you can make your own ML model with Azure ML.
With Microsoft’s Project Brainwave, you are not limited to training your models using only CPU or GPU but FPGAs (Field-Programmable Gate Arrays) as well. It is a highly-efficient and flexible circuit that can be customized according to your preferences. A trained neural network can be run as quickly as possible on an FPGA and can be even parallelized on multiple ones to scale a service. This is how you can achieve low latency inferencing requests in real-time.
Execution and Management
When it comes to execution, users are able to perform experiments across multiple environments (Local Native, Local Docker container, Remote Docker container, Spark Cluster, etc.). This is where we can clearly see Microsoft’s dedication to the hybrid computing approach. The hybrid cloud takes the best from both private and public clouds. It provides businesses with the benefits like extra elasticity with the additional customization and control available from a dedicated computing infrastructure hosted on-premises. In a few words, hybrid technology is all about high scalability, reliability, and also boosted with good security measures that a company can implement using firewalls and other tools from the Azure’s arsenal.
When you’re managing a multitude of ML models, you should always keep in mind that they must be updated over time in order to be effective. You need to re-train them and pack them with new data. For checking previous training runs, it is advisable to use Visual Studio as logs all your model-related actions. Another tip worth noting: Azure’s model management service lets data scientists use simple CLI commands to containerize models and deploy them to all kinds of computing environments.
The end-to-end workflow includes preparing data in the cloud using Azure data bricks in combination with the Python SDK to train models and keep track of all key metrics. After that, you can register your trained model and deploy it to the cloud or Edge and orchestrate the whole thing using the Azure machine learning pipelines.
Microsoft did an amazing job with Azure to let you scale-out model training in the cloud using powerful GPUs. Furthermore, you will be using these powerful compute clusters on-demand, which means that you are only paying for actual usage of the system resources. Microsoft lets you use the fastest NVIDIA GPUs to train your models, but as an alternative, you can use the FPGA to complete any model in record-breaking time.
Types of ML Models Available in Azure
There are several model types you can create within Azure ML. Basically, they represent what you are trying to predict. Here is what you can choose:
It is a model for making forecasts, where we’re trying to predict a straight-up value. For example, you can calculate a number of kilometres a car can drive on one litre of fuel or predict temperature on a particular day using weather patterns and statistical data, and so on.
Your choice for situations when you’re expecting a binary outcome or want to categorize something. These models will help you to choose one of two options – whether an email you have received is a spam or not, or will a customer buy a product or leave without a purchase, etc. Also, it lets you create classifications with multiple variants – not just one or two, but maybe three/four/five, and so on. Such models come in handy for various ratings (e.g., movies, books) – excellent, good, average, bad.
Anomaly Detection model
Good for work with structured data in order to detect everything that falls out of your patterns. Perfect for identifying and predicting rare events. Models of this type are often used to check fraudulent transactions, so a lot of banking institutions are armed with anomaly detection tools.
Clustering is a bit different from the previous examples. This is something we would label as “unsupervised learning.” All previous models on the list are supervised because we are in control of the learning process. When it comes to clustering, we don’t really know much about the data that will be used by the model to learn. It is like asking your model: go ahead, look at the patterns and items, and find out what type of correlation exists between them.
This type is a popular choice for suggesting things a client may like or not. Based on the data fed to a model, it can come up with multiple options in the format “You may also like…” This is one of the most popular model types as it can suggest a product/place/service that will most likely interest a customer, so there is a potential to increase the conversion rate.
Azure Machine Learning Studio
If we’re to name the most exciting feature of Azure that makes it unique, we would say that it is a Machine Learning Studio. It is a browser-based visual drag-n-drop environment, where code is not even necessary. It takes just a few clicks from idea to deployment (given, you know what you are doing and have some experience with the tool).
Let’s create a basic experiment to predict the price of a car in the future based on the dataset with current prices. So, the first thing you need to do is to find a relevant dataset – you can use one provided in the ML Studio or import your own data.
On the left, you can see a bunch of datasets and modules. Use the search bar to find the dataset you need and then drag it to the centre area. Datasets and modules have input and output ports visualized as small circles (input at the top, and output at the bottom). When you create a dataflow, you need to connect the output of one module with the input of another.
Next, you can drag the Clean Missing Data module and link it to the previous module from the chain. Using the “Remove Entire Row” function, you can automatically remove all rows from the dataset that contain missing values. It will help you to define what features from the dataset are suitable for the model you’re creating.
Now you may want to specify what features are relevant for the model. For that purpose, you need to drag and drop another Select Columns in Dataset module and connect it with the previous module. Usually, you would want to add the best features here. In our car price experiment, we will be adding things like horsepower, engine type, size, body style, and others. After selecting all that stuff, you can finally choose the machine learning algorithm to apply. And because we’re predicting price – which is a number, not a category – we will be using the Regression model.
The next step is to Split Data into training and testing datasets, then initialize the Linear Regression Module. Go to the Regression submenu in the Initialize model tab, find Linear Regression, and apply it to the model in order to find the best fit across the dataset. Now feel free to add the Train model module.
Once you clicked Train Model and set your target variable features, you can proceed with the experiment. When your model is trained, you can use it to predict the prices of other cars. Any finalized model can be reiterated, improved, altered by changing ML algorithms, and deployed as a predictive web service.
Azure Machine Learning has two main pathways for you to choose: ML studio or production of Web APIs. If you are planning to use your models as predictive analytics solutions, then you definitely need to invest some funds as only the Testing tier is available for free. Here are the prices of other tiers, including their limits (in terms of transactions, compute hours, and the number of web services):
Machine Learning Studio pricing is much humbler but less variation – only free and standard tiers are available. The first one is severely limited and mostly serves as a trial version for attracting new customers. However, it is enough to demonstrate the power of the instrument. If you are serious about your machine learning endeavours, then just go with the standard edition without wasting time on “free cheese.”
Why Choose Microsoft’s Solution for Machine Learning?
Azure machine learning will help you to simplify the process of building ML models by using automated machine learning capabilities. For developers who would like to build their own models, there is an option that allows scaling them out in the cloud using the Python SDK or any open source frameworks. Also, MS Azure users can manage end-to-end workflows with the help of Azure ML pipelines, which is almost like DevOps but for machine learning. Furthermore, Azure allows you to deploy your models to the cloud and the Edge with ease.
There are many different ways to use the power of machine learning tools provided by Microsoft. You can use pre-trained cognitive services that are available via RESTful APIs, or you can use Azure machine learning to train your own custom models with frameworks of your choice. Prefer PyTorch? Here you go. More of a TensorFlow fan? Not a problem! Finalized models can be easily deployed to the powerful infrastructure using the company’s CPUs/GPUs or FPGAs to speed up your inferencing.
Overall, Azure is definitely a useful cloud solution for enterprises as well as for machine learning practitioners. The intuitive drag-n-drop interface of the ML Studio is a truly unique option that lets even entry-level developers easily build ML models. With the flexible pricing models offered by Microsoft, you should not have any problems concluding which option to pick. But if you need help with estimating and calculating parameters for the optimal plan, feel free to contact our team of machine learning experts – we’re always ready to into new fascinating machine learning projects you might have!