The machine learning platform is one of the fastest growing public cloud computing services. Unlike other cloud-based services, ML and AI platforms are available through a variety of delivery models such as cognitive computing, automatic machine learning, ML model management, presentation of ML models, and GPU-based computing.
This article tries to explain the terminology and delivery model adopted by public cloud providers. It aims to help business decision makers choose the right cloud-based ML and AI services.
Like the original cloud delivery models from IaaS, PaaS, and SaaS, infrastructure ranges from the ML and AI spectrum, high-level platforms and services that are exposed as APIs.
Let’s take a closer look at each of these layers.
Cognitive computing is delivered as a set of APIs which offer natural language processing (NLP), computer vision, and speech services. Developers can use this API like other web services or REST APIs. Developers are not expected to know the complex details of machine learning algorithms or data processing paths to utilize this service.
When the consumption of these services increases, the quality of cognitive services increases. With increasing data and service usage, cloud providers continue to improve predictive accuracy.
A newer addition to cognitive computing is an automated machine learning service (AutoML) where developers can use the API after training services with special data. AutoML offers a middle ground for consuming models that have been trained before vs. training special models from the start.
If you are considering adding AI capabilities needed for existing or new applications, ask your developer to evaluate cognitive services in the public cloud. From object detection to sentiment analysis, you will be able to take advantage of available AI services. Think of this API as equivalent to SaaS from AI where you only pay for what you use.
ML Platform as a Service:
When cognitive APIs do not meet the requirements, you can use ML PaaS to build highly customizable machine learning models.
For example, while cognitive APIs might be able to identify vehicles like cars, it might not be able to classify cars based on brands and models. Assuming you have a large dataset of cars labelled with brands and models, your data science team can rely on ML PaaS to train and use special models that are suitable for business scenarios.
Similar to the PaaS shipping model where developers carry their code and store it on a scale, ML PaaS expects data scientists to carry their own datasets and codes that can train models of custom data. They will avoid providing computing, storage and network environments to carry out complex machine learning work. Data scientists are expected to create and test code with smaller datasets in their local environment before running it as work on the public cloud platform.
ML PaaS eliminates friction involved in setting up and configuring data science environments. It provides a pre-configured environment that can be used by data scientists to train, adjust, and host models. ML PaaS efficiently handles the life cycle of the machine learning model by providing tools from the data preparation stage to the hosting model. They come with popular tools such as Jupyter Notebooks that are familiar with data scientists. ML PaaS handles the complexity involved in running training work on a group of computers. They abstract the basics through simple Python or R API for data scientists.
Amazon SageMaker, Microsoft Azure ML Service, Google Cloud ML Engine, IBM Watson Knowledge Studio are examples of ML PaaS in the cloud.
If your business wants to bring agility into the development and implementation of machine learning models, consider ML PaaS. This combines the CI / CD technique that is proven by the ML management model.
ML Infrastructure Services:
Think of ML infrastructure as IaaS from the machine learning stack. The cloud provider offers raw VMs that are supported by high-end CPUs and accelerators such as the graphics processing unit (GPU) and gate programmable gate (FPGA).
Developers and data scientists who need access to the raw computing power switch to ML infrastructure. They depend on the DevOps team to provide and configure the required environment. Workflow is no different from preparing a testbed for developing web or cellular applications based on VMs. From choosing the core CPU number to installing certain versions of Python, the DevOps team has an end-to-end configuration.
For complex in-depth learning projects that rely heavily on toolkit niches and libraries, organizations choose the ML infrastructure. They get primary control of hardware and software configurations that might not be available from the ML PaaS offering.
The latest hardware investments from Amazon, Google, Microsoft and Facebook, make ML infrastructure cheaper and more efficient. Cloud providers now offer special hardware that is highly optimized to run ML workloads in the cloud. Google TPU offers and Microsoft FPGA is examples of special hardware accelerators specifically intended for ML jobs. When combined with the latest computing trends like Kubernetes, ML infrastructure is an attractive choice for companies.
Amazon EC2 Deep Learning AMI supported by NVIDIA GPU, TPU Google Cloud, Microsoft Azure Deep Learning VM based on NVIDIA GPU, and IBM Bare Metal Server based GPU Server are examples of IaaS niches for ML.
With ML being a significant workload, public cloud providers invest in core infrastructure, platforms and services to attract corporate customers.