By Priyal Walpita
This series of articles intend to elaborate the usage of Azure Machine learning and usage of different machine learning tools in Azure ML studio. This is the first post and it walks you through the introduction to the Azure ML studio and how to upload data to the tool.
Following are the other related articles of the series.
- Introduction to Azure ML studio (this article)
- Data pre processing in Azure ML studio
- Build a prediction model in Azure ML studio
- Principle Component Analysis in Azure ML studio
Introduction to Azure ML Studio
Azure Machine Learning Studio (Azure ML Studio) facilitates you to form, check and to create various machine learning models for your data. This is a drag and drop tool which could publish models such as web services by consuming custom apps or business intelligence tools such as Excel.
Azure ML Studio provides you an interactive, visual workspace where your drag and drop data sets and analysis are converted to an interactive canvas. This is a platform for operating machine learning workloads in the cloud. The key objective of this article is to give a brief introduction on Azure ML Studio and to make the reader educated on data uploading procedure through importing data in Azure ML Studio workspace.
Background of Microsoft Azure ML Studio
Microsoft Azure offers the service as a cloud computing service for the fields of building, testing, deploying and managing its applications and facilities from the Microsoft managed data centers. Software, platform and infrastructure are provided as services by Microsoft Azure, supporting numerous programming languages, tools and frameworks, which encloses both Microsoft and third party software and systems.
“Project Red Dog” was the code name when Microsoft Azure announced its name in October 2008, and was released on February 1, 2010. Moreover, on the March 25, 2011, Windows Azure retitled to as Microsoft Azure.
More than 600 Azure services are provided by Microsoft which comprises Computer services, Mobile services, Storage services, Data Management services, Messaging services, Media services, Content Delivery Network services, Azure Block Chain Work Bench services, Azure Function services, etc. The Microsoft Azure Machine Learning service is a part of the Cortana Intelligence Suite which assists you predictive analytics and interactions through data using speech and language through Cortana. Furthermore, on July, 2014, Microsoft introduced Azure machine Learning public preview for the users.
What is Machine Learning?
Let us initially get an idea on ‘What is Machine Learning and its History’, this is the scientific study of algorithms and statistical models that are used by computer systems to perform a specific activity relying only patterns and interface. Sample data known as ‘training data’ are utilized to build mathematical models based on machine learning algorithms to make decisions and predictions. This study is closely associated to computational statistics which focuses on creating predictions through computers.
History of Machine Learning
In 1959, Arthur Samuel an American innovator in the fields of Artificial Intelligence and Computer Gaming created the term ‘Machine Learning’. In the 1970’s, interest of Machine Learning related to pattern recognition was continued and in 1990’s Machine Learning was recognized as a separate field from Artificial Intelligence (AI) where its objectives were changed, so as to tackle solvable problems in a practical nature. This condition arise since the emphasis on the logical, knowledge based approach caused a gap in Machine learning and AI. So that eventually, the symbolic approaches which it had inherited from AI were shifted to methods and models which were carried forward from probability theory and statistics.
Use of Machine Learning Studio (ML Studio)
Let us now understand the uses of ML Studio in the field. When deliberating the usages, these are some of the benefits for the consumers.
- To develop a machine learning model using data from one or more sources.
- Transforms and analyze data by statistical functions where data manipulation is conducted to produce set of results.
Note: This type of model developing is an iterative process. Modifications for the parameters will change the results until the user is satisfied with an effective, trained model.
- Data sets can be drag and dropped where the analysis modules on to an interactive canvas, by connecting them together and forming an experiment and run in ML studio.
- Iteration, editing, saving can be done to the users’ model design. Note: Before iterating the model design, a copy should be saved and run again.
- Publishing of a predictive experiment as a web service so that the users’ model can be accessed by others.
- Programming is not required in the process.
- Predictive analysis are done by visually connecting data sets and modules.
What are the Applications of Machine Learning?
Limitations of Machine Learning
These are some of the Limitations identified in Machine Learning as follows.
- Lack of Suitable Data
- Data Bias
- Privacy Problems
- Wrong Tools & People
- Lack of Resources
- Evaluation Problems
- Lack of Access to Data
Moving on to Azure ML Studio by Microsoft
Things that Azure ML Studio Enables You to Manage
- Data storage and connectivity to consume the data from a wide range of sources.
- Assist in metrics and monitoring for training experiments, published services and for datasets.
- Deployment of models for batch inference and real time.
- Registration and management of models to be done so that tracking of multiple versions of models and the data which they were trained.
Now let us compare ML Studio (Classic) to Azure ML Studio by Microsoft.
What Features do differ them from each other?
Comparison of Machine Learning Studio (Classic) to Azure Machine Learning Studio is shown in Table 1.
Working with Azure ML Studio
If you want to create your own Machine Learning Experiment in Azure ML Studio (classic). Take a look at the following flowchart for a default work flow for an experiment.
In this article we will only discuss on the data uploading procedure by Importing Data to an Azure ML Studio workspace as an Introduction to Azure ML Studio.
What is a Workspace?
It is a boundary for a set of related machine assets such as data used for experimentation and model training, models that you have trained, experiments which include run history with the logged metrics and output, etc. Work space acts like a context for the experiments, compute targets, data and the other assets related with the machine learning workload.
Getting Started in Azure ML Studio
How do you Sign in?
Prior to uploading the data, it is important that you create your Azure Machine Learning Studio account and sign in using your Microsoft account. If you have already created your account, clicking on the sign in links as shown in the below figure 4 will allow you to proceed with the sign in process and then get started as
shown below figure 5. It is important that you use your Microsoft account that was used to create your workspace.
Once you are signed in with your Microsoft account you will see the browser based work bench which is the Azure ML studio as in the below figure 6.
What is Azure ML Studio in Briefly?
The ML studio initially consists of three sections which can be described as;
- Experiments : Offers Experimental Predictions to Analytical Experiments
The experimental section shows all the experiments in your workspace as shown in the below figure 5. Another feature in the experiment section is that is your workspace becomes prepopulated with samples. If you click on the samples it will display the list of samples that are available to you and by clicking on a sample experiment will give you all of its modules.
- Web Services : Contains the list of web services that you have created
- Settings: Where you can change the settings for your workspace, provides the facility of sharing your workspace with your team members
How to Upload the Data in a Predictive Model?
The first step of building a predictive model is to upload a data set. In the process of machine learning ‘data’ plays a significant role. The data can be Imported from Many Sources or Sample Data Sets can be used for this uploading purpose.
What are Sample Data Sets?
If you are using Azure ML Studio (classic), default number of sample data sets and experiments are present. These sample data sets are mostly used by the sample models which are included in the Artificial Intelligence (AI) gallery. Others can be as examples of various types of data for Machine Learning while some of the data sets are present in the Azure Blob storage. A portion of the sample data sets are available in your workspace under ‘Saved Data sets’ where you can find in your module palette at the experiment canvas in Machine Learning Studio (classic). Any type of these mentioned data sets can be dragged to your experiment canvas.
Take a look at some of these data sets from Azure Blob storage.
How do you Import Data to Azure ML Studio?
In this article one of our main objective is to make our reader aware on how to import data to an Azure ML Studio workspace. So this is how it is done. In the ML studio the ‘New’ icon at the bottom of the page encourages you to upload the data for your workspace as shown in the below figure 9.
Click on ‘data set’ icon and you will get a dialogue box where you can import new data sets from local files as shown below in figure 10.
Following date formats are supported in Azure ML studio.
o Generic CSV File with a Header (.csv)
o Generic CSV File with no Header (.nh.csv)
o Generic TSV File with a header (.tsv)
o Generic TSV File with no header (.nh.tsv)
o Plain Text (.txt)
o SvmLight File (.svmlight)
o Attribute Relation File Format (.ariff)
o Zip File (.zip)
o R Object or Workspace (.RData)
As an example we will use a bike buyer’s dataset for the working space as shown in the below figure 8. Entries such as ID, gender, marital status, income, children etc. are included in this data set. This original data set is a combination of different types of data including numeric, ordinal and categorical data types. The great thing about Azure ML Studio is that it could handle any of this above mentioned data types.
According to the above data sheet before obtaining the numeric data sheet, basic regression was carried out for the original data (Note: keeping in Binary variables). After preparing your data sheet, remember to save in the format of any as mentioned above.
Here we save our data sheet from excel in the format of ‘CSV’ to be handled in the Azure ML studio workspace. The following figure 11 shows the saving format for the data sheet on ‘Bike Buyers’.
After clicking ‘Save’ icon it is important that you select the active data sheet to be saved as shown below in figure 12.
Then you can open your above mentioned saved data set using Azure ML studio workspace using the data set option. The prepared data set shall then be uploaded in the workspace as shown in the below figure 11. This how the importing procedure for a data set to an Azure ML Studio workspace is initially done.
This concludes the first part of the this series which is introduction to Azure ML studio and how to upload data to the tool. Stay tuned for the next article which is : Data pre processing in Azure ML studio.
Thanks a lot for reading this article. If you have any questions, please ask it in here or reach me via email (firstname.lastname@example.org) or from my LinkedIn.