What is ML.NET AutoML?

The ML.NET AutoML system stable release debuted a couple of weeks ago (July 2019). I was trying to explain to a colleague what AutoML is. I had a difficult time trying to explain because AutoML has several different components. In the end I decided that a screenshot of a demo was the best way to start the explanation.

I created a 40-item training data file of employee information: Hourly (True or False), Age, Job (technical, management, sales), Annual Income, and job Satisfaction (low, medium, high). The goal is to predict Satisfaction from the other variables. I also created a 10-item test file with the same format, for example:

False  66  mgmt  52100.00   low
True   35  tech  86100.00   medium
False  24  tech  44100.00   high
True   43  sale  51700.00   medium
. . .

To prepare the demo, I installed 1.) Visual Studio 2019 Community (free) Edition, 2.) the .NET Core Framework SDK version v2.2, 3.) the ML.NET CLI (command line interface) tool v1.2.

I issued the command:

> mlnet.exe auto-train ^
--task multiclass-classification ^
--dataset ".\Data\employees_train.tsv" ^
--test-dataset ".\Data\employees_test.tsv" ^
--label-column-name satisfac ^
--name EmpClassifier ^
--max-exploration-time 50

This command analyzed the training data, and used the ML.NET library to explore several variations of five different machine learning algorithms, and identified the best algorithm (“FastTreeOva”), and saved this best one as a model.

So, AutoML is a sub-tool (auto-train) of the mlnet.exe command line tool. The combined tools call into the ML.NET library, running in the .NET Core Framework (multi-platform version of the .NET Framework). Notice there isn’t a specific “AutoML” entity. The AutoML name is left over from earlier versions.

After identifying the best classification algorithm and saving the model, the AutoML tools also created a Visual Studio solution with two projects. The first project is a C# console application that is a template for how to use the trainined model to make predictions for new data. Amazingly cool. The second generated VS project is the C# code that was used to train and test the FastTreeOva model. Also amazing.

So, there are a lot of moving parts in the system — .NET Core Framework and SDK, the ML.NET code library, the mlnet.exe + auto-train command line tool, and several others. The key idea is that you can start with data and get a sophisticated machine learning system using C# with relative ease, at least compared to doing things from scratch.

Automated ML in software is relatively new but automated women in cinema have been around for a long time. Kyoko and Ava in “Ex Machina” (2014). The fembots in “Austin Powers” (1997). Maria in “Metropolis” (1927). Rachel in “Blade Runner” (1982).

This entry was posted in Machine Learning. Bookmark the permalink.

2 Responses to What is ML.NET AutoML?

Thorsten Kleppe says:

August 1, 2019 at 9:51 am

Did you compared AutoML with your own work?

My goal is to work in machine learning area, but as I heard the first time about AutoML I was scared, because it felt like a competitor is taking away my goal.

A friend of a friend works for google, I had a conversation with him.
What he told me was sobering, he works for other companies and after he has all the information, he just chases them through googleML, done.

I really like the most parts of ML, but AutoML feels wrong to me.

Loading...
jamesdmccaffrey says:

August 2, 2019 at 6:43 am

I have mixed feelings about AutoML. I refactored my demo code by starting from scratch and coding with ML.NET directly — it was very difficult. I think that for relatively simple scenarios and for developers who don’t have a lot of experience with ML, AutoML can be useful. But AutoML is still something of a black box, and for complex problems, a custom approach will usually be needed.

Loading...