How to train models
Last updated
Last updated
Once you have chosen your scenario and the ML task, AI Builder asks you to provide a dataset. The data is used to train, evaluate, and choose the best model for your scenario.
Robot AI Builder supports datasets in .tsv
, .csv
, .txt
formats, as well as SQL database format. If you have a .txt
file, columns should be separated with ,,
;
or \t
.
If the dataset is made up of images, the supported file types are .jpg
and .png
. For more information, see Load training data into AI Builder.
A dataset is a table of rows of training examples, and columns of attributes.
Each row has:
a label (the attribute that you want to predict)
features (attributes that are used as inputs to predict the label). For the house-price prediction scenario, the features could be:
the square footage of the house
the number of bedrooms and bathrooms
the zip code The label is the historical house price for that row of square footage, bedroom, and bathroom values, and zip code.
Once you select your scenario, environment, data, and label, AI Builder trains the model.
Training is an automatic process by which AI Builder teaches your model how to answer questions for your scenario. Once trained, your model can make predictions with input data that it has not seen before. For example, if you are predicting house prices and a new house comes on the market, you can predict its sale price.
Because AI Builder uses automated machine learning (AutoML), it does not require any input or tuning from you during training.
Evaluation is the process of measuring how good your model is. AI Builder uses the trained model to make predictions with new test data, and then measures how good the predictions are. Model Builder splits the training data into a training set and a test set. The training data (80%) is used to train your model and the test data (20%) is held back to evaluate your model.
A scenario maps to a machine learning task. Each ML task has its own set of evaluation metrics.
The default metric for value prediction problems is RSquared, the value of RSquared ranges between 0 and 1. 1 is the best possible value or in other words the closer the value of RSquared to 1 the better your model is performing. Other metrics reported such as absolute-loss, squared-loss, and RMS loss are additional metrics, which can be used to understand how your model is performing and comparing it against other value prediction models.
The default metric for classification problems is accuracy. Accuracy defines the proportion of correct predictions your model is making over the test dataset. The closer to 100% or 1.0 the better it is. Other metrics reported such as AUC (Area under the curve), which measures the true positive rate vs. the false positive rate should be greater than 0.50 for models to be acceptable. Additional metrics like F1 score can be used to control the balance between Precision and Recall.
The default metric for Multi-class classification is Micro Accuracy. The closer the Micro Accuracy to 100% or 1.0 the better it is. Another important metric for Multi-class classification is Macro-accuracy, similar to Micro-accuracy the closer to 1.0 the better it is. A good way to think about these two types of accuracy is:
Micro-accuracy: How often does an incoming ticket get classified to the right team?
Macro-accuracy: For an average team, how often is an incoming ticket correct for their team?
If your model performance score is not as good as you want it to be, you can:
Train for a longer period of time. With more time, the automated machine learning engine experiments with more algorithms and settings.
Add more data. Sometimes the amount of data is not sufficient to train a high-quality machine learning model. This is especially true with datasets that have a small number of examples.
Balance your data. For classification tasks, make sure that the training set is balanced across the categories. For example, if you have four classes for 100 training examples, and the two first classes (tag1 and tag2) are used for 90 records, but the other two (tag3 and tag4) are only used on the remaining 10 records, the lack of balanced data may cause your model to struggle to correctly predict tag3 or tag4.
After the evaluation phase, AI Builder outputs a model file and you can use the model directly in a robot workflow after deploying it.