Training

Machine Learning models learn from training data and predict using the knowledge gained from training. To create an excellent ML model, we need to train a model using an excellent dataset. In other words, the quality of the dataset is directly proportional to the performance of the ML model.

Steps to train a Machine Learning Model

Overview

  1. Create dataset using Create Dataset API.

  2. Create a training job using Create Job.

  3. Start the training job using Start Job.

  4. Check the status of the job using Job Status until the completion of the training job.

  5. Get the id of the trained Machine learning model using Job Status API.

1. Create New Dataset

Use Create Dataset API to create a new dataset. This dataset can be used to train or retrain your own Machine Learning models. You can create 3 types of datasets.

  1. FORM FORM datasets are document-based datasets. They are used to train FORM ML models.

  2. DOC DOC datasets are document based datasets. They are used to train DOC ML models.

  3. NER NER datasets are Text Based datasets. They are used to train NER ML models.

2. Create Training Job

Use Create Job API to create a training job.

3. Start Job

Use Start Job API to start a particular job. The API returns the job id which you can use to query for training status.

4. Job Status

Job Status API is used to check the current status of the job using job id. Response of status API will be the status of job until the job is completed. After the completion of the job, status API will return the id and version of the newly created model.

Retraining/Finetuning a model

Retraining can be used to enhance the performance of a Machine learning model. Retraining is the process of adding additional data to the training set and training the model using the modified training set. A new version of the ML model will be created after the completion of retraining.

Retrain Machine learning model using Job APIs

The first step is to create a retraining job to retrain a particular Machine Learning model. The advantage of this method is that we can manually specify the id and version of the dataset. So model retraining won't be restricted to only one dataset. We can retrain models on different datasets to analyze the behavior of ML models.

Mainly used to analyze the behavior of ML models with respect to different datasets and tuning parameters.

Steps

  1. Choose the id and version of the dataset using which you need to retrain the ML model. Create a retraining job using Create Job API.

  2. Start the retraining job using Start Job API.

  3. Check the status of job using Job Status API until the job is completed. After the completion of retraining, status API will return the id and version of your retrained model.

2. Retrain Machine learning model using Model APIs

To retrain a particular Machine learning model using model APIs, please dedicate a dataset for the particular ml model at the time of model creation. Otherwise, data conflict can affect your model retraining.

We can retrain a model using model APIs. This retraining method has more practical aspects. In real life, data that are handled by Machine learning models may change over time or additional patterns may get involved. So the best option to enhance the performance of model is by retraining. But we couldn't neglect the data that we used to train the model. Retrain using model APIs is game changer in this kind of scenario. This retraining method will update training data with new data without losing previous data. In a word, retraining using model APIs is the best method to retrain deployed models.

Steps

  1. Upload dataitems using Single Upload API.

  2. Update the uploaded dataitems using Single Update API.

  3. Upgrade the model using Upgrade API.

Last updated