You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
System Information (please complete the following information):
Model Builder Version (available in Manage Extensions dialog): 17.18.2.2415501
Visual Studio Version: 17.9.2
Train function generated by Builder is not working for "Named Entity Recognition" and cause exception:
System.ArgumentOutOfRangeException: 'Cannot map column (name: Label, type: Key<UInt32, 0-0>) in data to the user-defined type, Microsoft.ML.Data.VBuffer`1[System.UInt32]. Arg_ParamName_Name'
Using builder, a was able to generate Named Entity Recognition mlnet model. Builder generated *.training.cs file with "Train" function:
/// <summary>
/// Train a new model with the provided dataset.
/// </summary>
/// <param name="outputModelPath">File path for saving the model. Should be similar to "C:\YourPath\ModelName.mlnet"</param>
/// <param name="inputDataFilePath">Path to the data file for training.</param>
/// <param name="separatorChar">Separator character for delimited training file.</param>
/// <param name="hasHeader">Boolean if training file has a header.</param>
public static void Train(string outputModelPath, string inputDataFilePath = RetrainFilePath, char separatorChar = RetrainSeparatorChar, bool hasHeader = RetrainHasHeader, bool allowQuoting = RetrainAllowQuoting)
{
var mlContext = new MLContext();
var data = LoadIDataViewFromFile(mlContext, inputDataFilePath, separatorChar, hasHeader, true);
var model = RetrainModel(mlContext, data);
SaveModel(mlContext, model, data, outputModelPath);
}
Trying to use this function cause an exception on:
/// <summary>
/// Retrain model using the pipeline generated as part of the training process.
/// </summary>
/// <param name="mlContext"></param>
/// <param name="trainData"></param>
/// <returns></returns>
public static ITransformer RetrainModel(MLContext mlContext, IDataView trainData)
{
var pipeline = BuildPipeline(mlContext);
var model = pipeline.Fit(trainData); // <-HERE AN EXCEPTION IS THROWN
return model;
}
System.ArgumentOutOfRangeException: 'Cannot map column (name: Label, type: Key<UInt32, 0-0>) in data to the user-defined type, Microsoft.ML.Data.VBuffer`1[System.UInt32]. Arg_ParamName_Name'
Here is the example dataset i made for the sake of this post but every data set i have tried is not working: test data example.txt
The text was updated successfully, but these errors were encountered:
The problem is generated in *.training.cs function 'LoadIDataViewFromFile' that is loading dataset.txt without tags. I was able to workaround this problem by creating own function to train:
private class Label(string key)
{
public readonly string Key = key;
}
public static void TrainNER(string outputModelPath, string inputLabelsFilePath, string inputDataFilePath)
{
IEnumerable<Label> GetLabels(string inputLabelsFilePath)
{
var lines = File.ReadLines(inputLabelsFilePath);
return lines.Select(x => new Label(x));
}
IEnumerable<ModelInput> GetLine(string fileName)
{
using StreamReader sr = File.OpenText(fileName);
string? line;
while ((line = sr.ReadLine()) != null)
{
var split = line.Split('\t');
yield return new ModelInput()
{
Sentence = split[0],
Label = split[1..]
};
}
}
var mlContext = new MLContext();
var labels = mlContext.Data.LoadFromEnumerable(GetLabels(inputLabelsFilePath));
var dataView = mlContext.Data.LoadFromEnumerable(GetLine(inputDataFilePath));
var chain = new EstimatorChain<ITransformer>();
var estimator = chain.Append(mlContext.Transforms.Conversion.MapValueToKey("Label", keyData: labels))
.Append(mlContext.MulticlassClassification.Trainers.NamedEntityRecognition(outputColumnName: "predicted_label", batchSize: 32, maxEpochs: 10))
.Append(mlContext.Transforms.Conversion.MapKeyToValue("predicted_label"));
using var transformer = estimator.Fit(dataView);
// function automaticaly generated in *.training.cs
SaveModel(mlContext, transformer, dataView, outputModelPath);
}
System Information (please complete the following information):
Train function generated by Builder is not working for "Named Entity Recognition" and cause exception:
Using builder, a was able to generate Named Entity Recognition mlnet model. Builder generated *.training.cs file with "Train" function:
Trying to use this function cause an exception on:
Here is the example dataset i made for the sake of this post but every data set i have tried is not working:
test data example.txt
The text was updated successfully, but these errors were encountered: