Hello!
I often have CSV files with more than 50 float columns, so it's not feasible to specify each of them individually. I've failed to load them in one shot using a range/sweep specifier. To test things out in smaller scale, I used the Iris example because it ends with 4 float columns.
Here's the data class, I only added 2 lines at the end:
public class IrisData
{
[Column("0")]
public float Label;
[Column("1")]
public float SepalLength;
[Column("2")]
public float SepalWidth;
[Column("3")]
public float PetalLength;
[Column("4")]
public float PetalWidth;
[Column("1-*", name: "Features")] // New
public float[] Features; // New
}
Here's the simplified pipeline, I only commented out the normal way with ColumnConcatenator:
var pipeline = new LearningPipeline();
pipeline.Add(new TextLoader(DataPath).CreateFrom<IrisData>(useHeader: true, separator: '\t'));
//pipeline.Add(new ColumnConcatenator("Features",
// "SepalLength",
// "SepalWidth",
// "PetalLength",
// "PetalWidth"));
pipeline.Add(new KMeansPlusPlusClusterer() { K = 3 });
var model = pipeline.Train<IrisData, ClusterPrediction>();
So it worked when I load each column individually and then concatenate them in the pipeline, like the sample code says. But it always throws an exception when I use my above code:
System.Reflection.TargetInvocationException: 'Exception has been thrown by the target of an invocation.'
Inner Exception:
InvalidOperationException: Column 'Features' is a vector of variable size, which is not supported for normalizers
Please help! Thank you!
=============================================================
System information
- OS version/distro: Windows 10
- .NET Version (eg., dotnet --info): .Net Framework 4.7.1
Issue
- What did you do?: trying to load a CSV's multiple float columns by specifying a range in the data class's declaration, for example: "1-4"
- What happened?: I got an exception on the Features' size.
- What did you expect?: that concatenating columns by specifying a range would work the same as adding a ColumnConcatenator to the pipeline.
Source code / logs
Please paste or attach the code or logs or traces that would be helpful to diagnose the issue you are reporting.
Hello!
I often have CSV files with more than 50 float columns, so it's not feasible to specify each of them individually. I've failed to load them in one shot using a range/sweep specifier. To test things out in smaller scale, I used the Iris example because it ends with 4 float columns.
Here's the data class, I only added 2 lines at the end:
Here's the simplified pipeline, I only commented out the normal way with ColumnConcatenator:
So it worked when I load each column individually and then concatenate them in the pipeline, like the sample code says. But it always throws an exception when I use my above code:
Please help! Thank you!
=============================================================
System information
Issue
Source code / logs
Please paste or attach the code or logs or traces that would be helpful to diagnose the issue you are reporting.