Skip to content

Deprecated Wiki: Azure Trainer

Jack Gerrits edited this page Apr 29, 2019 · 1 revision

Vowpal Wabbit can be hosted on Azure using a cloud service to train models. Data can be fed using Azure EventHubs and by creating JSON formatted examples.

Note: As of today the trainer expects contextual bandit-style data, as it performs evaluation. With minor modifications one should be able to train arbitrary models.

Configuration

At deployment time cscfg or at runtime time through the Azure portal. The trainer observes configuration changes and restarts as required.

  • APPINSIGHTS_INSTRUMENTATIONKEY: An Application Insights key used for central logging and performance monitoring.
  • Diagnostics.ConnectionString: Azure storage connection string for startup logging.
  • StorageConnectionString: Azure storage connection string used to output models.
  • JoinedEventHubConnectionString: EventHubs connection string used for data ingestion.
  • EvalEventHubConnectionString: EventHubs connection string used to output evaluation.
  • AdminToken: The authorization token required to invoke the trainers REST API.
  • CheckpointIntervalOrCount: The trainer checkpoints in regular intervals. If the configuration contains ':' it is treated as time span (see format), otherwise a number representing the number of examples after which checkpointing shall be performed. Checkpoint data includes the model, a list of events since the last checkpoint (trackback) and the EventHubs position.
  • EnableExampleTracing: Traces each example in VW string format to AppInsights.

Upgrade an existing deployment

  • download cscfg from Management console using this powershell script
  • download the cspkg from GitHub release page. Make sure you select the correct VM size.
  • navigate to Azure portal and your trainer instance, select "Update"
  • Select: Deploy even if one or more roles contain a single instance
  • Select: Allow the update if role sizes change or if the number of roles change
  • Hit start

REST API

Each request expects a HTTP authorization header: "Authorization: ".

  • /reset Resets the current model.
  • /status returns a status and live performance counter values.

One can inject an offline trained model (aka warmstart) using

 curl  -v --request PUT --data-binary @<filename>  "<trainer url>/reset" --header "Authorization: <Insert AdminToken here>"

Links

Debugging

To run the debug version on Azure, connect using Remote Desktop and copy C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\redist\Debug_NonRedist\x64\Microsoft.VC120.DebugCRT* to D:\Windows\System32

Clone this wiki locally