Contextual Multi-Armed Bandit Item/Reward Tracker & Model Trainer

The Improve AI Tracker/Trainer is a stack of serverless components that trains updated contextual multi-armed bandit models for scoring, ranking, and decisions. The stack runs on AWS to cheaply and easily track JSON items and their rewards from Improve AI libraries. These rewards are joined with the tracked items that they're associated with and used as input to training new scoring and ranking models.

Deployment

Fork this repo

Make a private fork of this repo. This way your model configuration is stored in revision control.

Install the Serverless Framework

$ npm install -g serverless

Install NPM Dependencies

$ npm install

Configure Models and Training Parameters

$ nano config/config.yml

Deploy the Stack

Deploy a new dev stage in us-east-1

$ serverless deploy --stage dev

Deploying improveai-acme-demo to stage dev (us-east-1)

✔ Service deployed to stack improveai-acme-demo-dev (111s)

endpoint: https://xxxx.lambda-url.us-east-1.on.aws/

The output of the deployment will list the track endpoint URL like https://xxxx.lambda-url.us-east-1.on.aws. The track endpoint URL may be used directly by the client SDKs to track decisions and rewards. Alternately, a CDN may be configured in front of the track endpoint URL for greater administrative control.

The deployment will also create a models S3 bucket in the form of improveai-{organization}-{project}-{stage}-models. After each round of training, updated models are automatically uploaded to the models bucket.

The models bucket is private by default. Make the '/models/latest/' directory public to serve models directly from S3. Alternatively, a CDN may be configured in front of the models S3 bucket.

Model URLs follow the template of https://{modelsBucket}.s3.amazonaws.com/models/latest/{modelName}.{mlmodel|xgb}.gz. The Android and Python SDKs use .xgb.gz models and the iOS SDK uses .mlmodel.gz models.

Integrate a Ranker Library

Improve AI libraries are currently available for Swift, Java, and Python.

Algorithm

The reinforcement learning algorithm is a contextual multi-armed bandit with XGBoost acting as the core regression algorithm. As such, it is ideal for making decisions on structured data, such as JSON or native objects in Swift/Objective-C, Java/Kotlin, and Python. Unlike deep reinforcement learning algorithms, which often require simulator environments and hundreds of millions of decisions, this algorithm performs well with the more modest amounts of data found in real world applications. Compared to A/B testing it requires exponentially less data for good results.

Name		Name	Last commit message	Last commit date
Latest commit History 982 Commits
config		config
src		src
tests		tests
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
package.json		package.json
serverless.yml		serverless.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

config

config

src

src

tests

tests

.gitignore

.gitignore

LICENSE.md

LICENSE.md

README.md

README.md

package.json

package.json

serverless.yml

serverless.yml

Repository files navigation

Contextual Multi-Armed Bandit Item/Reward Tracker & Model Trainer

Deployment

Fork this repo

Install the Serverless Framework

Install NPM Dependencies

Configure Models and Training Parameters

Deploy the Stack

Integrate a Ranker Library

Algorithm

About

Releases 3

Packages

Contributors 3

Languages

License

improve-ai/tracker-trainer

Folders and files

Latest commit

History

Repository files navigation

Contextual Multi-Armed Bandit Item/Reward Tracker & Model Trainer

Deployment

Fork this repo

Install the Serverless Framework

Install NPM Dependencies

Configure Models and Training Parameters

Deploy the Stack

Integrate a Ranker Library

Algorithm

About

Topics

Resources

License

Stars

Watchers

Forks

Languages