spark-lp

This package offers an implementation of Mehrohra's predictor-corrector interior point algorithm, described in my thesis Distributed linear programming with Apache Spark, to solve large-scale linear programming problems at the lowest cost using Apache Spark.

Linear programming has the following standard form:

minimize c^T x 
subject to Ax=b and x >= 0

where c, b are given vectors ((.)^T is the traspose operation), A is a given m by n matrix and x is the objective vector. We assume that in A the number of rows (equations) is at most equal to the number of columns (unknowns) (m <= n) and A has full row rank, thus AA^T is invertible.

Example

The following is an example of using spark-lp locally to solve a linear programming problem in parallel with 2 cores and 2 partitions:

import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.mllib.linalg.{DenseVector, Vector, Vectors}
import org.apache.spark.mllib.optimization.lp.VectorSpace._
import org.apache.spark.mllib.optimization.lp.vs.dvector.DVectorSpace
import org.apache.spark.mllib.optimization.lp.vs.vector.DenseVectorSpace
import org.apache.spark.mllib.optimization.lp.LP

val sparkConf = new SparkConf().setMaster("local[2]").setAppName("TestLPSolver")
val sc = new SparkContext(sparkConf)
val numPartitions = 2
val cArray = Array(2.0, 1.5, 0.0, 0.0, 0.0, 0.0, 0.0)
val BArray = Array(
	Array(12.0, 16.0, 30.0, 1.0, 0.0),
	Array(24.0, 16.0, 12.0, 0.0, 1.0),
	Array(-1.0, 0.0, 0.0, 0.0, 0.0),
	Array(0.0, -1.0, 0.0, 0.0, 0.0),
	Array(0.0, 0.0, -1.0, 0.0, 0.0),
	Array(0.0, 0.0, 0.0, 1.0, 0.0),
	Array(0.0, 0.0, 0.0, 0.0, 1.0))
val bArray = Array(120.0, 120.0, 120.0, 15.0, 15.0)

val c: DVector = sc.parallelize(cArray, numPartitions).glom.map(new DenseVector(_))
val rows: DMatrix = sc.parallelize(BArray, numPartitions).map(Vectors.dense(_))
val b: DenseVector = new DenseVector(bArray)

val (v, x): (Double, DVector) = LP.solve(c, rows, b, sc=sc)
val xx = Vectors.dense(x.flatMap(_.toArray).collect())
println(s"optimial vector is $xx")
println("optimal min value: " + v)

Software Architecture Overview

Detailed descriptions of our design is described in chapter 4 of the thesis.

Advantages

spark-lp is unique because it is open-source and it can solve large-scale LP problems in a distributed way with fault-tolerance over commodity clusters of machines. Thus, it provides the lowest cost opportunity for such applications. See page 42 for cluster results here.
spark-lp is at least ~10X faster and more accurate than spark-tfocs for solving large-scale LP problems. See page 38 for local results here. Our benchmark shows that spark-tfocs is not suitable even for small LP problems.

Future plans:

Add preprocessing to capture more general LP formats.
Add infeasibility detection.
Extend to QP solver.
Add GPU support, as described in page 47 here, using INDArray provided in ND4J library.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
conf		conf
project		project
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conf

conf

project

project

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

build.sbt

build.sbt

Repository files navigation

spark-lp

Example

Software Architecture Overview

Advantages

Future plans:

About

Languages

License

ehsanmok/spark-lp

Folders and files

Latest commit

History

Repository files navigation

spark-lp

Example

Software Architecture Overview

Advantages

Future plans:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages