Skip to content
This repository has been archived by the owner on Apr 8, 2021. It is now read-only.

Allow SbtUpdateReport to provide arbitrary artifact files, not just jars #113

Open
NicolasRouquette opened this issue Oct 20, 2016 · 2 comments

Comments

@NicolasRouquette
Copy link

net.virtualvoid.sbt.graph.backend.SbtUpdateReport.fromConfigurationReport() has a hardcoded rule for selecting a "jar" file for a given ModuleReport:

https://github.com/jrudolph/sbt-dependency-graph/blob/master/src/main/scala/net/virtualvoid/sbt/graph/backend/SbtUpdateReport.scala#L37

It would be helpful to pass an artifact file selector to this function; something like this:

def fromConfigurationReport
(report: ConfigurationReport,
 rootInfo: sbt.ModuleID,
 selector: (Artifact, File) => Boolean)
: ModuleGraph = {
  implicit def id(sbtId: sbt.ModuleID): ModuleId
  = ModuleId(sbtId.organization, sbtId.name, sbtId.revision)

  def moduleEdges(orgArt: OrganizationArtifactReport)
  : Seq[(Module, Seq[Edge])]
  = {
    val chosenVersion = orgArt.modules.find(!_.evicted).map(_.module.revision)
    orgArt.modules.map(moduleEdge(chosenVersion))
  }

  def moduleEdge(chosenVersion: Option[String])(report: ModuleReport)
  : (Module, Seq[Edge]) = {
    val evictedByVersion = if (report.evicted) chosenVersion else None

    val jarFile = report.artifacts.find(selector.tupled).map(_._2)
    (Module(
      id = report.module,
      license = report.licenses.headOption.map(_._1),
      evictedByVersion = evictedByVersion,
      jarFile = jarFile,
      error = report.problem),
      report.callers.map(caller ⇒ Edge(caller.caller, report.module)))
  }

  val (nodes, edges) = report.details.flatMap(moduleEdges).unzip
  val root = Module(rootInfo)

  ModuleGraph(root +: nodes, edges.flatten)
}

A selector function could be then defined like this:

def jarFileSelector
( a: Artifact, f: File)
: Boolean
= a.`type` == "jar" || a.extension == "jar"

def jarOrZipFileSelector
( a: Artifact, f: File)
: Boolean
= a.`type` == "jar" || a.`type` == "zip" || a.extension == "jar" || a.extension == "zip"

I suggest adding a new setting key with a selector function that could be customized for a project with jarFileSelector as the default for backwards compatibility with the current behavior.

@jrudolph
Copy link
Member

@NicolasRouquette sorry for keeping you waiting for so long. Yes, it could be an option to pass that selector in. Could it be enough to pass in a Set of file extensions or types, instead?

Can you explain when this problem arises? Should we also fix the default behavior?

@NicolasRouquette
Copy link
Author

@jrudolph

Where does this problem come from?

This problem arises from several use cases where it is necessary to retrieve a filtered set of Maven artifacts from the transitive dependencies of a given project.

  1. Application-specific packaging

    Many applications are designed to support user-defined extensions. We prefer to manage such applications & extensions as Maven artifacts. This way, we can record in Maven dependencies the compatibility between a particular version of an extension w.r.t. compatible versions of other extensions it depends on and transitively a compatible version of the application itself.

    We've recently done this for a COTS tool, NoMagic's MagicDraw and for one of its extensions, NoMagic's SysML plugin for MagicDraw. Here, the Maven POM is just a proxy specifying where to fetch the actual artifact from NoMagic's new download service with proper credentials.

    In turn, we developed several extensions of this tool, e.g., The OMG Tool-Neutral Interoperability API adapter for MagicDraw and The Dynamic Scripts plugin for MagicDraw.

  2. Provisioning applications + extensions + datasets for workflow execution

    For us, a workflow orchestrates the execution of micro-services; each of which is a kind of technology-specific application with extensions. Some of our ontology modeling workflows involve JVM-based applications (Scala, Java); Ruby and JavaScript. We'd like to use Maven as a way of proxying Ruby GEM and NPM dependencies so that we can use the same SBT machinery for resolving and fetching these dependencies to the appropriate repositories.

Limitations in the current logic

See: https://github.com/jrudolph/sbt-dependency-graph/blob/master/src/main/scala/net/virtualvoid/sbt/graph/backend/SbtUpdateReport.scala#L37

The logic only handles 1 artifact per module; this logic doesn't work where we need to get in the graph nodes corresponding to multiple artifacts published for a given module.

A peculiar need arose when I had to publish an artifact to bintray that exceeded the 250 mb limit for OSS repos. In this case, I published the artifact in multiple parts of 250 mb or less; e.g:
https://bintray.com/jpl-imce/gov.nasa.jpl.imce/imce.dynamic_scripts.magicdraw.plugin/6.18.0

An an improved version of fromConfigurationReport to handle multiple artifacts per module is below:

def fromConfigurationReport
(report: ConfigurationReport,
 rootInfo: sbt.ModuleID,
 selector: (Artifact, File) => Boolean)
: net.virtualvoid.sbt.graph.ModuleGraph = {
  implicit def id(sbtId: sbt.ModuleID): net.virtualvoid.sbt.graph.ModuleId
  = net.virtualvoid.sbt.graph.ModuleId(sbtId.organization, sbtId.name, sbtId.revision)

  def moduleEdges(orgArt: OrganizationArtifactReport)
  : Seq[(net.virtualvoid.sbt.graph.Module, Seq[net.virtualvoid.sbt.graph.Edge])]
  = {
    val chosenVersion = orgArt.modules.find(!_.evicted).map(_.module.revision)
    orgArt.modules.flatMap(moduleEdge(chosenVersion))
  }

  def moduleEdge(chosenVersion: Option[String])(report: ModuleReport)
  : Seq[(net.virtualvoid.sbt.graph.Module, Seq[net.virtualvoid.sbt.graph.Edge])] = {
    val evictedByVersion = if (report.evicted) chosenVersion else None

    report
      .artifacts
      .filter(selector.tupled)
      .map { case (artifact, file) =>

        (net.virtualvoid.sbt.graph.Module(
          id = report.module,
          license = report.licenses.headOption.map(_._1),
          evictedByVersion = evictedByVersion,
          jarFile = Some(file),
          error = report.problem),
          report.callers.map(caller ⇒ net.virtualvoid.sbt.graph.Edge(caller.caller, report.module)))
      }
  }

  val (nodes, edges) = report.details.flatMap(moduleEdges).unzip
  val root = net.virtualvoid.sbt.graph.Module(rootInfo)

  net.virtualvoid.sbt.graph.ModuleGraph(root +: nodes, edges.flatten)
}

Filtering criteria: do we need the Artifact + File or could we pass file extensions/types.

I think a filter operation predicated on the Artifact + File gives us reasonable flexibility to accommodate various needs. It's unfortunate that the filter doesn't have access to the organization info; this is something that I find particularly useful with low-level filtering with SBT's update.

Why use dependency graph instead of SBT's update report?

Before using the dependency graph, I was using SBT's update report to retrieve a filtered subset of dependencies I needed. This worked most of the time except when we began using version range dependencies; e.g.:

libraryDependencies += "foo" % "bar" % "1.0.+"

First, we can't use the SBT maven resolver plugin in this case because of known limitations (See: http://www.scala-sbt.org/0.13/docs/sbt-0.13-Tech-Previews.html#Maven+resolver+plugin).

Second, we noticed some odd behavior where SBT's update report doesn't always provide information about the latest artifact available. For example, if the local ivy cache has, say, version 1.0.0 but the repo has version 1.0.1, then I noticed that dependency graph would give us the current up-to-date info whereas SBT's update report didn't. I am really dumbfounded by this difference. I am not sure how I could make a reproducible case of this peculiar behavior. What I do know is that the SBT dependency graph does provide current up-to-date info and that's why I've updated a lot of my SBT scripts to use it instead of the SBT update report.

Proposal

  1. Add a new fromConfigurationReport as described above so that it invokes a filter for each artifact of a given module; i.e.,:
def fromConfigurationReport
(report: ConfigurationReport,
 rootInfo: sbt.ModuleID,
 selector: (Artifact, File) => Boolean)
: net.virtualvoid.sbt.graph.ModuleGraph 
= { /* see above */ }
  1. Provide a few examples of file selectors as originally described.
def jarFileSelector
( a: Artifact, f: File)
: Boolean
= a.`type` == "jar" || a.extension == "jar"

def jarOrZipFileSelector
( a: Artifact, f: File)
: Boolean
= a.`type` == "jar" || a.`type` == "zip" || a.extension == "jar" || a.extension == "zip"
  1. For backwards compatibility, change fromConfigurationReport as follows:
def fromConfigurationReport
(report: ConfigurationReport,
 rootInfo: sbt.ModuleID)
: net.virtualvoid.sbt.graph.ModuleGraph 
= fromConfigurationReport(report, rootInfo, jarFileSelector)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants