Skip to content

Commit

Permalink
scrooge-generator: cache getCanonicalPath value in DirImporter
Browse files Browse the repository at this point in the history
Problem
Using a large number of `DirImporter` instances within a `MultiImporter` is very slow.
This is because `DirImporter` uses `java.io.File.getCanonicalPath` as part of it's `equals()`
and `hashCode()` methods, which are called frequently from `MultiImporter`, esp. in
`MultiImporter.deduped()`.  `File.getCanonicalPath` makes system IO calls, and is relatively
slow because of it.

Solution
Updates `DirImporter` to cache the result of `getCanonicalPath` so it's only called once.

JIRA Issues: AIPIPE-7239

Differential Revision: https://phabricator.twitter.biz/D523368
  • Loading branch information
Joseph Boyd authored and jenkins committed Jul 31, 2020
1 parent c5f95e4 commit 76504ba
Showing 1 changed file with 7 additions and 5 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,9 @@ object Importer {
}

final case class DirImporter(dir: File) extends Importer {
private[scrooge] def canonicalPaths = Seq(dir.getCanonicalPath) // for test
private lazy val dirCanonicalPath = dir.getCanonicalPath

private[scrooge] def canonicalPaths = Seq(dirCanonicalPath) // for test

private[this] def resolve(filename: String): Option[(File, Importer)] = {
val file = {
Expand All @@ -91,7 +93,7 @@ final case class DirImporter(dir: File) extends Importer {
if (file.canRead) {
// if the file does not exactly locate in dir, we need to add a new relative path
val importer =
if (file.getParentFile.getCanonicalPath == dir.getCanonicalPath)
if (file.getParentFile.getCanonicalPath == dirCanonicalPath)
this
else
Importer(file.getParentFile) +: this
Expand All @@ -117,19 +119,19 @@ final case class DirImporter(dir: File) extends Importer {
resolve(filename).map { case (file, _) => file.getCanonicalPath }
}

override def toString: String = s"DirImporter(${dir.getCanonicalPath})"
override def toString: String = s"DirImporter($dirCanonicalPath)"

// We override equals and hashCode because
// the default equality for File does not have the semantics we want.
override def equals(other: Any): Boolean = {
other match {
case that: DirImporter =>
this.dir.getCanonicalPath == that.dir.getCanonicalPath
this.dirCanonicalPath == that.dirCanonicalPath
case _ => false
}
}

override def hashCode: Int = this.dir.getCanonicalPath.hashCode
override def hashCode: Int = this.dirCanonicalPath.hashCode
}

// jar files are just zip files, so this will work with both .jar and .zip files
Expand Down

0 comments on commit 76504ba

Please sign in to comment.