Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

packaging takes very long time #68

Closed
pankajmi opened this issue Jan 31, 2013 · 44 comments
Closed

packaging takes very long time #68

pankajmi opened this issue Jan 31, 2013 · 44 comments

Comments

@pankajmi
Copy link

I am not sure if I am the only one who is facing this problem but
Packaging takes hell lot of time. May be I am missing some settings or something.
Last these three steps are the slowest, build stucks here for long long time.

[info] Merging 'org/apache/http/annotation/Immutable.class' with strategy 'deduplicate'
[info] SHA-1: WrappedArray(26, -125, -113, 64, 37, 94, -6, -79, -79, 41, 34, -92, 42, 7, -72, 31, -97, -57, 55, -2)
[info] Packaging ~/workspace/myproj/xyz/target/xyz-assembly-1.0.jar ..
[info] Done packaging.
[success] Total time: 724 s, completed 31 Jan, 2013 1:51:10 PM

Please let me know if anyone need more info ?

@eed3si9n
Copy link
Member

I can't work on this issue without having exact reproduction steps and a full log. Do some projects take longer time than the others? Could it be dependent on the library that you are using? The last portion of the log indicates that multiple jars include org/apache/http/annotation/Immutable.class. Is there more merging that's going on?

@pankajmi
Copy link
Author

pankajmi commented Feb 1, 2013

It's intermittent, sometime it stucks forever.

full stack trace

[info] Loading global plugins from /Users/pankajmittal/.sbt/plugins
[info] Loading project definition from /Users/pankajmittal/workspace/workflow/project
[warn] Multiple resolvers having different access mechanism configured with same name 'local'. To avoid conflict, Remove duplicate project resolvers (`resolvers`) or rename publishing resolver (`publishTo`).
[info] Set current project to workflow (in build file:/Users/pankajmittal/workspace/workflow/)
> project taskmanager
[info] Set current project to taskmanager (in build file:/Users/pankajmittal/workspace/workflow/)

> assembly
[info] Packaging /Users/pankajmittal/workspace/workflow/taskmanager/target/scala-2.9.2/taskmanager_2.9.2-1.0-sources.jar ...
[info] Done packaging.
[info] Compiling 10 Scala sources to /Users/pankajmittal/workspace/workflow/taskmanager/target/scala-2.9.2/classes...
[info] Compiling 3 Scala sources to /Users/pankajmittal/workspace/workflow/taskmanager/target/scala-2.9.2/classes...
[warn] /Users/pankajmittal/workspace/workflow/taskmanager/src/main/scala/com/livestream/taskmanager/TaskManagerServiceConfig.scala:7: class ServerConfig in package config is deprecated: no direct replacement
[warn] class TaskManagerServiceConfig extends ServerConfig[TaskManagerService] {
[warn]                                        ^
[warn] one warning found
[info] No tests to run for taskmanager/test:test
[info] Including antlr-2.7.2.jar
[info] Including c3p0-0.9.1.2.jar
[info] Including logback-classic-1.0.1.jar
[info] Including logback-core-1.0.1.jar
[info] Including jsr305-1.3.9.jar
[info] Including guava-13.0.jar
[info] Including h2-1.3.170.jar
[info] Including amqp-client-3.0.0.jar
[info] Including paranamer-2.4.1.jar
[info] Including finagle-core-6.0.3.jar
[info] Including finagle-http-6.0.3.jar
[info] Including finagle-ostrich4-6.0.3.jar
[info] Including ostrich-9.0.4.jar
[info] Including scala-json-3.0.1.jar
[info] Including util-codec-6.0.4.jar
[info] Including util-collection-6.0.4.jar
[info] Including util-core-6.0.4.jar
[info] Including util-eval-6.0.4.jar
[info] Including util-hashing-6.0.4.jar
[info] Including util-jvm-6.0.4.jar
[info] Including util-logging-6.0.4.jar
[info] Including akka-actor-2.0.5.jar
[info] Including config-0.5.0.jar
[info] Including commons-chain-1.1.jar
[info] Including commons-codec-1.6.jar
[info] Including commons-collections-3.2.1.jar
[info] Including commons-digester-1.8.jar
[info] Including commons-lang-2.6.jar
[info] Including commons-logging-1.1.1.jar
[info] Including commons-net-3.1.jar
[info] Including commons-pool-1.6.jar
[info] Including commons-validator-1.3.1.jar
[info] Including dom4j-1.1.jar
[info] Including netty-3.5.5.Final.jar
[info] Including mysql-connector-java-5.1.6.jar
[info] Including jna-3.3.0.jar
[info] Including lift-json_2.9.2-2.5-SNAPSHOT.jar
[info] Including opencsv-1.8.jar
[info] Including httpclient-4.1.3.jar
[info] Including httpcore-nio-4.2.jar
[info] Including httpcore-4.2.jar
[info] Including httpmime-4.1.3.jar
[info] Including struts-core-1.3.8.jar
[info] Including struts-taglib-1.3.8.jar
[info] Including struts-tiles-1.3.8.jar
[info] Including velocity-tools-2.0.jar
[info] Including velocity-1.7.jar
[info] Including quartz-2.1.3.jar
[info] Including scalap-2.9.2.jar
[info] Including slf4j-api-1.6.4.jar
[info] Including squartz_2.9.2-1.0-SNAPSHOT.jar
[info] Including syslog4j-0.9.30.jar
[info] Including oro-2.0.8.jar
[info] Including postgresql-9.1-901.jdbc4.jar
[info] Including jedis-2.1.0.jar
[info] Including sslext-1.2-0.jar
[info] Including scala-compiler.jar
[info] Including scala-library.jar
[info] Including java-gearman-service-0.6.6.jar
[info] Including gdata-client-1.0.jar
[info] Including gdata-core-1.0.jar
[info] Including gdata-spreadsheet-3.0.jar
[info] Including google-oauth-client-1.10.0-beta.jar
[info] Merging 'License.txt' with strategy 'rename'
[info] Merging 'NOTICE.txt' with strategy 'rename'
[info] Merging 'META-INF/NOTICE.txt' with strategy 'rename'
[info] Merging 'META-INF/NOTICE' with strategy 'rename'
[info] Merging 'META-INF/license' with strategy 'rename'
[info] Merging 'META-INF/LICENSE.txt' with strategy 'rename'
[info] Merging 'LICENSE.txt' with strategy 'rename'
[info] Merging 'META-INF/LICENSE' with strategy 'rename'
[info] Merging 'org/apache/http/annotation/ThreadSafe.class' with strategy 'deduplicate'
[info] Merging 'org/apache/http/annotation/NotThreadSafe.class' with strategy 'deduplicate'
[info] Merging 'META-INF/MANIFEST.MF' with strategy 'discard'
[info] Merging 'org/apache/http/annotation/GuardedBy.class' with strategy 'deduplicate'
[info] Merging 'META-INF/INDEX.LIST' with strategy 'discard'
[info] Merging 'reference.conf' with strategy 'concat'
[info] Merging 'META-INF/services/java.sql.Driver' with strategy 'filterDistinctLines'
[info] Merging 'org/apache/http/annotation/Immutable.class' with strategy 'deduplicate'
[info] SHA-1: WrappedArray(9, 3, 98, -116, -112, 53, 88, -58, 19, -63, 22, 57, 99, 107, 15, 63, -114, 113, 53, 87)
[info] Packaging /Users/pankajmittal/workspace/workflow/taskmanager/target/taskmanager-assembly-1.0.jar ...
[info] Done packaging.
[success] Total time: 809 s, completed 1 Feb, 2013 5:56:44 PM

sbt version : 0.12.2
system info :
Software OS X 10.8.2 (12C60), Process: 2.5 GHz intel core i5, Memory: 4gb 1333 MHz DDR3

Let me know if you want me to monitor on system during process. I monitored memory almost 600mb was free, cpu was 97% idle.

@jrudolph
Copy link
Member

jrudolph commented Feb 1, 2013

Have you tried running jstack <pid> while the process hung? If you could post its output one could at least see in which call it is spending its time. (Maybe run it a view times to see if it returns similar results)

@eed3si9n
Copy link
Member

eed3si9n commented Feb 1, 2013

Caching feature was requested in #59 and was added to the latest 0.8.5. This calculates SHA-1 of every single *.class file prior to packaging. Trying running clean and test first to see if the assemblying speeds up.

@antonioalegria
Copy link

I have the same issue with SBT Assembly 0.8.5, Scala 2.9.2 and SBT 0.12.2. Here is my jstack output: https://gist.github.com/4707228

@pankajmi
Copy link
Author

pankajmi commented Feb 5, 2013

As I said it's intermittent, it doesn't happen always. Here are few jstacks when it happened last time.

https://gist.github.com/pankajmi/4712919

https://gist.github.com/pankajmi/4712926

@amrnt
Copy link

amrnt commented Feb 8, 2013

Same here! and I dont know if it finish soon!

jstack: https://gist.github.com/amrnt/82a716d70b9e1919bbcd

I'm on Macbook Air 4GB - Scala 2.10.0

@eed3si9n
Copy link
Member

eed3si9n commented Feb 8, 2013

It seems like most of your jstacks outputs are in some phase of calculating the hash, which is consistent with my theory that this was introduces as part of #59. I could add a setting to enable/disable caching behavior and turn it off by default.

@amrnt
Copy link

amrnt commented Feb 9, 2013

After I updated my sbt from 0.12.0 to 0.12.2 It works fine!

@eed3si9n
Copy link
Member

eed3si9n commented Feb 9, 2013

Really? @pankajmi Could you please try sbt 0.12.2, and see if the situation changes for you too?

@eed3si9n
Copy link
Member

0.8.6 is out with default caching turned off.

@pankajmi
Copy link
Author

I was having problem with 0.12.2, as I mentioned above.
Let me try 0.8.6 version and see, is it going to have any side effects ?

@pankajmi
Copy link
Author

I tried sbt-assembly version 0.8.6 with sbt 0.12.2. , see time taken below -

[info] Including antlr-2.7.2.jar
[info] Including c3p0-0.9.1.2.jar
[info] Including logback-classic-1.0.1.jar
[info] Including logback-core-1.0.1.jar
[info] Including jsr305-1.3.9.jar
[info] Including guava-13.0.jar
[info] Including h2-1.3.170.jar
[info] Including paranamer-2.4.1.jar
[info] Including finagle-core-6.0.3.jar
[info] Including finagle-http-6.0.3.jar
[info] Including finagle-ostrich4-6.0.3.jar
[info] Including ostrich-9.0.4.jar
[info] Including scala-json-3.0.1.jar
[info] Including util-codec-6.0.4.jar
[info] Including util-collection-6.0.4.jar
[info] Including util-core-6.0.4.jar
[info] Including util-eval-6.0.4.jar
[info] Including util-hashing-6.0.4.jar
[info] Including util-jvm-6.0.4.jar
[info] Including util-logging-6.0.4.jar
[info] Including akka-actor-2.0.5.jar
[info] Including config-0.5.0.jar
[info] Including commons-chain-1.1.jar
[info] Including commons-codec-1.6.jar
[info] Including commons-collections-3.2.1.jar
[info] Including commons-digester-1.8.jar
[info] Including commons-lang-2.6.jar
[info] Including commons-logging-1.1.1.jar
[info] Including commons-net-3.1.jar
[info] Including commons-pool-1.6.jar
[info] Including commons-validator-1.3.1.jar
[info] Including dom4j-1.1.jar
[info] Including netty-3.5.5.Final.jar
[info] Including mysql-connector-java-5.1.6.jar
[info] Including jna-3.3.0.jar
[info] Including lift-json_2.9.2-2.5-SNAPSHOT.jar
[info] Including opencsv-1.8.jar
[info] Including httpclient-4.1.3.jar
[info] Including httpcore-nio-4.2.jar
[info] Including httpcore-4.2.jar
[info] Including httpmime-4.1.3.jar
[info] Including struts-core-1.3.8.jar
[info] Including struts-taglib-1.3.8.jar
[info] Including struts-tiles-1.3.8.jar
[info] Including velocity-tools-2.0.jar
[info] Including velocity-1.7.jar
[info] Including quartz-2.1.3.jar
[info] Including scalap-2.9.2.jar
[info] Including slf4j-api-1.6.4.jar
[info] Including squartz_2.9.2-1.0-SNAPSHOT.jar
[info] Including syslog4j-0.9.30.jar
[info] Including oro-2.0.8.jar
[info] Including postgresql-9.1-901.jdbc4.jar
[info] Including jedis-2.1.0.jar
[info] Including sslext-1.2-0.jar
[info] Including scala-compiler.jar
[info] Including scala-library.jar
[info] Including java-gearman-service-0.6.6.jar
[info] Including gdata-client-1.0.jar
[info] Including gdata-core-1.0.jar
[info] Including gdata-spreadsheet-3.0.jar
[info] Including google-oauth-client-1.10.0-beta.jar
[info] Merging 'License.txt' with strategy 'rename'
[info] Merging 'NOTICE.txt' with strategy 'rename'
[info] Merging 'META-INF/NOTICE.txt' with strategy 'rename'
[info] Merging 'META-INF/NOTICE' with strategy 'rename'
[info] Merging 'META-INF/license' with strategy 'rename'
[info] Merging 'META-INF/LICENSE.txt' with strategy 'rename'
[info] Merging 'LICENSE.txt' with strategy 'rename'
[info] Merging 'META-INF/LICENSE' with strategy 'rename'
[info] Merging 'org/apache/http/annotation/ThreadSafe.class' with strategy 'deduplicate'
[info] Merging 'org/apache/http/annotation/NotThreadSafe.class' with strategy 'deduplicate'
[info] Merging 'META-INF/MANIFEST.MF' with strategy 'discard'
[info] Merging 'org/apache/http/annotation/GuardedBy.class' with strategy 'deduplicate'
[info] Merging 'META-INF/INDEX.LIST' with strategy 'discard'
[info] Merging 'reference.conf' with strategy 'concat'
[info] Merging 'META-INF/services/java.sql.Driver' with strategy 'filterDistinctLines'
[info] Merging 'org/apache/http/annotation/Immutable.class' with strategy 'deduplicate'
[info] Packaging /Users/pankajmittal/workspace/workflow/taskmanager/target/taskmanager-assembly-1.0.jar ...
[info] Done packaging.
[success] Total time: 342 s, completed 13 Feb, 2013 12:52:58 PM

Still taking time, and jstack is as follows -
https://gist.github.com/pankajmi/4942861

@eed3si9n
Copy link
Member

That's less than half the time as your original report 724 s, and the jstack shows that it's actually making jar instead of doing hash.

Creating assembly takes time because we are unzipping all jars and zipping them back again. If you want to avoid that you could take a look at https://github.com/sbt/sbt-onejar.

@pankajmi
Copy link
Author

@eed3si9n :Thanks for reply, actually I tried sbt-onejar just to check the size of jar that gets generated and surprised to see it generates even smaller jar though it simply puts all in one jar without unzipping. Can you please tell me what other advantages we can get from sbt-assembly apart from assuming it is going to generate one small jar after all merge and all ? I actually thought sbt-assembly should generate smaller.

@eed3si9n
Copy link
Member

sbt-assembly is simpler. see sbt-onejar's page for the comparison.

@aradke
Copy link

aradke commented Apr 4, 2013

I found sbt-assembly takes much longer to compile than one-jar but it is much faster to execute the jar once compiled. As an example, one of my one-jar application takes nearly 30seconds to launch where the same sbt-assembly version is almost immediate.

I am assuming the difference is the class clashing is determined at compile time with sbt-assembly and with one-jar the specialized class loader is occurring at run-time.

@jrudolph
Copy link
Member

jrudolph commented Apr 8, 2013

@aradke this is an interesting observation. Here's another speculation why onejar would be slower: You can access entries of a zip file without extracting the whole archive. However, you can not access entries of a zip file which lies inside another zip file without extracting the inner zip file. So, to find a class (at least the first) with the one-jar approach the JVM has to extract all of the dependency jars once to find out where the class lies and then load it.

Still, 30 seconds sounds very slow.

@eed3si9n
Copy link
Member

Just released 0.9.0 incorporating the pull req #83. This caches the jar unzipping results and cuts down the assembly time for second run onwards: http://notes.implicit.ly/post/51259892611/sbt-assembly-0-9-0

@gip
Copy link

gip commented Jul 18, 2013

Using sbt-assembly 0.9.0, sbt 0.12.3 and scala 2.10, I generate a single 76M jar file in 1612 seconds! Is there a way to cut assembly time further?

@eed3si9n
Copy link
Member

@gip is that for the first run or the second? caching should improve the performance for the second run forward.

@eed3si9n
Copy link
Member

eed3si9n commented Nov 2, 2013

Just released 0.10.1 with some more performance improvements (#96) that should make the run time more consistent: http://notes.implicit.ly/post/65751699253/sbt-assembly-0-10-1

@joeroot
Copy link

joeroot commented Aug 4, 2014

I'm seeing this issue on Ubuntu (run within VirtualBox on OS X). Assembly typically takes 10+ mins, or just crashes the whole machine. When I run it from with OS X however, compile time is < 30 seconds.

It always hangs whilst/or just after including dependencies. Each include seems to take slightly longer than the one before it. sbt compile works fine.

Running SBT 0.13.5, Scala 2.10.3 and SBT Assembly 0.11.2.

Not a killer, but odd!

@fblundun
Copy link

@joeroot I have the same issue of sbt assembly taking much longer in a VirtualBox guest (even with access to all cores) than the host. Did you ever find a way around this?

@iangelov
Copy link

That's probably because disk i/o is more expensive inside a VM (those files need to be read in order to determine their SHA1 hashes)

@joeroot
Copy link

joeroot commented May 19, 2015

@fblundun: I never found a way around this, instead I run set assembly from within Mac, which is slightly frustrating.

@iangelov: It could be i/o, but that seems quite extreme. My projects don't have that many files, and I tend to end up crashing, rather than completing very slowly.

@eed3si9n
Copy link
Member

One idea that was suggested by @jsuereth was that sbt-assembly try reading into the JAR directly without unzipping first. ymmv/pr welcome on this front.

@ghost
Copy link

ghost commented Jul 31, 2015

If you're using vboxfs (the Vagrant default) there's no way around this. It's slow. You should use nfs instead.

@AnirudhVyas
Copy link

AnirudhVyas commented Sep 29, 2017

@eed3si9n I have the exact same problem; my project is barebones, spark, akka, scala 2.11.8 - was there an option to disable calculation of SHA1 that was added? Assembly is taking 774+ seconds

import Dependencies._
resolvers ++= Seq(
  Resolver.sonatypeRepo("releases"),
  Resolver.sonatypeRepo("snapshots")
)
lazy val root = (project in file(".")).
  settings(
    inThisBuild(List(
      organization := "me.free",
      scalaVersion := "2.11.8",
      version := "0.1.0-SNAPSHOT"
    )),
    name := "BondAgent",
    libraryDependencies ++= Seq(akkDeps ++ Seq(scalaCheck)
      ++ Seq(shapelessDep)
      ++ sparkDeps
      ++ Seq(redisScala)
      ++ argonautDeps ++ sparkConflictResolverDeps).flatten
  )
assemblyMergeStrategy in assembly := {
  case PathList("org", "aopalliance", xs@_*) => MergeStrategy.first
  case PathList("javax", "inject", xs@_*) => MergeStrategy.first
  case PathList("javax", "servlet", xs@_*) => MergeStrategy.first
  case PathList("javax", "activation", xs@_*) => MergeStrategy.first
  case PathList("org", "xml-apis", x@_*) => MergeStrategy.first
  case PathList("javax", "stax-api", x@_*) => MergeStrategy.first
  case PathList("org", "commons-collections", x@_*) => MergeStrategy.first
  case PathList("org", "jcl-over-slf4j", x@_*) => MergeStrategy.first
  case PathList("org", "apache", xs@_*) => MergeStrategy.first
  case PathList("com", "google", xs@_*) => MergeStrategy.first
  case PathList("com", "esotericsoftware", xs@_*) => MergeStrategy.first
  case PathList("com", "codahale", xs@_*) => MergeStrategy.first
  case PathList("com", "yammer", xs@_*) => MergeStrategy.first
  case "about.html" => MergeStrategy.rename
  case PathList("META-INF", xs@_*) => MergeStrategy.discard
  case "META-INF/ECLIPSEF.RSA" => MergeStrategy.first
  case "META-INF/mailcap" => MergeStrategy.first
  case "META-INF/mimetypes.default" => MergeStrategy.first
  case "plugin.properties" => MergeStrategy.first
  case "log4j.properties" => MergeStrategy.first
  case "overview.html" => MergeStrategy.first
  case x =>
    val oldStrategy = (assemblyMergeStrategy in assembly).value
    oldStrategy(x)
}
assemblyOption in assembly := (assemblyOption in assembly).value//.copy(includeScala = false)
assemblyJarName in assembly := "BondAgent.jar"

This is my assembly sbt -

import Versions._
import sbt._

object Dependencies {
  lazy val scalaCheck: ModuleID = "org.scalacheck" %% "scalacheck" % "1.12.2" % "test"
  lazy val akkaStream: ModuleID = "com.typesafe.akka" %% "akka-stream" % akkaStreamV
  lazy val akkaActor: ModuleID = "com.typesafe.akka" %% "akka-actor" % akkaV
  lazy val akkaSlf4j: ModuleID = "com.typesafe.akka" %% "akka-slf4j" % akkaV
  lazy val akkDeps = Seq(akkaStream, akkaActor, akkaSlf4j)
  lazy val sparkCore: ModuleID = "org.apache.spark" %% "spark-core" % sparkCoreV
  lazy val sparkSql: ModuleID = "org.apache.spark" %% "spark-sql" % sparkCoreV
  lazy val sparkDeps = Seq(sparkCore.exclude("org.eclipse.jetty.orbit", "javax.servlet").
    exclude("org.eclipse.jetty.orbit", "javax.transaction").
    exclude("org.eclipse.jetty.orbit", "javax.mail").
    exclude("org.eclipse.jetty.orbit", "javax.activation").
    exclude("commons-beanutils", "commons-beanutils-core").
    exclude("commons-collections", "commons-collections").
    exclude("commons-collections", "commons-collections").
    exclude("xml-apis", "xml-apis").
    exclude("javax.xml.stream", "stax-api")
    exclude("com.esotericsoftware.minlog", "minlog"), sparkSql.exclude("org.eclipse.jetty.orbit", "javax.servlet").
    exclude("org.eclipse.jetty.orbit", "javax.transaction").
    exclude("org.eclipse.jetty.orbit", "javax.mail").
    exclude("org.eclipse.jetty.orbit", "javax.activation").
    exclude("commons-beanutils", "commons-beanutils-core").
    exclude("commons-collections", "commons-collections").
    exclude("commons-collections", "commons-collections").
    exclude("xml-apis", "xml-apis").
    exclude("javax.xml.stream", "stax-api")
    exclude("com.esotericsoftware.minlog", "minlog"))
  lazy val funcDeps: ModuleID = "org.typelevel" %% "cats-core" % catsCoreV
  lazy val shapelessDep: ModuleID = "com.chuusai" %% "shapeless" % shapelessV
  lazy val hadoopClientDep: ModuleID = "org.apache.hadoop" % "hadoop-client" % hadoopCV
  lazy val parquetClient: ModuleID = "org.apache.parquet" % "parquet" % parquetV
  lazy val redisScala: ModuleID = "com.github.etaty" %% "rediscala" % redisScalaV
  lazy val argonautDeps: Seq[ModuleID] = Seq("io.argonaut" %% "argonaut").map(_ % argonautV)
  lazy val sparkConflictResolverDeps: Seq[ModuleID] = Seq("com.fasterxml.jackson.core" % "jackson-core" % "2.8.7",
  // test deps
    "com.fasterxml.jackson.core" % "jackson-databind" % "2.8.7",
    "com.fasterxml.jackson.core" % "jackson-annotations" % "2.8.7",
    "com.fasterxml.jackson.module" %% "jackson-module-scala" % "2.8.7",
    // With spark, to resolve java.lang.ClassNotFoundException: org.w3c.dom.ElementTraversal (error)
    //"xml-apis" % "xml-apis" % "1.4.01",
    // With spark, to resolve large stack trace on java.lang.ClassNotFoundException: de.unkrig.jdisasm.Disassembler (warning)
    "org.codehaus.janino" % "janino" % "3.0.7")
}

Not sure how to make it fast but it is affecting my development timelines - any help, suggestions, rants are welcome...

@ghost
Copy link

ghost commented Sep 29, 2017

@AnirudhVyas are you building in a VM?

@AnirudhVyas
Copy link

no - build on macbook os x

@eed3si9n
Copy link
Member

@AnirudhVyas Do you have SSD drive on your MacBook?
See https://github.com/sbt/sbt-assembly#other-things, and try turning off any of the optional features and see if it helps your situation.

@AnirudhVyas
Copy link

I have an ssd, let me see thanks. Will report back...

@AnirudhVyas
Copy link

AnirudhVyas commented Sep 29, 2017

not much difference - well I gained about 100 seconds if thats a consolation ... do you know where the bottleneck is? it seems to be taking a while in packaging... Honestly I have been using this awesome plugin for a while, but I never used it with spark, I used to tar up, thought I'd give it a try while putting it along with sbt on a fresh project - something is missing ...

@AnirudhVyas
Copy link

ok for now I made my app a spark submit - i was using a single package for a reason, with provided spark dependencies, less things to deal with, it completes in 180s ... very doable i think. But it would be great if know where the bottleneck was/is so that perhaps it could be improved? I am not implying that you haven't tried already - my apologies if I come across as blunt...

@eed3si9n
Copy link
Member

I personally haven't run into performance issues so I haven't really looked at current bottlenecks. If there's some specific issue around, feel free to create a sample project that someone could look into.

@EnverOsmanov
Copy link

EnverOsmanov commented Jan 17, 2018

I've played a bit with sbt-assembly 0.14.7-SNAPSHOT.
Changing this line:
(for(jar <- libsFiltered.par) yield {
to this line:
(for(jar <- libsFiltered) yield {
increased packaging speed 3x times (from 1544 seconds to 571 seconds).

My fat jar is about 130 MiB. My computer has SSD. Could anyone test this solution on own project?
Required steps:

  1. git clone git@github.com:EnverOsmanov/sbt-assembly.git
  2. cd ./sbt-assembly
  3. sbt publishLocal
  4. Set in your project sbt-assembly plugin's version to 0.14.7-SNAPSHOT

@EnverOsmanov
Copy link

EnverOsmanov commented Jan 18, 2018

@eed3si9n
I'm sorry, I can't share my work project. I've created sample project using some blogpost from the internet. Assembly takes:
libsFiltered.par => 588-634 seconds
libsFiltered => 320-402 seconds

Fat jar size is 96 MiB.

UPD: on HDD it works much faster, less than 60 seconds.

@jrudolph
Copy link
Member

I guess creating >60k files while unpacking puts major stress on the file system. Doing that non-sequentially by adding par probably makes things rather worse...

@devmanhinton
Copy link

devmanhinton commented Mar 16, 2018

FYI, having this problem and resolution was turning off anti-virus software (mac OS)

cry

In case anyone else has to feel this pain

@noahlz
Copy link

noahlz commented Aug 6, 2019

hello from 2019. I had this exact problem (yes,old project old version of sbt, but still) and doing sbt clean followed by assembly solved it.

YMMV

@karthicks
Copy link

I'm on sbt 0.12.01 and macOS. Every once in a while, the assembly goal simply hangs on the packaging jar step. Will upgrading my sbt help?

@Sathiyarajan
Copy link

any updates on this thread it's hanging for long time in the ubuntu 18.04 LTS @karthicks @noahlz @devmanhinton ?

@noahlz
Copy link

noahlz commented Nov 21, 2019

Wowwow found my way back to this issue via google. I have no insight except trying what past me did (clean assembly)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests