Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need better control over worker logging #51

Open
quinnj opened this issue Feb 7, 2018 · 6 comments
Open

Need better control over worker logging #51

quinnj opened this issue Feb 7, 2018 · 6 comments

Comments

@quinnj
Copy link
Member

quinnj commented Feb 7, 2018

Currently, even if you define your own AbstractLogger and set Base.global_logger(mylogger), worker loggers still prepend " From worker X: to all logs.

My current work-around is doing

using Distributed
function Distributed.redirect_worker_output(ident, stream)
           @schedule while !eof(stream)
               println(readline(stream))
           end
       end
using MyPkg
MyPkg.run()

So hurray for JuliaLang/julia#265 and all, but we really need better controls here.

@newptcai
Copy link

Yeah, this would be very convenient!

@newptcai
Copy link

newptcai commented May 10, 2019

Also the workaround does not seem to work for me initially.

I found you need to call the

using Distributed
function Distributed.redirect_worker_output(ident, stream)
           @schedule while !eof(stream)
               println(readline(stream))
           end
       end

before calling addprocs

Also, if julia is started with -p option, then this also does not work.

@c42f
Copy link
Member

c42f commented Jul 16, 2019

I think the ideal fix here would need both Distributed workers and master to be aware of the standard logging framework. I have something like the following in mind:

  • All workers get a logger installed which can direct log records back to the master
  • Workers use @info or related macros to generates log records
  • Log records are intercepted on the workers and sent over the network to the master
  • Log records are deserialized on the master and sent to the current global logger

For this to be reasonably scalable some element of log filtering will be required on the worker nodes and it might also be necessary to designate a worker for log aggregation and sinking rather than using the master.

@simonbyrne
Copy link
Contributor

Note that you don't see this unless you manually set the logger due to JuliaLang/julia#26798 (that doesn't fix the issue though).

@vchuravy
Copy link
Sponsor Member

One thing I have done recently for distributed logging:

@everywhere begin
  import Dates
  using Logging, LoggingExtras
  const date_format = "HH:MM:SS"

  function dagger_logger(logger)
    logger = MinLevelLogger(logger, Logging.Info)
    logger = TransformerLogger(logger) do log
      merge(log, (; message = "$(Dates.format(Dates.now(), date_format)) ($(myid())) $(log.message)"))
    end
    return logger
  end

  # set the global logger
  if !(stderr isa IOStream)
    ConsoleLogger(stderr)
  else
    FileLogger(stderr, always_flush=true)
  end |> dagger_logger |> global_logger
end

but I agree that @c42f ideas are probably worth exploring.

@c42f
Copy link
Member

c42f commented Nov 26, 2020

Yeah I still think it would be nice to have logging "just work" with Distributed by default, in a similar way to the stdout handling.

However it's also clear that redirecting logging to the master node will fall over if there's many nodes or high log volume. So for serious HPC work some distributed solution also seems necessary... such as dumping to a distributed filesystem, if you have one. @kpamnany wrote an interesting comment on that at CliMA/ClimateMachine.jl#134 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants