Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Number of syntax trees in project sometimes zero #138

Open
duncanawoods opened this issue May 17, 2020 · 8 comments
Open

Number of syntax trees in project sometimes zero #138

duncanawoods opened this issue May 17, 2020 · 8 comments
Labels
⚠ Bug Something isn't working as expected

Comments

@duncanawoods
Copy link

I am often and unpredictably finding no syntax trees in a project. The solution has 10 projects and on around half the executions, 2 or 3 of the projects (different each time) will return no classes.

public static async Task<string> ParseInfo(string solutionFile)
{
    var manager = new AnalyzerManager(solutionFile);
    var workspace = manager.GetWorkspace();

    foreach (var project in workspace
        .CurrentSolution
        .Projects)
    {
        var compilation = await project.GetCompilationAsync();
        var count = 0;

        foreach (var tree in compilation.SyntaxTrees)
        {
            count +=
                tree.GetRoot()
                    .DescendantNodes()
                    .OfType<ClassDeclarationSyntax>()
                    .Count();
        }

        Console.WriteLine($"Project: {project.Name} Class Count: {count}");
    }
}

An example sequence of analyses might find this many total classes depending on which projects failed to produce any syntax trees:

Run Class count
0 854
1 854
2 854
3 854
4 727
5 831
6 693
7 854
8 854
9 853
10 840
11 854
12 854
13 854
14 854
15 831
16 854
17 854
18 854
19 854
  • BuildAlyzer 2.6.0
  • Ubuntu 18.04
  • .NET Core SDK (3.1.202)
@duncanawoods
Copy link
Author

duncanawoods commented May 24, 2020

Ok looks like the same thread safety issue as #134. It's fixed by disabling parallel builds.

I have identified one issue so far. AnalyzerResults is updated from multiple threads and has no concurrency support. I have updated it from Dictionary to ConcurrentDictionary but this has not solved all issues.

@duncanawoods
Copy link
Author

duncanawoods commented Jun 12, 2020

@daveaglick

The cause is that AnalyzerResult.ProcessCscCommandLine is never called because EventProcessor.MessageRaised is never called.

I believe there is a race condition between the AnonymousPipeLoggerServer reading from the buffer and it's CancellationTokenSource being triggered when the build process exits. I think that the buffer reading is cancelled before it has completed.

It appears that it isn't actually necessary to trigger CancellationTokenSource at all. The buffer reading will terminate on it's own when the pipe is closed so the Process.Exited event can just be removed.

private IAnalyzerResults BuildTargets(BuildEnvironment buildEnvironment, string targetFramework, string[] targetsToBuild, AnalyzerResults results)
{
    using (CancellationTokenSource cancellation = new CancellationTokenSource())
    {
        using (AnonymousPipeLoggerServer pipeLogger = new AnonymousPipeLoggerServer(cancellation.Token))
        {
            using (EventProcessor eventProcessor = new EventProcessor(Manager, this, BuildLoggers, pipeLogger, results != null))
            {
                // Run MSBuild
                int exitCode;
                string fileName = GetCommand(buildEnvironment, targetFramework, targetsToBuild, pipeLogger.GetClientHandle(), out string arguments);
                using (ProcessRunner processRunner = new ProcessRunner(fileName, arguments, Path.GetDirectoryName(ProjectFile.Path), GetEffectiveEnvironmentVariables(buildEnvironment), Manager.LoggerFactory))
                {
                    processRunner.Exited = () => cancellation.Cancel(); // <--- RACE CONDITION
                    processRunner.Start();
                    pipeLogger.ReadAll();                               // <----
                    processRunner.WaitForExit();
                    exitCode = processRunner.ExitCode;
                }

                // Collect the results
                results?.Add(eventProcessor.Results, exitCode == 0 && eventProcessor.OverallSuccess);
            }
        }
    }
    return results;
} 

When I comment out the cancellation, I now see reliable results.

@daveaglick daveaglick added the ⚠ Bug Something isn't working as expected label Jun 12, 2020
@daveaglick
Copy link
Collaborator

When I comment out the cancellation, I now see reliable results.

Awesome! Thanks for tracking this down to a root cause - when I initially tried to reproduce I couldn't seem to get the same results so I put it on hold while working on other stuff and intended to come back later. Glad you beat me to it :)

I'll try my best to apply this as a fix and get out a new version this weekend or early next week.

@duncanawoods
Copy link
Author

duncanawoods commented Jun 12, 2020

Thanks @daveaglick

i) also see #141 for a fix for that one.

ii) Changing AnalyzerResults to use a ConcurrentDictionary might fix the missing projects seen in #134.

I initially tried to reproduce I couldn't seem to get the same results

iii)
To help replicate this, I generated a liltte script to create big solutions with lots of tiny projects and I was able to see lots of failures. I only need a dozen or so projects:

var sln = $"bigtest";

Console.WriteLine($"dotnet new sln -n {sln}");

for (int i = 0; i < 50; i++)
{    
    Console.WriteLine($"dotnet new console -n Proj{i}");
    Console.WriteLine($"dotnet sln ./{sln}.sln add ./Proj{i}/Proj{i}.csproj");
}

@daveaglick
Copy link
Collaborator

I'm going to add a test solution using your script to the test suite and it will be awesome :)

@daveaglick
Copy link
Collaborator

This test is now consistently passing for me with a 50 project solution:
image

Hopefully that's an indication this is resolved. Thanks again for the fantastic work narrowing down the multiple concurrency issues here @duncanawoods.

@daveaglick
Copy link
Collaborator

Buildalyzer 3.0.1 is now rolling out to NuGet and should (hopefully) resolve this issue. When you get a chance can you please verify that this problem is resolved? Thanks!

@duncanawoods
Copy link
Author

duncanawoods commented Jul 10, 2020

Hi @daveaglick

I have had a chance to try it out. When noodling around, I was seeing it around 5% of the time. I have also seen some cases where a typically 15s operation took 10 minutes to complete.

I created a batch test with better data collection to share some hard stats with you but this didn't even appear (!) and I saw some different errors.

Test setup

  • a process that ran an analysis of a 14 project solution with 863 classes
  • Ubuntu 18.04
  • run 20 processes at a time
  • repeat 20 times
Result Number %
Success 385 96.25%
Could not find build environment 15 3.75%
Missing classes 0 0

Observations

  • I don't know what environmental conditions might increase/decrease the likelihood of this specific issue
  • Even though they are separate processes, the tests within a batch do not seem entirely independent. The "Could not find build environment" appeared in groups of 5 or so.
  • Trying to do 100 process batches led to lots of errors about "closed pipes". Must be some system resource thing but I am surprised on a 12C/24T, 64GB RAM

Conclusions

I think that while we have improved things, there are still gremlins in the concurrency code and given their rarity we are going to have a terrible time replicating let alone fixing them.

In this sort of situation my approach would be to take a clean-slate approach the concurrency design. Right now pretty much every form of parallelism is used: sub-processes, threads, async/await, TPL loops and concurrent data-structures. One concern is that peppering code with concurrent data-structures is insufficient for operations that touch multiple data-structures. What needs to happen is ensure that entire operations are isolated from each other instead of just fine-grained access to specific data. To make concurrency comprehensible and checkable for errors, I would aim to have the majority of code unaware of concurrency concerns and push all the orchestration up to the top level with crystal clear guarantees about who is accessing what, when things are started/terminated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚠ Bug Something isn't working as expected
Projects
None yet
Development

No branches or pull requests

2 participants