Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion to improve pre-processing time 40x #104

Open
maciejwypych opened this issue Dec 14, 2022 · 2 comments
Open

Suggestion to improve pre-processing time 40x #104

maciejwypych opened this issue Dec 14, 2022 · 2 comments

Comments

@maciejwypych
Copy link
Contributor

maciejwypych commented Dec 14, 2022

Hi,
I've noticed that the display of the Revit file info is a massive bottleneck for large lists of files. I've been running a script for 5200 families, and the display process of file information took over 80 minutes. This is only to show individual lines of the already processed information.
image

This bottleneck is caused by the ShowSupportedRevitFileInfo method.

def ShowSupportedRevitFileInfo(supportedRevitFileInfo, output):
    output()
    if supportedRevitFileInfo.IsCloudModel():
        revitCloudModelInfo = supportedRevitFileInfo.GetRevitCloudModelInfo()
        projectGuidText = revitCloudModelInfo.GetProjectGuid().ToString()
        modelGuidText = revitCloudModelInfo.GetModelGuid().ToString()
        output("\t" + "CLOUD MODEL")
        output("\t" + "Project ID: " + projectGuidText)
        output("\t" + "Model ID: " + modelGuidText)
        revitVersionText = supportedRevitFileInfo.TryGetRevitVersionText()
        revitVersionText = revitVersionText if not str.IsNullOrWhiteSpace(revitVersionText) else "NOT SPECIFIED!"
        output("\t" + "Revit version: " + revitVersionText)
    else:
        revitFileInfo = supportedRevitFileInfo.GetRevitFileInfo()
        revitFilePath = revitFileInfo.GetFullPath()
        fileExists = revitFileInfo.Exists()
        fileSize = revitFileInfo.GetFileSize()
        fileSizeText = str.Format("{0:0.00}MB", fileSize / (1024.0 * 1024.0)) if fileSize is not None else "<UNKNOWN>"
        output("\t" + revitFilePath)
        output("\t" + "File exists: " + ("YES" if fileExists else "NO"))
        output("\t" + "File size: " + fileSizeText)
        if fileExists:
            revitVersionText = revitFileInfo.TryGetRevitVersionText()
            revitVersionText = revitVersionText if not str.IsNullOrWhiteSpace(revitVersionText) else "NOT DETECTED!"
            output("\t" + "Revit version: " + revitVersionText)
    return

I'm guessing that because each line calls the output method for each line that creates a new timer, it slows it down dramatically.
Removing the calls to output reduces the pre-processing time for the same 5200 families from 80 minutes to 10 seconds.
But I'm guessing that the logging should still be there, so I'd suggest changing this method to output a multi-line string instead, which can be processed by a slightly modified output method that allows for processing multi-line strings.

def ShowSupportedRevitFileInfo(supportedRevitFileInfo):
    message = "\n"
    if supportedRevitFileInfo.IsCloudModel():
        revitCloudModelInfo = supportedRevitFileInfo.GetRevitCloudModelInfo()
        projectGuidText = revitCloudModelInfo.GetProjectGuid().ToString()
        modelGuidText = revitCloudModelInfo.GetModelGuid().ToString()
        message += ("\t" + "CLOUD MODEL\n")
        message += ("\t" + "Project ID: " + projectGuidText + "\n")
        message += ("\t" + "Model ID: " + modelGuidText + "\n")
        revitVersionText = supportedRevitFileInfo.TryGetRevitVersionText()
        revitVersionText = revitVersionText if not str.IsNullOrWhiteSpace(revitVersionText) else "NOT SPECIFIED!"
        message += ("\t" + "Revit version: " + revitVersionText + "\n")
    else:
        revitFileInfo = supportedRevitFileInfo.GetRevitFileInfo()
        revitFilePath = revitFileInfo.GetFullPath()
        fileExists = revitFileInfo.Exists()
        fileSize = revitFileInfo.GetFileSize()
        fileSizeText = str.Format("{0:0.00}MB", fileSize / (1024.0 * 1024.0)) if fileSize is not None else "<UNKNOWN>"
        message += ("\t" + revitFilePath + "\n")
        message += ("\t" + "File exists: " + ("YES" if fileExists else "NO") + "\n")
        message += ("\t" + "File size: " + fileSizeText + "\n")
        if fileExists:
            revitVersionText = revitFileInfo.TryGetRevitVersionText()
            revitVersionText = revitVersionText if not str.IsNullOrWhiteSpace(revitVersionText) else "NOT DETECTED!"
            message += ("\t" + "Revit version: " + revitVersionText + "\n")
    return message

and a sample implementation

        message = "\n"
        if nonExistentCount > 0:
            message += ""
            message += "WARNING: The following Revit Files do not exist (" + str(nonExistentCount) + "):"
            for supportedRevitFileInfo in nonExistentRevitFileList:
                message += batch_rvt_monitor_util.ShowSupportedRevitFileInfo(supportedRevitFileInfo)

        if unsupportedCount > 0:
            message += "\n"
            message += "WARNING: The following Revit Files are of an unsupported version (" + str(unsupportedCount) + "):"
            for supportedRevitFileInfo in unsupportedRevitFileList:
                message += batch_rvt_monitor_util.ShowSupportedRevitFileInfo(supportedRevitFileInfo)

        if unsupportedRevitFilePathCount > 0:
            message += "\n"
            message += "WARNING: The following Revit Files have an unsupported file path (" + str(unsupportedRevitFilePathCount) + "):"
            for supportedRevitFileInfo in unsupportedRevitFilePathRevitFileList:
                message += batch_rvt_monitor_util.ShowSupportedRevitFileInfo(supportedRevitFileInfo)
        Output(message)

and the output method needs to allow for a multiline string

def Output(m="", msgId=""):
    timestamp = time_util.GetDateTimeNow().ToString("HH:mm:ss")
    message = ""
    for line in m.split("\n"):
        message += timestamp + " : " + (("[" + str(msgId) + "]" + " ") if msgId != "" else "") + line + "\n"
    if SHOW_OUTPUT:
        ORIGINAL_STDOUT.write(message)
    if logging_util.LOG_FILE[0] is not None:
        logging_util.LOG_FILE[0].WriteMessage({ "msgId" : msgId, "message" : m })
    return

This adds time to the processing (about 2 minutes) but shows the log, so still a 40x performance improvement.
The only thing is that the time stamp will be the same for the whole list of files, but if the list is generated in milliseconds, the output time doesn't matter.

I've also modified the READ_OUTPUT_INTERVAL_IN_MS in the BatchRvtGuiForm.cs to 10 ms, to show the text a bit quicker.

You can check out this implementation in my fork - https://github.com/maciejwypych/RevitBatchProcessor

@maciejwypych maciejwypych changed the title Suggestion to improve pre-processing time Suggestion to improve pre-processing time 40x Dec 14, 2022
@petersmithfromengland
Copy link
Contributor

Hi @maciejwypych,

Thanks very much for this. I'm verrrrry late on the response, so I apologise for that.

I have merged your fork onto a v1.9.1-beta branch. Will do some testing from my side just to make sure it's all good and then push through

Cheers, Pete

@maciejwypych
Copy link
Contributor Author

maciejwypych commented Apr 3, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants