Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate databases per library and program. #15

Open
ThisIsMyAltAccount opened this issue Dec 30, 2020 · 9 comments
Open

Separate databases per library and program. #15

ThisIsMyAltAccount opened this issue Dec 30, 2020 · 9 comments

Comments

@ThisIsMyAltAccount
Copy link

Depending on how much metadata you have pushed into the database you can get the wrong results when pulling. For example, when pulling metadata for IDA's QT5Gui.dll I get metadata for CryptoPP and 7-zip that I have uploaded into my database the past week.

If each library and program has its own database, I could compile QT5 and upload it into the QT5 database. Since I'm decompiling IDA's QT5Gui.dll I could select the QT5 database and pull metadata from it, without the possibility of getting metadata from unrelated programs and libraries.

Maybe even create separate databases per OS/Architecture, maybe even compiler versions:

windows/x86/qt5.sql
windows/x64/qt5.sql
linux/arm64/qt5.sql
linux/x86/gcc-6.4.0/qt5.sql
linux/x86/gcc-6.5.0/qt5.sql

As far as switching between databases, I have no idea how it would work.

@naim94a
Copy link
Owner

naim94a commented Dec 30, 2020

The protocol doesn't specify the file hash when pulling metadata, so you would have to switch databases manually to accomplish that.
You could add dbname=x86_qt5 parameter to the connection string (connection_info) in order to select a different database. That database should have schema.sql applied to it too...

Note that when lumina identifies functions from multiple files, it's because they have the same functions. The whole point of lumina is to make detection faster while reversing. The optimal solution would be to select a more general name for the function, or maybe increase IDA's LUMINA_MIN_FUNC_SIZE

@AGG2017
Copy link

AGG2017 commented Jan 5, 2021

I think it is not hard to be done by creating a simple Python plugin that when activated, after loading new IDA database or anytime later, to read ida.cfg to get lumina server information and then to comunicate with the lumina server. With custom commands it can read all available databases and to give you a menu to select the one you want to use from now on, or to create a new database. You can always reactivate this plugin to switch to another lumina database when needed. I did something similar replicating all IDA lumina functions but able to work with my custom processor modules that are not supported by the internal IDA Lumina.

@naim94a
Copy link
Owner

naim94a commented Jan 5, 2021

I think that it would require hooking a few IDA functions... What if someone hits "pull metadata", how would you select the correct database on the server without modifying the protocol on IDA's side?

@AGG2017
Copy link

AGG2017 commented Jan 5, 2021

I'm talking only for one private server for just one user. Different users can be detected by their current IP and their last selected database. If found there is no information about a specific IP, the unknown users will all use one default database or something like that.

@naim94a
Copy link
Owner

naim94a commented Jan 5, 2021

I personally have multiple databases open simultaneously on the same PC... But for private use, I guess a lock IP on file md5 could be added to the http API

@AGG2017
Copy link

AGG2017 commented Jan 5, 2021

During the hello communication I see there is a license information with user name and email. Enough to detect the user properly. The leaking licenses can be entered in a database and served by IP.
The Lumina server is a great idea but at this stage of implementation it is just the beginning and I'm still not interested by the original one. Restricted to a few processor modules and no options to extend anything, the only options I had is to replicate everything from scratch. Now I can do everything and adding something like users and databases can be done for minutes. I found that having different databases for the versions of each project is very useful. Even the function may be the same for several of them, the comments can be very specific for each one.

@naim94a
Copy link
Owner

naim94a commented Jan 5, 2021

The license isn't enough unfortunately... A company using floating licenses would result in identical hello messages from all clients (IP is more unique).
If you're interested in sharing databases with version control, and not sharing function signatures across databases, why not use something like https://github.com/idarlingteam/idarling ? It sounds like they accomplish something similar to what you're describing...

@AGG2017
Copy link

AGG2017 commented Jan 5, 2021

I know about this project but it is not what I needed. In fact I started from Diaphora project and connected the local database file with a remote connection to a server with MySQL database. In Diaphora they keep the clean assembly of the function and many many other unique function parameters (not only hashes) in order to be able to find not only the best match but also close enough functions to the unknown one (in my case modified function from the previous version of the same firmware). So, now I have access to the exact match for all known functions and the best guess for the rest with ability to choose manually if more than one is close enough. For just one user I can collect all I needed. For public servers they have many limits what to collect and that restricts a lot the end result.
Thanks for your efforts to make publicly available a private Lumina server. It is a great project. I installed one and everything is working fine but I needed much more than the original idea behind it. Unfortunately, no enough time to make it universal enough for public release.

@naim94a
Copy link
Owner

naim94a commented Jan 5, 2021

Using Diaphora seems like a really cool idea! I'd like to see your project if you every decide to publicly release it.
Hopefully HR will add more features to the protocol that would help resolve these issues (Or we can define our own extensions to the protocol and hook parts of IDA)

Thanks for using Lumen, It's nice to see people use your work :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants