Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RandomForest segfaults once deserialized. #5060

Open
geektoni opened this issue Jun 8, 2020 · 8 comments
Open

RandomForest segfaults once deserialized. #5060

geektoni opened this issue Jun 8, 2020 · 8 comments

Comments

@geektoni
Copy link
Contributor

geektoni commented Jun 8, 2020

When using the python interface, if we serialize and then deserialize a RandomForest object, we will get a segfault if we try to call the apply_regression method from the deserialized object. See the code below for an example.

#!/usr/bin/env python
# coding: utf-8

import shogun as sg
import numpy as np

# Create random features
X_train = np.random.normal(0, 1, (100, 5))
betas = np.random.normal(0,1, 5)
y_train = np.dot(X_train, betas)

X_test = np.random.normal(0, 1, (10, 5))
y_test = np.dot(X_test, betas)

features_train = sg.create_features(X_train.T)
features_test = sg.create_features(X_test.T)
labels_train = sg.create_labels(y_train)
labels_test = sg.create_labels(y_test)

# Create the random forest object
mean_rule = sg.create_combination_rule("MeanRule")
rand_forest = sg.create_machine("RandomForest", labels=labels_train, num_bags=5,
                                seed=1, combination_rule=mean_rule)

rand_forest.train(features_train)
labels_predict = rand_forest.apply_regression(features_test)

# Serialize the model
model_file_path = './sample_model.json'
sg.serialize(model_file_path, rand_forest, sg.JsonSerializer())

# Deserialize the model and return
deserialized_rand_forest = sg.as_machine(sg.deserialize(model_file_path, sg.JsonDeserializer()))
labels_train_predict = deserialized_rand_forest.apply_regression(features_test)
@gf712
Copy link
Member

gf712 commented Jun 8, 2020

Hmmm, I guess this is why the random forest meta example sometimes fails locally... Have you tried to run it in gdb yet?

@geektoni
Copy link
Contributor Author

geektoni commented Jun 8, 2020

Hmmm, I guess this is why the random forest meta example sometimes fails locally... Have you tried to run it in gdb yet?

Nope, I still need to check properly.

@geektoni
Copy link
Contributor Author

geektoni commented Jun 8, 2020

The stacktrace is the following:

#0  0x00007ffff58156b4 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count (this=0x7fffce914c38, __r=...)
    at /usr/include/c++/7/bits/shared_ptr_base.h:849
#1  0x00007ffff57cb20d in std::__shared_ptr<shogun::SGObject, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<shogun::SGObject, void> (
    this=0x7fffce914c30, __r=...) at /usr/include/c++/7/bits/shared_ptr_base.h:1147
#2  0x00007ffff576818b in std::shared_ptr<shogun::SGObject>::shared_ptr<shogun::SGObject, void> (this=0x7fffce914c30, __r=...)
    at /usr/include/c++/7/bits/shared_ptr.h:266
#3  0x00007ffff570f315 in std::enable_shared_from_this<shogun::SGObject>::shared_from_this (this=0x8)
    at /usr/include/c++/7/bits/shared_ptr.h:640
#4  0x00007ffff579fe15 in shogun::SGObject::as<shogun::BinaryTreeMachineNode<shogun::CARTreeNodeData> > (this=0x0)
    at /home/gdetoni/Github/shogun/src/shogun/base/SGObject.h:641
#5  0x00007ffff1f2fc00 in shogun::CARTree::apply_regression (this=0x555556162e90, data=...)
    at /home/gdetoni/Github/shogun/src/shogun/multiclass/tree/CARTree.cpp:142
#6  0x00007ffff12bd75f in shogun::Machine::apply (this=0x555556162e90, data=...)
    at /home/gdetoni/Github/shogun/src/shogun/machine/Machine.cpp:128
#7  0x00007ffff1285fdf in shogun::BaggingMachine::apply_outputs_without_combination(std::shared_ptr<shogun::Features>) [clone ._omp_fn.0]
    () at /home/gdetoni/Github/shogun/src/shogun/machine/BaggingMachine.cpp:114
#8  0x00007fffeac166d5 in gomp_thread_start (xdata=<optimized out>)
    at /home/nwani/m3/conda-bld/compilers_linux-64_1560109574129/work/.build/x86_64-conda_cos6-linux-gnu/src/gcc/libgomp/team.c:123
#9  0x00007ffff7bbd6db in start_thread (arg=0x7fffce915700) at pthread_create.c:463
#10 0x00007ffff78e688f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

It seems like we have some null pointer here...

@gf712
Copy link
Member

gf712 commented Jun 8, 2020

From what I can tell is that m_root contains a nullptr. It seems like that is the default value in the default constructor. And you can see that it says there that m_root has not been added to the parameter framework, so it is not being serialised. I am not sure why it is not being registered as an SGObject though? That should work fine.

@geektoni
Copy link
Contributor Author

geektoni commented Jun 8, 2020

Thank you @gf712 for the insight!

@geektoni
Copy link
Contributor Author

geektoni commented Jun 9, 2020

If we add m_root as a parameter we will get the following messages once we try to deserialize:

[06/09/20 10:45:49 error] Could not create 'BinaryTreeMachineNode' class
[06/09/20 10:45:49 warning] Error while deserializeing RandomCARTree: ShogunException: Could not create 'BinaryTreeMachineNode' class

The issue should be caused by this problem below here (taken from the source code).

// the problem is "CARTree"/"RandomCARTree" can't be cloned because
// they inherit from TreeMachine which is templated and can't be
// created in class_list
// SG_ADD((std::shared_ptr<SGObject>*)&m_root,"m_root", "tree structure");

I am not super familiar with the serialization framework, so are there any possible ways to make it work again?

@gf712
Copy link
Member

gf712 commented Jun 9, 2020

Hmm I see, it needs the template parameter. @vigsterkr I guess we need to replace TreeMachineNode with non templated version?

@hasini93
Copy link

hasini93 commented Jan 6, 2021

Got a segfault as well deserializing a random forest using c++ (develop version and release 6.1.4). Same issue was presented in previous issues, latter discussed a workaround:

#3481
#4242

Is there any update to this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants