Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about the details of the nettack experiment #12

Open
Gmrylbx opened this issue Aug 25, 2021 · 11 comments
Open

about the details of the nettack experiment #12

Gmrylbx opened this issue Aug 25, 2021 · 11 comments

Comments

@Gmrylbx
Copy link

Gmrylbx commented Aug 25, 2021

Hi! Thanks for sharing the code, i'd like to ask you about the details of the nettack experiment.
I noticed in the paper that you only selected 10% of the target nodes as the test set when you conducted the nettack experiment on the pubmed dataset. So when i get pubmed dataset from deeprobust repository and set the parameter 'ptb_rate'=1.0, there will be 186 targeted nodes, i just need to sample 10% nodes i.e. 18 nodes as my test set, am i right?

@ChandlerBang
Copy link
Owner

Hi,

Thanks for your interest in our work. The 186 targeted nodes are already sampled so you don't need to sample from them again. According to our paper,

The nodes in test set with degree larger than 10 are set as target nodes. For Pubmed dataset, we only sample 10% of them`.

This means, we first obtain the test nodes with degree larger than 10 (there will be 1860 nodes), and then we sample 10% of them. Hence, the number of the target nodes is 186.

@Gmrylbx
Copy link
Author

Gmrylbx commented Aug 26, 2021

thanks a lot

@Gmrylbx
Copy link
Author

Gmrylbx commented Aug 26, 2021

Hi!

Could you release the code that generated the nettack attack? I want to use all the target nodes as the test set (The nodes in test set with degree larger than 10).

Thanks a lot!

@ChandlerBang
Copy link
Owner

Basically, we just sequentially attack those target nodes. I modified the example code as follows

from deeprobust.graph.defense import GCN
from deeprobust.graph.targeted_attack import Nettack
from deeprobust.graph.utils import *
from deeprobust.graph.data import Dataset

def attack_all():
    cnt = 0
    degrees = adj.sum(0).A1
    node_list = select_nodes() # obtain the nodes to be attacked
    num = len(node_list)
    print('=== Attacking %s nodes sequentially ===' % num)
    modified_adj = adj
    for target_node in node_list:
        n_perturbations = int(degrees[target_node])
        model = Nettack(surrogate, nnodes=modified_adj.shape[0], attack_structure=True, attack_features=False, device=device)
        model = model.to(device)
        model.attack(features, modified_adj, labels, target_node, n_perturbations, verbose=False)
        modified_adj = model.modified_adj

Feel free to let me know if you have further questions.

@Gmrylbx
Copy link
Author

Gmrylbx commented Aug 27, 2021

thanks a lot!

@Gmrylbx
Copy link
Author

Gmrylbx commented Sep 22, 2021

Hi!

I meet some problems when i use Nettack to attack polblogs dataset with n_perturbations=1, the code as follows

modified_adj = adj
    print('=== [Poisoning] Attacking %s nodes respectively ===' % len(node_list))
    for target_node in tqdm(node_list):
        model = Nettack(surrogate, nnodes=modified_adj.shape[0], attack_structure=True, attack_features=False, device=device)
        model = model.to(device)
        model.attack(features, modified_adj, labels, target_node, int(n_perturbations), verbose=False)
        modified_adj = model.modified_adj
        print(modified_adj.nnz)
    modified_adj = modified_adj.tocsr()

the origin graph has 33430 nnz(non zero elements), but after sequentially attack 443 nodes with n_perturbations=1, the modified_adj only has 33364 nnz, is that correct? why the edges in modified_adj less than origin adj?

@ChandlerBang
Copy link
Owner

Hi, I would suggest you check the changes made on the adjacency matrix for each iteration. It could happen that the attacker deleted some edges.

@Gmrylbx
Copy link
Author

Gmrylbx commented Nov 4, 2021

Basically, we just sequentially attack those target nodes. I modified the example code as follows

from deeprobust.graph.defense import GCN
from deeprobust.graph.targeted_attack import Nettack
from deeprobust.graph.utils import *
from deeprobust.graph.data import Dataset

def attack_all():
    cnt = 0
    degrees = adj.sum(0).A1
    node_list = select_nodes() # obtain the nodes to be attacked
    num = len(node_list)
    print('=== Attacking %s nodes sequentially ===' % num)
    modified_adj = adj
    for target_node in node_list:
        n_perturbations = int(degrees[target_node])
        model = Nettack(surrogate, nnodes=modified_adj.shape[0], attack_structure=True, attack_features=False, device=device)
        model = model.to(device)
        model.attack(features, modified_adj, labels, target_node, n_perturbations, verbose=False)
        modified_adj = model.modified_adj

Feel free to let me know if you have further questions.

When you compare the defense performance of different models under Nettack, does these models use the same data set?

I mean, I use GCN as a surrogate model to attack the graph structure, and then use other models to train on this modified graph. Is this correct?

I think different models should use themselves as surrogate models when testing defense performance. Is this the truth?

@yx606
Copy link

yx606 commented May 16, 2022

Hi! Thanks for sharing the code, I'd like to ask you about the details of the datasets!
for the Citeseer dataset, the edges of LCC in your article are 3668, but the edges of LCC in some other articles are 3757。
Why is the number different?

@ChandlerBang
Copy link
Owner

Hi! Thanks for sharing the code, I'd like to ask you about the details of the datasets! for the Citeseer dataset, the edges of LCC in your article are 3668, but the edges of LCC in some other articles are 3757。 Why is the number different?

Sorry for the late reply (I just noticed this message). I am not sure why the difference happens but according to my experiment the number should be 3668. I remember it should also be 3668 for Citeseer when checking the original code of nettack,

@ChandlerBang
Copy link
Owner

Basically, we just sequentially attack those target nodes. I modified the example code as follows

from deeprobust.graph.defense import GCN
from deeprobust.graph.targeted_attack import Nettack
from deeprobust.graph.utils import *
from deeprobust.graph.data import Dataset

def attack_all():
    cnt = 0
    degrees = adj.sum(0).A1
    node_list = select_nodes() # obtain the nodes to be attacked
    num = len(node_list)
    print('=== Attacking %s nodes sequentially ===' % num)
    modified_adj = adj
    for target_node in node_list:
        n_perturbations = int(degrees[target_node])
        model = Nettack(surrogate, nnodes=modified_adj.shape[0], attack_structure=True, attack_features=False, device=device)
        model = model.to(device)
        model.attack(features, modified_adj, labels, target_node, n_perturbations, verbose=False)
        modified_adj = model.modified_adj

Feel free to let me know if you have further questions.

When you compare the defense performance of different models under Nettack, does these models use the same data set?

I mean, I use GCN as a surrogate model to attack the graph structure, and then use other models to train on this modified graph. Is this correct?

I think different models should use themselves as surrogate models when testing defense performance. Is this the truth?

Sorry for the late reply (I just noticed this message). I simply used GCN as the surrogate model and generated the attacked graphs. All (defense) models used the same attacked graphs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants