Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary classification problem using NeuralNetworkClassifier and cross entropy loss #798

Closed
robinshi123 opened this issue May 6, 2024 · 1 comment
Assignees
Labels
status: time-out Issuer creator has not responded within 2 weeks type: question 🙋 Question about the code or its uses

Comments

@robinshi123
Copy link

What should we add?

Hello,

We're currently attempting a classification task using NeuralNetworkClassifier with the 'cross_entropy' loss function, as outlined in the tutorial found here: https://qiskit-community.github.io/qiskit-machine-learning/tutorials/02_neural_network_classifier_and_regressor.html.

Our target label has two classes. We noticed that the model outputs values in the range [-1, 1] instead of the typical [0, 1]. Consequently, we have adjusted the target class (ground truth) to also range between [-1, 1] to match the model's predictions.

However, we are unsure about one aspect: when using 'cross_entropy' as the loss function, is there an automatic adjustment within the model considering that binary cross entropy typically expects the outputs to be probabilities that sum to 1, with classes labeled as 0 and 1? The tutorial only uses the default loss which is "squared_error". Thus it is unclear to us if using NeuralNetworkClassifier with the 'cross_entropy' loss function, whether a special treatment is needed?

Thanks in advance!

@edoaltamura edoaltamura added the type: question 🙋 Question about the code or its uses label May 7, 2024
@oscar-wallis
Copy link
Collaborator

Hi @robinshi123, I think some wires have been crossed here and I am gonna do my best to untangle them.

Target labels

So as you rightly said in tutorial 02, there is a model that outputs values in the range [-1, 1] - this is EstimatorQNN it does this as the default observables are the full qubit Pauli Z operator. As this is 1 set of commuting observables, the range of the output is [1, -1] simply due to the statevector probabilites have to add to unity. You can check this out with a little linear algebra, start with the 2 qubit case and find the expectation value of the Pauli Z operator with a general statevector $\psi$. Considering the range is always [-1, 1] it makes life simpler for the EstimatorQNN to simply designate [-1, 0) to a -1 class and (0, 1] to the +1 class, which is done using np.sign in the NeuralNetworkClassier (NNC) on line 147.

Cross Entropy Loss

Your confusion cross entropy loss is understandable, as there is some magic going on under the hood you need to know about. Cross entropy can technically be used with either EstimatorQNN or SamplerQNN, however, care HAS to be made, otherwise you'll get some funky behaviour (for a taste of this, just throw loss='cross_entropy' in the estimator_classifier in line 99 of tutorial02 - it does learn and improve but in my testing the objective function value in the callback graph heads down to an entropy of -4, so a bit strange). Cross entropy *needs one hot encoded labels to work predictably and consistently, this can be automatically done by NNC using the one_hot=True variable. Internally the NNC will then convert input labels to one hot encoded labels so long as the labels are unique (so 1, -1 works or 1, 0 or any two unique numbers) and the number of labels matches the number of outputs of your QNN (you can check the number of outputs of the QNN by running qnn.forward(data, weights)). With the SamplerQNN in tutorial02 this is really easy, simply change line 167 to the line below (where I have converted the notebook to a python script and used that as a reference for lines).

sampler_classifier = NeuralNetworkClassifier(
    neural_network=sampler_qnn, optimizer=COBYLA(maxiter=30), callback=callback_graph, loss='cross_entropy', one_hot=True
)

and you can just set it training. Keep in mind the reason you can just add one_hot=True and just go is because the interpret function in SamplerQNN (line 159) has already been set to parity (line 151) for you. parity collects the SamplerQNN's outputs and changes them from its default $2^{nqubits}$ output to having 2 outputs, therefore matching the number of classes (again, check this with sampler_qnn.forward(data, weights) with and without the interpret variable set. Now, again, you can use cross entropy loss with EstimatorQNN, however, you would need to make the EstimatorQNN's output be an array of two numbers. This could be done using a second observable, say you set the using the observables variable in EstimatorQNN, for each observable added the circuit is re-evaluated with that new set (provided they commute) increasing overhead and outputting another number. For example, if observables=[observable1, observable2] the circuit will be evulated twice and an array of 2 numbers outputted which now matches the number of classes in the tutorial. However, as you rightly mentioned, this does pose a problem, as cross entropy works under the assumption that the inputted probabilities add to unity. This can be done with a bit of care, if we set the observables using SparsePauliOP and replace line 73 with the code below.

from qiskit.quantum_info import SparsePauliOp
op1 = SparsePauliOp.from_list([("II", 0.5), ("ZZ", 0.5)])
op2 = SparsePauliOp.from_list([("II", 0.5), ("ZZ", -0.5)])
estimator_qnn = EstimatorQNN(circuit=qc, observables=[op1, op2], one_hot=True, loss='cross_entropy')

Here I have essentially created op1 to be the 2 qubit Pauli Z observable scaled to be between [0 ,1] instead of [-1, 1], and op2 is essentially1 - op1. Now the output of estimator_qnn is two probabilities that add to unity. As we say in my home country - bish bash bosh.

*needs is a bit strong because technically you can have 1 qubit for a 2 class problem which naturally isn't one hot encoded but I thought we'd leave that out.

TLDR

If you want to guarantee cross_entropy works properly, make sure that:

  1. your_classifier = NeuralNetworkClassifier(neural_network=your_qnn, optimizer=optimizer, one_hot=True, loss='cross_entropy)'.
  2. len(your_qnn.forward(data, weights)) == num_classes.
  3. np.sum(your_qnn.forward(data, weights)) == 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: time-out Issuer creator has not responded within 2 weeks type: question 🙋 Question about the code or its uses
Projects
None yet
Development

No branches or pull requests

3 participants