Note

You can download this notebook here.

Differentiating CVQNN layers with Piquasso and Tensorflow#

In Piquasso, one can easily create Continuous-Variable Quantum Neural Networks (CVQNN) circuits using the Piquasso CVQNN module. In these circuits, the layers are constructed according to Continuous-variable quantum neural networks.

[8]:
import piquasso as pq

d = 2  # Number of qumodes

layer_count = 10  # Number of CVQNN layers

# Generating random weights
weights = pq.cvqnn.generate_random_cvqnn_weights(layer_count=layer_count, d=d)

# Creating CVQNN layers as a Piquasso subprogram
cvqnn_layers = pq.cvqnn.create_layers(weights)

Piquasso automatically sets up a subprogram containing the instructions of the desired CVQNN layer. Now we can embed this subprogram in any Piquasso program. Let’s choose the input state as a pure displaced state.

[9]:
# The program definition
with pq.Program() as program:
    pq.Q() | pq.Vacuum()

    for i in range(d):
        pq.Q(i) | pq.Displacement(r=0.1)

    pq.Q() | cvqnn_layers

Now, we need to choose the simulator which executes the instructions. Since a CVQNN layer includes non-linear terms, we definitely need to perform the simulation in Fock space. Since our initial state is pure, we can use PureFockSimulator.

[10]:
cutoff = 5

simulator = pq.PureFockSimulator(d, pq.Config(cutoff=cutoff))

final_state = simulator.execute(program).state

After obtaining the state, we can calculate several things, e.g. the expectation value of the position operator on mode 0.

[11]:
print("Mean position on mode 0:")
final_state.mean_position(mode=0)
Mean position on mode 0:
[11]:
0.08810452589019563

In order to differentiate quantitities, we need to modify the simulation. In itself, PureFockSimulator is unable to perform automatic differentiation. In order to do that, we can use TensorflowCalculator, which replaces NumPy to TensorFlow under the hood. For a concrete example, let the loss function be

\[J(w) = || \ket{\psi(w)} - \ket{\psi_*} ||_2,\]

where \(\ket{\psi(w)}\) is the final state of the circuit, and \(\ket{\psi_*}\) is some random final state.

[12]:
import numpy as np
from scipy.special import comb

state_vector_size = comb(d + cutoff - 1, cutoff - 1, exact=True)

psi_star = np.random.rand(state_vector_size) + 1j * np.random.rand(state_vector_size)

psi_star /= np.sum(np.abs(psi_star))

Then, by using tf.GradientTape, we can differentiate this loss function.

[13]:
import tensorflow as tf

tf.get_logger().setLevel("ERROR")  # Turns off complex->float casting warnings

simulator = pq.PureFockSimulator(
    d, pq.Config(cutoff=cutoff), calculator=pq.TensorflowCalculator()
)

w = tf.Variable(weights)
psi_star = tf.Variable(psi_star)

with tf.GradientTape() as tape:
    cvqnn_layers = pq.cvqnn.create_layers(w)

    with pq.Program() as program:
        pq.Q() | pq.Vacuum()

        for i in range(d):
            pq.Q(i) | pq.Displacement(r=0.1)

        pq.Q() | cvqnn_layers

    simulator.execute(program)

    final_state = simulator.execute(program).state

    psi = final_state.state_vector

    loss = tf.math.reduce_sum(tf.math.abs(psi - psi_star))

loss_grad = tape.gradient(loss, w)

print(f"Loss: {loss.numpy()}")
print(f"Loss gradient: {loss_grad.numpy()}")
Loss: 1.9604476480647497
Loss gradient: [[ 2.48370504e-02  6.91253512e-03 -5.17538228e-02  5.91197083e-01
   8.35270137e-01  4.33414809e-02 -1.44455276e-03 -6.10485008e-02
   6.30518617e-01  7.87589392e-01  4.15688561e-04  6.45192857e-03
  -7.05893753e-02 -1.81552492e-02]
 [ 5.51070724e-02  1.14482055e-02 -7.20810177e-02  8.70276885e-01
   6.50958369e-01  4.79129674e-02 -1.05355280e-02 -6.39677085e-02
   7.03435046e-01  7.72832475e-01  6.62571242e-03  1.03948801e-03
  -7.14190194e-02 -1.32908739e-02]
 [ 5.91775056e-02  4.75369308e-03 -6.20956892e-02  8.06572060e-01
   7.06752388e-01  4.33605125e-02 -4.52911124e-04 -6.26476669e-02
   7.98818963e-01  7.21751009e-01  1.01077259e-02 -2.26709094e-03
  -6.75334937e-02 -1.38186903e-02]
 [ 4.01203598e-02  1.99768691e-03 -5.45376280e-02  8.81982242e-01
   6.53387621e-01  5.93868412e-03 -6.30429008e-04 -5.38798294e-02
   9.05422584e-01  7.15949040e-01 -3.64702223e-04 -5.77410545e-05
  -6.82197114e-02  1.74116637e-03]
 [ 1.07959822e-03  1.73058480e-04 -5.44175901e-02  8.66129564e-01
   6.70398558e-01  1.66385446e-02  4.30945595e-04 -5.41922390e-02
   9.00744709e-01  7.08804040e-01  4.50634059e-03  5.91755415e-04
  -6.23273595e-02  5.69621412e-03]
 [ 4.21102694e-03  3.93742722e-03 -5.36233257e-02  1.03238139e+00
   5.26661099e-01 -1.52360597e-03 -3.54072884e-03 -4.91789820e-02
   9.16184120e-01  6.90672795e-01  1.16414634e-03 -2.31269771e-03
  -5.94908198e-02  9.16505767e-03]
 [-2.04315766e-02  5.76912708e-04 -4.85917483e-02  8.56742917e-01
   6.53319067e-01 -1.99088351e-02  8.87190361e-04 -4.30689726e-02
   9.27679869e-01  6.49967198e-01  1.48523330e-04 -1.12318622e-03
  -4.57634914e-02  1.02198966e-02]
 [-2.21043277e-02  1.45639573e-03 -4.43768450e-02  9.80622059e-01
   5.40051984e-01 -4.77245102e-02  2.81496583e-03 -6.23762968e-02
   1.05210311e+00  4.76482110e-01 -8.71547209e-04  6.93330613e-05
  -8.64155789e-02  8.37255149e-03]
 [-4.67122038e-02 -3.90292564e-05 -6.32088147e-02  1.12961323e+00
   4.26270074e-01 -2.25618686e-02 -5.57393577e-03 -5.11763569e-02
   9.44769422e-01  6.50075251e-01 -1.20220424e-03 -2.19911083e-03
  -6.92235563e-02  6.31057580e-03]
 [-3.85113231e-02 -1.01100459e-03 -5.13675566e-02  7.89344621e-01
   6.40349720e-01 -3.04770202e-02 -4.07158955e-03 -4.26573559e-02
   8.30388031e-01  7.82317691e-01  2.13340682e-03  1.06722282e-03
  -4.83288207e-02  5.57853420e-03]]

However, Piquasso is written in a way that it supports tf.function (see Better performance with tf.function) one can also use tf.function for this task. Refactoring everything into a function, we can use the tf.function decorator. Note, that we have to avoid side effects in any function decorated with tf.function, because side effects are only executed at the tracing step. Therefore, instantiation of pq.Program should happen by providing the instructions as constructor arguments, instead of using the with keyword.

[14]:
def calculate_loss(w, psi_star, cutoff):
    d = pq.cvqnn.get_number_of_modes(w.shape[1])

    simulator = pq.PureFockSimulator(
        d,
        pq.Config(cutoff=cutoff, normalize=False),
        calculator=pq.TensorflowCalculator(decorate_with=tf.function),
    )

    with tf.GradientTape() as tape:
        cvqnn_layers = pq.cvqnn.create_layers(w)

        final_state = simulator.execute_instructions(
            instructions=[pq.Vacuum()] + cvqnn_layers.instructions
        ).state

        psi = final_state.state_vector

        loss = tf.math.reduce_sum(tf.math.abs(psi - psi_star))

    return loss, tape.gradient(loss, w)


improved_calculate_loss = tf.function(calculate_loss)

loss, loss_grad = improved_calculate_loss(w, psi_star, cutoff)

print(f"Loss: {loss.numpy()}")
print(f"Loss gradient: {loss_grad.numpy()}")
Loss: 2.018569979174099
Loss gradient: [[ 0.00000000e+00  0.00000000e+00  0.00000000e+00  5.75160857e-01
   6.83322458e-01  1.28450675e-02 -4.00008041e-05 -6.30548348e-03
  -7.49845629e-01 -8.00313533e-01  7.09929231e-05  1.03634563e-02
  -1.18960333e-02  1.04187002e-02]
 [-1.35800236e-03 -5.49585214e-04 -5.68490534e-03  7.68436902e-01
   5.05490527e-01  6.69074565e-03  6.61089567e-04 -5.98396524e-03
  -7.70615878e-01 -7.65752853e-01  2.44147270e-03  2.03468039e-03
  -4.38080826e-03  1.26999161e-02]
 [ 1.33582634e-02 -1.68655402e-03 -1.85593853e-03  7.07517913e-01
   5.44470265e-01  2.49539746e-02  6.42390173e-05 -1.95025626e-03
  -8.25024061e-01 -6.44185279e-01  6.53653384e-03 -7.65527162e-03
   9.10541653e-03  3.40278572e-03]
 [ 4.26398634e-02 -1.80787941e-03  6.39415698e-03  7.10539771e-01
   5.16317806e-01  4.19027819e-02  3.11151157e-04  6.34437434e-03
  -7.11899830e-01 -5.86043987e-01 -1.78246076e-03 -3.32723452e-04
   1.04650909e-02  4.07298677e-03]
 [ 4.29666003e-02 -5.13803858e-05  4.61329396e-03  6.98424383e-01
   5.25555203e-01  3.78096485e-02 -3.26929949e-04  6.85206072e-03
  -7.86593321e-01 -5.59964717e-01  8.99000290e-03  4.50767474e-03
   2.76895247e-02  7.51784650e-03]
 [ 4.84763174e-02 -5.06656057e-03  2.09086242e-02  7.78957857e-01
   4.11735915e-01  4.47041965e-02  5.57595820e-03  1.66618932e-02
  -7.45897045e-01 -6.70781724e-01  4.18901129e-03 -5.51458453e-03
   3.53793166e-02  2.84834471e-03]
 [ 6.30724855e-02 -6.41540211e-04  2.14924447e-02  6.11241592e-01
   5.05390366e-01  6.89479439e-02 -1.41398304e-03  2.94560107e-02
  -8.62929995e-01 -6.09878053e-01  1.88758840e-04 -3.11469367e-03
   5.47604296e-02 -3.52522121e-03]
 [ 6.68569098e-02 -2.65481869e-03  3.22995882e-02  7.03713041e-01
   4.32114569e-01  4.74536273e-02 -3.67396656e-03  2.02959096e-02
  -8.42787956e-01 -4.25121040e-01 -3.99655314e-03  2.98835285e-04
   1.63926716e-02 -1.21762767e-02]
 [ 4.79882382e-02  6.12290682e-05  1.62381274e-02  8.37636878e-01
   3.50979911e-01  5.30524902e-02  7.52035617e-03  1.63628512e-02
  -8.23920872e-01 -6.25385991e-01 -2.62846126e-03 -5.16820365e-03
   2.70939337e-02 -7.64525657e-03]
 [ 6.78217682e-02  5.96498824e-04  1.31378911e-02  5.59849745e-01
   5.20217392e-01  7.60063612e-02  2.26111302e-03  1.50965198e-02
  -4.33486232e-01 -6.76081337e-01 -5.44203449e-03  4.00677625e-03
   2.97522697e-02  1.53461233e-03]]

The first run is called the tracing step, and it takes some time, because Tensorflow captures a tf.Graph here. The size of the graph can be decreased by passing the decorate_with=tf.function argument to pq.TensorflowCalculator, which also decreases the execution time of the tracing step. After the first run, a significant speedup is observed. We can also compare the runtimes of the compiled and non-compiled function.

[15]:
import time
import numpy as np

regular_runtimes = []
improved_runtimes = []

for i in range(10):
    w = tf.Variable(pq.cvqnn.generate_random_cvqnn_weights(layer_count, d))

    start_time = time.time()
    calculate_loss(w, psi_star, cutoff)
    end_time = time.time()

    regular_runtimes.append(end_time - start_time)

    start_time = time.time()
    improved_calculate_loss(w, psi_star, cutoff)
    end_time = time.time()

    improved_runtimes.append(end_time - start_time)

print(f"Regular: {np.mean(regular_runtimes)} s (+/- {np.std(regular_runtimes)} s).")
print(f"Improved: {np.mean(improved_runtimes)} s (+/- {np.std(improved_runtimes)} s).")
Regular: 2.4323306560516356 s (+/- 0.25203824522512575 s).
Improved: 0.011875104904174805 s (+/- 0.0002783859661023313 s).

But that is not everything yet! One can also create a similar function with the jit_compile=True flag, since every operation in Piquasso can be JIT-compiled using XLA through tf.function.

[16]:
jit_compiled_calculate_loss = tf.function(jit_compile=True)(calculate_loss)

jit_compiled_calculate_loss(w, psi_star, cutoff)
2024-03-02 09:40:10.970536: I external/local_xla/xla/service/service.cc:168] XLA service 0x23ff2f70 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2024-03-02 09:40:10.970561: I external/local_xla/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2024-03-02 09:40:14.366437: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-03-02 09:42:47.120088: E external/local_xla/xla/service/slow_operation_alarm.cc:65]
********************************
[Compiling module a_inference_calculate_loss_184206__XlaMustCompile_true_config_proto_3175580994766145631_executor_type_11160318154034397263_.64111] Very slow compile? If you want to file a bug, run with envvar XLA_FLAGS=--xla_dump_to=/tmp/foo and attach the results.
********************************
2024-03-02 09:45:40.915775: E external/local_xla/xla/service/slow_operation_alarm.cc:133] The operation took 4m53.795791141s

********************************
[Compiling module a_inference_calculate_loss_184206__XlaMustCompile_true_config_proto_3175580994766145631_executor_type_11160318154034397263_.64111] Very slow compile? If you want to file a bug, run with envvar XLA_FLAGS=--xla_dump_to=/tmp/foo and attach the results.
********************************
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1709394340.917125   34412 device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
2024-03-02 09:45:41.016959: E external/local_xla/xla/stream_executor/stream_executor_internal.h:177] SetPriority unimplemented for this stream.
[16]:
(<tf.Tensor: shape=(), dtype=float64, numpy=1.9352979440546>,
 <tf.Tensor: shape=(10, 14), dtype=float64, numpy=
 array([[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          2.82974314e-01,  1.00761968e+00, -4.86587480e-04,
          2.35804220e-04, -1.15113732e-04, -4.06551101e-01,
         -1.11506339e+00, -3.28812640e-04,  7.26509004e-03,
         -5.50407493e-04,  1.95176250e-02],
        [-1.27337958e-02, -4.32230679e-04, -1.16956930e-05,
          2.97284980e-01,  9.58032369e-01,  7.80146796e-03,
         -1.15800353e-03, -5.29595553e-04, -6.03508410e-01,
         -1.00960026e+00, -6.00128740e-05, -5.43138082e-03,
          1.13088525e-03, -6.98910091e-03],
        [ 1.50953348e-02,  1.06793890e-04, -6.96402317e-04,
          2.78072391e-01,  8.89425807e-01,  1.65284781e-02,
         -3.26193983e-04, -7.32702009e-04, -4.35469885e-01,
         -1.08220389e+00, -2.19933132e-03, -2.47865138e-03,
         -2.93182299e-03, -1.68765726e-02],
        [ 1.09036764e-02, -1.39943937e-04, -2.79208939e-03,
          2.70519193e-01,  9.50698120e-01,  1.51859239e-02,
         -1.67403741e-04, -2.13834584e-03, -4.22355256e-01,
         -1.12201492e+00,  5.67571939e-04,  1.58293338e-02,
         -2.39030263e-03, -5.86237362e-03],
        [ 8.30080898e-03, -1.82451373e-04, -1.38832253e-03,
          1.76303654e-01,  9.95779073e-01,  4.75550916e-03,
          6.22530734e-03, -8.94953191e-03, -4.22817330e-01,
         -1.02663235e+00,  4.82082440e-03, -1.31465644e-02,
         -1.01452761e-02, -5.70510595e-02],
        [ 2.50325193e-02,  6.29469112e-03, -1.04233986e-02,
          3.23611952e-01,  8.98244843e-01,  3.23882176e-02,
         -2.04379300e-03,  3.59544328e-03, -4.59367176e-01,
         -1.02342603e+00, -2.11534639e-03, -1.15771764e-03,
          3.74358915e-03, -3.37975680e-02],
        [ 2.35414600e-02, -1.02818825e-03,  2.50828514e-03,
          1.33153804e-01,  1.02103626e+00,  2.06230600e-02,
          1.96143023e-03, -5.50583838e-03, -3.60952140e-01,
         -1.03240699e+00, -4.49581492e-04,  8.07466573e-03,
         -1.35877540e-02, -2.15258038e-02],
        [ 1.43627038e-02,  1.12035672e-03, -7.07577659e-03,
          8.26398345e-02,  9.67306681e-01,  1.21847068e-02,
          3.93848334e-03, -1.30241315e-02, -5.96429663e-01,
         -9.72524487e-01,  4.51563428e-03, -2.90174828e-03,
         -2.43580857e-02, -1.92727297e-02],
        [ 3.10054604e-02,  2.81632895e-04, -8.79013015e-03,
          2.34154711e-01,  8.87158522e-01,  2.62144967e-02,
          1.71721715e-03, -3.41037966e-03, -7.37362510e-01,
         -8.97797687e-01,  2.97914053e-03,  4.39680362e-03,
         -1.40056766e-02, -1.96915555e-03],
        [ 2.65714043e-02,  8.56081649e-07, -4.32095216e-04,
          3.61163933e-01,  7.91136183e-01,  2.11602898e-02,
          1.01903539e-03, -1.55889975e-02, -8.24757241e-01,
         -8.52458879e-01,  2.70664645e-03,  2.46375745e-03,
         -4.45933265e-02, -1.36795389e-04]])>)

Compiling the same function takes significantly more time, but the compilation step results in an extra order of magnitude runtime improvement:

[17]:
jit_compiled_runtimes = []

for i in range(10):
    w = tf.Variable(pq.cvqnn.generate_random_cvqnn_weights(layer_count, d))

    start_time = time.time()
    jit_compiled_calculate_loss(w, psi_star, cutoff)
    end_time = time.time()

    jit_compiled_runtimes.append(end_time - start_time)

print(f"Regular:\t{np.mean(regular_runtimes)} s (+/- {np.std(regular_runtimes)} s).")
print(f"Improved:\t{np.mean(improved_runtimes)} s (+/- {np.std(improved_runtimes)} s).")
print(
    f"JIT compiled:\t{np.mean(jit_compiled_runtimes)} s "
    f"(+/- {np.std(jit_compiled_runtimes)} s)."
)
Regular:        2.4323306560516356 s (+/- 0.25203824522512575 s).
Improved:       0.011875104904174805 s (+/- 0.0002783859661023313 s).
JIT compiled:   0.0013802051544189453 s (+/- 0.00014840260989162237 s).

We are able to use this function for. e.g., quantum state learning. Consider the state

\[\ket{\psi_*} = \frac{1}{\sqrt{2}} \left ( \ket{03} + \ket{30} \right ),\]

which is an example of a so-called NOON state. We can produce this using Piquasso:

[18]:
with pq.Program() as target_state_preparation:
    pq.Q(all) | pq.StateVector([0, 3]) / np.sqrt(2)
    pq.Q(all) | pq.StateVector([3, 0]) / np.sqrt(2)


target_state = simulator.execute(target_state_preparation).state

target_state_vector = target_state.state_vector

psi_star = tf.Variable(target_state_vector)

Now we can demonstrate the speed of the JIT-compiled calculation by creating a simple optimization algorithm as follows:

[19]:
learning_rate = 0.01
iterations = 10000

optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)

w = tf.Variable(pq.cvqnn.generate_random_cvqnn_weights(layer_count, d))


for i in range(iterations):
    loss, loss_grad = jit_compiled_calculate_loss(w, psi_star, cutoff)

    optimizer.apply_gradients(zip([loss_grad], [w]))

    if (i + 1) % (iterations // 20) == 0:
        print(f"{i+1}:\t\t", loss.numpy())

print("Final loss:\t", loss.numpy())
500:             0.8607086996549679
1000:            0.8473090910243798
1500:            0.5849377384183669
2000:            0.40879991260231896
2500:            0.29813437623431027
3000:            0.2761913995731654
3500:            0.26621822690380204
4000:            0.23435492471006458
4500:            0.23874428115482763
5000:            0.2205772331178741
5500:            0.2416911247101965
6000:            0.2146732713315681
6500:            0.1993502777279203
7000:            0.22721179422812682
7500:            0.21320290731217184
8000:            0.20555180951647123
8500:            0.21018212032348688
9000:            0.20439647783148554
9500:            0.19759860554592923
10000:           0.18922056395791387
Final loss:      0.18922056395791387

We can use the final weights to calculate the final state, and calculate its fidelity with the target state.

[20]:
program = pq.cvqnn.create_program(w)

final_state = simulator.execute(program).state

print("Final state fidelity: ", final_state.fidelity(target_state))
Final state fidelity:  0.9994946525680765