Qiskit Code Assistant
Qiskit Code Assistant aims to make quantum computing more accessible to new Qiskit adopters and to improve the coding experience for current users. It is a generative AI code assistant powered by watsonx. It is trained using millions of text tokens from Qiskit SDK, years of Qiskit code examples, and IBM Quantum® features. Qiskit Code Assistant can help your quantum development workflow by offering LLM-generated suggestions based on IBM Granite models, which incorporate the latest features and functionalities from IBM®.
- This is an experimental feature available to IBM Quantum Premium Plan users registered on the new IBM Quantum Platform.
- Qiskit Code Assistant is in preview release status and is subject to change.
- If you have feedback or want to contact the developer team, use the Qiskit Slack Workspace channel or the related public GitHub repositories.
Features
The following features are included in the Visual Studio Code (VS Code) and compatible editors, as well as JupyterLab extensions:
- Accelerates Qiskit code generation by leveraging generative AI based on models specialized in generating Qiskit code.
- Allows abstract and specific prompts to generate recommendations.
- Presents suggestions that you can review, accept, or reject.
- Supports Python code and Jupyter notebook files.
- Includes guardrails to avoid answering questions that represent a potential risk for users, such as hateful speech.
For instructions to integrate Qiskit Code Assistant directly into your development environment, follow the instructions in the appropriate topic:
The Large Language Model (LLM) behind Qiskit Code Assistant
To provide code suggestions, Qiskit Code Assistant uses a Large Language Model (LLM). In this case, Qiskit Code Assistant currently relies on the model mistral-small-3.2-24b-qiskit, built on the Mistral-Small-3.2-24B-Qiskit model. The mistral-small-3.2-24b-qiskit model improves the Mistral-Small-3.2-24B-Instruct-2506 model's code generation capabilities for Qiskit through extended pretraining and fine-tuning it on high-quality Qiskit data, as well as Python commits and chat. For more information about the Mistral AI models family, refer to Mistral AI documentation. For more details about the .*-qiskit models, see Qiskit Code Assistant: Training LLMs for generating Quantum Computing Code.
Our LLMs specialized for Qiskit are available also as open-source models. Check all the models available at https://huggingface.co/Qiskit.
The Qiskit HumanEval and Qiskit HumanEval Hard benchmarks
To test the mistral-small-3.2-24b-qiskit and other models, we collaborated with Qiskit Advocates and experts to create the execution-based benchmarks called Qiskit HumanEval (QHE) and Qiskit HumanEval Hard (QHE Hard), and ran them on the models. These benchmarks are similar to HumanEval, including multiple challenging code problems to solve, all based on the official Qiskit libraries.
The benchmarks are composed of approximately 150 tests, each one made from a function definition, followed by a docstring that details the task the model is required to solve. Each example also includes a reference canonical solution, as well as unit tests, to evaluate the correctness of the generated solutions. There are three levels of difficulty for tests: basic, intermediate, and difficult. The Qiskit HumanEval Hard benchmark is a variation of the Qiskit HumanEval one, but removes information related to code imports, so the LLM needs to figure out the right method or class imports. This change makes the dataset much more challenging for LLMs, according to our tests and initial results.
The datasets for Qiskit HumanEval and Qiskit HumanEval Hard are available at these websites: Qiskit HumanEval and Qiskit HumanEval. You can contribute to the development of these benchmarks at the GitHub repository.
More information and citations
To learn more about Qiskit Code Assistant, the Qiskit HumanEval, or Qiskit HumanEval Hard benchmarks, and cite them in your scientific publications, review these recommended citations:
@misc{2405.19495,
Author = {Nicolas Dupuis and Luca Buratti and Sanjay Vishwakarma and Aitana Viudes Forrat and David Kremer and Ismael Faro and Ruchir Puri and Juan Cruz-Benito},
Title = {Qiskit Code Assistant: Training LLMs for generating Quantum Computing Code},
Year = {2024},
Eprint = {arXiv:2405.19495},
}
@misc{2406.14712,
Author = {Sanjay Vishwakarma and Francis Harkins and Siddharth Golecha and Vishal Sharathchandra Bajpe and Nicolas Dupuis and Luca Buratti and David Kremer and Ismael Faro and Ruchir Puri and Juan Cruz-Benito},
Title = {Qiskit HumanEval: An Evaluation Benchmark For Quantum Code Generative Models},
Year = {2024},
Eprint = {arXiv:2406.14712},
}
@misc{2508.20907,
Author = {Nicolas Dupuis and Adarsh Tiwari and Youssef Mroueh and David Kremer and Ismael Faro and Juan Cruz-Benito},
Title = {Quantum Verifiable Rewards for Post-Training Qiskit Code Assistant},
Year = {2025},
Eprint = {arXiv:2508.20907},
}
Use Qiskit Code Assistant in local mode
Learn how to install, configure, and use any of Qiskit Code Assistant models on your local machine.
- Qiskit Code Assistant is in preview release status and is subject to change.
- If you have feedback or want to contact the developer team, use the Qiskit Slack Workspace channel or the related public GitHub repositories.
Quick start (recommended)
The easiest way to get started with Qiskit Code Assistant in local mode is to use the automated setup scripts for either the VS Code or JupyterLab extension. These scripts will automatically install Ollama to run the LLMs, download the recommended model, and configure the extension for you.
VS Code extension setup
Run the following command in your terminal:
bash <(curl -fsSL https://raw.githubusercontent.com/Qiskit/qiskit-code-assistant-vscode/main/setup_local.sh)This script performs the following steps:
- Install Ollama (if not already installed)
- Download and configure the recommended Qiskit Code Assistant model
- Set up the VS Code extension to work with your local deployment
JupyterLab extension setup
Run the following command in your terminal:
bash <(curl -fsSL https://raw.githubusercontent.com/Qiskit/qiskit-code-assistant-jupyterlab/main/setup_local.sh)This script will:
- Install Ollama (if not already installed)
- Download and configure the recommended Qiskit Code Assistant model
- Set up the JupyterLab extension to work with your local deployment
Available models
Current models
These are the latest recommended models for use with Qiskit Code Assistant:
- Qiskit/mistral-small-3.2-24b-qiskit - Released October 2025
- qiskit/qwen2.5-coder-14b-qiskit - Released June 2025
- qiskit/granite-3.3-8b-qiskit - Released June 2025
- qiskit/granite-3.2-8b-qiskit - Released June 2025
GGUF models (recommended for personal environments/laptops)
GGUF format models are optimized for local use and require fewer computational resources:
-
mistral-small-3.2-24b-qiskit-GGUF – Released October 2025
Trained with Qiskit data up to version 2.1 -
qiskit/qwen2.5-coder-14b-qiskit-GGUF – Released June 2025
Trained with Qiskit data up to version 2.0 -
qiskit/granite-3.3-8b-qiskit-GGUF – Released June 2025
Trained with Qiskit data up to version 2.0 -
qiskit/granite-3.2-8b-qiskit-GGUF – Released June 2025
Trained with Qiskit data up to version 2.0
The Open Source Qiskit Code Assistant models are available in Safetensors is a file format designed specifically for storing machine learning model weights and tensors in a secure and efficient manner. or GGUF is a binary format that is designed for quickly loading and saving models, and for readability. and can be downloaded from the Hugging Face as explained below.
Qiskit versions used for training
Model | Benchmark Metrics | Release date | Trained on Qiskit version | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| QiskitHumanEval-Hard | QiskitHumanEval | HumanEval | ASDiv | MathQA | SciQ | MBPP | IFEval | CrowsPairs (English) | TruthfulQA (MC1 acc) | |||
| mistral-small-3.2-24b-qiskit | 32.45 | 47.02 | 77.49 | 3.77 | 49.68 | 97.50 | 64.00 | 48.44 | 67.08 | 39.41 | January 2026 | 2.2 |
| qwen2.5-coder-14b-qiskit | 25.17 | 49.01 | 91.46 | 4.21 | 53.90 | 97.00 | 77.60 | 49.64 | 65.18 | 37.82 | June 2025 | 2.0 |
| granite-3.3-8b-qiskit | 14.57 | 27.15 | 62.80 | 0.48 | 38.66 | 93.30 | 52.40 | 59.71 | 59.75 | 39.05 | June 2025 | 2.0 |
| granite-3.2-8b-qiskit | 9.93 | 24.50 | 57.32 | 0.09 | 41.41 | 96.30 | 51.80 | 60.79 | 66.79 | 40.51 | June 2025 | 2.0 |
| granite-8b-qiskit-rc-0.10 | 15.89 | 38.41 | 59.76 | — | — | — | — | — | — | — | February 2025 | 1.3 |
| granite-8b-qiskit | 17.88 | 44.37 | 53.66 | — | — | — | — | — | — | — | November 2024 | 1.2 |
Note: All models listed in the benchmark table were evaluated using their respective system prompt, defined in their Hugging Face model.
Deprecated models
These models are no longer actively maintained but remain available:
- qiskit/granite-8b-qiskit-rc-0.10 - Released February 2025 (deprecated)
- qiskit/granite-8b-qiskit - Released November 2024 (deprecated)
Advanced setup
If you prefer to manually configure your local setup or need more control over the installation process, expand the sections below.
Follow these steps to download any Qiskit Code Assistant-related model from the Hugging Face website:
- Navigate to the desired Qiskit model page on Hugging Face.
- Go to the Files and Versions tab and download the safetensors or GGUF model files.
To download any of the available Qiskit Code Assistant models using the Hugging Face CLI, follow these steps:
-
Install the Hugging Face CLI
-
Log in to your Hugging Face account
huggingface-cli login -
Download the model you prefer from the previous list
huggingface-cli download <HF REPO NAME> <MODEL PATH> --local-dir <LOCAL PATH>
-
There are multiple ways to deploy and interact with the downloaded Qiskit Code Assistant model. This guide demonstrates using Ollama as follows: either with the Ollama application by using the Hugging Face Hub integration or local model, or with the
llama-cpp-pythonpackage.Using the Ollama application
The Ollama application provides a simple solution to run the LLMs locally. It is easy to use, with a CLI that makes the whole setup process, model management, and interaction fairly straightforward. It’s ideal for quick experimentation and for users that want fewer technical details to handle.
Install Ollama
-
Download the Ollama application
-
Install the downloaded file
-
Launch the installed Ollama application
InfoThe application is running successfully when the Ollama icon appears in the desktop menu bar. You can also verify the service is running by going to
http://localhost:11434/. -
Try Ollama in your terminal and start running models. For example:
ollama run hf.co/Qiskit/Qwen2.5-Coder-14B-Qiskit
Set up Ollama using the Hugging Face Hub integration
The Ollama/Hugging Face Hub integration provides a way to interact with models hosted on the Hugging Face Hub without needing to create a new modelfile nor manually downloading the GGUF or safetensors files. The default
templateandparamsfiles are already included for the model on the Hugging Face Hub.-
Make sure the Ollama application is running.
-
Go the desired model page, and copy the URL. For example, https://huggingface.co/Qiskit/Qwen2.5-Coder-14B-Qiskit-GGUF.
-
From your terminal, run the command:
ollama run hf.co/Qiskit/Qwen2.5-Coder-14B-Qiskit
You can use the
hf.co/Qiskit/Qwen2.5-Coder-14B-Qiskitmodel or any of the other currently recommended GGUF official modelshf.co/Qiskit/mistral-small-3.2-24b-qiskit-GGUForhf.co/Qiskit/granite-3.3-8b-qiskit-GGUF.Set up Ollama with a manually downloaded Qiskit Code Assistant GGUF model
If you have manually downloaded a GGUF model such as https://huggingface.co/Qiskit/Qwen2.5-Coder-14B-Qiskit-GGUF and you want to experiment with different templates and parameters, you can follow these steps to load it into your local Ollama application.
-
Create a
Modelfileentering the following content and be sure to update<PATH-TO-GGUF-FILE>to the actual path of your downloaded model.FROM <PATH-TO-GGUF-FILE> TEMPLATE """{{ if .System }} System: {{ .System }} {{ end }}{{ if .Prompt }}Question: {{ .Prompt }} {{ end }}Answer: ```python{{ .Response }} """ PARAMETER stop "Question:" PARAMETER stop "Answer:" PARAMETER stop "System:" PARAMETER stop "```" PARAMETER temperature 0 PARAMETER top_k 1 -
Run the following command to create a custom model instance based on the
Modelfile.ollama create Qwen2.5-Coder-14B-Qiskit -f ./path-to-model-fileNoteThis process may take some time for Ollama to read the model file, initialize the model instance, and configure it according to the specifications provided.
Run the Qiskit Code Assistant model manually downloaded in Ollama
After the
Qwen2.5-Coder-14B-Qiskitmodel has been set up in Ollama, run the following command to launch the model and interact with it in the terminal (in chat mode).ollama run Qwen2.5-Coder-14B-QiskitSome useful commands:
ollama list- List models on your computerollama rm Qwen2.5-Coder-14B-Qiskit- Delete the modelollama show Qwen2.5-Coder-14B-Qiskit- Show model informationollama stop Qwen2.5-Coder-14B-Qiskit- Stop a model that is currently runningollama ps- List which models are currently loaded
-
An alternative to the Ollama application is the
llama-cpp-pythonpackage, which is a Python binding forllama.cpp. It gives you more control and flexibility to run the GGUF model locally, and is ideal for users who wish to integrate the local model in their workflows and Python applications.- Install
llama-cpp-python - Interact with the model from within your application using
llama_cpp. For example:
from llama_cpp import Llama model_path = <PATH-TO-GGUF-FILE> model = Llama( model_path, seed=17, n_ctx=10000, n_gpu_layers=37, # to offload in gpu, but put 0 if all in cpu ) input = 'Generate a quantum circuit with 2 qubits' raw_pred = model(input)["choices"][0]["text"]You can also add text generation parameters to the model to customize the inference:
generation_kwargs = { "max_tokens": 512, "echo": False, # Echo the prompt in the output "top_k": 1 } raw_pred = model(input, **generation_kwargs)["choices"][0]["text"]- Install
Use the
llama.cpplibraryAnother alternative is to use
llama.cpp, an open-source library for performing LLM inference on a CPU with minimal setup. It provides low-level control over the model execution and is typically run from the command line, pointing to a local GGUF model file.There are several ways to install
llama.cppon your machine:- Install llama.cpp using brew, nix, or winget
- Run with Docker: See out the Docker documentation by
llama.cppteam - Download pre-built binaries from the releases page
- Build from source by cloning this repository
Once installed, you can use
llama.cppto interact with GGUF models in conversation mode as follows:# Use a local model file llama-cli -m my_model.gguf -cnv # Or download and run a model directly from Hugging Face llama-cli -hf Qiskit/Qwen2.5-Coder-14B-Qiskit-GGUF -cnvYou can also launch an OpenAI-compatible API server for the model in the following way:
llama-server -hf Qiskit/Qwen2.5-Coder-14B-Qiskit-GGUFAdvanced parameters
With the
llama-cliprogram, you can control the model generation using command-line options. For example, you can provide an initial “system” prompt using the-p/--promptflag. In conversation mode (-cnv), this initial prompt acts as the system message. Otherwise, you can simply prepend any desired instruction to your prompt text. You can also adjust sampling parameters - for instance: temperature (--temp), top-k (--top-k), top-p (--top-p), repetition penalty (--repeat-penalty), and the seed to use (--seed). The following is an example invocation using these options:llama-cli -hf Qiskit/Qwen2.5-Coder-14B-Qiskit-GGUF \ -p "You are a friendly assistant." -cnv \ --temp 0.7 \ --top-k 50 \ --top-p 0.95 \ --repeat-penalty 1.1 \ --seed 42To ensure proper functionality of our Qiskit models, we recommend using the system prompt provided in our HF GGUF repositories: system prompt for mistral-small-3.2-24b-qiskit-GGUF, Qwen2.5-Coder-14B-Qiskit-GGUF, granite-3.3-8b-qiskit-GGUF, and granite-3.2-8b-qiskit-GGUF.
Use the VS Code extension and JupyterLab extension for the Qiskit Code Assistant to prompt the locally deployed Qiskit Code Assistant model. Once you have the Ollama application set up with the model, you can configure the extensions to connect to the local service.
Connect with the Qiskit Code Assistant VS Code extension
With the Qiskit Code Assistant VS Code extension, you can interact with the model and perform code completion while writing your code. This can work well for users looking for assistance writing Qiskit code for their Python applications.
- Install the Qiskit Code Assistant VS Code extension.
- In VS Code, go to the User Settings and set the Qiskit Code Assistant: Url to the URL of your local Ollama deployment (for example,
http://localhost:11434). - Reload VS Code by going to View > Command Palette... and selecting Developer: Reload Window.
The Qiskit Code Assistant model configured in Ollama should appear in the status bar and is then ready to use.
Connect with the Qiskit Code Assistant JupyterLab extension
With the Qiskit Code Assistant JupyterLab extension, you can interact with the model and perform code completion directly in your Jupyter Notebook. Users who predominantly work with Jupyter Notebooks can take advantage of this extension to further enhance their experience writing Qiskit code.
- Install the Qiskit Code Assistant JupyterLab extension.
- In JupyterLab, go to the Settings Editor and set the Qiskit Code Assistant Service API to the URL of your local Ollama deployment (for example,
http://localhost:11434).
The Qiskit Code Assistant model configured in Ollama should appear in the status bar and is then ready to use.
Next steps
- Install and use the official JupyterLab or VS Code extensions.
- See examples to use Qiskit Code Assistant for circuits, configuring error suppression, and transpiling with pass managers.