It runs on my machine 02
October 17, 2025
Introduction
In the previous post, I managed to get a "Hello World!" completion working in my IntelliJ plugin, "completamente". The completion was hardcoded, always returning the same string, but it worked. Progress.
The next logical step is connecting to an actual LLM to get real completions. I'm following the llama.vim approach, which uses the llama.cpp server's infill endpoint. The idea is simple: send the code before the cursor (prefix) and after the cursor (suffix) to the endpoint, and it returns a completion suggestion.
This post documents my attempt at implementing settings, HTTP integration, and error handling for the plugin. I'm sure someone with more experience would develop a better solution, but I'm stuck with my knowledge of the subject, and I'm doing this to learn.
Understanding the llama.cpp infill endpoint
Before writing any Kotlin code, I need to understand what the llama.cpp infill endpoint expects and returns. The llama.vim plugin source code shows it uses a POST request to /infill
with input_prefix
and input_suffix
parameters.
I'm running a llama.cpp server locally on port 8012
with a small model (Qwen2.5-Coder-3B).
The command to start the server locally is llama-server --fim-qwen-3b-default --port 8012
.
First, a simple test with just a prefix:
curl -X POST http://127.0.0.1:8012/infill \
-H "Content-Type: application/json" \
-d '{"input_prefix": "<?php\nfunction hello(", "input_suffix": ""}' \
-s | jq '.content'
The response comes back quickly enough on my laptop:
")\n{\n echo \"hello world\";\n}\n?>\n"
All fine and dandy. The key field here is content
- that's the completion text I need to extract and show to the user. The response also includes a bunch of metadata about the generation settings, which I might use later for debugging and customization of the request, but for now I just need the content
field.
Let me test with both prefix and suffix to see how the model handles Fill In the Middle (FIM):
curl -X POST http://127.0.0.1:8012/infill \
-H "Content-Type: application/json" \
-d '{"input_prefix": "<?php\nfunction calculateSum($a, $b) {", "input_suffix": "}"}' \
-s | jq '.content'
The response:
"\n return $a + $b;\n"
Perfect! The model understands that it needs to fill in the middle between the opening brace and the closing brace and that the code is PHP. This is exactly what I need for inline completions in the IDE. Or rather: this is what I need to start my completion work. I will customize the request and the context later.
One more test - what happens when the endpoint is unreachable?
curl -X POST http://127.0.0.1:9999/infill \
-H "Content-Type: application/json" \
-d '{"input_prefix": "<?php\nfunction greet(", "input_suffix": ""}' \
-v
As expected, curl
fails with:
* Trying 127.0.0.1:9999...
* connect to 127.0.0.1 port 9999 from 127.0.0.1 port 56266 failed: Connection refused
* Failed to connect to 127.0.0.1 port 9999 after 0 ms: Couldn't connect to server
* Closing connection
This means my plugin needs to handle connection failures gracefully - I don't want the whole thing to crash just because the server isn't running.
Implementing Settings
Before I can make HTTP requests, I need a way to configure the endpoint URL. I could hardcode it to http://127.0.0.1:8012/infill
, but that would make the plugin less flexible. Different users (which I currently do not care about: scratching an itch here) might run their llama.cpp server on different ports, or even on different machines.
I need to implement a settings page where users can configure:
- The endpoint URL (default:
http://127.0.0.1:8012/infill
) - An optional API key (for when the server requires authentication, I do not need it yet, but I might someday)
Let me start by looking at how IntelliJ handles settings.
The Settings class
In IntelliJ, settings are typically implemented using the PersistentStateComponent
interface. This interface provides automatic persistence - the IDE takes care of loading and saving the settings to disk. I just need to define what data to store.
Here's my Settings
class:
package com.github.lucatume.completamente.settings
import com.intellij.openapi.components.PersistentStateComponent
import com.intellij.openapi.components.State
import com.intellij.openapi.components.Storage
import com.intellij.openapi.components.Service
import com.intellij.openapi.application.ApplicationManager
@State(
name = "com.github.lucatume.completamente.settings.Settings",
storages = [Storage("CompletamenteSettings.xml")]
)
@Service
class Settings : PersistentStateComponent<Settings.State> {
// The state class holds the actual settings values
data class State(
var endpointUrl: String = "http://127.0.0.1:8012/infill",
var apiKey: String = ""
)
private var myState = State()
override fun getState(): State {
return myState
}
override fun loadState(state: State) {
myState = state
}
companion object {
// Get the application-level instance of Settings
fun getInstance(): Settings {
return ApplicationManager.getApplication().getService(Settings::class.java)
}
}
}
The @State
annotation tells IntelliJ where to store the settings (in an XML file called CompletamenteSettings.xml
). The @Service
annotation makes this a service that can be retrieved using Settings.getInstance()
.
The State
data class is what gets serialized to disk. It's a simple data class with two fields: endpointUrl
and apiKey
, both with sensible defaults.
The SettingsConfigurable class
Now I need a UI for users to edit these settings. IntelliJ provides the Configurable
interface for this:
package com.github.lucatume.completamente.settings
import com.intellij.openapi.options.Configurable
import javax.swing.JComponent
import javax.swing.JPanel
import javax.swing.JLabel
import javax.swing.JTextField
import java.awt.GridBagLayout
import java.awt.GridBagConstraints
import java.awt.Insets
class SettingsConfigurable : Configurable {
private var endpointUrlField: JTextField? = null
private var apiKeyField: JTextField? = null
override fun getDisplayName(): String {
return "Completamente"
}
override fun createComponent(): JComponent {
val panel = JPanel(GridBagLayout())
val constraints = GridBagConstraints()
// Endpoint URL label and field
constraints.gridx = 0
constraints.gridy = 0
constraints.anchor = GridBagConstraints.WEST
constraints.insets = Insets(0, 0, 5, 10)
panel.add(JLabel("Endpoint URL:"), constraints)
endpointUrlField = JTextField(40)
constraints.gridx = 1
constraints.gridy = 0
constraints.fill = GridBagConstraints.HORIZONTAL
constraints.weightx = 1.0
panel.add(endpointUrlField, constraints)
// API Key label and field
constraints.gridx = 0
constraints.gridy = 1
constraints.fill = GridBagConstraints.NONE
constraints.weightx = 0.0
panel.add(JLabel("API Key:"), constraints)
apiKeyField = JTextField(40)
constraints.gridx = 1
constraints.gridy = 1
constraints.fill = GridBagConstraints.HORIZONTAL
constraints.weightx = 1.0
panel.add(apiKeyField, constraints)
return panel
}
override fun isModified(): Boolean {
val settings = Settings.getInstance()
val state = settings.state ?: return false
return endpointUrlField?.text != state.endpointUrl ||
apiKeyField?.text != state.apiKey
}
override fun apply() {
val settings = Settings.getInstance()
val state = settings.state ?: Settings.State()
state.endpointUrl = endpointUrlField?.text ?: state.endpointUrl
state.apiKey = apiKeyField?.text ?: state.apiKey
settings.loadState(state)
}
override fun reset() {
val settings = Settings.getInstance()
val state = settings.state ?: return
endpointUrlField?.text = state.endpointUrl
apiKeyField?.text = state.apiKey
}
}
I'm not a Swing expert (or even a Swing beginner, really), so this code is... functional. It uses GridBagLayout
to arrange the labels and text fields. The best way I can explain it is: it's like CSS grid, but more verbose and from the 1990s.
The key methods are:
createComponent()
: Creates the UIisModified()
: Checks if the user changed anythingapply()
: Saves the changesreset()
: Reverts to the saved values
Registering the settings
Finally, I need to register both the service and the configurable in plugin.xml
:
<extensions defaultExtensionNs="com.intellij">
<inline.completion.provider implementation="com.github.lucatume.completamente.completion.Service"/>
<applicationService serviceImplementation="com.github.lucatume.completamente.settings.Settings"/>
<applicationConfigurable
parentId="tools"
instance="com.github.lucatume.completamente.settings.SettingsConfigurable"
id="com.github.lucatume.completamente.settings.SettingsConfigurable"
displayName="Completamente"/>
</extensions>
The applicationService
entry makes the Settings service available, and the applicationConfigurable
entry adds a "Completamente" page under "Tools" in the IDE settings dialog.
I ran ./gradlew build
and it compiled successfully and the plugin settings section appears in all its brutalistic glory:
Implementing HTTP Client Integration
Now that I have settings configured, I need to update the Service
class to actually use them.
Instead of returning "Hello World!" every time, the service should:
- Extract the text before and after the cursor (prefix and suffix)
- Make an HTTP POST request to the configured endpoint
- Parse the JSON response and extract the
content
field - Return the completion to the user
The JDK provides HttpURLConnection
which should work. Let me update the Service
class:
package com.github.lucatume.completamente.completion
import com.intellij.codeInsight.inline.completion.InlineCompletionEvent
import com.intellij.codeInsight.inline.completion.InlineCompletionProvider
import com.intellij.codeInsight.inline.completion.InlineCompletionProviderID
import com.intellij.codeInsight.inline.completion.InlineCompletionRequest
import com.intellij.codeInsight.inline.completion.suggestion.InlineCompletionSuggestion
import com.intellij.notification.NotificationGroupManager
import com.intellij.notification.NotificationType
import com.intellij.openapi.diagnostic.Logger
import com.github.lucatume.completamente.settings.Settings
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.withContext
import org.json.JSONObject
import java.io.OutputStreamWriter
import java.net.HttpURLConnection
import java.net.URI
class Service : InlineCompletionProvider {
private val logger = Logger.getInstance(Service::class.java)
override val id: InlineCompletionProviderID
get() = InlineCompletionProviderID("completamente")
override suspend fun getSuggestion(request: InlineCompletionRequest): InlineCompletionSuggestion {
// Get the text before and after the cursor
val document = request.document
val offset = request.startOffset
val text = document.text
val prefix = text.take(offset)
val suffix = text.substring(offset)
// Get the completion from the LLM
val completion = getCompletion(prefix, suffix)
return StringSuggestion(completion)
}
override fun isEnabled(event: InlineCompletionEvent): Boolean {
return true
}
/**
* Make an HTTP POST request to the llama.cpp infill endpoint.
* Returns the completion text, or an empty string if the request fails.
*/
private suspend fun getCompletion(prefix: String, suffix: String): String {
return withContext(Dispatchers.IO) {
try {
val settings = Settings.getInstance()
val state = settings.state
val endpointUrl = state.endpointUrl
// Create the HTTP connection
val url = URI(endpointUrl).toURL()
val connection = url.openConnection() as HttpURLConnection
connection.requestMethod = "POST"
connection.setRequestProperty("Content-Type", "application/json")
connection.doOutput = true
// Add API key if configured
if (state.apiKey.isNotEmpty()) {
connection.setRequestProperty("Authorization", "Bearer ${state.apiKey}")
}
// Build the request body
val requestBody = JSONObject()
requestBody.put("input_prefix", prefix)
requestBody.put("input_suffix", suffix)
// Send the request
val writer = OutputStreamWriter(connection.outputStream)
writer.write(requestBody.toString())
writer.flush()
writer.close()
// Read the response
val responseCode = connection.responseCode
if (responseCode == HttpURLConnection.HTTP_OK) {
val response = connection.inputStream.bufferedReader().use { it.readText() }
val jsonResponse = JSONObject(response)
val content = jsonResponse.optString("content", "")
logger.info("Got completion: $content")
content
} else {
logger.warn("HTTP request failed with code: $responseCode")
showErrorNotification("Failed to get completion: HTTP $responseCode")
""
}
} catch (e: Exception) {
logger.warn("Failed to get completion", e)
showErrorNotification("Failed to connect to LLM endpoint: ${e.message}")
""
}
}
}
/**
* Show an error notification to the user.
*/
private fun showErrorNotification(message: String) {
NotificationGroupManager.getInstance()
.getNotificationGroup("Completamente")
.createNotification(message, NotificationType.ERROR)
.notify(null)
}
}
There's a lot going on here, so let me break it down:
-
Coroutines and Dispatchers: The
getCompletion
function is wrapped inwithContext(Dispatchers.IO)
which tells Kotlin to run this code on a background thread suitable for I/O operations. This is important because HTTP requests can be slow and we don't want to block the UI thread. The JavaScript mantra of not blocking the main thread is as fundamental here, especially to keep the snappy IDE experience going. -
HTTP Request: I'm using the built-in
HttpURLConnection
class. It's not the most modern HTTP client (there are libraries like OkHttp that are nicer to use), but it works and doesn't require additional dependencies... wait, I already addedorg.json:json
as a dependency because the JDK doesn't include a JSON parser. I guess I'm halfway to using modern libraries anyway. I will eventually refactor this into a dedicated requests object, so this is not as relevant now. -
JSON Parsing: I'm using
org.json.JSONObject
to parse the response. TheoptString
method returns an empty string if the field is missing, which is convenient for error handling. -
Error Handling: I'm catching all exceptions and showing a notification to the user. This is important because if the endpoint is unreachable, I don't want the plugin to crash - I want to show a friendly error message to the user. To me.
Error Handling with Notifications
When the HTTP request fails (either due to a connection error or a non-200 status code), the plugin shows a balloon notification in the IDE.
To make this work, I had to register a notification group in plugin.xml
:
<notificationGroup id="Completamente" displayType="BALLOON"/>
This creates a notification group called "Completamente" that displays as a balloon in the bottom-right corner of the IDE (the same place where build notifications appear).
Adding the JSON dependency
When I tried to build, I got compilation errors because the JDK doesn't include a JSON parser. I added the org.json
library to build.gradle.kts
:
dependencies {
implementation("org.json:json:20240303")
// ... other dependencies
}
The number after it, 20240303
is the version number? Or date? It works.
After that, ./gradlew build
succeeded. The tests still pass (well, I had to update one test that was checking for "Hello World!" because now the service makes HTTP requests).
I have played around a bit and and it's mostly working.
Next
In the next post I will concentrate over the HTTP request part of the code:
- refactoring to an abstracted API
- handling concurrent requests correctly
- testing it