It runs on my machine 02

Introduction

In the previous post, I managed to get a "Hello World!" completion working in my IntelliJ plugin, "completamente". The completion was hardcoded, always returning the same string, but it worked. Progress.

The next logical step is connecting to an actual LLM to get real completions. I'm following the llama.vim approach, which uses the llama.cpp server's infill endpoint. The idea is simple: send the code before the cursor (prefix) and after the cursor (suffix) to the endpoint, and it returns a completion suggestion.

This post documents my attempt at implementing settings, HTTP integration, and error handling for the plugin. I'm sure someone with more experience would develop a better solution, but I'm stuck with my knowledge of the subject, and I'm doing this to learn.

Understanding the llama.cpp infill endpoint

Before writing any Kotlin code, I need to understand what the llama.cpp infill endpoint expects and returns. The llama.vim plugin source code shows it uses a POST request to /infill with input_prefix and input_suffix parameters.

I'm running a llama.cpp server locally on port 8012 with a small model (Qwen2.5-Coder-3B). The command to start the server locally is llama-server --fim-qwen-3b-default --port 8012.

First, a simple test with just a prefix:

curl -X POST http://127.0.0.1:8012/infill \
  -H "Content-Type: application/json" \
  -d '{"input_prefix": "<?php\nfunction hello(", "input_suffix": ""}' \
  -s | jq '.content'

The response comes back quickly enough on my laptop:

")\n{\n  echo \"hello world\";\n}\n?>\n"

All fine and dandy. The key field here is content - that's the completion text I need to extract and show to the user. The response also includes a bunch of metadata about the generation settings, which I might use later for debugging and customization of the request, but for now I just need the content field.

Let me test with both prefix and suffix to see how the model handles Fill In the Middle (FIM):

curl -X POST http://127.0.0.1:8012/infill \
  -H "Content-Type: application/json" \
  -d '{"input_prefix": "<?php\nfunction calculateSum($a, $b) {", "input_suffix": "}"}' \
  -s | jq '.content'

The response:

"\n    return $a + $b;\n"

Perfect! The model understands that it needs to fill in the middle between the opening brace and the closing brace and that the code is PHP. This is exactly what I need for inline completions in the IDE. Or rather: this is what I need to start my completion work. I will customize the request and the context later.

One more test - what happens when the endpoint is unreachable?

curl -X POST http://127.0.0.1:9999/infill \
  -H "Content-Type: application/json" \
  -d '{"input_prefix": "<?php\nfunction greet(", "input_suffix": ""}' \
  -v

As expected, curl fails with:

*   Trying 127.0.0.1:9999...
* connect to 127.0.0.1 port 9999 from 127.0.0.1 port 56266 failed: Connection refused
* Failed to connect to 127.0.0.1 port 9999 after 0 ms: Couldn't connect to server
* Closing connection

This means my plugin needs to handle connection failures gracefully - I don't want the whole thing to crash just because the server isn't running.

Implementing Settings

Before I can make HTTP requests, I need a way to configure the endpoint URL. I could hardcode it to http://127.0.0.1:8012/infill, but that would make the plugin less flexible. Different users (which I currently do not care about: scratching an itch here) might run their llama.cpp server on different ports, or even on different machines.

I need to implement a settings page where users can configure:

  1. The endpoint URL (default: http://127.0.0.1:8012/infill)
  2. An optional API key (for when the server requires authentication, I do not need it yet, but I might someday)

Let me start by looking at how IntelliJ handles settings.

The Settings class

In IntelliJ, settings are typically implemented using the PersistentStateComponent interface. This interface provides automatic persistence - the IDE takes care of loading and saving the settings to disk. I just need to define what data to store.

Here's my Settings class:

package com.github.lucatume.completamente.settings

import com.intellij.openapi.components.PersistentStateComponent
import com.intellij.openapi.components.State
import com.intellij.openapi.components.Storage
import com.intellij.openapi.components.Service
import com.intellij.openapi.application.ApplicationManager

@State(
    name = "com.github.lucatume.completamente.settings.Settings",
    storages = [Storage("CompletamenteSettings.xml")]
)
@Service
class Settings : PersistentStateComponent<Settings.State> {

    // The state class holds the actual settings values
    data class State(
        var endpointUrl: String = "http://127.0.0.1:8012/infill",
        var apiKey: String = ""
    )

    private var myState = State()

    override fun getState(): State {
        return myState
    }

    override fun loadState(state: State) {
        myState = state
    }

    companion object {
        // Get the application-level instance of Settings
        fun getInstance(): Settings {
            return ApplicationManager.getApplication().getService(Settings::class.java)
        }
    }
}

The @State annotation tells IntelliJ where to store the settings (in an XML file called CompletamenteSettings.xml). The @Service annotation makes this a service that can be retrieved using Settings.getInstance().

The State data class is what gets serialized to disk. It's a simple data class with two fields: endpointUrl and apiKey, both with sensible defaults.

The SettingsConfigurable class

Now I need a UI for users to edit these settings. IntelliJ provides the Configurable interface for this:

package com.github.lucatume.completamente.settings

import com.intellij.openapi.options.Configurable
import javax.swing.JComponent
import javax.swing.JPanel
import javax.swing.JLabel
import javax.swing.JTextField
import java.awt.GridBagLayout
import java.awt.GridBagConstraints
import java.awt.Insets

class SettingsConfigurable : Configurable {

    private var endpointUrlField: JTextField? = null
    private var apiKeyField: JTextField? = null

    override fun getDisplayName(): String {
        return "Completamente"
    }

    override fun createComponent(): JComponent {
        val panel = JPanel(GridBagLayout())
        val constraints = GridBagConstraints()

        // Endpoint URL label and field
        constraints.gridx = 0
        constraints.gridy = 0
        constraints.anchor = GridBagConstraints.WEST
        constraints.insets = Insets(0, 0, 5, 10)
        panel.add(JLabel("Endpoint URL:"), constraints)

        endpointUrlField = JTextField(40)
        constraints.gridx = 1
        constraints.gridy = 0
        constraints.fill = GridBagConstraints.HORIZONTAL
        constraints.weightx = 1.0
        panel.add(endpointUrlField, constraints)

        // API Key label and field
        constraints.gridx = 0
        constraints.gridy = 1
        constraints.fill = GridBagConstraints.NONE
        constraints.weightx = 0.0
        panel.add(JLabel("API Key:"), constraints)

        apiKeyField = JTextField(40)
        constraints.gridx = 1
        constraints.gridy = 1
        constraints.fill = GridBagConstraints.HORIZONTAL
        constraints.weightx = 1.0
        panel.add(apiKeyField, constraints)

        return panel
    }

    override fun isModified(): Boolean {
        val settings = Settings.getInstance()
        val state = settings.state ?: return false

        return endpointUrlField?.text != state.endpointUrl ||
               apiKeyField?.text != state.apiKey
    }

    override fun apply() {
        val settings = Settings.getInstance()
        val state = settings.state ?: Settings.State()

        state.endpointUrl = endpointUrlField?.text ?: state.endpointUrl
        state.apiKey = apiKeyField?.text ?: state.apiKey

        settings.loadState(state)
    }

    override fun reset() {
        val settings = Settings.getInstance()
        val state = settings.state ?: return

        endpointUrlField?.text = state.endpointUrl
        apiKeyField?.text = state.apiKey
    }
}

I'm not a Swing expert (or even a Swing beginner, really), so this code is... functional. It uses GridBagLayout to arrange the labels and text fields. The best way I can explain it is: it's like CSS grid, but more verbose and from the 1990s.

The key methods are:

  • createComponent(): Creates the UI
  • isModified(): Checks if the user changed anything
  • apply(): Saves the changes
  • reset(): Reverts to the saved values

Registering the settings

Finally, I need to register both the service and the configurable in plugin.xml:

<extensions defaultExtensionNs="com.intellij">
    <inline.completion.provider implementation="com.github.lucatume.completamente.completion.Service"/>
    <applicationService serviceImplementation="com.github.lucatume.completamente.settings.Settings"/>
    <applicationConfigurable
            parentId="tools"
            instance="com.github.lucatume.completamente.settings.SettingsConfigurable"
            id="com.github.lucatume.completamente.settings.SettingsConfigurable"
            displayName="Completamente"/>
</extensions>

The applicationService entry makes the Settings service available, and the applicationConfigurable entry adds a "Completamente" page under "Tools" in the IDE settings dialog.

I ran ./gradlew build and it compiled successfully and the plugin settings section appears in all its brutalistic glory:

Completamente settings sections appearing in the IDE Tools

Implementing HTTP Client Integration

Now that I have settings configured, I need to update the Service class to actually use them.

Instead of returning "Hello World!" every time, the service should:

  1. Extract the text before and after the cursor (prefix and suffix)
  2. Make an HTTP POST request to the configured endpoint
  3. Parse the JSON response and extract the content field
  4. Return the completion to the user

The JDK provides HttpURLConnection which should work. Let me update the Service class:

package com.github.lucatume.completamente.completion

import com.intellij.codeInsight.inline.completion.InlineCompletionEvent
import com.intellij.codeInsight.inline.completion.InlineCompletionProvider
import com.intellij.codeInsight.inline.completion.InlineCompletionProviderID
import com.intellij.codeInsight.inline.completion.InlineCompletionRequest
import com.intellij.codeInsight.inline.completion.suggestion.InlineCompletionSuggestion
import com.intellij.notification.NotificationGroupManager
import com.intellij.notification.NotificationType
import com.intellij.openapi.diagnostic.Logger
import com.github.lucatume.completamente.settings.Settings
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.withContext
import org.json.JSONObject
import java.io.OutputStreamWriter
import java.net.HttpURLConnection
import java.net.URI

class Service : InlineCompletionProvider {
    private val logger = Logger.getInstance(Service::class.java)

    override val id: InlineCompletionProviderID
        get() = InlineCompletionProviderID("completamente")

    override suspend fun getSuggestion(request: InlineCompletionRequest): InlineCompletionSuggestion {
        // Get the text before and after the cursor
        val document = request.document
        val offset = request.startOffset
        val text = document.text

        val prefix = text.take(offset)
        val suffix = text.substring(offset)

        // Get the completion from the LLM
        val completion = getCompletion(prefix, suffix)

        return StringSuggestion(completion)
    }

    override fun isEnabled(event: InlineCompletionEvent): Boolean {
        return true
    }

    /**
     * Make an HTTP POST request to the llama.cpp infill endpoint.
     * Returns the completion text, or an empty string if the request fails.
     */
    private suspend fun getCompletion(prefix: String, suffix: String): String {
        return withContext(Dispatchers.IO) {
            try {
                val settings = Settings.getInstance()
                val state = settings.state
                val endpointUrl = state.endpointUrl

                // Create the HTTP connection
                val url = URI(endpointUrl).toURL()
                val connection = url.openConnection() as HttpURLConnection
                connection.requestMethod = "POST"
                connection.setRequestProperty("Content-Type", "application/json")
                connection.doOutput = true

                // Add API key if configured
                if (state.apiKey.isNotEmpty()) {
                    connection.setRequestProperty("Authorization", "Bearer ${state.apiKey}")
                }

                // Build the request body
                val requestBody = JSONObject()
                requestBody.put("input_prefix", prefix)
                requestBody.put("input_suffix", suffix)

                // Send the request
                val writer = OutputStreamWriter(connection.outputStream)
                writer.write(requestBody.toString())
                writer.flush()
                writer.close()

                // Read the response
                val responseCode = connection.responseCode
                if (responseCode == HttpURLConnection.HTTP_OK) {
                    val response = connection.inputStream.bufferedReader().use { it.readText() }
                    val jsonResponse = JSONObject(response)
                    val content = jsonResponse.optString("content", "")

                    logger.info("Got completion: $content")
                    content
                } else {
                    logger.warn("HTTP request failed with code: $responseCode")
                    showErrorNotification("Failed to get completion: HTTP $responseCode")
                    ""
                }
            } catch (e: Exception) {
                logger.warn("Failed to get completion", e)
                showErrorNotification("Failed to connect to LLM endpoint: ${e.message}")
                ""
            }
        }
    }

    /**
     * Show an error notification to the user.
     */
    private fun showErrorNotification(message: String) {
        NotificationGroupManager.getInstance()
            .getNotificationGroup("Completamente")
            .createNotification(message, NotificationType.ERROR)
            .notify(null)
    }
}

There's a lot going on here, so let me break it down:

  1. Coroutines and Dispatchers: The getCompletion function is wrapped in withContext(Dispatchers.IO) which tells Kotlin to run this code on a background thread suitable for I/O operations. This is important because HTTP requests can be slow and we don't want to block the UI thread. The JavaScript mantra of not blocking the main thread is as fundamental here, especially to keep the snappy IDE experience going.

  2. HTTP Request: I'm using the built-in HttpURLConnection class. It's not the most modern HTTP client (there are libraries like OkHttp that are nicer to use), but it works and doesn't require additional dependencies... wait, I already added org.json:json as a dependency because the JDK doesn't include a JSON parser. I guess I'm halfway to using modern libraries anyway. I will eventually refactor this into a dedicated requests object, so this is not as relevant now.

  3. JSON Parsing: I'm using org.json.JSONObject to parse the response. The optString method returns an empty string if the field is missing, which is convenient for error handling.

  4. Error Handling: I'm catching all exceptions and showing a notification to the user. This is important because if the endpoint is unreachable, I don't want the plugin to crash - I want to show a friendly error message to the user. To me.

Error Handling with Notifications

When the HTTP request fails (either due to a connection error or a non-200 status code), the plugin shows a balloon notification in the IDE.

To make this work, I had to register a notification group in plugin.xml:

<notificationGroup id="Completamente" displayType="BALLOON"/>

This creates a notification group called "Completamente" that displays as a balloon in the bottom-right corner of the IDE (the same place where build notifications appear).

Notice baloon showing

Adding the JSON dependency

When I tried to build, I got compilation errors because the JDK doesn't include a JSON parser. I added the org.json library to build.gradle.kts:

dependencies {
    implementation("org.json:json:20240303")
    // ... other dependencies
}

The number after it, 20240303 is the version number? Or date? It works.

After that, ./gradlew build succeeded. The tests still pass (well, I had to update one test that was checking for "Hello World!" because now the service makes HTTP requests).

I have played around a bit and and it's mostly working.

Completions working in a PHP file

Next

In the next post I will concentrate over the HTTP request part of the code:

  • refactoring to an abstracted API
  • handling concurrent requests correctly
  • testing it