Debugging Ktor Upload Performance (OkHttp vs CIO)

Originally published on Medium: Debugging Ktor HTTP Client Performance — OkHttp vs CIO file upload speeds

The problem

I was working on a Kotlin Multiplatform application that needed to upload large files (up to ~100MB) to AWS S3 using a presigned URL. The initial implementation used OkHttp, but it was significantly slower compared to Postman when uploading the same files to the same S3 endpoint.

To isolate variables and test different HTTP client engines, I built a small sample app. The surprising result: CIO (Ktor’s native engine) was dramatically faster than OkHttp—around 3–4× faster for the same upload.

That raised the core question: is OkHttp inherently slower here, or is something else going on?

Initial observations

In the sample app:

CIO: ~3–4 seconds for 15MB uploads
OkHttp: ~30–40 seconds for the same uploads

Step 1: create a controlled test setup

I set up a simple Kotlin project with two implementations:

Ktor with OkHttp — Ktor’s OkHttp engine
Ktor with CIO — Ktor’s native CIO engine

Initial test harness

fun main(args: Array<String>) {
  val implementation = args.getOrNull(0)?.lowercase() ?: "ktor-cio"
  when (implementation) {
    "ktor-okhttp" -> runBlocking { uploadWithKtorOkHttp(presignedUrl, file, checksum) }
    "ktor-cio" -> runBlocking { uploadWithKtorCio(presignedUrl, file, checksum) }
  }
}

First results

Implementation	Time	Speed
Ktor OkHttp	37.25s	0.40 MB/s
Ktor CIO	7.55s	1.99 MB/s

Key finding: CIO was ~5× faster than OkHttp in this setup.

Step 2: add basic logging

Next, I added basic logging to see what was happening on the wire.

// Ktor OkHttp with basic logging
suspend fun uploadWithKtorOkHttp(presignedUrl: String, file: File, checksum: String) {
  val client = HttpClient(io.ktor.client.engine.okhttp.OkHttp) {
    install(Logging) {
      logger = Logger.DEFAULT
      level = LogLevel.HEADERS
    }
  }
  // ... rest of implementation
}

// Ktor CIO with basic logging
suspend fun uploadWithKtorCio(presignedUrl: String, file: File, checksum: String) {
  val client = HttpClient(io.ktor.client.engine.cio.CIO) {
    install(Logging) {
      logger = Logger.DEFAULT
      level = LogLevel.HEADERS
    }
  }
  // ... rest of implementation
}

Logging results

Ktor OkHttp output:

Ktor-OkHttp: REQUEST: https://...
Ktor-OkHttp: RESPONSE: 200
Ktor-OkHttp: Protocol: HTTP/2.0
Ktor-OkHttp: Total request time: 37190ms

Ktor CIO output:

Ktor-CIO: REQUEST: https://...
Ktor-CIO: RESPONSE: 200 OK
Ktor-CIO: Protocol: HTTP/1.1
Ktor-CIO: Total request time: 7486ms

Key finding: CIO was using HTTP/1.1, while OkHttp was using HTTP/2.0.

That’s counterintuitive because HTTP/2 is generally considered more efficient—yet CIO still won by a lot. That suggested the bottleneck wasn’t simply “network protocol efficiency”.

Step 3: deep performance monitoring

At this point, I wanted to know exactly where time was being spent. I added instrumentation for:

time per phase
memory deltas
thread count deltas
file read behavior

Performance monitor

class PerformanceMonitor(private val name: String) {
  private val memoryBean: MemoryMXBean = ManagementFactory.getMemoryMXBean()
  private val threadBean: ThreadMXBean = ManagementFactory.getThreadMXBean()

  private val startTime = System.currentTimeMillis()
  private val startMemory = getUsedMemory()
  private val startThreads = threadBean.threadCount

  fun logPhase(phase: String) {
    val currentTime = System.currentTimeMillis()
    val elapsed = currentTime - startTime

    val currentMemory = getUsedMemory()
    val memoryDelta = currentMemory - startMemory

    val currentThreads = threadBean.threadCount
    val threadDelta = currentThreads - startThreads

    println("[$name] $phase - Time: ${elapsed}ms, Memory: +${memoryDelta}MB, Threads: +$threadDelta")
  }

  private fun getUsedMemory(): Long {
    val heapMemory = memoryBean.heapMemoryUsage.used
    val nonHeapMemory = memoryBean.nonHeapMemoryUsage.used
    return (heapMemory + nonHeapMemory) / (1024 * 1024)
  }
}

File I/O monitoring

class MonitoredFileInputStream(
  private val file: File,
  private val monitorName: String
) : FileInputStream(file) {

  private var bytesRead = 0L
  private var readOperations = 0
  private val startTime = System.currentTimeMillis()

  override fun read(b: ByteArray, off: Int, len: Int): Int {
    val result = super.read(b, off, len)
    if (result > 0) {
      bytesRead += result
      readOperations++
    }
    return result
  }

  override fun close() {
    val endTime = System.currentTimeMillis()
    val totalTime = endTime - startTime
    println("[$monitorName] FileInputStream Stats - Bytes: $bytesRead, Operations: $readOperations, Time: ${totalTime}ms")
    super.close()
  }
}

Enhanced upload (CIO)

suspend fun uploadWithKtorCio(presignedUrl: String, file: File, checksum: String) {
  val monitor = PerformanceMonitor("Ktor-CIO")
  println("=== Ktor CIO Implementation ===")

  monitor.logPhase("Client Creation Start")
  val client = HttpClient(io.ktor.client.engine.cio.CIO) {
    install(HttpTimeout) {
      requestTimeoutMillis = 120_000L
      connectTimeoutMillis = 10_000L
      socketTimeoutMillis = 120_000L
    }
    install(Logging) {
      logger = object : Logger {
        override fun log(message: String) {
          println("[Ktor-CIO] $message")
        }
      }
      level = LogLevel.ALL
    }
  }

  try {
    monitor.logPhase("Client Creation Complete")
    monitor.logPhase("Request Preparation Start")

    val response: HttpResponse = client.put(presignedUrl) {
      setBody(MonitoredFileInputStream(file, "Ktor-CIO"))
      header("x-amz-checksum-sha256", checksum)
      header("Content-Type", "application/octet-stream")
      header("Content-Length", file.length().toString())
    }

    monitor.logPhase("Request Complete")
    monitor.logPhase("Response Processing Complete")
  } finally {
    monitor.logPhase("Client Cleanup Complete")
    client.close()
  }
}

Step 4: the breakthrough — file I/O

With monitoring enabled, the bottleneck became obvious.

Ktor CIO results

[Ktor-CIO] Client Creation Start - Time: 3ms, Memory: +0MB, Threads: +0
[Ktor-CIO] Client Creation Complete - Time: 114ms, Memory: +24MB, Threads: +3
[Ktor-CIO] Request Preparation Start - Time: 115ms, Memory: +24MB, Threads: +3
[Ktor-CIO] FileInputStream Stats - Bytes: 15728640, Operations: 3840, Time: 5669ms
[Ktor-CIO] Request Complete - Time: 7486ms, Memory: +68MB, Threads: +22

Ktor OkHttp results

[Ktor-OkHttp] Client Creation Start - Time: 3ms, Memory: +0MB, Threads: +0
[Ktor-OkHttp] Client Creation Complete - Time: 107ms, Memory: +15MB, Threads: +3
[Ktor-OkHttp] Request Preparation Start - Time: 107ms, Memory: +15MB, Threads: +3
[Ktor-OkHttp] FileInputStream Stats - Bytes: 15728640, Operations: 3840, Time: 36775ms
[Ktor-OkHttp] Request Complete - Time: 37190ms, Memory: +58MB, Threads: +11

Root cause

OkHttp wasn’t slow on the network. OkHttp was slow at reading the file.

File I/O time

Ktor CIO: 5,669ms file read time (~75.7% of total)
Ktor OkHttp: 36,775ms file read time (~98.9% of total)

Both implementations read:

Bytes: 15,728,640 (100% of file)
Read ops: 3,840 operations
Average chunk: ~4KB per read

Network time (surprise)

CIO: ~1,817ms network time
OkHttp: ~415ms network time

OkHttp actually had faster network transfer, but the overall upload looked slower because file I/O dominated end-to-end time.

Why CIO can be faster (in this scenario)

In this setup, the difference appeared to come from how each engine handles I/O and buffering:

Buffer management: CIO uses optimized buffers for its engine; OkHttp’s path may incur more overhead here.
Memory copying: additional copying between layers can add up when reading large payloads.
Coroutine integration: CIO is designed for coroutines; OkHttp can involve bridging between coroutine execution and threads.

Final performance summary

Implementation	Total time	Speed	File I/O time	Network time
Ktor CIO	7.55s	1.99 MB/s	5,669ms	~1,817ms
Ktor OkHttp	37.25s	0.40 MB/s	36,775ms	~415ms

Key takeaways

File I/O was the bottleneck: nearly all of OkHttp’s time was spent reading the file, not uploading it.
Engine choice matters: the client engine can materially affect I/O performance, not just “network performance”.
Measure everything: what looks like a network problem can be an I/O problem.

Practical solution

For Kotlin Multiplatform apps doing high-throughput file uploads via presigned URLs, consider using CIO (especially on JVM) when you observe disproportionate time spent in file I/O.

Note: This analysis was run on the JVM with a ~15.7MB test file under identical conditions. Results may vary by platform (Android/iOS/native), file size, network, and device characteristics.