Originally published on Medium: Debugging Ktor HTTP Client Performance — OkHttp vs CIO file upload speeds
The problem
I was working on a Kotlin Multiplatform application that needed to upload large files (up to ~100MB) to AWS S3 using a presigned URL. The initial implementation used OkHttp, but it was significantly slower compared to Postman when uploading the same files to the same S3 endpoint.
To isolate variables and test different HTTP client engines, I built a small sample app. The surprising result: CIO (Ktor’s native engine) was dramatically faster than OkHttp—around 3–4× faster for the same upload.
That raised the core question: is OkHttp inherently slower here, or is something else going on?
Initial observations
In the sample app:
- CIO: ~3–4 seconds for 15MB uploads
- OkHttp: ~30–40 seconds for the same uploads
Step 1: create a controlled test setup
I set up a simple Kotlin project with two implementations:
- Ktor with OkHttp — Ktor’s OkHttp engine
- Ktor with CIO — Ktor’s native CIO engine
Initial test harness
fun main(args: Array<String>) {
val implementation = args.getOrNull(0)?.lowercase() ?: "ktor-cio"
when (implementation) {
"ktor-okhttp" -> runBlocking { uploadWithKtorOkHttp(presignedUrl, file, checksum) }
"ktor-cio" -> runBlocking { uploadWithKtorCio(presignedUrl, file, checksum) }
}
}
First results
| Implementation | Time | Speed |
|---|---|---|
| Ktor OkHttp | 37.25s | 0.40 MB/s |
| Ktor CIO | 7.55s | 1.99 MB/s |
Key finding: CIO was ~5× faster than OkHttp in this setup.
Step 2: add basic logging
Next, I added basic logging to see what was happening on the wire.
// Ktor OkHttp with basic logging
suspend fun uploadWithKtorOkHttp(presignedUrl: String, file: File, checksum: String) {
val client = HttpClient(io.ktor.client.engine.okhttp.OkHttp) {
install(Logging) {
logger = Logger.DEFAULT
level = LogLevel.HEADERS
}
}
// ... rest of implementation
}
// Ktor CIO with basic logging
suspend fun uploadWithKtorCio(presignedUrl: String, file: File, checksum: String) {
val client = HttpClient(io.ktor.client.engine.cio.CIO) {
install(Logging) {
logger = Logger.DEFAULT
level = LogLevel.HEADERS
}
}
// ... rest of implementation
}
Logging results
Ktor OkHttp output:
Ktor-OkHttp: REQUEST: https://...
Ktor-OkHttp: RESPONSE: 200
Ktor-OkHttp: Protocol: HTTP/2.0
Ktor-OkHttp: Total request time: 37190ms
Ktor CIO output:
Ktor-CIO: REQUEST: https://...
Ktor-CIO: RESPONSE: 200 OK
Ktor-CIO: Protocol: HTTP/1.1
Ktor-CIO: Total request time: 7486ms
Key finding: CIO was using HTTP/1.1, while OkHttp was using HTTP/2.0.
That’s counterintuitive because HTTP/2 is generally considered more efficient—yet CIO still won by a lot. That suggested the bottleneck wasn’t simply “network protocol efficiency”.
Step 3: deep performance monitoring
At this point, I wanted to know exactly where time was being spent. I added instrumentation for:
- time per phase
- memory deltas
- thread count deltas
- file read behavior
Performance monitor
class PerformanceMonitor(private val name: String) {
private val memoryBean: MemoryMXBean = ManagementFactory.getMemoryMXBean()
private val threadBean: ThreadMXBean = ManagementFactory.getThreadMXBean()
private val startTime = System.currentTimeMillis()
private val startMemory = getUsedMemory()
private val startThreads = threadBean.threadCount
fun logPhase(phase: String) {
val currentTime = System.currentTimeMillis()
val elapsed = currentTime - startTime
val currentMemory = getUsedMemory()
val memoryDelta = currentMemory - startMemory
val currentThreads = threadBean.threadCount
val threadDelta = currentThreads - startThreads
println("[$name] $phase - Time: ${elapsed}ms, Memory: +${memoryDelta}MB, Threads: +$threadDelta")
}
private fun getUsedMemory(): Long {
val heapMemory = memoryBean.heapMemoryUsage.used
val nonHeapMemory = memoryBean.nonHeapMemoryUsage.used
return (heapMemory + nonHeapMemory) / (1024 * 1024)
}
}
File I/O monitoring
class MonitoredFileInputStream(
private val file: File,
private val monitorName: String
) : FileInputStream(file) {
private var bytesRead = 0L
private var readOperations = 0
private val startTime = System.currentTimeMillis()
override fun read(b: ByteArray, off: Int, len: Int): Int {
val result = super.read(b, off, len)
if (result > 0) {
bytesRead += result
readOperations++
}
return result
}
override fun close() {
val endTime = System.currentTimeMillis()
val totalTime = endTime - startTime
println("[$monitorName] FileInputStream Stats - Bytes: $bytesRead, Operations: $readOperations, Time: ${totalTime}ms")
super.close()
}
}
Enhanced upload (CIO)
suspend fun uploadWithKtorCio(presignedUrl: String, file: File, checksum: String) {
val monitor = PerformanceMonitor("Ktor-CIO")
println("=== Ktor CIO Implementation ===")
monitor.logPhase("Client Creation Start")
val client = HttpClient(io.ktor.client.engine.cio.CIO) {
install(HttpTimeout) {
requestTimeoutMillis = 120_000L
connectTimeoutMillis = 10_000L
socketTimeoutMillis = 120_000L
}
install(Logging) {
logger = object : Logger {
override fun log(message: String) {
println("[Ktor-CIO] $message")
}
}
level = LogLevel.ALL
}
}
try {
monitor.logPhase("Client Creation Complete")
monitor.logPhase("Request Preparation Start")
val response: HttpResponse = client.put(presignedUrl) {
setBody(MonitoredFileInputStream(file, "Ktor-CIO"))
header("x-amz-checksum-sha256", checksum)
header("Content-Type", "application/octet-stream")
header("Content-Length", file.length().toString())
}
monitor.logPhase("Request Complete")
monitor.logPhase("Response Processing Complete")
} finally {
monitor.logPhase("Client Cleanup Complete")
client.close()
}
}
Step 4: the breakthrough — file I/O
With monitoring enabled, the bottleneck became obvious.
Ktor CIO results
[Ktor-CIO] Client Creation Start - Time: 3ms, Memory: +0MB, Threads: +0
[Ktor-CIO] Client Creation Complete - Time: 114ms, Memory: +24MB, Threads: +3
[Ktor-CIO] Request Preparation Start - Time: 115ms, Memory: +24MB, Threads: +3
[Ktor-CIO] FileInputStream Stats - Bytes: 15728640, Operations: 3840, Time: 5669ms
[Ktor-CIO] Request Complete - Time: 7486ms, Memory: +68MB, Threads: +22
Ktor OkHttp results
[Ktor-OkHttp] Client Creation Start - Time: 3ms, Memory: +0MB, Threads: +0
[Ktor-OkHttp] Client Creation Complete - Time: 107ms, Memory: +15MB, Threads: +3
[Ktor-OkHttp] Request Preparation Start - Time: 107ms, Memory: +15MB, Threads: +3
[Ktor-OkHttp] FileInputStream Stats - Bytes: 15728640, Operations: 3840, Time: 36775ms
[Ktor-OkHttp] Request Complete - Time: 37190ms, Memory: +58MB, Threads: +11
Root cause
OkHttp wasn’t slow on the network. OkHttp was slow at reading the file.
File I/O time
- Ktor CIO: 5,669ms file read time (~75.7% of total)
- Ktor OkHttp: 36,775ms file read time (~98.9% of total)
Both implementations read:
- Bytes: 15,728,640 (100% of file)
- Read ops: 3,840 operations
- Average chunk: ~4KB per read
Network time (surprise)
- CIO: ~1,817ms network time
- OkHttp: ~415ms network time
OkHttp actually had faster network transfer, but the overall upload looked slower because file I/O dominated end-to-end time.
Why CIO can be faster (in this scenario)
In this setup, the difference appeared to come from how each engine handles I/O and buffering:
- Buffer management: CIO uses optimized buffers for its engine; OkHttp’s path may incur more overhead here.
- Memory copying: additional copying between layers can add up when reading large payloads.
- Coroutine integration: CIO is designed for coroutines; OkHttp can involve bridging between coroutine execution and threads.
Final performance summary
| Implementation | Total time | Speed | File I/O time | Network time |
|---|---|---|---|---|
| Ktor CIO | 7.55s | 1.99 MB/s | 5,669ms | ~1,817ms |
| Ktor OkHttp | 37.25s | 0.40 MB/s | 36,775ms | ~415ms |
Key takeaways
- File I/O was the bottleneck: nearly all of OkHttp’s time was spent reading the file, not uploading it.
- Engine choice matters: the client engine can materially affect I/O performance, not just “network performance”.
- Measure everything: what looks like a network problem can be an I/O problem.
Practical solution
For Kotlin Multiplatform apps doing high-throughput file uploads via presigned URLs, consider using CIO (especially on JVM) when you observe disproportionate time spent in file I/O.
Note: This analysis was run on the JVM with a ~15.7MB test file under identical conditions. Results may vary by platform (Android/iOS/native), file size, network, and device characteristics.