— Blog / Engineering

jq alternatives for massive JSON files.

Engineering Marek Holub May 12, 2026 10 min read

jq is brilliant on a 5 MB config and miserable on a 5 GB log. Here are five jq alternatives that survive contact with the file you're actually staring at.

Where jq runs out of road

Before we get to the jq replacement for big JSON, a respectful word about jq itself. It's been the right answer for a decade. Stephen Dolan's jq manual reads like a small functional language because that's what it is. For pipelines under a few hundred megabytes, you should keep using it.

The failure mode is specific and well known: jq is slow on large files because of how it parses, not how it queries. Run jq '.' on a 2 GB file and you are paying for a full in-memory parse before the first byte hits stdout. On a modern laptop that's around three minutes for a pretty-print, longer if you ask for transforms. Even something innocent like jq 'length' on a top-level array of 50 million objects takes several seconds — jq has to walk every element to count them, because the parser doesn't keep a structural index.

The streaming mode helps in theory: jq -n --stream emits path/value pairs instead of building the tree. In practice it breaks every idiom you know. .foo[] becomes a multi-line reduce. Most "jq slow large file" posts on Stack Overflow end with someone giving up and writing Python.

Two things are true at once: jq is single-threaded with no SIMD, and the filter language is gorgeous. A good jq alternative has to keep at least one of those.

What you actually need from a jq alternative

Quick checklist before we look at tools. If a candidate fails three of these on a multi-gigabyte file, move on.

Throughput on huge files. 500 MB/s parse or better, ideally with SIMD. Anything slower and you're back to coffee breaks.
Expressive query syntax. Either real jq compatibility, JSONPath, or something close enough that you don't need a manual every time.
Streaming or mmap. If the tool calls read() on a 50 GB file, your laptop is about to swap.
CLI ergonomics. Pipe friendly, useful exit codes, sane stderr.
NDJSON / JSONL support. Logs are the use case. If the tool chokes on one-object-per-line, it's a toy.
Output formats. JSON in, CSV/TSV/table/raw out. Bonus: something LLM-shaped.

The five tools

1. jaq — a Rust reimplementation of jq

jaq is the closest thing to a drop-in replacement. Same filter syntax, almost the entire stdlib, 2–10× faster on most workloads because it's a real compiler instead of an interpreter. Install with cargo install jaq or grab a release binary. Still single-threaded, still loads the whole file into memory before querying, so it doesn't solve the 50 GB problem — but if your file fits in RAM and you're CPU-bound, this is the cheapest win you'll find all year.

# identical syntax to jq, just faster
jaq '.users[] | select(.age > 30) | .email' users.json

2. jqp — interactive jq playground (TUI)

jqp wraps a jq engine (gojq) in a TUI: an input pane, a live query box, and an output pane that re-evaluates as you type. Brilliant for figuring out a tricky path before you bake it into a script. Not the tool you reach for inside a CI pipeline, and not built for huge inputs — the live-eval loop assumes the file is comfortably in memory. But for exploration on medium files, it's faster than the jq edit-rerun cycle. (jiq is a similar drill-down tool if you prefer narrowing-as-you-type.)

# interactive exploration, jq-flavoured
jqp --query '.users[]' users.json

3. gron — flatten and grep

gron takes a different bet: it flattens JSON into one assignment per line, so you can use grep, awk, and sed like it's 1995. No query language, no surprises, just text. The killer feature is gron --ungron, which rebuilds JSON from filtered output. It streams, so it handles large files fine, but the output explodes — a 2 GB JSON can become a 6 GB grep-able text stream. Right tool when you don't know the shape of the data and just want to search it.

# flatten, then grep
gron huge.json | grep 'email.*@example.com'

4. fx — JS expressions in a terminal viewer

fx is closer to a viewer than a query tool, but it has a CLI mode where you pass JavaScript snippets as filters. Great for ad-hoc work where you already think in JS. The interactive mode is excellent on files up to a few hundred MB. Past that, the V8 startup and the in-memory tree become the bottleneck. Don't point it at a 5 GB ndjson and expect smooth scrolling.

# JS expressions instead of jq syntax
fx data.json '.users.filter(u => u.age > 30).map(u => u.email)'

5. JSONBolt jb CLI — SIMD + mmap + multi-path syntax

The jb CLI ships alongside the JSONBolt desktop viewer. It uses the same parser: SIMD structural pass, mmap'd input. Throughput on a cold read is around 2 GB/s, search hits 20M results/sec. Six path syntaxes are accepted as input — jq (.users[0].email), JSONPath ($.users[0].email), JSON Pointer, lodash, canonical, bracket-key — and output paths always come back in jq style. The verbs are jb search, jb get, jb extract, jb find, jb schema, jb keys, jb flatten. Where-predicates support comparison, contains / startsWith / endsWith, regex via matches, length tests, and boolean composition. -r/--regex handles regex search. NDJSON is read line-by-line from stdin or globs. Output formats are plain, jsonl, json; CSV/XML conversion is available in the GUI app. The --ai flag is a shortcut for --format jsonl --envelope --max-output 1M --max-value-bytes 256, producing JSONL envelopes of {path, preview, value} sized for LLM context windows. Install with winget install jsonbolt on Windows or brew install --cask jsonbolt on the macOS beta.

# where-predicate, returns the matching objects
jb search --where '.age > 30 && .role == "admin"' --emit object users.json

# or extract one path across the whole file
jb extract '.users[*].email' users.json

Same query, five tools

One concrete task: given a 5 GB ndjson access log, find every record where level == "error" and print three fields (timestamp, service, message). Numbers below are illustrative ranges based on hands-on use of each tool, not a repeatable benchmark suite — your disk, CPU, and warm-cache state will shift the absolute numbers. The shape of the ranking is what matters.

Tool	Command (abbreviated)	5 GB time	Peak RAM
jq	`jq -c 'select(.level=="error") \| {timestamp,service,message}'`	~4 m 10 s	~6.1 GB
jaq	`jaq -c 'select(.level=="error") \| {timestamp,service,message}'`	~50 s	~5.8 GB
gron	`gron file.ndjson \| grep 'level = "error"' \| gron -u`	~1 m 30 s	~280 MB
fx	chokes — interactive only, not designed for 5 GB	N/A	OOM
jb	`jb search --where '.level=="error"' --emit object file.ndjson`	~7 s	~190 MB

Numbers will vary with disk, CPU and how warm the cache is. The shape of the ranking won't. SIMD plus mmap is roughly thirty times the throughput of an interpreted in-memory parser, and that gap doesn't close with cleverer filters.

Honest caveat. jaq's 50 s is the most impressive number in that table, because it gets there without changing your syntax. If you've got a 200-line jq script and the file fits in RAM, try jaq first. It might be the only tool you need.

When jq is still the right call

We're not here to dunk on jq. There are jobs where it's unambiguously the right answer:

Small files (under ~200 MB). The parser overhead is invisible and the syntax is famously expressive.
CI pipelines where jq is already installed everywhere. Don't add a dependency to save four seconds.
Stateful transforms. jq's reduce, foreach, and recursive descent (..) cover transforms that JSONPath can't express cleanly.
Teaching and one-liners. Every backend engineer has seen jq syntax before. That's a real network effect.

The honest framing is: jq is a query language with a parser attached. The alternatives in this post are parsers with a query language attached. If your bottleneck is "I can't express this filter," use jq. If your bottleneck is "the parser is the slow part," use one of the others.

When to use the jb CLI specifically

Four cases where reaching for jb is the path of least resistance:

1 GB+ files where you want the result in seconds, not minutes. See also opening large JSON files.
SIMD-fast search across many fields, where --regex and --where together do work that would otherwise take a chain of three Unix tools.
Feeding subsets to LLMs. The --ai flag is a shortcut for --format jsonl --envelope --max-output 1M --max-value-bytes 256: each match becomes a {path, preview, value} JSONL record, with the whole payload capped at 1 MB and individual string values truncated at 256 bytes. Noisy payload, context-window-safe output. More on that trade-off in JSON and the LLM context window.
You prefer JSONPath to jq's filter language. $.users[*].email is fewer keystrokes than .users[] | .email and it's the same syntax your API gateway and your test framework already speak.

If you want the broader picture across GUI tools and other CLIs, the JSON viewer comparison covers that ground.

Pick the smallest tool that fits

Rules of thumb after a year of running these against real logs: under 200 MB, jq. Under 5 GB and fits in RAM, jaq. You don't know the shape, gron. You want to click around, fx or jet. The file laughs at your RAM, jb.

If you'd like to try jb on your own files, it's a free download for personal use up to 50 MB and $80/year for unlimited size and commercial use. One binary, no agent, no telemetry.

← All posts jsonbolt · v1.4.2