— Blog / Engineering

How to open a large JSON file (when your editor gives up).

Engineering Anika Rao May 12, 2026 9 min read

You double-clicked a 400 MB JSON dump. Your editor's spinner is doing the polite "I'm still alive" dance. It isn't. Here's what to do instead — and why the thing you reached for first was never going to work.

Why your editor chokes on a big JSON file

The reason you can't open a large JSON file in VS Code isn't laziness on the editor's part. It's three separate ceilings stacked on top of each other.

The heap. VS Code runs on Node, and Node runs on V8. V8's default old-generation heap is around 1.5 GB on 64-bit builds. You can raise it with --max-old-space-size (see the Node CLI docs), but you're not really fixing the problem, you're just buying yourself thirty seconds of swapping before the next wall.

The buffer model. Monaco — the editor inside VS Code — keeps the file as a piece tree in memory plus a tokenization cache. Every token, every fold marker, every diagnostic costs bytes. A 200 MB JSON file becomes a 1–2 GB live buffer.

The syntax pass. JSON syntax highlighting runs over the whole buffer on load. It's incremental afterwards, but the first pass blocks the UI thread. On a 500 MB array of objects, that's the part where you alt-tab and check Slack.

Other tools fail differently. Sublime Text handles big files better than VS Code but doesn't give you a JSON tree, so you're scrolling raw text. Notepad++ opens the file and lets you stare at it. Browser-based viewers like Firefox's built-in JSON formatter build a full DOM — practical ceiling around 500 MB before the tab dies. Web tools like JSONLint or JSON Editor Online upload the whole payload to a server and typically cap out at 5–10 MB.

What "large" even means for JSON

It's worth being concrete. Here are the thresholds where things stop being free:

~10 MB — pretty-print starts to lag. Syntax highlighting is fine.
~200 MB — VS Code becomes painful. Open is slow, scroll stutters, the find dialog times out.
~500 MB — most browsers crash the tab. Sublime survives but you have no tree.
~1 GB — jq . pretty-print is glacial. jq queries that don't materialize the whole tree are still fine.
Multi-GB — almost everything you'd normally reach for just refuses. cat is one of the few things that still works, and that's not a plan.

The other axis is shape. A 1 GB file that's a single huge array of 10M small records behaves very differently from a 1 GB file that's one deeply nested object. The array can be streamed line-by-line. The object can't.

Five ways to open a large JSON file

1. Use a streaming parser

If you control the code consuming the file, a streaming parser is the right answer. Python's ijson, JavaScript's oboe.js, Java's Jackson streaming API — they all parse the file as a sequence of events without ever materializing the full tree.

import ijson

with open("events.json", "rb") as f:
    for evt in ijson.items(f, "events.item"):
        if evt["type"] == "error":
            handle(evt)

This works at any file size, including bigger-than-RAM. The downside: it's terrible for browsing. You can't scroll, can't pivot, can't ask "wait, what does the third record at depth 4 look like". You're committing to a specific extraction.

2. Use jq, carefully

jq is the standard answer for "I just need to grab something out of this file." The trick is knowing which queries stream and which materialize.

# fine on a 5 GB file
jq '.events | length' events.json
jq -c '.events[] | select(.code == 500)' events.json

# will hold the whole file in memory and pretty-print it. don't.
jq '.' events.json > pretty.json

Rule of thumb: jq -c with a filter that selects a slice is fast. Anything that reformats the whole document is going to grind. On files past a gigabyte, even simple jq pretty-print runs measured in minutes, not seconds.

3. Use ripgrep on the raw file

For "does this file contain the string user_id: 70414", ripgrep beats every parser on the planet. It's a few hundred MB/s of literal byte scan with no understanding of structure at all.

The catch is exactly that — no structure. You get a hit, you get a line, you don't get the surrounding object. For confirming a value exists, perfect. For inspecting what's around it, useless.

4. Use a memory-mapped viewer

This is the category most people don't know exists. A memory-mapped viewer doesn't load the file into RAM — it asks the OS to map the file into the process's address space, and pages are faulted in only when accessed. Combined with a flat index of where each key and value lives, you can browse a 50 GB file with the responsiveness of a 50 KB file.

The trade-off is that you need an index pass. The first open of a fresh file pays a parse cost. Subsequent opens are free. If you want to know how the parser hits 2 GB/s, the parsing post covers the structural-mask + tape design that makes that index cheap to produce.

5. Split the file

If the top level is an array, you can convert to NDJSON and process line-by-line with anything that reads a stream:

jq -c '.[]' big.json > big.ndjson
# now any line-oriented tool works: rg, awk, wc, sort, sed

Cheap and effective. You lose the structural context — once a record is a line, it's hard to ask "what's at .events[1234].payload.user.address" without re-parsing. Worth it for one-off ETL, less useful for exploration.

When to reach for a dedicated viewer

The five techniques above cover most cases. They fall apart in one specific situation: you need to browse. You don't know what you're looking for yet. You want to expand a node, scroll, search a key, jump to a path, see what's in there. That's a job for a viewer designed for it.

JSONBolt was built for exactly this. The relevant features map one-to-one onto the problems above:

Virtual tree. Only the ~60 rows actually visible on screen are rendered, out of files with 100M+ nodes. The DOM doesn't care that the file is 8 GB.
mmap'd file plus tape index. No preload. The published ceiling is 120 GB, with RAM tracking 1:1 with file size (per the homepage stats), and the OS pages in only what you actually look at as you scroll.
SIMD search at 20M results per second. Find a key across the entire file in the time it takes ripgrep to warm up.
JSONPath, regex, and --where predicates. Same expressive power as jq for filtering, without the materialization tax.
Live tailing for NDJSON. Point it at a log file and watch records appear.

The numbers we care about. 100 MB file: open in 50 ms. 2 GB/s parse throughput on the cold index pass. ~60 rows rendered out of 100 million. The rest is a feature list — those four numbers are the experience.

Personal use up to 50 MB is free. Pro is $80/year for unlimited size and commercial use, with a lifetime license available while the launch offer holds.

A worked example: opening a 4 GB JSON dump

Say you pulled an export of a year of API logs. events.json, 4.2 GB, one top-level array of about 18 million objects, each one a request record.

Drop the file on the JSONBolt window. The tree paints in well under a second — the structural pass runs at ~2 GB/s on a modern desktop, and after that we're rendering virtualized rows. Press Ctrl + F, type status_code, step through matches with F3. The tree auto-scrolls and expands as you navigate. Select the subtree you want and use File → Export Selection Value As to write it out, or copy the jq-style path with Ctrl + P for the next step.

Or skip the GUI and use the CLI. The jb binary ships in the same installer and uses the same engine:

# pull the matching records as full JSON objects
jb search --where '.status_code >= 500' --emit object events.json

# or get the values at one path across the whole file
jb extract '.events[*].user_id' events.json

The --ai flag is a shortcut for --format jsonl --envelope --max-output 1M --max-value-bytes 256: each match becomes a JSONL envelope of {path, preview, value}, total output is capped at 1 MB, and long string values are truncated to a 256-byte preview. Useful when you want to hand a slice to a model without burning a context window on plumbing.

Quick reference: tool by file size

File size	What to use	What not to use
< 10 MB	Any editor. VS Code is fine.	—
10–100 MB	VS Code (with the JSON extensions disabled for the big buffer), `jq` for queries.	Web tools that upload to a server.
100 MB–1 GB	`jq` for one-shot queries, a dedicated viewer for interactive browsing.	Browser tabs, JSON Editor Online.
1–10 GB	Streaming parser (`ijson`, Jackson) for ETL, JSONBolt for inspection.	`jq .`, anything that pretty-prints the whole file.
10 GB+	JSONBolt, custom Rust/C++ streaming code, or a database load.	Pretty much everything else.

Most of the time the answer isn't "buy a tool", it's "use the right combination of jq, ripgrep, and a streaming parser." When you need to look at a file that's too big to look at, that's when a dedicated large JSON viewer earns its keep.

Download JSONBolt if you want to skip the part where you find out which combination of flags VS Code needs to not crash on your dump.

← All posts jsonbolt · v1.4.2