I need to scan very large JSONL files efficiently and am considering a parallel grep-style approach over line-delimited text.
Would love to hear how you would design it.
I need to scan very large JSONL files efficiently and am considering a parallel grep-style approach over line-delimited text.
Would love to hear how you would design it.
https://jsonltools.com/what-is-jsonl
First time hearing about it, but JSONL is like CSV with json per line. So not really structured.
I don’t see why you couldn’t just use grep on it.
Sorry, I missed that L, and I’ve never heard about JSONL before (although worked with JSON logs that are effectively JSONL). So, well, you may use
grep, however it can be inefficient (depends on regex engine and how good you are in regexes). It is also easy to make a mistake if you are not very proficient in regexes. So I’d prefer using JSON parser (jqor another, maybe lower level if performance matters) overgrepanyway.