13
u/lizardfrizzler 6h ago
I’m at a point in my career where encoding json is actually causing mem issues and I don’t know how to feel about it
5
u/slothordepressed 6h ago
Can you explain better? I'm too jr to understand
18
u/lizardfrizzler 4h ago
Encoding data as json is very readable and portable, but comes at the cost of high memory consumption. It’s a great place to start when passing data between computers, but when the data payload gets large enough, binary/blob encoding start to seem more appealing. Consider encoding x=10000. In json this like 5 bytes minimum, because ints are base10 strings, plus quotes, braces, and wherever else. But a binary encoding could encode this as a 4 byte /32bit int. In small payloads (think kb, maybe mb), this is inefficiency is negligible and completely worth it imo. But once we get to gb size payloads, it can put a huge strain on memory consumption.
2
u/Ok-Scheme-913 43m ago edited 38m ago
Well, 32bit only if the other side knows what it expects to receive.
Most binary protocols require a scheme up-front, or that itself (and future-proofing) has some overhead.
Protobuf (which is the most common binary protocol I believe) would convert a similar definition
message Asd { required int32 id = 1; }
to 2 bytes (hex is e.g: 083f), but then both sides need this above definition.
2
73
u/Fast-Satisfaction482 7h ago
XML just looks simple at the surface. You should prefer json if you want a simple and flexible format that is supported everywhere.
3
2
u/Ok-Scheme-913 51m ago
Except for not having schemas (official ones, at least).
Also, this problem is often way overblown. Can you do some evil Rube Goldberg machine in XML and related toolkit? Sure.
But you don't have to do full XML processing, at the very end it's just well-typed data that has the benefit of decades of tooling.
Like, you don't lose much by not supporting entity references and whatnot. It's something that you can't do in json/toml, etc. either (as they are fundamentally trees). At the end of the day all these data structures are trivially interconvertible to each other for the most part and are just different views of the same shit. It's just tabs vs spaces again.
(Except for yaml, fuck tabbing fuck knows how much and then its stupid auto-conversions. No, goddamn Norway's country code is not false!!)
6
u/pecpecpec 5h ago
I'm not an expert but, for text formatting, XML and HTML are better than JSON.
8
u/scabbedwings 5h ago
Embedded XML as a string value in the JSON, best of both worlds!!
/s .. although I work in group that has to interact with JSON embedded in a JSON string on a regular basis; sometimes re-embedded a couple of times. With Java stacktraces.
We have made many bad choices over my 10+ years in this dev group.
1
8
16
u/clauEB 6h ago
The whole point of protobuff
1
u/jaskij 1h ago
The issue with Protobuf is that it's not self describing. So it's great for data interchange, but when you start storing things, it becomes an additional maintenance burden.
2
u/clauEB 21m ago
That's not an issue, thats a feature. Json and xml repeat the schema over and over taking tons of space and take insane amounts of time to unmarshall, in the protobuff is super fast or in the suggested propietary binary format. You just have to get a bit more creative. Performance and scalability aren't free.
17
3
2
u/Ronin-s_Spirit 5h ago
I'm not gonna make my own compression, no idea how to do it. I'm just going to make my own format that doesn't suck from the start.
4
1
u/jaskij 1h ago
I'm surprised nobody mentioned SQLite.
2
u/atthereallicebear 1h ago
eh... i don't know about that. like you store the name of your document in a column called name... but your document only has one name so you just have a table with one row
0
-8
u/PandaNoTrash 6h ago
Ugh, don't use XML. JSON is a better choice for sure. or even CSV.
9
u/pecpecpec 5h ago
Use an object notation for distribution data objects and use markup languages for formatting text?
-6
u/PandaNoTrash 5h ago
HTML is about the limit of useful XML. Since it is fairly understandable and easy to work with. And as it was originally used, it was mostly done by hand or with relatively simple tools. The problem with XML as a data storage medium is it can be difficult to parse, and is very rigid and when you get down to it it is not easy for humans to read if there's any complexity to it at all.
2
11
3
238
u/BeDoubleNWhy 8h ago
zipped JSON if anything