Home
json-everything
Cancel

Null Has Value, Too

If you want to build JSON structures in your programming language, you need a data model. For .Net, that model exists in the System.Text.Json.Nodes namespace as JsonNode and its derived types, JsonObject, JsonArray, and JsonValue. Importantly, you need to make a decision of how to represent JSON null. The designers of JsonNode decided to make JSON null equivalent to .Net null. This post will explore why I think that was a poor decision. Much of the content of this post comes from my experience with Manatee.Json and conversations with the .Net engineers on this very topic. #66948 - JsonNode/JsonObject not differentiating between missing property and null-value #68128 - API Proposal: Add JSON null independent of .Net null The structure of JSON To begin, I’d like to cover how JSON is described in its specification. Particularly, I’d like to look at the data model. There are two structured types, objects and arrays, that are comprised of a set of primitives: numbers, strings, and the literals, true, false, and null. I’d like to focus on those literals. The way they’re defined, they’re just names, symbols without any inherent value. Often we relate true and false to a boolean type because the names imply that association, but technically they hold no such meaning. Similarly, null is often used to mean “no value,” but that’s not the case. In JSON, “no value” is represented by simply not existing. In JSON, null is a value. .Net’s data model The data model that JsonNode and family give us represents JSON null as .Net null. Because of this, you get a fairly convenient API. If you want to represent an object with a null under the foo key, you do this: 1 2 3 4 var node = new JsonObject { ["foo"] = null }; Pretty straightforward and easy to use, right? Similarly, you get null when querying the object. 1 var valueAtFoo = node["foo"]; It all still works. It begins to go wrong One of the features of the JsonNode API is that you can find out where in the JSON structure a particular value exists by calling its .GetPath() method. This method returns a JSON Path (BTW, wrong construct) that starts from the root JSON value and leads to the value you have. That can be pretty handy. Now, what happens when you use this method to find out where a null was? (Note that .GetPath() isn’t an extension method.) 1 var location = valueAtFoo.GetPath(); BOOM! Instant null reference exception. Imagine you’re trying to protect against nulls in your JSON, so want to walk the structure and report any nulls that you find. You’ve managed to walk the structure, but when you find a null, now you can’t report where it was without manually keeping track of where you’ve been. .GetPath() is supposed to be able to report where a value is from the value itself, but now you don’t have a value. Differentiating null from “missing” Now let’s say that we want to check our object for a bar property. 1 var barValue = node["bar"]; Most developers would expect that this, like any other dictionary (JsonObject does implement IDictionary<string, JsonValue>), would throw a KeyNotFoundException. But it doesn’t. It returns null for missing keys. So now, although we know it’s absolutely not correct, this holds: 1 Assert.AreEqual(node["foo"], node["bar"]); So how are we supposed to determine whether a key exists and holds a null or a key just doesn’t exist? We have to use .TryGetPropertyValue() or .ContainsKey(). These will return true if the key exists and false if it doesn’t. That means we can’t use the nice indexer syntax; we have to use a clunky method. 1 2 3 4 5 6 7 8 if (node.TryGetPropertyValue("foo", out valueAtFoo)) { // node exists } else { // node doesn't exist } And for both cases, valueAtFoo still comes out as null. Other odd side effects This also has an impact on how developers write their code. If I want to write a method that returns a JsonNode and I need to also communicate the presence of a null node, then I’m forced to write a Try-pattern method. 1 public bool TryQuery(JsonNode? node, out JsonNode? result) { ... } instead of 1 public JsonNode? Query(JsonNode? node) { ... } Lastly, if I have nullable reference types enabled, then I have to have JsonNode? everywhere, even when it’s supposed to represent a legitimate value (i.e. null). What’s the solution? Linked above, I presented my proposal to the .Net team as a new JsonValue-derived type called JsonNull combined with a parsing/deserialization option to use this instead of .Net null. As of this writing the issue is still open. I don’t know if it’ll be accepted or not. Ideally, I’d like to see a JsonValue that can represent JSON null without itself being null. Sadly, the design decision they’ve made means that changing anything to support an explicit representation for JSON null in this way would be a breaking change, and they’re (understandably) unwilling to do that. Until my proposal is adopted, or in the event it’s rejected, I’ve created the JsonNull type in my Json.More.Net library that contains a single static property: 1 public static readonly JsonValue SignalNode = new JsonValue<JsonNull>(); Although it only partially solves the problem (it doesn’t work for parsing into JsonNode or deserialization), this can be used to communicate that the value exists and is null. I use it extensively in the library suite. Summary If you’re building a parser and data model for JSON and your language supports the concept of null, keep it separate from JSON null. On the surface, it may be convenient, but it’ll likely cause problems for someone. If you like the work I put out, and would like to help ensure that I keep it up, please consider becoming a sponsor!

JSON Path vs JSON Pointer

JSON Path and JSON Pointer are two different syntaxes that serve two different purposes. JSON Path is a query syntax that’s used to search JSON data for values that meet specified criteria. JSON Pointer is an indicator syntax that’s used to specify a single location within JSON data. They both have their own strengths and knowing which to employ for a given use case can be important. I’m not going to dive too deeply into the syntaxes of each in this post, but I’ll give enough of an overview to lay the foundation to explain their differences. JSON Pointer A JSON Pointer is constructed with a series of selectors separated by forward slashes /. The selectors can be either key names or array indices. A JSON Pointer’s purpose is to identify a single location within JSON data. However, the specific location can depend on the shape of the data that it’s given. To see this, let’s take a look at an example. /foo/2/bar Reading this pointer, you would probably guess the following: foo is an object key 2 is an array index bar is an object key At first glance, it’s obvious that foo and bar can only be object keys because they are non-numeric. Surprisingly, however, 2 can either be an array index or an object key. The data makes that determination when it’s evaluated. If the pointer finds an array when evaluating the 2 segment, then the segment is treated like a number and the third (0-based indexing) element in the array is selected (if it exists). However, if the pointer finds an object when evaluating the 2, the segment is treated like a key name, and the object is searched for a "2" key. Importantly, given some JSON data, a pointer only identifies at most a single location within it. JSON Path A JSON Path is a query that operates over JSON data. Like JSON Pointer, it’s constructed using a series of segments, but there are more types of segments, most of which can select multiple values. A JSON Path’s purpose is generally to find all values within JSON data that meet given criteria. The syntax supports identifying a single location, but that’s not its purpose. Many implementations of JSON Path not only return the values, but also the locations of those values within the original data. Often that location is also expressed as a JSON Path. It could be argued that a JSON Pointer would be better suited to indicate the single location of a specific value found by a JSON Path, but then users would have to contend with two syntaxes, so JSON Path is generally used for the location indicator. Converting between the syntaxes Let’s start with the obvious: JSON Path to JSON Pointer. A JSON Path can be expressed as a JSON Pointer only when each of its segments can select at most a single node. $.foo..bar The JSON Path above will start with the root’s foo property and then recursively search the result for any values that are under bar properties. Since this returns multiple values, it can’t be represented as a JSON Pointer. $.foo[2].bar This JSON Path has three segments that each identify a single value. Its JSON Pointer representation is the example we had earlier in the post: /foo/2/bar. But remember that for this JSON Pointer, the 2 could potentially select a "2" key in an object. But the [2] in the JSON Path can only select from an array. It would need to be ['2'] to select from an object… but then it couldn’t select from an array. Therefore, A JSON Pointer can be expressed as a JSON Path only when all of its segments are non-numeric keys. EDITOR’S NOTE: I realized a few months after posting this that this only considers JSON Path segments with a single selector. If you allow for multiple selectors, you can certainly convert any JSON Pointer to JSON Path by including a numeric selector and a string selector: /foo/2/bar → $.foo[2,'2'].bar. More info here. /foo/bar is equivalent to $.foo.bar, however, in general, JSON Paths and JSON Pointers are not interchangeable. When do I use which? Whether you use JSON Path or JSON Pointer depends heavily on what you expect to get back. If you only expect (or can only handle) at most single value being returned, use JSON Pointer. If you are okay with receiving multiple results, then JSON Path is probably your friend. JSON Schema’s $ref keyword uses URI-encoded JSON Pointers because only a single value is expected (specifically, a value that can be interpreted as a schema). Kubernetes generally expects multiple results, so it uses its own custom flavor of JSON Path. The verdict While some JSON Pointers and JSON Paths can indicate the same locations, this is not the case in general. Use the right one for your scenario. I think a lot of confusion on this topic arises because many APIs get this decision wrong. I’ve seen many APIs that define a parameter that accepts JSON Path but whose evaluation must only result a single value. I figure they think more people are familiar with JSON Path (maybe because of the dot syntax) so they choose to use it for the API. But perhaps familiarity isn’t a sufficient reason to use a tool. If you like the work I put out, and would like to help ensure that I keep it up, please consider becoming a sponsor!

Episode IV: A New Blog

Welcome to the new json-everything blog! This will be a place where I can write about all things JSON. Most of the time, it’ll be a place where I can answer common questions so I have somewhere to point people when they ask. I’ll also highlight anything that looks interesting to me. This could be use cases from StackOverflow, other JSON-related projects, or even explanations of decisions that I’m making for one of the json-everything libraries. But don’t you already write posts for the JSON Schema blog? I do! Thanks for noticing! I may copy/re-blog those posts over here, but for the most part, JSON Schema stuff has a primary home there. I mainly wanted this site for non-JSON-Schema JSON-related content. For example, a JSON Path specification is coming soon, and I’d like to cover the differences between it and traditional JSON Path. Anyway, thanks for tuning in. If you like the work I put out, and would like to help ensure that I keep it up, please consider becoming a sponsor!