jsonref

jsonref is a library for automatic dereferencing of JSON Reference objects for Python (supporting Python 3.7+).

from pprint import pprint
from jsonref import replace_refs

# Sample JSON data, like from json.load
document = {
    "data": ["a", "b", "c"],
    "reference": {"$ref": "#/data/1"}
}

# The :func:`replace_refs` function will return a copy of the document
# with refs replaced by :class:`JsonRef` objects
pprint(replace_refs(document))
{'data': ['a', 'b', 'c'], 'reference': 'b'}

The replace_refs() function

The primary interface to use jsonref is with the function replace_refs(). It will return a copy of an object you pass it, with all JSON references contained replaced by JsonRef objects. There are several other options you can pass, seen below.

jsonref.replace_refs(obj, base_uri='', loader=<function jsonloader>, jsonschema=False, load_on_repr=True, merge_props=False, proxies=True, lazy_load=True)[source]

Returns a deep copy of obj with all contained JSON reference objects replaced with JsonRef instances.

Parameters:
  • obj – If this is a JSON reference object, a JsonRef instance will be created. If obj is not a JSON reference object, a deep copy of it will be created with all contained JSON reference objects replaced by JsonRef instances

  • base_uri – URI to resolve relative references against

  • loader – Callable that takes a URI and returns the parsed JSON (defaults to global jsonloader, a JsonLoader instance)

  • jsonschema – Flag to turn on JSON Schema mode. ‘id’ or ‘$id’ keyword changes the base_uri for references contained within the object

  • load_on_repr – If set to False, repr() call on a JsonRef object will not cause the reference to be loaded if it hasn’t already. (defaults to True)

  • merge_props – When True, JSON reference objects that have extra keys other than ‘$ref’ in them will be merged into the document resolved by the reference (if it is a dictionary.) NOTE: This is not part of the JSON Reference spec, and may not behave the same as other libraries.

  • proxies – If True, references will be replaced with transparent proxy objects. Otherwise, they will be replaced directly with the referred data. (defaults to True)

  • lazy_load – When proxy objects are used, and this is True, the references will not be resolved until that section of the JSON document is accessed. (defaults to True)

The different modes

proxies

The default mode (proxies=True) uses JsonRef proxy objects to replace the reference objects in the document. For most purposes, they proxy everything to the referenced document. This can be useful for a few reasons:

  • The original reference object is still available with the JsonRef.__reference__ attribute.

  • dump() and dumps() can be used to output the document again, with the references still intact. (Including changes made.)

If you are using a tool that does not play nicely with the JsonRef proxy objects, they can be turned off completely using proxies=False. This is needed e.g. if you want to pass the data back to the stdlib json.dump() function.

lazy_load and load_on_repr

By default, the references will not actually be resolved until the data is accessed (lazy_load=True.) This can be useful to limit the upfront processing of deeply nested, or otherwise complicated reference trees. To limit the lookups even more, the load_on_repr argument can be set to False, so that printing the document will not cause the references to load (this can be especially useful when debugging.) The downside of this mode is that exceptions when a reference cannot be loaded might be issued from more places when using the loaded document. Turning off lazy loading can make catching errors much easier.

merge_props

When using this mode, extra properties from the reference object will be merged into the referenced document. e.g.:

>>> json = {
    "a": {"$ref": "#/b", "extra": "blah"},
    "b": {"real": "b"}
}
>>> print(replace_refs(json, merge_props=True))
{
    "a": {"real": "b", "extra": "blah"},
    "b": {"real": "b"}
}
>>> print(replace_refs(json))
{
    "a": {"real": "b"},
    "b": {"real": "b"}
}

This is against the JSON reference spec, but some other JSON reference libraries also implement this behavior. It can be useful to e.g. extend common JSON schemas with extra properties. This behavior should not be used if you want your JSON documents to be usable with the widest possible array of tools.

A note on base_uri

A common question is how to reference other documents from the local filesystem. This is easy if you provide the correct base_uri to the replace_refs() function (or the other utility functions.) For example, if you have several files in a folder like this:

file-a.json
file-b.json

If file-a.json has a reference like {"$ref": "file-b.json"} you could load them like this:

from pathlib import Path
import jsonref

file_a_path = Path("file-a.json").absolute()

with file_a_path.open() as file_a:
    result = jsonref.load(file_a, base_uri=file_a_path.as_uri())

JsonRef Objects

JsonRef objects are used to replace the JSON reference objects within the data structure. They act as proxies to whatever data the reference is pointing to, but only look up that data the first time they are accessed. Once JSON reference objects have been substituted in your data structure, you can use the data as if it does not contain references at all.

class jsonref.JsonRef(refobj, base_uri=None, loader=None, jsonschema=False, load_on_repr=True)[source]

A lazy loading proxy to the dereferenced data pointed to by a JSON Reference object.

JsonRef instances proxy almost all operators and attributes to the referent data, which will be loaded when first accessed. The following attributes are not proxied:

__subject__

Contains the referent data. Accessing this will cause the data to be loaded if it has not already been.

__reference__

Contains the original JSON Reference object. Accessing this attribute will not cause the referent data to be loaded.

Loading a document at a given URI

In order to actually get and parse the JSON at a given URI, JsonRef objects pass the URI to a callable, set with the keyword argument loader. This callable must take the URI as an argument, and return the parsed JSON referred to by that URI.

The jsonloader() function is provided to fill this role. It handles http, https, and file references.

jsonref.jsonloader(uri, **kwargs)[source]

Provides a callable which takes a URI, and returns the loaded JSON referred to by that URI. Uses requests if available for HTTP URIs, and falls back to urllib.

If you wish to pass extra arguments to json.loads() call, they can be passed through kwargs to this function. e.g.:

import decimal
import functools

import jsonref


loader = functools.partial(jsonref.jsonloader, parse_float=decimal.Decimal)
result = jsonref.load_uri("https://my_doc_uri", loader=loader)

Custom Loaders

If you want to support custom references, you can define your own loader. For example here is a complete script to load env:XXX URIs from environment variables:

import os

import jsonref


def loader(uri):
    if uri.startswith("env:"):
        return os.environ[uri[4:]]
    # Fall back to the default loader:
    return jsonref.jsonloader(uri)

json_w_refs = {
    "a": {"$ref": "env:MYENVVAR"}
}

result = jsonref.replace_refs(json, loader=loader)

json module drop in replacement functions

Several functions are provided as drop in replacements to functions from the json module.

load

load() and loads() work just like their json counterparts, except for references will already be replaced in the return values.

jsonref.load(fp, base_uri='', loader=None, jsonschema=False, load_on_repr=True, merge_props=False, proxies=True, lazy_load=True, **kwargs)[source]

Drop in replacement for json.load(), where JSON references are proxied to their referent data.

Parameters:
  • fp – File-like object containing JSON document

  • **kwargs

    This function takes any of the keyword arguments from replace_refs(). Any other keyword arguments will be passed to json.load()

jsonref.loads(s, base_uri='', loader=None, jsonschema=False, load_on_repr=True, merge_props=False, proxies=True, lazy_load=True, **kwargs)[source]

Drop in replacement for json.loads(), where JSON references are proxied to their referent data.

Parameters:
  • s – String containing JSON document

  • **kwargs

    This function takes any of the keyword arguments from replace_refs(). Any other keyword arguments will be passed to json.loads()

There is also a convenience function provided to load and process references on a document at a given uri using the specified loader

jsonref.load_uri(uri, base_uri=None, loader=None, jsonschema=False, load_on_repr=True, merge_props=False, proxies=True, lazy_load=True)[source]

Load JSON data from uri with JSON references proxied to their referent data.

Parameters:
  • uri – URI to fetch the JSON from

  • **kwargs

    This function takes any of the keyword arguments from replace_refs()

dump

dump() and dumps() work just like their json counterparts, except they output the original reference objects when encountering JsonRef instances.

jsonref.dump(obj, fp, **kwargs)[source]

Serialize obj, which may contain JsonRef objects, as a JSON formatted stream to file-like fp. JsonRef objects will be dumped as the original reference object they were created from.

Parameters:
  • obj – Object to serialize

  • fp – File-like to output JSON string

  • kwargs – Keyword arguments are the same as to json.dump()

jsonref.dumps(obj, **kwargs)[source]

Serialize obj, which may contain JsonRef objects, to a JSON formatted string. JsonRef objects will be dumped as the original reference object they were created from.

Parameters:
  • obj – Object to serialize

  • kwargs – Keyword arguments are the same as to json.dumps()

When things go wrong

If there is a failure when resolving a JSON reference, a JsonRefError will be raised with the details.

class jsonref.JsonRefError(message, reference, uri='', base_uri='', path=(), cause=None)[source]
message
reference

Contains the original JSON reference object.

uri

The uri that was trying to be resolved in the JSON reference object.

base_uri

If the uri was relative, or a fragment, this is the base uri it was being resolved against.

path

This is the path within the JSON document the reference was found. This can be useful when the reference was deeply nested within the document.

cause

The exception that caused the resolution to fail.