Hacker News new | past | comments | ask | show | jobs | submit login
Use JSON files as if they are Python modules (github.com/kragniz)
267 points by bane on Jan 25, 2016 | hide | past | favorite | 65 comments



The mechanism is called "import hooks":

https://docs.python.org/3/reference/import.html#import-hooks

https://www.python.org/dev/peps/pep-0302/

You can see the import hook added here:

https://github.com/kragniz/json-sempai/blob/master/jsonsempa...

Import hooks are a great feature in Python.

Take the `import` mechanism in the language and reduce it to its barest theoretical formulation - what does it really do? (Name binding.) Think of all the other features in your application that can be reduced to this. (e.g., configuration management) Use the constraints of the import mechanism to guide the design of this feature, and use import hooks to implement it. You'll end up with an implementation that is very "close" to the core language. I would assert that this closeness strongly suggests correctness and composability.

Less philosophically, think of the Python virtual machine as a system with lots of "safety valves" and "escape hatches." Import hooks are one such safety valve. Think about all the things you could do easily by hooking into module importing? (Real use-cases: loading code from bitemporal object databases, deprecation of modules, relocation of modules, configuration management, &c.)

Of course, there are languages that are much more flexible than Python in this respect. Python aims for a practical flexibility, and I find that, in practice, Python strikes a nice balance.


Import hooks are awesome, one of the cool things you can do is re-write the code as you load it. I made a toy loader that automagically inlines (some) Python function calls[1]. There are also cool projects like MacroPy[2] that do much more extensive things.

1. https://tomforb.es/automatically-inline-python-function-call...

2. https://github.com/lihaoyi/macropy


I read your experiment on inlining and it looks quite interesting. Did you do a benchmark on how it affects (improves?) performance? In my line of work I've found ocasionally places where inlining would've helped (functions that do masking with 64 bit masks, for example, are a mess to "inline by hand" and kill a lot of readability and clarity). Even with the current limitations of your "toy" implementation, it seems like it would help to avoid the costly function calls.


Awesome, I saw your post a while back and it triggered me to write a full inliner for python, also added some IFDEF macros using context managers e.g.

with IFDEF("DEBUG"):

I'll have to dig around and find my code, your inline project was a great help in understanding the import hooks and walking the ast.


Sounds a bit like the Ruby-fication of Python.


A simple example of how import hooks use for module relocation/deprecation:

https://gist.github.com/dutc/0f7498451d98e3114268


This seems like a huge case of "implicit over explicit". Ideally if you import package foo, it shouldn't affect anything but foo.bar


Python itself never enforced that, though. It's the libraries' responsibility to have a "register_import_hook()" function or similar instead of just doing so unannounced.


For what it's worth - modern JavaScript does it too, it separates the syntax (import/export) and semantics of binding the reference from the actual loading mechanism. Not only can you "hook" into imports, you can replace them altogether if you'd like.


The "loader protocol" (PEP 302) is a related trick that can be used locally, so it only affects imports from packages that opt in. I use this in https://github.com/bdarnell/codegenloader to automagically process protobuf/thrift files and import the generated code.


My favorite is this issue: https://github.com/kragniz/json-sempai/issues/7

> Ever since I saw this I have been unable to sleep. Please fix.


> Works as intended. adds the 'wontfix' label

This is gold.


The 'wontfix' label looks more like a yellow than a gold to me.


IIRC there is a moment of sheer terror in this talk by David Beazley [0] where he shows how to import an xml file as a python module.

0. https://www.youtube.com/watch?v=0oTh1CXRaQ0


From the examples:

    >>> from jsonsempai import magic
    >>> from python_package import file
    >>> file
    <module 'python_package.file' from 'python_package/file.json'>
I believe it is not a good idea to teach people to import something named "file", as that overrides Python's builtin class "file".

(On the other hand, that class is usually instantiated via calls to "open", so the class name "file" is unused is most programs dealing with files.)


Also, 'file' doesn't exist in Python 3.


Feel free to send a pull request changing it to a name that makes things clearer. Calling it 'file' probably isn't the best idea, but I think it makes it clear you're loading from some arbitrary json file.


Really love the fact the import is named `from jsonsempai import magic`.

If this were to be used seriously, it would make sense to other devs that the import named magic is doing some crazy stuff.


I did the same with Perl in 10 minutes: https://git.io/vzKDs

it was fun I will check it tomorrow when I'm back from the pub


A man from Toronto writing a 10-minute perl module from a pub? This sounds like a Wheat Sheaf sort of affair.


Is there any practical difference between that and just a normal new() function returning a hash based on that json? Does bless change anything?

I mean, for "the same" I'd expect something like:

    ....
    use test
    say test->hello;
    say test->this->could->be->a->bad;


Ok, back from the pub and fixed it you can now use it like this:

use BlessJSON qw(test.json);

say test->hello;

say test->this->could->be->a->bad;

close enough


Why do all the comments act like this is bad? Seems fine to me. Explanation would be appreciated :)


You're changing the behavior of import. So if somebody else were to remove the seemingly unused `from jsonsempai import magic` the file would stop working as expected.

Non obvious "magic" code like that tends to be frowned on in general, and especially in the python community


Is it be useful? Or there are some alternatives to import JSON as a module as easy as this one does?


For files:

    import json
    with open("foo.json") as f:
        foo = json.load(f)
For strings:

    import json
    bar = '{"foo": "bar"}'
    foo = json.loads(bar)


Does that respect import search paths? Does the original?


There is no good reason to load json from import paths. Your import paths should contain code, assets, not configuration.


A JSON document might be an asset, such as static data. In fact, I'd much rather have a Python file as configuration and JSON as data than the opposite.


Does it respect import paths: no.

As for whether or not json-sempai does, I looked through the code and I am pretty sure the answer is yes.

Declaring pkg_resources is probably a better way to include static data files in your package but hey radical freedom and all that.


Do you have any good tutorial on how to use pkg_resources ? I use pathlib.Path(__file__).absolute().parent but it's not zip safe. Problem is, I can't wrap my head around pkg_resources and the docs don't help.


Yeah, docs are a bit fuzzy and I'm not exactly sure what the right answer is.

So there are actually two ways to go about it AFAIK and I'm not entirely sure which one is the best. One is to list files in MANIFEST.in and then access it via __file__, but I think this doesn't work if your module gets turned into an egg/wheel (it's for source dists).

On the other hand is using pkg_resources, where you declare the list of files in the package_data argument to setup in your setup.py. You can then get the contents of that file (not its path) by importing the pkg_resources package, which provides a lookup table of loaded resources. This [apparently] works for binary dists but not source dists (the opposite of MANIFEST).

I've used pkg_resources without issue myself, but it's possible I just have been lucky. In any event, here's a few links discussing the differences and also how to use pkg_resources:

http://stackoverflow.com/questions/7522250/how-to-include-pa...

http://blog.codekills.net/2011/07/15/lies,-more-lies-and-pyt...

http://peak.telecommunity.com/DevCenter/PythonEggs#accessing...

If you want an example, here's a package I made that uses pkg_resources (I make no promises that this is the right way to do it, but it works for me):

(setup.py, see line 10) https://github.com/jasonmhite/gefry2/blob/master/setup.py

(the data file I want to include) https://github.com/jasonmhite/gefry2/tree/master/gefry2/data

(where the code loads the data file, see line 67) https://github.com/jasonmhite/gefry2/blob/master/gefry2/mate...


Thanks


Cython does the same thing for ".pyx" files and I have to say it is very handy and works well.


Because it monkey patches the way imports work. The context manager isn't bad really, although there are probably more natural ways to create a python object with the same structure as a JSON file without abusing the import system.


Right. JSON is almost valid Python to begin with. Off the top of my head, the main differences are that JSON true needs to be True in Python (likewise with false/False, null/None). It's not that difficult.


It's also slightly stricter with syntax; you can't have extra ending commas. {'foo' : 1, 'bar' : 2,} is valid Python but not valid json.


Yes, I keep running into this. I leave trailing commas in my JSON as it evals fine in Python. Then my C++ code, using boost::property_tree chokes on it. property_tree also needs double quotes, not the single quotes in this example.


There's a module for this: json. The error messages might not be very good, but it does check for trailing commas.


To note: Python 3, due to the whole string vs unicode literals thing (\u)


As a JavaScript developer for the past 5 years or so, when jumping into Python I certainly miss the simple require(<json file>) which this seems to replicate into Python pretty well!

Having said that I don't think that I would use it for serious things since this isn't really the Python way and I would like my code to be most understood by others. Neat though!


JSON stands for JavaScript Object Notation , not Python Object Notation. Furthermore "require" is "proprietary" to nodejs, ES2015 has a different module system and it will be the official js module system.


> JSON stands for JavaScript Object Notation , not Python Object Notation.

That does not mean it shouldn't be used anywhere else.


The require() function was actually part of the old CommonJS standard: http://wiki.commonjs.org/wiki/Modules/1.1

node was originally an implementation of it before eclipsing it. Between the original advocacy by CommonJS and the rising popularity of node, browserify, and Webpack, things like require() leaked out to become the pre-ES2015 de facto standard for importing modules.

require() is so widespread now, and the transform to the ES2015 syntax so trivial, that it's not going away anytime soon.


Yeah, and? I'm not sure what you were trying to get at but I never said anything to the contrary to this and I mentioned I've done lots of JavaScript dev so this isn't anything new.


> Disclaimer: Only do this if you hate yourself and the rest of the world.

I'm curious why the author would do this. Just for the fun in it?


Likely for the fun and it might just save time when writing one off scripts or using the interpreter.

I nearly spit my drink out at the name, though.


Yes, it was just for fun.


Fun is the only thing that can justify this blasphemy :)

Why do this instead of:

   myvar = read_json_file_as_object('path/to/file')
??

Ok, you can argue that function would hit the hard drive many times when called from multiple modules (while imports would not), but this can be fixed with simple memoization.


The enormous downside to this approach is that once you want to load a json file dynamically, e.g. with a filename that comes from a dynamic string, you're stuck and you have to come up with a completely different approach and rewrite your code.


Not correct.

Like many things in Python import is actually just syntactic sugar for a dunder function.

https://docs.python.org/3/library/functions.html#__import__


Excepting the cringemarsh that is "import <jsonfile>", this is actually very useful. This looks very unpythonic:

my_json["this"]["can"]["be"] == "nested"

and is impossible to write defensively and cleanly after the first call:

my_json.get("this", {}).get("no way to not crash here")

Something that could do:

my_json.this.can.be == "nested"

With a default value would be very useful. Does anyone know something like it, or should I whip a module up in an hour?


I've taken a crack at something similar, I'd love to see what you came up with. I ended up with this super-meh solution: https://github.com/zmap/ztag/blob/master/ztag/transform.py#L...


My thing turned out to be pretty much the same as yours, also equally meh:

https://github.com/skorokithakis/jsane

There's no code there, but the underlying class is pretty much spot on yours. I'm still trying to improve the API, but it's not looking great.


What I had in mind sounds pretty close to yours, really. I was thinking of adding a default somewhere along the line (possibly the end), so you could do:

json.structure.that.i.want.to.access.default(True)

Or something like that. I don't really like the default item API I've come up... I'll comment here if I do give it a go, thanks for the snippet!


> Have you ever been kept awake at night, desperately feeling a burning desire to do nothing else but directly import JSON files as if they were python modules

I almost died laughing at this. Feels like the author is aware that this scenario is probably the only one where such library can be mission critical.


Funny cause it's true. It's really something I desperately seeked a few years ago


... ??? why? What's wrong with json.loads()?


Awesome, although I don't quite have a use for it right now. Next we could: * Import HTML/CSS/JS/DOM via a URL: https://news.ycombinator.com

* Import XML? * Import any_structured_data!


Ah. kragniz must be the Damien Conway of Python.


I'm not sure which is the better/worse idea:

"JSONx is an IBM® standard format to represent JSON as XML. The appliance converts JSON messages that are specified as JSON message type to JSONx. The appliance provides a style sheet that you can use to convert JSONx to JSON."

https://www-01.ibm.com/support/knowledgecenter/SS9H2Y_7.1.0/...


... and I was writing SPARQL queries against my Java source code yesterday


These jokes are getting rather sophisticated. For a few (head-scratching) minutes, I thought this was meant seriously.


Some people are saying JSON ≠ Python Object Notation. Here we go! https://github.com/brhs/pon

(I used OP's work as a reference. Amazing you can do this in so few lines of code.)


It's official. PON is now a thing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: