1) Create a virtual environment. python3 -m venv myproject (download an interpreter).
2) Run pip install. myproject/bin/activate; pip install requirements.txt; (download all project dependencies).
3) Start the application. myproject/bin/activate; python myapp.py
If you can assume that there is an interpreter available on the system, say /usr/bin/python3, you can use that instead of creating a virtual environment.
If you want to embed all the dependencies, you can save all the files created by pip install in the /lib directory if I remember the name well.
Should I write a full blog post with example? I used to deploy python applications in a bank, can explain all the advanced usage with and without internet access, with and without dependencies.
" you can assume that there is an interpreter available on the system, say /usr/bin/python3, you can use that instead of creating a virtual environment."
Please DO NOT do this. It is so much easier to always create a virtual environment, and then you never have to worry about installing two applications on the same system that may have conflicting versions. Additionally, you don't know what else may be installed into the system's default python environment.
Using the system Python for scripts that only depend on the standard library should be pretty safe. As soon as I need any third-party dependency I create a virtual environment.
Edit: Although if startup performance is important, using a virtual environment might be a good idea even for simple scripts. Running a "hello world" program using the system Python takes twice as long on my system as running it with "python -S" (i.e. ignoring site packages) or from a fresh virtual env.
Aaaand, now that you're at it, a single .dmg file for MacOS with binaries and scripts so us lazy people with Macbooks can start coding right away :) (it might be a bit inaccurate but I guess you know what I mean)
One solution is to use Ana / Miniconda. If you are not that tech savvy, if you don't mind downloading several hundred megabytes you can get away with this easily.
In my case, and contrary to what one of the other commentators said, I'm using docker.
It took a while to get it ironed out, but I use a "base" folder where I have already downloaded all the packages I use by default (Top of my mind are pandas, numpy and youtube-dl ;) )
I have all the related configuration in an env file that tells pip to save the packages in the base folder so they don't disappear when I shut down the container - that I always run with --rm so they get removed when dead - and for creating a new env I just have to copy that folder. As I use a base folder for all of this I don't need to remember the names of the envs, as just need to list folders.
Only downside to this - because I'm lazy and it still hasn't bothered me enough to fix it - is that new files & folders are owned by root.
I use just a base image with python, and for different versions that work with this - supposedly - I just have to download a different image.
As someone who's been hanging out in freenode python channels for years helping noobs, I recommend strongly against any kind of conda setup for new programmers. It often has strange issues that noobs have difficulty even understanding how to ask for help with, nevermind actually solve.
Learning how to use a simple barebones venv is extremely easy, saves a ton of time both in the short and long run, and generalises better.
pip install virtualenv
cd your/project/location
which python
virtualenv -p result_of_which_python env
source env/bin/activate
pip install anything_you_like
and do whatever you want from there. Those commands get 100% of the basics out of the way for noobs, and cover like 90% of the stuff you use to do more serious stuff.
> but I use a "base" folder where I have already downloaded all the packages I use by default [...etc...]
Your setup sounds outrageously complex, and I don't understand why you would do any of that.
Also, on Windows, use the Python launcher[1] (that comes with the standard install of Python for Windows) to avoid having to mess with PATH settings as you would have had to historically. Instead of calling "python" or "python3" and hoping for the right version, just call "py" and tell it which version you want:
py -3.6-32 -m venv env
That gets you a venv with 32-bit Python 3.6, for example. Use "py -0" to list available versions.
- It would work the same way on Windows, Linux and macOS
- Most importantly (!), it is not a Python package manager. That is, if a package needs BLAS or MKL or HDF5 or whatever else, you won't be crossing fingers and hoping your system-installed version would work for all your venvs; instead, those binary libraries are properly managed per environment.
Pro tip: use mamba instead of conda to get a free 4x speed boost.
Ah yes, all of those noobs who need weird numpy extensions and highly domain oriented and specific odd data storage formats.
That's the generalisation part I mentioned. The distinction you're making is non-trival for a wee noobie. Better they get the ground work in and expand to where they need to go.
One of the things most noobs will install first is pandas. Pandas needs numpy. Numpy needs lapack, blas etc, even if they're not aware it. Yes, there's wheels, but that's limited to python-only stuff.
Then they're reading on some blog about protobuf, or hdf5, or arrow or whatever, and want to use it - but using either from Python needs installed C libraries. On linux/macos you'll need to dig into your package managers and then hope things are compatible, on Windows it's a complete pain. In conda, they can just 'conda install h5py' or whatever, and get proper hdf5 installed without having to figure out the nitty gritty details.
Well, already I can tell you're disconnected from what "most noobs" are actually like. Presumably you think "most noobs" are data scientists, which just isn't the case in my experience.
> Presumably you think "most noobs" are data scientists, which just isn't the case in my experience
It is the case in my personal experience though. I don't think I'm disconnected from what 'noobs' are as I've taught a lot of them through my line of work.
Assume you're a noob and are exploring Python package universe, fire a new private tab and google 'most popular python packages'; in my case the first 3 links are:
Conda is one of the most vile pieces of software I have ever used and recommending beginners to take on that dragon sounds like complete insanity to me.
Please let them actually use python before sending them into tooling hell.
I'm not a huge fan of conda either but your comment is not very constructive without either providing concrete examples of what you mean, or explaining your 'complete insanity' claim. I don't personally see how helping newcomers easily maintain environments with versioned C libraries in them is vile or insane.
If you're just writing "a=b+c"-level Python scripts, or scrambling together tiny flask/django apps, or whatever else, you probably don't care indeed.
If you're doing/learning any kind of data-science-related stuff, that requires tons of C extensions and libraries. And many 'noobs' learn Python in order to use pandas/numpy/torch/whatever else is hot these days.
You are right. My comment is colored by the countless times I have tried conda and it was one of the worst experiences I have ever had with software.
Of all the times I have tried conda only once was it as easy as following the guide they post. The other times it was completely broken. I also greatly dislikes that it pollutes your default environment. Imagine if every piece of software would do that.
> Your setup sounds outrageously complex, and I don't understand why you would do any of that
But isn't that similar to what virtualenv does?
The differences are that with virtualenv, folders are created elsewhere.
Besides that, friction so far hasn't enough for me to automate this even more (as in, creating a script that would do the folder creation) but I guess I'll do eventually.
Also, this way I have a set of python packages already installed.
And it was fun learning how to command pip / python to do things as I wanted them. That I guess answers this:
> Also, this way I have a set of python packages already installed.
Isn't this what virtualenv was created to get away from?
If several of your projects rely on one of the "base" packages, upgrading it (which you might want for one project) could break the other projects.
Also, having implicit dependencies on packages might lead to problems with deployment (or for other developers) because they don't know that this package is required (and you might not have realized either, since it's always available on your machine).
(I might have misunderstood your solution, though. I've never used Docker.)
Hi, nop, the reasoning behind having a base set of packages installed is that I more often than not use those.
But as I start docker off a particular folder, I could decide not to use that base folder, and then having no package pre installed, or a complete different set.
This other comment [1] clarifies my whole set up, and sheds a lot more light on how it works
The reasoning here is lost on me. We want independent envs for independent codebases. Why do you have some static base to build off?
What's with the root stuff? Why is any of this taking place outside of the env that actually needs it? You're creating huge amounts of complexity and dependency problems to save yourself a couple of literal ms of download / path lookup per env.
The friction here sounds like a tire fire. I think you are just not aware at the moment of how much less convoluted this can be. You've become used to this complexity.
> We want independent envs for independent codebases
OK. I understand where are you coming from, and I feel your pain, being a developer myself and have been way too many times that I want to admit on the "But, why?!" position.
You sound really distressed by this, and at the same time honestly curious (And baffled can probably shoved in there) so 'll explain things a little more.
I think all this starts with the fact that I explained myself rather poorly. I _do_ have different envs for different codebases. My set up is like this:
Everything python related is in a folder where I store my "dockerized" projects:
~/projects/docker/python/
In that folder I have a pip.env file that I use when setting up the container. It tells pip where to pick things up from:
I take it it might be a little messy (and may be thee are some redundant confs there) but it works. Once I got it working, I didn't mind about removing what was not necessary.
then I have a "base" project folder, that I use as a starting point for _most_ of my projects:
base
├── packages
├── notebooks
This folder has installed a few "default" packages (pandas, numpy, matplotlib, youtube-dl).
So, as you can see the packages go all in a custom folder. This is because when I start the container, the project folder would be shared inside the container, and python / pip will pick the packages from there. If I install new packages, it will be persisted in the "env" folder system of the guest OS, so next time I "start" than environment all them packages are there.
Whenever I want to start a new project, the steps I need to do is:
python.sh env_name
python.sh is a bash scripts that checks if env_name is there (as a folder). if not, it copies the base env and all it's contents (That, just so that it's extra clear, is a barebones folder _except_ for the already stated packages installed in it). In case it receives a second parameter, it uses that as starting folder (So, I could duplicate an existing project, or use an even emptier base folder).
Once the folder is there, it just starts a container in interactive mode with the (environment) base folder always shared as /work
In case I need some files to work there, I just copy them.
Again, the _only_ issue so far - that annoys me -, is that if I create files when inside the container they are owned by root. Eventually I'll grow tired of this and will fix it, but not there yet.
for me starting a certain env is just one line of code. Obviously you could do the same with your method.
Said that I didn't have the script, for me creating a new env goes like this:
What I like of this set up is that all the files related to a certain environment are in a certain place. Also, moving envs from one machine to another is rather trivial (as long as I didn't install anything not related to python in that container).
This way, the container is removed every time you exit it, but it has a name so you can log into another console would you need it. the only issue with this is that it's a barebones OS, so if you need some program for it you have to install it (And that is lost of you don't keep the container: in my todo list there's a change for python.sh where it would be possible to persist the containers, and run them from the one already existing if found. The downside to this is that every container you persist is several hundred mb... But space is cheap, right?). BTW, I do have a container I persist for use with youtube-dl as it depends heavily on ffmpeg.
From my point of view, that works the same a venv. Your mileage might vary.
Caveat: python:3.7 is not the default name of the python image; I renamed it because the default was too long. I used to do it this way until I got tired of writing the same long command.
Sure, but I'm not teaching here. Even then, unbuntu will tell you how to install pip (actually it probably comes installed), and closing the terminal is sufficient to close the env for most.
the only time i've appreciated conda is on windows systems. recently I've been playing with pystan, which - after hours of trial and error - im not even sure is possible to install correctly without conda.
Recommending Docker in these discussions usually gets me downvotes (?), but I'll do it anyway because it's such an easy way to develop and run Python programs across environments:
1/ add a Dockerfile into your project with the following lines:
You can run code-server to run VS Code inside the docker container where your files will be. This is the Dockerfile I use, which I got from someone else's link that was posted on HN:
# the base miniconda3 image
FROM continuumio/miniconda3:latest
# load in the environment.yml file - this file controls what Python packages we install
ADD environment.yml /
# install the Python packages we specified into the base environment
Building that and running it with: docker run -d -p 127.0.0.1:8443:8080 -p 127.0.0.1:8888:8888 -v $(pwd)/data:/data -v $(pwd)/code:/code --rm -it <image>
from the directory where your code is will put those files into the container, and start a VS Code and a Jupyter Notebook server on your localhost. The password for Jypter is the default "local-development" and the password for the VS Code instance is in the Docker logs. You can set these via the Dockerfile but I just keep the defaults.
I vastly prefer this to anything else because it means I can install any packages I want without worrying about messing up my environment. You can use virtual envs to make this even better, but I am typically too dumb and lazy for that. Better part still is that my development is the same on my Mac, on my Linux machine, and on my Windows machine. Same VS Code version, same packages, etc.
Biggest issue here is with certain VS Code plugins. Some, like the vim plugin, can be finicky and depend heavily on the version of code server that you use. Some plugins break completely. However, I mainly hate plugins so this doesn't present much of an issue for me personally. I have the vim plugin, the python plugin, and a terraform plugin installed. Once they are installed, they work perfectly for me.
The way my set up works is I have a repo with that Dockerfile in it as well as the accompanying files such as environment.yml and docker-entrypoint.sh:
#!/bin/bash
set -e
if [ $# -eq 0 ]
then
jupyter lab --ip=0.0.0.0 --NotebookApp.token='local-development' --allow-root --no-browser &> /dev/null &
code-server2.1698-vsc1.41.1-linux-x86_64/code-server --allow-http --no-auth --data-dir /data /code
else
exec "$@"
fi
I actually currently use python inside docker (I put that in another comment in this thread). Much cleaner, and was fun to get python / do what I wanted (Where I wanted) inside the container.
I have had this problem every time I try to teach anyone to use Python. I have years of experience with messing around with all of the relevant parts of Windows, Mac, and Linux and it never Just Works. It's super frustrating when others are like 'it's easy just use x'. It's not. If you're a beginner, the setting up of a dev environment is far more of an obstacle to learning go code than the syntax or programming concepts.
Repl.it is one way to sidestep these issues but it's not perfect. Making people jump in the deep end and use Linux helps to some extent, but also brings its own set of issues and frustrations.
I don't think "add another layer of abstraction to hide the complexity" is often a good solution. Docker brings it's own problems too.
> I don't think "add another layer of abstraction to hide the complexity" is often a good solution
I wholeheartedly agree here.
> Docker brings it's own problems too
This is a rather complex statement to reply to. Even thought you might be right, I don't think this applies totally to what is being discussed here.
The biggest issue I have with people advising for or against a certain tool, is that they do that from the point of view of the tool, instead of looking at fit from the problem you are trying to solve. that in your case, would be:
> I have years of experience with messing around with all of the relevant parts of Windows, Mac, and Linux and it never Just Works
As long as you manage to install Docker in all those 3 systems (I have no experience with Mac because I don't use it) both for Windows and Linux installing Docker is a no brainer.
There's a slight curve when it comes to fetching the right image and running it, but your problem is not that one; your problem is teaching Python. So you can take care of that yourself, and focus on the teaching part.
Supposing that you managed to install Docker, fetch a python image and running it (Something that is a lot more easy to do than it sounds), you have python, whatever version you want, and in an isolated way.
I use pipenv,and it works well enough for what I'm doing, but in reality the problem is that python just isnt designed to be distributed. There are various techniques each with their own pros and cons, but the only consensus is that it's a pain in the ass.
Docker uses more resources, and is trying to isolate your code from the rest of thr computer. It also has a higher learning curve.
Virtual environments are just separate copies of python with their own libraries installed.
Pipenv is basically just a workflow for organizing virtual environments
I can give someone a repo and tell them to type pip install pipenv && pipenv sync and they'll have everything. Assuming they already have the correct version of python installed, which is one nice thing docker handles, but it is easy to install python these days so it hasnt been an issue.
My biggest problem with docker was that I ended up using a ton of storage just learning about it. I have a feeling that was mostly my own fault, maybe using too heavy of a base image. I've been trying it out once or twice a year for about 6 years now, every time my conclusion is "wow this is really cool, I wish i could justify spending more time to get it right"
Though docker and virtual environments share the same problem, in that they are just a way for a developer to distribute code to other developers, and to production environments. Distributing python applications to end users is a totally different issue. I floated the idea of sending out a local data collection* app to Mac and windows users mostly because I think it would be fun to try.
*data collection of troubleshooting information from users within the same company on company hardware, that are actively asking for help. I'm not trying to spy on people.
> What are the tradeoffs vs using docker? Just curious.
Probably some combination of memory usage and complexity, depending on your application. If you're already familiar with using docker as a development environment, definitely go for it.
I don't use pipenv, I'm still using plain old virtualenv for development. Mostly it's just a matter of familiarity. If there's not an itch, why scratch?
Docker has been mentioned a couple of times but no one has referenced one of the biggest wins of Docker which is it lets you quickly get up and running for things that your Python app might be using in addition to Python itself.
If we're talking about web development with Python typically that means PostgreSQL and Redis too, and probably running Celery in addition to a web server such as gunicorn.
It's really nice to be able to just run a single Docker Compose command and be up and running in a way that works the same on Windows, MacOS and Linux.
Haha yeah I find it easy too, which is why we're here, but I'm trying to also make it easy for people who normally think it isn't.
And yes, setting up Python is a pain, so I think being able to start learning and running code without any setup is a big deal for learners. Get them excited about coding before they have to deal with that!
I've decided to play it simple albeit "old school" (because Docker seemed to add its own set of headaches - also not every project needs to do micro services or work with clusters or scale):
- Vagrant (so you don't have to worry about runtime executables) with the stack matching that of the production server (you can even ask ops to provide you with the provisioning script and remove the parts that you don't need - otherwise learning to provision your dev. VMs won't hurt you)
- Virtualenv
- pip
Then simply point your IDE (I only use Vim under duress) to the remote Python interpreter (the one you installed in the Vagrant VM).
It does add processing overhead but it worked with my 2010 MacBook Pro until it died and still works (only ten times faster) with the 2016 model. Your only limitation would be the RAM (I would recommend at least 8GB and if you plan to run multiple machines communicating together as much as you can afford - I do believe that Docker has less overhead, but again, for my use, not needed).
The best practice is what works for you, not the latest trend.
In a nutshell, use pyenv [0] to manage your Python versions. It's a bash-shell solution, so it won't work on Windows; but if we're talking about learning Python, I highly recommend using WSL [1] anyway.
It's definitely confusing, because there are so many options. I wrote about dependency management best practices here[1]. Planning to write a follow-up on "cross-platform executables" soon
Nice work! I tried to clone the project and run with Docker with the hope of contributing but I'm getting this error message on the "Introducing the Shell" page:
The process died.
Your code probably took too long.
Maybe you have an infinite loop?
Do you have a chatroom or something where project devs can discuss work in realtime?
Is Python the language of choice to teach/learn programming nowadays?
I learned to code with Java like 10 years ago. I haven't touched it again after graduating, and to be honest I was not so fond of the language (too verbose and too much OOP for my taste), but I'm glad I learned the benefits of having static types and a compiler from the start.
Obviously, Python is closer to pseudo code than Java, which is also great when you just start learning.
I've had about 6 people in my work area learn Python. We're not employed as programmers, but we're scientists in an R&D team. My colleagues varied in experience, some had used Matlab in college, but none had programmed extensively.
The scripting aspect is great because you can get useful things done with a few lines and minimal constructs. There's a pile of solved problems on Stackoverflow. Folks can approach it at their own pace and on their own terms, at work or at home. I promise them that if they hate it, I'll refund the price. ;-)
Now, germane to the discussion in this thread about installing the packages and so forth. That's a drawback to Python that I tell people about, but my approach is to provide complete hand holding on installation until they've come up to speed on programming. If they've never programmed, then they've probably never approached a computer from the command line or dug into its file structure. So, learning a bit of programming first is a good way to prepare for doing that other stuff.
I originally learned BASIC without knowing how to install BASIC on the mainframe.
We're a Windows shop, so I help them download and install WinPython. Running into a useful library that WinPython doesn't have is pretty rare, and then a pip install within the WinPython shell usually works.
It's a fantastic scripting language, which makes it excellent for people who want to automate computing tasks. Researchers, journalists, academics, administrative roles, etc.
But it is very high-level, so it is not a great way to teach/learn how something like a computer, database, or operating system works.
I think it's a great way to start teaching or learning programming, since you can keep using it no matter where your career takes you.
Thanks! If you've wanted to make something similar, would you perhaps join me in making this? I'm looking for contributors.
Yes, enumerate is better, but I want to teach concepts one at a time and I want students to understand what's going on in code. Before teaching `for a, b in c` I want to teach `a, b = c`. And to motivate doing that I want to teach `return a, b`. At this point in the course they're only just starting to learn about lists - they haven't seen tuples and they've never defined a function. So it's not time yet.
Besides, it's important that students are intimately familiar with how to index a list a which indices are valid.
This seems to be non-standard, but I make a serious effort to never teach the wrong way to do something. Primacy is just so strong.
It makes teaching harder. You really have to work to come up with great examples. But it makes learning easier and there's less correcting to do later.
In general, I agree, and I think it’s especially important in text.
A teacher is in a position of authoritative trust. As a student, after I learn an example is flawed, I often wonder if I’m just missing some context because I trust the teacher to have gotten it right. In a setting where communication is already established (e.g., a class room) this can be cleared up quickly with a question. In other settings (e.g., reading a text), it can leave me wondering until I reach a much higher level of competence.
When the course reaches the point that students are ready to learn about enumerate (see my reply above) the course will definitely cover it and emphatically point out that it is the better way.
Take it for what it's worth, but that's exactly what I think is a bad idea. It violates the principle of primacy in learning[1]. It also erodes trust in the lessons. (Is this the real way to do it?)
There's a similar strategy of building things up and then refactoring when they get bad (this ifelse is getting too big. We could use a dict). But the difference is every step along the way is valid or immediately corrected.
The student has never seen a tuple, or iterable unpacking of any form. Would you just show them `for index, word in enumerate(words)` and tell them not to worry about what that means?
Quite shortly after this, I ask them to essentially do `zip_longest`. Given two string variables:
string1 = "Goodbye"
string2 = "World"
output:
G W
o o
o r
d l
b d
y
e
Here's what I expect their solution to look like:
length1 = len(string1)
length2 = len(string2)
if length1 > length2: # one could use max, but I don't expect them to
length = length1
else:
length = length2
for i in range(length):
if i < len(string1):
char1 = string1[i]
else:
char1 = ' '
if i < len(string2):
char2 = string2[i]
else:
char2 = ' '
print(char1 + ' ' + char2)
That's not something that can nicely be solved with enumerate. Do you think this exercise is bad because they should just use zip_longest instead?
I'd split the problem into two. You're talking about these problems like they're fixed, but they're not. This is your project.
I'd teach index access and number generation separately, with small incrementing variations. Then I'd show iteration, then enumeration (maybe after tuple unpacking).
When you combine two concepts it doesn't create just a combination. It creates one or more new concepts. Those new concepts have their own idioms and they should be taught properly.
Combining looping over numbers and index access creates two new concepts: looping over items and looping over items with their index. Both of those things have their own idiom in Python and your solution shows neither.
I think telling someone not to worry about the details of how something works because you'll get back to it later is way better than showing them the wrong thing and then correcting it. One is a promise kept, the other a promise broken.
In my opinion, nailing this kind of stuff is like half of the value of the project you're doing. You need to lean way in on it and get it right.
My take is right now you're thinking backwards from the position of someone who knows how to code. You need to think forward like someone who doesn't, at least more often. A student isn't going to be motivated to learn a, b = c by learning return a, b first because they won't know that second concept exists!
In short, if your students aren't ready to use enumerate or zip_longest, don't hand them problems that call for enumerate or zip_longest.
> I think telling someone not to worry about the details of how something works because you'll get back to it later is way better than showing them the wrong thing and then correcting it. One is a promise kept, the other a promise broken.
I'm not sure I'd say teaching someone to do something by hand for which they could use an existing function is "wrong". There's huge pedagogical value in knowing how powerful tools you didn't make work. Going bottom-up is a very effective way to do that (just ask the lisp folks), and a culmination of "nice job, you've implemented something so useful that it mirrors what the language designers/library authors did as well, here's how to use their version to save time in the future" is far from a broken promise.
> I'm not sure I'd say teaching someone to do something by hand for which they could use an existing function is "wrong".
I didn't say it was wrong the wrong way to teach. I said the code was wrong. Like, if I removed myself from teaching and I just saw that in a PR, I'd definitely suggest they use enumerate instead.
I'm a huge fan of reimplementing built in functions as a way to learn. I'm learning Clojure right now (like I stopped to write this comment) and I do it all the time. But, I know I'm doing it.
The other thing that's fine is building up to the abstraction. "Let's get the index, now the item. Okay, there's a better way to do this". But you have to do it immediately.
I'm just not a fan of showing someone something, letting it sink in, and then having to go back and correct it.
And while I hope my arguments stand on their own, and I certainly could be wrong, I'm at least not speaking out of complete inexperience. I've spent a decent amount of time teaching people to code, juggle, work at rope courses, and fly airplanes.
This is a cool idea. In a similar vein, I use Google Colab heavily to induct new students to python. Requires no setup, we can collaboratively edit and the output is instantaneous.
I have struggled with virtual environments and runtime executables across various OSs.
What are current best practices?