I read an interesting article on OpenSource.com about Why Python is perfect for startups. I’ve done the startup thing a couple times now and I’ve spent a lot of time developing with Python. I just wanted to add a little balance to that article and point out a few things to consider before investing in Python as the foundation for your new business.
Yes, I know, I’m about to unleash a holy war. Putting down someone’s favorite language, tool, whatever, is bound to frustrate people. So let me put this argument to rest before we begin. These are my opinions based on my observations. You are 100% free to have different opinions than I have. And, if you can do so politely, you are absolutely welcome to voice your opinions in the comment section below.
With that out of the way, let’s highlight four big concerns:
1. There are fewer quality Python developers to hire
Python is a brilliant language. It’s easy to learn and it makes you productive quickly. It’s so easy to learn that introductory computer science classes, the kind targeted at business majors, often teach Python. It’s also often taught to younger students. There are a lot of people in the world who know Python and can write a program with it, including many non-programmers.
There’s an 80/20 rule for computer programming. The first 80% is easy and takes you 20% of the development time. The last 20% of the work, things like data validation, dealing with corner cases, and such, take 80% of the time. Python makes the first 80% nearly effortless, so that anyone can do it. However, in every project, despite the programming language you use, the last 20% is hard and it stinks.
It’s because of this last 20% that you want a real programmer that knows what they’re doing. Someone who has already made expensive mistakes. Someone who expects corner cases and knows how to deal with them. Someone who understands the patterns of computer science, and for that matter, who bring the science part of computer science.
Python has fewer professional computer programmers for hire than some other popular languages. This is true world-wide in general, though there are some regions where there are higher concentrations of Python developers. We can go to Odesk and look at the contractors available to hire globally.
Let’s look at a nice chart:
This chart shows how many developers there are on Odesk that have a 4 star or higher rating and list a particular skill. That does not necessarily mean that the developers are skilled with this language, though. It’s probable the actual numbers are lower than what is listed. For example, a developer who has done mostly C# work and has earned a high rating may also list Ruby as a skill, even though they are not a four-star Ruby developer.
The key thing to note here is that there are far fewer developers who list Python as a skill. Keep in mind that with ODesk, you have every incentive to list every skill you possibly can in order to compete for more work.
What does this mean? Well, it means the supply of skilled developers to hire is smaller. That means it will take you longer to find a developer that meets your needs, and because the supply is smaller, the cost will be higher. Compare that to PHP or C# which have 5-7x more developers available to hire.
2. Python performance is bad
Python is optimized for developer performance, not runtime performance. It’s string processing performance is horrible, it lacks multi-threading capability (the famous “GIL” issue) and popular frameworks such as Django are pre-configured for safe (i.e. bad) performance.
That said, an experienced, skilled Python developer can get some great performance out of Python. Not the person who took introduction to computer science in college, but a person who has spent years working with Python in high-performance environments.
I can give three examples of performance problems I’ve dealt with in Python. All are anecdotal but every experienced Python developer will admit that these are things you have to learn to work around to be successful with Python.
String performance: I created a log processing and reporting application. It was built to pre-process log files and create an optimized, searchable database that could be used to generate reports on demand. The problem was that it was taking 27 hours to process 24 hours of log data. After beating my head on the desk for a while I rewrote it in C# and was able to reduce the time to process 24 hours worth of data to 15 minutes. After finding that solution I returned to Python to figure out why it was struggling. The culprit: string concatenation. I was able to work around to a solution that got performance down to 45 minutes and with lower memory usage than C#, but that’s one big gotcha.
Django is a great tool for helping you quickly build database web applications. However, I noticed that a simple web-page, something similar to a super-basic Twitter, was greatly hindered by performance. I was able to only serve a few requests per second. I used the Django Debug Toolbar to inspect the performance on the page and found that Django was doing n+1 queries. So if there were 2 objects on the page, it was doing 3 queries. If there were 30 objects on the page, it was doing 31 queries.
It turns out there is an option you can use (with care) called select_related. Instead of doing n+1 queries it does 1 query. Once I learned this and started implementing it, my application performance jumped noticeably.
Scaling Python is hard. If you want to read an interesting account, check out my post on how Reddit was melting one of my customers servers. We were never able to get acceptable performance from the Python app until we replaced the dynamic content with static HTML. A big problem is that Python doesn’t utilize multiple CPU cores because it’s not multi-threaded. We had a beefy server that was mostly idle, even though the site was getting hammered and site visitors were getting 503 errors. Yes, there are solutions to this, but again, you need experience.
In case you’d like to dismiss my claims, I built the CMS that runs (last I checked) Ubuntu.com, a very busy site that has massive spikes in performance without batting an eye. The CMS is built with Python. It works by building a static HTML copy of the site, save for a few key pages. Performance is awesome because it’s hard to beat static HTML for performance.
3. The Python developer community is seriously divided
A few years ago, the Python leadership decided to make a big change. They initially called this change Python 3000. They discussed it for years, planned it far in advance. Finally, the decision was made, Python 2 would cease development and Python 3 would take over. Dates were set and work began.
The problem is that Python 3 was incompatible in subtle (and not subtle) ways from Python 2. Much of the code that the Python world depended on was written for 2.x and was going to take a long time to port over to Python 3.
Concessions were made and additional releases were planned to keep Python 2.x users happy. Years went by while the Python leadership pushed and pulled developers to Python 3 and the Python community resisted the change.
To this day, many books, tutorials, videos and most importantly, critical libraries still are incompatible with Python 3. Python 3 was released 6 years ago on December 3rd, 2008. Maybe 2015 will be the year that Python 2 goes away. Maybe.
4. Deploying Python is hard
This is not a problem unique to Python. Java is hard to deploy. Node.js is hard to deploy. Lot’s of things are hard to deploy. There are only two things that aren’t hard to deploy: PHP and plain HTML.
Really, this comes back around to the talent pool issue again. The number of people who have deployed a Python app that can scale for bursts of traffic is relatively small. The number of people writing and talking about it is pretty small. I’m one of them, you can read my three-part series at Digital Ocean if you need some help.
This is an important concept to get right. The first time one of my apps I’d built with Python got a little publicity, the website died. There is absolutely nothing worse for a startup than asking the press to check out your product and have them say, “What product? All I see is an error.” Getting a first chance is hard. Getting a second chance is killer hard.
I think this is a major contributor to the popularity of WordPress. The PHP and WordPress communities have worked hard to make deployment drop-dead simple. In most cases, you upload and you’re done.
I don’t think the difficult deployment issue is a deal breaker for a big app, but if you’re a startup (check the title, that’s the topic for this post), fast turn-around and sites that stay running under load are the secrets to success.
Conclusion – what to choose?
OK, Python has problems, you admit, maybe grudgingly. What should you choose, then? Choose the tool you and your development team like the best and that works best for you.
Do you have a very small budget? You can get good PHP developers cheaper than about anything else. You may find it’s better to do the coding yourself, in which case, use the tool you and your buds know best. Just keep in mind that you may need to scrap your site and start over. (PHP haters, I’ll point out that Facebook is built on PHP, as is WordPress, the CMS that powers about a quarter of the Internet)
Is performance critical? Java and C# both have proven track records of handling massive loads. Node.js is emerging as very fast solution. PHP performance is well documented.
What if you really want to use Python? Then do it! Disqus handles a billion visitors per month to their website which is built with Python. Python has historically be used heavily at Google and Youtube as well. It’s possible to be successful with Python. You just have to know about and accommodate the issues I’ve mentioned above.
The goal of this article is to help you make an informed decision. You’ve been informed. My work here is done.