developers-for-hire

Four reasons why Python isn’t right for your startup

Matthew Nuzum —  — 8 Comments

I read an interesting article on OpenSource.com about Why Python is perfect for startups. I’ve done the startup thing a couple times now and I’ve spent a lot of time developing with Python. I just wanted to add a little balance to that article and point out a few things to consider before investing in Python as the foundation for your new business.

Yes, I know, I’m about to unleash a holy war. Putting down someone’s favorite language, tool, whatever, is bound to frustrate people. So let me put this argument to rest before we begin. These are my opinions based on my observations. You are 100% free to have different opinions than I have. And, if you can do so politely, you are absolutely welcome to voice your opinions in the comment section below.

With that out of the way, let’s highlight four big concerns:

1. There are fewer quality Python developers to hire

Python is a brilliant language. It’s easy to learn and it makes you productive quickly. It’s so easy to learn that introductory computer science classes, the kind targeted at business majors, often teach Python. It’s also often taught to younger students. There are a lot of people in the world who know Python and can write a program with it, including many non-programmers.

There’s an 80/20 rule for computer programming. The first 80% is easy and takes you 20% of the development time. The last 20% of the work, things like data validation, dealing with corner cases, and such, take 80% of the time. Python makes the first 80% nearly effortless, so that anyone can do it. However, in every project, despite the programming language you use, the last 20% is hard and it stinks.

It’s because of this last 20% that you want a real programmer that knows what they’re doing. Someone who has already made expensive mistakes. Someone who expects corner cases and knows how to deal with them. Someone who understands the patterns of computer science, and for that matter, who bring the science part of computer science.

Python has fewer professional computer programmers for hire than some other popular languages. This is true world-wide in general, though there are some regions where there are higher concentrations of Python developers. We can go to Odesk and look at the contractors available to hire globally.

Let’s look at a nice chart:

developers-for-hire

This chart shows how many developers there are on Odesk that have a 4 star or higher rating and list a particular skill. That does not necessarily mean that the developers are skilled with this language, though. It’s probable the actual numbers are lower than what is listed. For example, a developer who has done mostly C# work and has earned a high rating may also list Ruby as a skill, even though they are not a four-star Ruby developer.

The key thing to note here is that there are far fewer developers who list Python as a skill. Keep in mind that with ODesk, you have every incentive to list every skill you possibly can in order to compete for more work.

What does this mean? Well, it means the supply of skilled developers to hire is smaller. That means it will take you longer to find a developer that meets your needs, and because the supply is smaller, the cost will be higher. Compare that to PHP or C# which have 5-7x more developers available to hire.

2. Python performance is bad

Python is optimized for developer performance, not runtime performance. It’s string processing performance is horrible, it lacks multi-threading capability (the famous “GIL” issue) and popular frameworks such as Django are pre-configured for safe (i.e. bad) performance.

That said, an experienced, skilled Python developer can get some great performance out of Python. Not the person who took introduction to computer science in college, but a person who has spent years working with Python in high-performance environments.

I can give three examples of performance problems I’ve dealt with in Python. All are anecdotal but every experienced Python developer will admit that these are things you have to learn to work around to be successful with Python.

String performance: I created a log processing and reporting application. It was built to pre-process log files and create an optimized, searchable database that could be used to generate reports on demand. The problem was that it was taking 27 hours to process 24 hours of log data. After beating my head on the desk for a while I rewrote it in C# and was able to reduce the time to process 24 hours worth of data to 15 minutes. After finding that solution I returned to Python to figure out why it was struggling. The culprit: string concatenation. I was able to work around to a solution that got performance down to 45 minutes and with lower memory usage than C#, but that’s one big gotcha.

Django is a great tool for helping you quickly build database web applications. However, I noticed that a simple web-page, something similar to a super-basic Twitter, was greatly hindered by performance. I was able to only serve a few requests per second. I used the Django Debug Toolbar to inspect the performance on the page and found that Django was doing n+1 queries. So if there were 2 objects on the page, it was doing 3 queries. If there were 30 objects on the page, it was doing 31 queries.

It turns out there is an option you can use (with care) called select_related. Instead of doing n+1 queries it does 1 query. Once I learned this and started implementing it, my application performance jumped noticeably.

Scaling Python is hard. If you want to read an interesting account, check out my post on how Reddit was melting one of my customers servers. We were never able to get acceptable performance from the Python app until we replaced the dynamic content with static HTML. A big problem is that Python doesn’t utilize multiple CPU cores because it’s not multi-threaded. We had a beefy server that was mostly idle, even though the site was getting hammered and site visitors were getting 503 errors. Yes, there are solutions to this, but again, you need experience.

In case you’d like to dismiss my claims, I built the CMS that runs (last I checked) Ubuntu.com, a very busy site that has massive spikes in performance without batting an eye. The CMS is built with Python. It works by building a static HTML copy of the site, save for a few key pages. Performance is awesome because it’s hard to beat static HTML for performance.

3. The Python developer community is seriously divided

A few years ago, the Python leadership decided to make a big change. They initially called this change Python 3000. They discussed it for years, planned it far in advance. Finally, the decision was made, Python 2 would cease development and Python 3 would take over. Dates were set and work began.

The problem is that Python 3 was incompatible in subtle (and not subtle) ways from Python 2. Much of the code that the Python world depended on was written for 2.x and was going to take a long time to port over to Python 3.

Concessions were made and additional releases were planned to keep Python 2.x users happy. Years went by while the Python leadership pushed and pulled developers to Python 3 and the Python community resisted the change.

To this day, many books, tutorials, videos and most importantly, critical libraries still are incompatible with Python 3. Python 3 was released 6 years ago on December 3rd, 2008. Maybe 2015 will be the year that Python 2 goes away. Maybe.

4. Deploying Python is hard

This is not a problem unique to Python. Java is hard to deploy. Node.js is hard to deploy. Lot’s of things are hard to deploy. There are only two things that aren’t hard to deploy: PHP and plain HTML.

Really, this comes back around to the talent pool issue again. The number of people who have deployed a Python app that can scale for bursts of traffic is relatively small. The number of people writing and talking about it is pretty small. I’m one of them, you can read my three-part series at Digital Ocean if you need some help.

This is an important concept to get right. The first time one of my apps I’d built with Python got a little publicity, the website died. There is absolutely nothing worse for a startup than asking the press to check out your product and have them say, “What product? All I see is an error.” Getting a first chance is hard. Getting a second chance is killer hard.

I think this is a major contributor to the popularity of WordPress. The PHP and WordPress communities have worked hard to make deployment drop-dead simple. In most cases, you upload and you’re done.

I don’t think the difficult deployment issue is a deal breaker for a big app, but if you’re a startup (check the title, that’s the topic for this post), fast turn-around and sites that stay running under load are the secrets to success.

Conclusion – what to choose?

OK, Python has problems, you admit, maybe grudgingly. What should you choose, then? Choose the tool you and your development team like the best and that works best for you.

Do you have a very small budget? You can get good PHP developers cheaper than about anything else. You may find it’s better to do the coding yourself, in which case, use the tool you and your buds know best. Just keep in mind that you may need to scrap your site and start over. (PHP haters, I’ll point out that Facebook is built on PHP, as is WordPress, the CMS that powers about a quarter of the Internet)

Is performance critical? Java and C# both have proven track records of handling massive loads. Node.js is emerging as very fast solution. PHP performance is well documented.

What if you really want to use Python? Then do it! Disqus handles a billion visitors per month to their website which is built with Python. Python has historically be used heavily at Google and Youtube as well. It’s possible to be successful with Python. You just have to know about and accommodate the issues I’ve mentioned above.

The goal of this article is to help you make an informed decision. You’ve been informed. My work here is done.

If it helped, please share!Tweet about this on TwitterShare on FacebookShare on LinkedInShare on Google+

Matthew Nuzum

Posts Twitter Facebook

Web guy, big thinker, loves to talk and write. Front end web, mobile and UX developer for John Deere ISG. My projects: @dsmwebgeeks @tekrs @squaretap ✝
  • newz2000

    I do want to toss out an exception for a couple industries, specifically math/data and life science, where some of the best tools are Python based.

  • sherzberg

    Your first few paragraphs suggest Python is bad for startups in general, but the rest of your bullet points only discuss using Python for building CMSs. CMSs typically have a very different use case than startups seen recently. CMSs are typically read heavy, which of course static HTML sites would be better at. Bullet 2 unfairly throws Python under the bus when comparing it to static HTML. This can be said about any language and framework.

    In bullet 2 you are also saying Python is not performant because of Django. How is it Python’s fault that django does extra queries? Is Python forcing Django to make this query? No, you use a different Django API to tune the queries. This is totaly unrelated to Python and also unfair. I think its fair to say “if you are unfamiliar with Django, don’t use it for a startup” but saying python is not performant because of Django seems wrong. You even state in your linked article (http://www.bearfruit.org/2013/04/19/reddit-is-melting-our-server-heres-what-we-did-nginx-apache-django-and-mysql/) that “It is my opinion that the Django app was not coded properly to handle this load”. Anyone that thinks a single server, non-static, dynamic content site will be not taken down by the “reddit hug” will get 503s.

    I would also argue that quantity of developers doesn’t necessarily mean higher quality. I could also say “I’ve seen way more lower quality developers in PHP than I have in Python, therefore I should use Python” which doesn’t make sense either.

    I also take offense to singling out a specific language for saying its bad for startups. This article could easily be about nodejs and ruby. I would think you would want your goal to be “here are things startups should avoid” and list things like ease of deployment, fast iterations, performance when it matters.

    Do I work for a startup, yes. Do we use Python, yes. Are there growing pains, yes. I am a firm believer that any startup that needs to deal with any scaling issues will deal with any of these problems with any language. Scaling takes expertise no matter what.

    • newz2000

      Good points, but a couple issues:

      1. If you can’t hire devs, you’re going to have problems at every corner. This is a real issue startups need to consider.

      2. The first performance example was not Django specific, it was a daemon process. This is another issue startups need to consider.

      3. If 20% of Python devs are good and 10% of PHP devs are good, and there are 7x more PHP devs then there are more good PHP devs. These are made-up numbers to illustrate the point that having a larger pool of talent to select from gives startups more choices. In the figures I quoted in the article, these are developers who had been given a good rating by their employers. We don’t know why they were given a good rating, but I think it stands to reason that highly rated PHP developers and highly rated Python developers both bring value to the table that startups want.

      I singled out Python because the source article also singled out Python. You are right, there are good and bad aspects to every choice. I see a lot of startups choose Python because the founder learned a little bit of it in school. That’s not a good reason.

  • ©ameron Tarbell

    So glad i didn’t bother with Python… Thought about it for a while

  • Michal Slonina

    Matthew, i can’t disagree more. I’ll present my view point:

    1. You present biased statistics. Let me present biased ones too: http://langpop.com/. Python comes with a scientific tool set that other languages can only dream about (pandas, numpy, scipy, numba, scikit-learn, pytables). Due to it’s strong presence in the academic world, every recent CS grad knows a bit about python.

    2.
    (A) STRING CONCAT SPEED: Your problem is related to your lack of experience with scripting languages. You would hit the same problem in PHP, Ruby, Lua (insert your duck typed language here). This solution is mentioned in https://wiki.python.org/moin/PythonSpeed/PerformanceTips. Please don’t blame the language for your lack of experience. If performance is critical try http://cython.org/ – you will be able to run circles around C# speed with low level access to memory.
    (B) THREADING: Here comes the lack of experience with devops. Your problem was related to web server setup. Forks are cheap – you can fork 100k process/s on a decent OS. I would suggest to avoid threading at all costs, as it is the shortest path to bring down production systems. Multithreading is hard to debug, hard to write (synchronization) and complicates simple architectures. If you are in a startup always remember the http://en.wikipedia.org/wiki/KISS_principle.
    (C) SCALABILITY: Computer languages do not scale. Software architectures do! PHP interpreter is slower as well. Yet Facebook was implemented in PHP – and it did scale well. Put a load balancer and a reverse proxy cluster in front if you have a problem with static page performance.
    3. Python developer community is not divided. Everyone is migrating to Python 3. But there is still a lot of good old code that remains to be ported. (but i do agree that lack of backwards compatibility is inconvenient.)
    4. https://wiki.python.org/moin/ConfigurationAndBuildTools – if you can’t handle deployment in a startup, that means you need to enhance your talent pool. I don’t see a difference between solid python and php deployment systems.

    My personal view is that python is a perfect match for startups. I work with incubating companies for the last 10 years of my career.

    • newz2000

      Thanks for the reply and the comments. I fully understand that there are differing perspectives. Also, I mentioned in the comments that the math, big data and life science fields are exceptions where Python is absolutely one of the best tools to use.

      I am not convinced your other points successfully refute mine. Hiring Python developers is a very real issue. It does vary in different regions, but I have worked with several companies who essentially had to train their own developers because hiring Python devs is too hard.

      By the way, as I mentioned in the closing paragraphs, if someone knows and wants to use Python, I fully support that. I just wanted to highlight some of the problems they’ll face in doing so. I like Python and still use it. I’ve also had to work through many of these challenges on my own and now can cope with them.

      In your second point you mentioned two solutions to the performance issue (B and C) that are very challenging to master. Also, I disagree that all scripting languages have poor concat speed, at least on the scale that Python does. When I last tested with this, which admittedly was 2-3 years ago, Python was way worse than PHP. It is very possible things have improved since then.

      I don’t think it’s wise to pretend the division doesn’t exist. You say the community isn’t divided, but then that the community is divided both in the same paragraph. You say you’ve coached startups who use Python – certainly you’ve seen the challenge of this division. There are many docs, tutorials and videos that don’t indicate what version of Python they’re using and it’s easy to get off track because you choose 3.4 and they chose 2.7 or vice-versa.

      It is getting better, but we’re not there yet.

      • Michal Slonina

        BTW, i just run a few tests on string concat in PHP and Python. Here are the results:

        PHP – 2.70s: php -r ‘$str=””; for($i=0; $i<300000; $i++) { $str=$str . "."; }'
        Python 2.7 – 0.08s: echo -e "s=''nfor x in range(0, 300000):nts=s+'.'n" | python
        Python 2.7 – 0.10s: echo -e "s=''nfor x in range(0, 300000):nts=s+'.'n" | python3
        NodeJS 0.10 – 0.06s: echo "s=''; for(i=0;i<300000; i++) {s=s+'.';}" | time node

        I've made them easy to reproduce, just copy and paste to a UNIX shell.

        I know that microbenchmarks suck but it looks like PHPs string concat is much slower then pythons one. Can you give us a counter example ?

      • Michal Slonina

        Regarding the developer availability at one company I work for python developer training takes 10h for non CS grads. After this training they are ready to code basic ETL processes as technical account consultants. Anyway, try to do a simple mind experiment and imagine switching a developer from C# to Python and vice versa and ask yourself which direction would be easier.