What’s behind Python?

11 April 2019
python
Python, created in 1991, is now the main programming language used in Machine Learning, Big Data and Data Science. It makes it possible to automate certain aspects with little human value-added and make application prototypes rapidly. But what curious features are hiding behind this widely used language?

An origin worthy of Hollywood.

The creator of the Python language, a Dutch citizen called Guido van Rossum, first worked on a language called ABC. This ancestor of Python was developed for an AMOEBA-distributed operating system project. However, the lack of internet destroyed this language, which was not able to benefit from improvements made possible by sharing and user feedback. Thus, Python was created.

Python is one of the “Three Ps” i.e. three script languages that were all launched in the same period:

  • Perl A very powerful language for manipulating regular expressions with dubious readability, Perl was designed to increase the possibilities of the unix shell i.e. "Practical Extraction and Reporting Language".
  • PhP Originally a very simple language, which has since changed a lot; it is very accepting of coding approximations. It was designed for “Personal Homepage” web pages, hence, the “PHP”: Hypertext Preprocessor”.

 

Fun Fact: the name of the language discussed in this article reflects Guido’s love of “Monty Python’s Flying Circus”, a surreal comedy created by the British comedy group - Monty Python. In one of their famous sketches, the group ‘kind of’ invented the word “spam” to refer to undesired goods that are forced on us. That’s why small Python programmes use terms like spam, ham, eggs, etc.

Another amusing fact: until 2018, among the Python community, the creator of the language called himself the BDFL - the Benevolent Dictator For Life. This meant that he had the final say when it came to decisions concerning the development of the language.

 

… A language based on simplicity

The first, main and basic objective of Python is readability through simplicity. Certain IT pros claim they can read Python code as easily as English.

The “Rosetta Code” site compares programmes doing the same task but using several hundreds of different computer languages. Our choice to illustrate the simplicity of Python relies on a short case: resolving the Euclid function i.e. computing the greatest common divisor (GCD) of two numbers.

 

Python:

def gcd(u, v):

 return gcd(v, u % v) if v else abs(u)

 

Perl:

sub gcd (Int $a is copy, Int $b is copy) {

 $a & $b == 0 and fail;

 ($a, $b) = ($b, $a% $b) while $b;

 return abs $a;

}

 

C:

int gcd(int u, int v) {

 return (v != 0)?gcd(v, u%v):u;

}

 

LC++:

#include <boost/math/common_factor.hpp>

int gcd(n, n)

{

 return boost::math::lcm(n, m);

}

 

Reading these lines of code, we can see that it takes significantly less time to complete a project in Python than C ++, and that the amount of code is also significantly reduced. The reason why Python is not systematically chosen for IT projects is mainly - but not only - due to its execution time.

 

Python is also known for its great learning curve. This means that the ratio [skill level obtained] on [time spent learning] is higher than in most other languages. It is no coincidence that this language is gradually being used to teach computer science after BASIC, PASCAL and JAVA.

Python is also very expressive. This characteristic is strengthened by libraries and, in particular, “iterools” and “collections”.

Simple example illustrating the simplicity of Python

# pick from the two sequences

list(zip([1, 2, 3], ['a', 'b', 'c']))

result: [(1, 'a'), (2, 'b'), (3, 'c')]

Example of using itertools, a module that implements iterators inspired by other computer languages for "effective" iterations:

 

# This programme finds all subsets of a powerset

import itertools

def powerset(iterable):

    xs = list(iterable)

    return list(itertools.chain.from_iterable(itertools.combinations(xs,n) for n in range(len(xs)+1)))

Collections example, a module that implements "more sophisticated" data types than standard:

 

# This program finds the 10 most frequent words in the Hamlet room (it is assumed that it was uploaded to the Hamlet.txt file)

import re

import collections

words = re.findall(r'\w+', open('hamlet.txt').read().lower())

collections.Counter(words).most_common(10)

 

Python, simple even in terms of access

MATLAB is not free and requires a licence. JAVA is free but IBM pays Oracle to link Java to its products. Objective C is used to serve Apple while C# (pronounced ‘C-Sharp’) to serve Microsoft. There is a free version of C#: Mono.

NB: the reference compiler for C/C++ gcc/g ++ is free and very powerful.

Unlike these languages, the Python license belongs (since 2001) to the Python Software Foundation, a non-profit organisation. The licence is FLOSS "Free and Open Source Software” like Linux, Ubuntu, LibreOffice, Mozilla Firefox, Mono (clone of the Microsoft .NET platform), Apache Web Server and VLC player.

 

Which languages remain "high level"?

High-level languages are not concerned with the nuts and bolts of the device operating the programme. They focus on the problem without taking into account the technical characteristics of the hardware used. Certain IT pros blame these languages for losing control over what the programme is running.

A typical example of a very high level (declarative) language is Prolog; let’s look at this simple example:

father(Louis10,Philippe4)

father(Jean1,Louis10)

grandfather(x,z) if father(x,y) and father(y,z)

 

The interpreter will be able to work out that Jean's grandfather is Philippe, as well as making it easy to browse, via commands and additional data, the family tree of the kings of France.

A typical example of very low level language is Assembler. The low-level nature of this language compels the programmer to comment i.e. to explain each source line, which would be incomprehensible otherwise.

Python is a “high level” language in that it resembles the first example.

A known indicator for different high-level languages is the presence (or not) of a "garbage collector".

An original way to explain a garbage collector is to imagine using two dishwashers, one has a post-it marked "clean" on it while the other has a post-it marked "dirty". In the beginning, all the dishes (clean) are in the first machine and the second machine is empty. When you need a knife, you take it from the "Clean" dishwasher then after you have used it, place it in the "Dirty” dishwasher. When the “Clean” dishwasher is empty, you place everything in the “Dirty” machine (for immediate washing) and swap the post-it notes!

The advantage of the garbage collector is that it prevents memory leaks caused by forgetting releases. The disadvantage is that it can be invoked at unexpected moments thus delaying the progress of the programme. So, it is not suitable for systems with high real-time constraints.

Python, like Java, does have a garbage collector so the programmer does not have to worry about this problem. C++ does not have a garbage collector. The programmer who allocates memory must make sure to release it explicitly when he/she no longer needs it. Google's Go language offers a mode with (as standard) and without (optional).

These curiosities are only a precursor to an exhaustive knowledge of Python. Mastering them will make it possible to choose a language best suited to your project’s specifications and to reduce monetary, human and technological costs. Our experts are on hand to take you deeper into the intricacies of this beautiful language and help you choose a language that best meet your expectations!

 

 

 

j.lefrancois

Jérémie Lefrançois, an AUSY consultant since 2004, wrote his first programmes in Basic on a ZX Spectrum back in 1982. Gaining a postgraduate qualification in IT in 1989, he has worked with many computer languages on a personal and professional basis. He loves exploring and comparing the possibilities of different languages - especially Python - which has been his preferred language since 2014. He enjoys monitoring developments in this dynamic language, which he likes for its ease of implementation.

 

 

Don’t hesitate to check out our offer Big Data.

Let's have a chat about your projects.

contact