Re-introduction to Python - part 4. Function composition.
You've learned a lot about data types, which really are the fundamental units of structured data. Now it is time to become familiar with the fundamental units of Python code itself - functions.
A function is, in a way, a mini-program. Earlier we said a computer program takes some "input" data, applies some transformations to it and produces "output" data. This is also 100% true for functions as well.
Think of functions as a way of breaking down a complex program into smaller, simpler units. Each function, if designed well, should only really do one thing (but hopefully do it well). That one thing may be a simple thing, or it may be a rather sophisticated thing, and that's ok either way, as long as you are able to briefly describe the entirety of what the function does. Some functions do more work than others. The important thing is that each function encapsulates the complexity of whatever it is that needs to be done, so from the outside it is relatively simple and easy to use.
If designed well, functions really are a bit like Lego blocks, in that you can assemble a bunch of simple ones together to achieve more complex things. This is what we call "function composition" in software engineering. But perhaps a better analogy would be to think of pieces of pipes, chained together. Imagine the data flowing through a function (and getting modified, or "processed" somehow by that function), then the modified data flowing out of that function (being "returned") and into another, which further modifies it in other ways, then possibly another... until it finally turns into the shape we want it to have.
You have already been using some Python functions. The print()
function you are now familiar with takes some data and - instead of actually modifying it - just displays it to the user. So it is a bit unusual, in that the data doesn't get modified, and also doesn't get "returned" to be processed by other functions - it's just redirected to the screen (or the "standard output" device - STDOUT
).
The other function you saw - input()
- is also a bit unusual, but in the opposite way. No other function passes any data into it. Instead, it reads data from the user (technically, from the "standard input" device - STDIN
), doesn't really process it much, and just passes it on ("returns" it) for the next function in the chain to do something with it. Or simply to assign it to a variable.
A typical simple computer program often has the shape:
input() --> process_data() --> print()
This process of composing a chain of functions, where each function passes its output as the input to the next function, is implemented in Python via function calls, so the above "conceptual" program would actually look like this in Python:
print(process_data(input()))
So what is an example of a proper "data processing" function? Python has a built-in function called sorted()
which, as the name suggests, takes as input a list, or a string, or a similar sequence of items, and returns a new list containing all of the items in the input sequence sorted in ascending order. So sorted([3, 2, 1])
returns the list [1, 2, 3]
. Similarly, sorted("word")
returns the list ['d', 'o', 'r', 'w']
.
The pieces of data we pass into the function as input are called the function's "arguments" (in this example, the sorted()
function takes a list or a string as an argument), and the data returned as output is called the "return value". The combination of the function's name, the arguments it accepts and the values it returns is sometimes called the "function's signature".
Now we can see a complete example of this "chaining of functions like pipes in a pipeline" analogy in actual Python code:
print(sorted(input()))
This is a complete Python program. It consists of 3 chained function calls. It outputs (via print()
) a list with the sorted (sorted()
) characters of the text that is entered by the user (via input()
). Pretty neat, eh?
It gets more interesting when you learn to define your own functions. In Python, you can do this by using the def
keyword. Here's an example of a new function we can define:
def reverse(text):
return text[::-1]
Here we are using the def
keyword to define a new function. We give our new function a descriptive name, which hints at what the function is meant to be doing - reverse
. Our function takes a single argument - a piece of data (presumably a string) and temporarily gives it the name text
. So whatever actual value we passed into our function, inside the "body" of our function definition that value will be known by the name text
, as if it was a variable. Finally, our function returns the passed in string (text
) reversed. (We used the negative step slicing trick from one of the previous exercises to actually perform the reversal.)
So now we have a new function. If we call it with a string argument, e.g. reverse("stressed")
we will get back "desserts".
Take a moment to grab a snack.
Often, when we have a collection of items, we want to be able to go over each of them and do something (e.g. transform each item, or add them together, or count the ones that match a certain pattern). To do this in Python we use a control structure called a for
loop. A loop, in general, is a programming construct for repeatedly applying a certain set of operations. A for
loop goes over each item in a collection and temporarily assigns it to a variable, which you can do things with. Let's illustrate this by creating a function to count all words in a list which start with a certain letter:
def count_words_with(words, letter):
occurrences = 0
for word in words:
if word.startswith(letter):
occurrences += 1
return occurrences
Now we can use our new function like this: count_words_with(['apple', 'banana', 'blueberry'], 'b')
which will return 2.
As you can see, functions are not limited to taking a single argument. They can have multiple arguments. In our new function, the first argument is words
and the second one is letter
. Because their order matters, these are called "positional arguments". You can also have another type of arguments, which have to be explicitly passed in by name. They are called "keyword arguments" and are more flexible, if not as concise as positional arguments. See what you can learn about positional and keyword arguments in Python.
Here are some more questions to research, think about and discuss:
- Other than the
for
loop, which is very convenient for iterating over a collection of items, what other types of loops are there in Python? - What are the rules and conventions for naming Python functions?
- Are variables defined inside a function visible outside of it? (i.e. can you actually access the variable
occurrences
in our sample function above, outside of the function definition?) Conversely, are variables defined outside of a function visible inside of it? - Research the concept of "namespaces" in Python.
- What does the built-in function
locals()
do? - What are "global" variables and how do they work in Python? How can you access the value of a global variable inside a function? How can you modify the value of a global variable inside a function?