A115 Software Engineering

Bespoke cloud-based software platforms powering UK commerce since 2010

Re-introduction to Python - part 8. Type hints, modules and reusability.

So far, we have been using various data types in Python without explicitly specifying what the types are. We've just been creating variables and assigning them values and Python kind of knows how to figure out the correct data type by looking at the values. This is called "duck typing" - after the saying "if it looks like a duck and walks like a duck, it must be a duck". E.g. if a variable my_number is assigned the value 12, Python might reasonably assume that the data type of my_number is integer. Which it, in all likelihood, is.

Except if it's not.. Because 12 is just as much a reasonable representation of the decimal number 12.0 as the integer 12. And in fact, if we were to later add, for example, 0.01 to our initial value of my_number, Python will start thinking that this duck walks a bit funny and will probably revise its initial guess about the data type of my_number from int to float. The technical term for computer languages that do this kind of thing is "dynamically typed" languages, and the process of guessing the types of things is called "type inference". You may have also ran across examples of "statically typed" languages (such as C, C++ and Java). In those, the programmer is expected to provide the data type of each variable at the time of definition. So in C for example, you might see things like int my_num = 12.

Python, traditionally, has not provided a way to specify the types of variables and has relied exclusively on its "duck typing" prowess. In modern versions of Python, "duck typing" continues to be the default - and only - behaviour of Python. However, the syntax of the language has recently (since about version 3.5) been expanded to include support for the so-called "type hints" (also known as "type annotations".) Type hints in Python look like this: my_number: int = 12. You can also use them in your function definitions, to indicate the data type of each argument as well as the type of the expected return value for your function:

def my_function(nums: List[int], target: int) -> Tuple[int, int]:

This defines a function, which (clearly) takes a list of integers and one other integer and returns a tuple of two integers. (If you did the exercise in the last lesson, this function might look familiar!)

At any rate, you can already see how this is a lot more informative than simply writing the equally valid, more traditional code:

def my_function(nums, target):

You might think "great, now Python knows for sure what the types of my data are." Sorry to disappoint, but if you did think that, you are in for a surprise. What Python does with type annotations, to this day, is something rather impolite... it IGNORES THEM COMPLETELY.

So why on Earth are type hints a thing, and more to the point - why are we even talking about this?

Because readability matters.

Making your code more readable and clearer to understand is important. You might come back to your code a year from now and wonder what you were thinking when you did what you did. Or, if you're anything like me, you might come back to your code 5 minutes from now and wonder what you were thinking when you did what you did. Or, you might actually work with, you know, other beings who have to look at your code and wonder what you were thinking when you did what you did.

So even if the Python interpreter itself refuses your kindness, type hints still have value because they make your code clearer and people looking at it won't have to wonder what the types of things are.

What's more, just because Python itself doesn't care about type annotations, doesn't mean there aren't other tools that do care about them. If you are using the PyCharm IDE for example, you'll notice that once you begin sprinkling type annotations into your code, PyCharm suddenly starts being more helpful and advising you that you can't, for example, just add a number to a string. Or that your function is not actually returning data of the type you said it will.

This helps catch many different errors in your code early on. In fact, it's so useful, there are even stand-alone tools designed specifically to run type checks on your type-annotated Python programs. The most widely used one is called mypy and I strongly advise you to read up a bit on it.

Please note that if you actually try to use anything beyond simple types for your annotations (like int, bool, str), you might end up needing to import some names from the typing module into your program before you can use them. Like the List with capital L and the Tuple with capital T from the example above. These are not built into Python, so you need to bring them in from a separate module (which does actually come with Python).

As you might have guessed, modules are ways of packaging up some related functionality together, in order to be used later elsewhere in your code - or in other people's code. For now, you can think of a Python module as just a separate file containing some code. If you want to use any of the code from the module in your own code, you have to import it. For example, if we want to use the data type names List or Tuple, we have to first import them like this, at the top of our code:

from typing import List, Tuple

(Note 1: Import statements don't technically have to be at the top of your code, but it's a good practice to try to keep them there. Helps with readability. )

(Note 2: You might be wondering why you have to import and use these capitalised data type names to begin with? How come int is OK for a type annotation, but we can't just use the normal built-in list and tuple? Well, technically you can, but they're not flexible enough. You can use my_items: list to indicate that your items are a list. But if you want to get more specific and try to describe a list of integers, you can not just do my_items: list[int] - you have to import that capital-L List from the typing module for that.)

A very important principle in software engineering is the idea of "reusability". When we write some code once, we don't really want to have to write the same (or nearly the same) code over and over again every time we need it. Remember how we said that less code is better than more code? If you only have your code in one place, there's fewer opportunities to make mistakes. It also makes it easier to read and think about that code. Functions are one way to make reusable code. You define your function once and you do your best to make it somewhat generic (e.g. if it is a function that works on a list, it should work with any list; you should not need a separate function to handle a special case like an empty list for example.) Once you have your generic function defined, you can use it over and over with as many lists as you can throw at it. This is reusability. You will also sometimes see or hear the term DRY (short for "Don't Repeat Yourself"). It means the same thing. Write your code once and make sure you (or other people) can use it over and over in different situations without having to reinvent the wheel.

Modules are another construct that helps with reusability and keeping code DRY. You can put some related functions together into their own file and, if they are well designed, people can import them from your module and re-use them happily.

When you install Python on your computer, it actually comes with a standard collection of commonly used modules. It is actually quite an extensive collection and is known as "the standard library". I can't emphasise strongly enough how important it is for you to get familiar with - and to continuously expand your knowledge of - the Python standard library. I have seen too many programmers with years of experience behind their back, banging their head against the wall trying to come up with solutions to problems that have already been solved and are available right there at their fingertips if they only knew their standard library a bit better. I don't think I've shared many links with you in this series (mostly because I want you to get used to finding things out on your own) - but the best way to expand your knowledge about the Python standard library is the official Python documentation: https://docs.python.org/3/library/

If you learned nothing else, this is enough to set you on the path to Python greatness.

Some questions for further research and discussion:

  1. Using type annotations, how do you indicate that a function may return "an integer or None" What do you have to import for that?
  2. How do you calculate the square root of a number n using the Python standard library?
  3. Go back to the latest version of your super-power character program and add type annotations.
  4. Extra credit: download and run the mypy tool on your type-annotated super-power character program. Did it complain about anything?
  5. What is PYTHONPATH?
  6. Split your program into two files - a module containing the definitions of the functions you wrote for it, and a main file which imports and uses those.