Python

1. Python#

In this course, we will us the Python programming language. Python as an open source language that is used in many areas of science and engineering. It is very similar to the R language that you might know from your statistics classes, but slightly more general and more widely used outside of psychology.

This first section of the course will provide a very brief introduction into the Python language.

Goals

Using Python as a calculator
Defining variables
Knowing different data types
Running loops
Running functions
Loading modules

1.1. Calculations#

Like most programming languages, Python can perform basic arithmetic operations like addition, subtraction, multiplication, and division.

3 + 3

10 / 4

2.5

Using the hash sign (#), we can add comments to our code to add notes for ourselves or others.

# These kind of comments (starting with '#') are ignored by Python

1.2. Variables#

Variables are used to store values for later use. They are defined by assigning a value (right hand side) to a custom variable name (left hand side), with an equals sign (=) in between (not an <-, as in R). We can print the value of a variable by simply typing its name (or, alternatively, by using the print() function).

a = 3
a

a = 3
b = 4
c = a + b
c

1.3. Data types#

So far, we have only used single numbers, but Python also supports many other data types. An important distinction is between scalars (single values, such as a single number) and iterables (collections of values, such as a list of numbers).

1.3.1. Scalars#

An integer (int) is a whole number (positive or negative).

my_var = 3
type(my_var)

int

A float (float) is a number with a decimal point.

my_var = 3.0
type(my_var)

float

A string (str) is a sequence of characters (letters, numbers, symbols).

my_var = 'Hello world'
type(my_var)

str

A Boolean (bool) is one of the two logical values True or False.

my_var = True
type(my_var)

bool

None is a special value that represents the absence of a value.

my_var = None
type(my_var)

NoneType

1.3.2. Iterables#

A list (list) is a collection of values and is defined by square brackets.

my_var = [1, 2, 3]
type(my_var)

list

Elements of a list can be accessed via their index (starting at 0, unlike R or MATLAB) in square brackets.

my_var[0]

This can also be used to change the value of an element. Note that, as shown here, lists can contain elements of different types (unlike vectors in R).

my_var[2] = 'Hooray!'
my_var

[1, 2, 'Hooray!']

A tuple (tuple) works like a list, but cannot be changed after it has been created (it is “immutable”). It is defined by round brackets.

my_var = (1, 2, 3)
type(my_var)

tuple

my_var[0]

my_var[2] = 'Hooray!'

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[16], line 1
----> 1 my_var[2] = 'Hooray!'

TypeError: 'tuple' object does not support item assignment

A dictionary (dict) is a collection of key-value pairs and is defined by curly brackets. The keys (usually some kind of labels) can then be used to access the corresponding values (usually some kinds of data).

my_var = {'Name': 'Berger', 'First_name': 'Hans', 'age': 53, 'married': True}
type(my_var)

dict

my_var['First_name']

'Hans'

1.4. Loops#

Loops are used to repeat a certain operation multiple times. The most common type of loop is a for loop, which iterates over the elements of an iterable (such as a list).

names = ['Alice', 'Bob', 'Chantoya']
for name in names:
    print('Hello ' + name)

Hello Alice
Hello Bob
Hello Chantoya

Note that the operations inside the loop need to be indented by exactly four spaces. Python, unlike R, is very strict about indentation.

It is often useful to store the results of a loop in a new list.

names = ['Alice', 'Bob', 'Chantoya']
greetings = []
for name in names:
    greetings.append('Hello ' + name)
greetings

['Hello Alice', 'Hello Bob', 'Hello Chantoya']

1.5. Functions#

Functions are pre-defined operations that can be applied to some input data. Functions are called by writing their name, followed by round brackets containing the input arguments.

As an example, the sum() function calculates the sum of all elements in a list.

numbers = [1, 2, 3]
sum(numbers)

The len() function returns the length of an iterable.

len(numbers)

We can also define our own custom functions like this:

def square(x):
    """Multiplies a number (x) by itself."""
    return x ** 2


square(3)

The “docstring” (in triple quotes) describing the function is optional but good practice.

Functions can have multiple input arguments. These can be specified by name (keyword arguments) or by position (positional arguments).

def power(base, exponent):
    """Exponentiates a number (base) to a given power (exponent)."""
    return base ** exponent


power(2, exponent=3)

Some types of objects come with their own special functions built in (called methods). These are called via the object name, followed by a dot, the name of the method, and round brackets (optionally containing additional input arguments).

my_string = 'Hello world'
my_string.upper()

'HELLO WORLD'

1.6. Modules#

Python only comes with a limited set of built-in functions like sum() and len(). Many additional functions are provided by external modules (also called packages).

To use these, we first need to install them (only once) and then import them (at the beginning of each script). As an example, we will install the numpy module for working with numerical data.

# %pip install numpy

We can now import the module and use its functions:

import numpy

numbers = [1, 2, 3]
numpy.mean(numbers)

2.0

We can also import the module under a custom name (alias) to make it easier to use. This is very common for the numpy module, which is usually imported as np.

import numpy as np

np.mean(numbers)

2.0

We can also import only specific functions from a module:

from numpy import mean

mean(numbers)

2.0

Numpy can also be used to create arrays, which are similar to lists but with two important differences:

Arrays can only contain elements of the same type (e.g., only integers)
Arrays can be multi-dimensional (e.g., a 2D matrix with two rows and three columns)

my_array = np.array([[1, 2, 3], [4, 5, 6]])
my_array

array([[1, 2, 3],
       [4, 5, 6]])

Numpy arrays come with many useful methods, such as taking the mean across the entire array or across specific dimensions.

np.mean(my_array)  # Mean of all elements

3.5

np.mean(my_array, axis=0)  # Means of each column

array([2.5, 3.5, 4.5])

np.mean(my_array, axis=1)  # Means of each row

array([2., 5.])

1.7. Further reading#

Blog post R or Python for Psychologists by Dominique Makowski (2020)
Free online course Programming with Python by Software Carpentry