Homework 1

In this homework you will learn how to use the scripting language called Python.

Learn about Python here.

Python has a library called NumPy, which provides support for multi-dimentional arrays and matrices.

Learn about NumPy here.

This homework has 4 parts.

Part 1) Arrays and copy semantics

Consider the following Python code segment, which uses built-in Python lists and NumPy lists to perform similar operations, albeit with differing results.

Using built-in Python lists:

>>> ra = range(0,10)
>>> ra
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> rb=ra
>>> rb[0]=-1
>>> ra
[-1, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> rv=rb[1:3]
>>> rv[1]=10
>>> ra
[-1, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Using NumPy arrays:

>>> import numpy as np
>>> na = np.arange(0,10)
>>> na
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> nb=na
>>> nb[0]=-1
>>> na
array([-1,  1,  2,  3,  4,  5,  6,  7,  8,  9])
>>> nv=nb[1:3]
>>> nv[1]=10
>>> na
array([-1, 1, 10,  3,  4,  5,  6,  7,  8,  9])

Describe similarities and differences between copying and assignment semantics of built-in Python lists and NumPy arrays. Explain why the code behaves differently for the two.

Part 2) Matrices

NumPy also supports matrices. However, there are some important differences between two-dimentional arrays and matrices. Consider the following two code segments that are similar, but produce different results:

Using NumPy two-dimentional arrays:

>>> aa = np.array([[1,2], [3,4]])
>>> ab = np.array([[2,1], [-1,2]])
>>> aa*ab
array([[ 2,  2],
       [-3,  8]])
>>> aa**3
array([[ 1,  8],
       [27, 64]])

Using NumPy matrices:

>>> ma = np.matrix([[1,2], [3,4]])
>>> mb = np.matrix([[2,1], [-1,2]])
>>> ma*mb
matrix([[ 0,  5],
        [ 2, 11]])
>>> ma**3
matrix([[ 37,  54],
        [ 81, 118]])

Describe similarities and differences between NumPy two-dimentional arrays and matrices. Explain why the code behaves differently for the two.

Part 3) Reading from files

Assume you have a file that contains an undirected graph. The graph is described according to the following BNF grammar:

<graph> -> <graph_header> <graph_body>
<graph_header> -> 'n' NUMBER '\n'
<graph_body> -> <graph_body>, <edge> |  <edge>
<edge> -> NUMBER -- NUMBER '\n'

There is only a single token, which is defined as:

NUMBER: [0-9]+

Here is an example input:

n 5
0 -- 3
0 -- 4
1 -- 2
1 -- 3
2 -- 4
3 -- 3

You can assume that the vertex indices start from 0. The 'n' specifies the total number of vertices. A graph like this can be represented as a matrix, as follows:

0 0 0 1 1
0 0 1 1 0 
0 1 0 0 1 
1 1 0 1 0 
1 0 1 0 0

Note that the matrix is symmetric. We do not need to specify an edge twice, u – v and v–u, as one implies the other.

Implement the following two functions:

For the first part, you need to study working with files in Python. It is described here.

Part 4) Going large-scale

What if your graph is large, say 10000 vertices. The matrix representation will end up eating a lot of space. This seems somewhat unnecessary, as most of the entries in the matrix will be 0. How can you make such matrices take less space in memory? Is there a Python library that you can locate which can handle this case? Do some research and report your findings.

Logistics

It is best if you install Python and NumPy on your own computer. But you could also find them pre-installed at dijkstra.ug.bcc.bilkent.edu.tr

Once you are done, put your code and report under a directory named lastname_name_hw1 and make an archive from that directory. Your report should be named lastname_name_hw1.pdf (or .txt). For example, the following Unix commands could be used:

    mkdir lastname_name_hw1
    cd lastname_name
        ...
        (edit and test your files in this directory)
        ...
    cd ..
    tar -cvzf lastname_name_hw1.tar.gz lastname_name_hw1

Then e-mail this newly generated file (named lastname_name_hw1.tar.gz) to your TA Doğukan Çağatay (dogukan.cagatay@bilkent.edu.tr).

Collaboration on the homework is not allowed. Reports in formats other than .txt and .pdf are not accepted.