NCTU CIS: Data Structures (Fall 2005)


Special Announcements:

1.  1st project due on November 9, 2005.  See below.

2.  2nd project due on December 7, 2005.  See below.

3.  3rd project due on January 11, 2006.  See below.

4.  CURRENT GRADES. Please check carefully about your midterm exam. DS051117.xls.

5.  Answer to the 3rd question in the midterm exam.

6.  Complexity analysis of Prim's algorithm, chapter 6.

7.  Classnotes for Chapter 6.


Course Information

Instructor

Wuu Yang, <wuuyang@cis.nctu.edu.tw>, EC332B, 03-5712121 ext. 56614.   Office Hours: Tu 1:30-2:30.

If you want to talk to me at other time, please make an appointment.

Teaching Assistants

黃信達、劉宸瑋、張志誠 、陳冠志、陳癸夫、黃柏翰、謝文均 at phone 56649.

Lectures

1G 3CD ED027.

Course URL

http://www.cis.nctu.edu.tw/~wuuyang/Lecture/lecture.DataStructure.html

Pre-requisites

The students are expected to have some experiences in programming and with the C programming language.

Course Description

Data structure is the second course in computer programming.  We will study arrays and structures, stacks and queues, lists, trees, graphs, sorting, hashing and heaps.

This course is an introduction to the use, design, and analysis of data structures in computer programs. We will begin with a reflection on the software construction process. Then we will discuss what an algorithm is and the use of data abstraction in software construction. Then we will talk about the evaluation of algorithms. Here we are interested in both the theoretical analyses and practical measurements. This completes our introduction to data structures.

In the following chapters, we will discuss arrays, stacks, queues, lists, trees, and graphs. These are commonly used data structures in almost every computer program. Mastering these basic data structures will greatly enhance our ability to write elegant programs.

Sorting and hashing are important topics in the study of algorithms. They are also closely related to the design of data structures. We will spend one week to discuss various sorting and hashing techniques, respectively. In the last two chapters, we will discuss heap and search structures.

Since there are 10 chapters in the textbook, we will discuss at least one chapter each week. We plan to implement 4 programs and have both a midterm and a final exam. In both classroom discussion and programming, we will use the C programming language. If most people are not familiar with the C language, we may discuss in more details the example programs in the textbook.

Textbook

Ellis Horrowitz, Sartaj Sahni, and Susan Anderson-Freed, Fundamentals of Data Structures In C, Computer Science Press, 1993.

(REFERENCE ONLY) Frank M. Carrano and Walter Savitch, Data Structures and Abstractions with Java, Pearson Education, 2003.

(REFERENCE ONLY) B.W. Kernighan and D.M. Ritchie, The C Programming Language, Prentice-Hall, 1978.

Exams

IT IS RECOMMENDED THAT YOU USE CHINESE TO ANSWER QUESTIONS IN THE EXAMS. YOU MAY ALSO USE ENGLISH IF NECESSARY. DO NOT USE ANY OTHER LANGUAGES.

Midterm: November 9, Wednesday, 10-12 am, 2005.  

Final: January 11, Wednesday, 10-12 am, 2006.

Grading (tentative)

There will be two exams (40% each) and many short homeworks and programming assignments (15%). Classroom discussion counts as 5% toward the final score.   This grading standard may change as the semester goes.

Current Homeworks

  1. Due September 28, Wednesday, 2005. Please analyze the complexity of program 2.9 (sparse matrix multiplication) with the sparse matrix representation discussed in the class. You need to hand in the written homework before class begins on 9/28.
  2. Due October 12, Wednesday, 2005.  Please design an algorithm for the longest common subsequence problem and analyze it.
  3. Due October 31, Monday, 2005. Please design an algorithm for inserting a new node into a threaded binary tree as a left child. See section 5.5.

Past Homeworks

  1. September 29, 2004 Wednesday, (1) Analyze program 2.9 on p. 75. (2) Design and discuss another data structure for sparse matrices.--蔡柏良助教
  2. October 6, 2004, Wednesady, Design an algorithm for the longest common subsequence problem. Analyze your algorithm. --彭垂業助教
  3. October 13, 2004 Wednesday, Modify the algorithms for transforming infix form into postfix form and for evaluating a postfix expression.  We need to take into account the associativity (as well as precedence) of operators. --許志行助教
  4. October 20, 2004, Wednesday, Can you find another LCS algorithm that is better than the one in your homework?  Why not? (CHECK YOUR BOOK AGAIN.) --劉宸瑋助教
  5. October 25, Mo, 3:30, (1) Write down a post-order traversal on a threaded tree. Equivalently, write down the postsucc function which returns the post-order successor of a node on a threaded tree.
    (2) Insertion to the left on threaded trees (slide 21, chap 5). --黃信達助教

Projects

  1. (November 9, 2005, Wednesday, 10 am) In this project, we are going to implement (1) input two sparse matrices, each in a file, (2) compute their product, and (3) output the number of entries that are even integers in the product. Here are data of 5 sparse matrices. The first line of every file indicates the dimension of the matrix.  In the five samples, all matrices are 1000x1000.  But your program should be able to process other dimensions. You may modify the program in the text book for reading a matrix.

    If there are duplicate entries, keep only the last one. Discard all the remaining duplicate entries. Your program should also report the duplicate entries.

    You need to package your program, the outputs of three sample runs, and a 1-page report in a single zip file named isyyyyyHW01, where isyyyyy is your student identity number. Submit the whole file to ftp://140.113.166.31 port 332, name is ds-fall2005 and password is ED117 by November 9, 2005, Wednesday, 10 am.  We do not accept late homework unless you have a legitimate reason.

  2. (December 7, 2005, Wednesday, 10 am) In this project, write TWO programs to compute the minimal spanning trees of a graph, one with the usual transitive closure method (OR ANY OTHER METHOD YOU CAN THINK OF) and the other with the union-find method.  Output of the programs is the total weight of the minimal spanning tree.  Run each program three times and compute the average of the running time of each on the input data.

    The input data is located here. It is a graph, which is shown in this figure

    It is a Delaunay Triangulation. Open the ps file and you can know what it is all about. First line is the number of vertices. 2nd line is the number of edges. Vertices are given by its coordinates. Since they are from image, so the coordinates are int. Edges are specified by two integers i and j that means the line segment has one end the ith point and the other end the jth point. Weight is the Euclidean distance, the length of the line segment.

    As usual, you need to package your program, the outputs of three sample runs, and a 1-page report in a single zip file named isyyyyyHW02, where isyyyyy is your student identity number. Submit the whole file to ftp://140.113.166.31 port 332, name is ds-fall2005 and password is ED117 by December 7, 2005, Wednesday, 10 am.  We do not accept late homework unless you have a legitimate reason.  In the report, you should analyze the running time and give some comments about the programs.

  3. (January 11, 2006, Wednesday, 10 am) In this third project, we want to implement the algorithm for optimal merging of runs in section 7.11.5 (Optimal Merging of Runs).  The input data file is located here.  The output consists of two traversals of the merge tree: both the pre-order and the in-order traversals.  We should be able to reconstruct the optimal merge tree from the two traversals.

    The input consists of n lines, one line for the size of a run. Here is an example input file (with three lines):

    3
    4
    9

    The output also consists many lines. Each line contains two numbers: the first is the size of an run or intermediate run and the second is either 0 or 1. If the second is 0, this line represents an original run. If the the second is 1, this line represents an intermediate run. For the above example, the in-order traversal consists of the following lines:

    3    0
    7    1
    4    0
    16  1
    9    0

    The pre-order traversal has a similar format.

    As usual, you need to package your program, the outputs of three sample runs, and a 1-page report in a single zip file named isyyyyyHW02, where isyyyyy is your student identity number. Submit the whole file to ftp://140.113.166.31 port 332, name is ds-fall2005 and password is ED117 by January 11, 2006, Wednesday, 10 am.  We do not accept late homework unless you have a legitimate reason.

Local On-Line Information

  1. Two figures for threaded binary tree traversals: completeINORDER and completePREORDER.  
  2.  

Information on the Net


Wuu Yang, <wuuyang@cis.nctu.edu.tw>, EC332B, 03-5712121 ext. 56614.