NCTU CIS: Data Structures (Fall 2011)


Special Announcements:

1.  第一次程式作業請寄給助教 楊佳典 marknicht@gmail.com,期限是10/21 星期五中午12 時。

2.  上周五與本周二之作業均為本周二交,題目請問來上課的同學。

3.  第二次程式作業請寄給助教 陳柏亨  zero3william@gmail.com,期限是11/4 星期五中午12 時。

3.  第三次程式作業的相關檔案 (助教歐冠翬  yanagiis@gmail.com  分機 56649):
  1. README
  2. SOCIAL SIMULATION PACKAGE
  3. Programming with Pthread
  4. Introduciton to OpenMP


Course Information

Instructor

Wuu Yang, <wuuyang@cis.nctu.edu.tw>, EC332B, 03-5712121 ext. 56614.   Office Hours: Tu 10-12.

If you want to talk to me at other time, please make an appointment.

Teaching Assistants

劉家倫, 楊佳典,  劉道源, 陳柏亨, 歐冠翬 at phone 56649.

Lectures

2B 5EF,  ED027.

Course URL

http://www.cis.nctu.edu.tw/~wuuyang/Lecture/lecture.DataStructure.html

Pre-requisites

The students are expected to have some experiences in programming and with the C or C++ programming language.

Course Description

Data structure is the second course in computer programming.  We will study arrays and structures, stacks and queues, lists, trees, graphs, sorting, hashing and heaps.

This course is an introduction to the use, design, and analysis of data structures in computer programs. We will begin with a reflection on the software construction process. Then we will discuss what an algorithm is and the use of data abstraction in software construction. Then we will talk about the evaluation of algorithms. Here we are interested in both the theoretical analyses and practical measurements. This completes our introduction to data structures.

In the following chapters, we will discuss arrays, stacks, queues, lists, trees, and graphs. These are commonly used data structures in almost every computer program. Mastering these basic data structures will greatly enhance our ability to write elegant programs.

Sorting and hashing are important topics in the study of algorithms. They are also closely related to the design of data structures. We will spend one week to discuss various sorting and hashing techniques, respectively. In the last two chapters, we will discuss heap and search structures.

Since there are 10 chapters in the textbook, we will discuss at least one chapter each week. We plan to implement 4 programs and have both a midterm and a final exam. In both classroom discussion and programming, we will use the C programming language. If most people are not familiar with the C language, we may discuss in more details the example programs in the textbook.

Textbook

Ellis Horowitz, Sartaj Sahni, and Dinesh P. Mehta, Fundamentals of Data Structures in C++, 2nd ed., Silicon Press, 2007.  (table of contents)

(REFERENCE ONLY) Frank M. Carrano and Walter Savitch, Data Structures and Abstractions with Java, Pearson Education, 2003.

(REFERENCE ONLY) B.W. Kernighan and D.M. Ritchie, The C Programming Language, Prentice-Hall, 1978.

Exams

IT IS RECOMMENDED THAT YOU USE CHINESE TO ANSWER QUESTIONS IN THE EXAMS. YOU MAY ALSO USE ENGLISH IF NECESSARY. DO NOT USE ANY OTHER LANGUAGES.

Midterm: November 11, Friday, 1:30-3:30 pm, 2011.  

Final: January 13, Friday, 1:30-3:30 pm, 2012.

Grading (tentative)

There will be two exams (40% each) and many short homeworks and programming assignments (15%). Classroom discussion counts as 5% toward the final score.   This grading standard may change as the semester goes. 

Classnotes

  1. Chapter 1
  2. Chapter 2
  3. Chapter 3
  4. Chapter 4
  5. Chapter 5
  6. Chapter 6
  7. Chapter 7
  8. Chapter 8
  9. Chapter 9
  10. Chapter 10
 

Current Homeworks

  1. ssss

Past Homeworks

  1. Due September 28, Wednesday, 2005. Please analyze the complexity of program 2.9 (sparse matrix multiplication) with the sparse matrix representation discussed in the class. You need to hand in the written homework before class begins on 9/28.
  2. Due October 12, Wednesday, 2005.  Please design an algorithm for the longest common subsequence problem and analyze it.
  3. Due October 31, Monday, 2005. Please design an algorithm for inserting a new node into a threaded binary tree as a left child. See section 5.5.
  4. September 29, 2004 Wednesday, (1) Analyze program 2.9 on p. 75. (2) Design and discuss another data structure for sparse matrices.--蔡柏良助教
  5. October 6, 2004, Wednesady, Design an algorithm for the longest common subsequence problem. Analyze your algorithm. --彭垂業助教
  6. October 13, 2004 Wednesday, Modify the algorithms for transforming infix form into postfix form and for evaluating a postfix expression.  We need to take into account the associativity (as well as precedence) of operators. --許志行助教
  7. October 20, 2004, Wednesday, Can you find another LCS algorithm that is better than the one in your homework?  Why not? (CHECK YOUR BOOK AGAIN.) --劉宸瑋助教
  8. October 25, Mo, 3:30, (1) Write down a post-order traversal on a threaded tree. Equivalently, write down the postsucc function which returns the post-order successor of a node on a threaded tree.
    (2) Insertion to the left on threaded trees (slide 21, chap 5). --黃信達助教

Projects

  1. (November 9, 2005, Wednesday, 10 am) In this project, we are going to implement (1) input two sparse matrices, each in a file, (2) compute their product, and (3) output the number of entries that are even integers in the product. Here are data of 5 sparse matrices. The first line of every file indicates the dimension of the matrix.  In the five samples, all matrices are 1000x1000.  But your program should be able to process other dimensions. You may modify the program in the text book for reading a matrix.

    If there are duplicate entries, keep only the last one. Discard all the remaining duplicate entries. Your program should also report the duplicate entries.

    You need to package your program, the outputs of three sample runs, and a 1-page report in a single zip file named isyyyyyHW01, where isyyyyy is your student identity number. Submit the whole file to ftp://140.113.166.31 port 332, name is ds-fall2005 and password is ED117 by November 9, 2005, Wednesday, 10 am.  We do not accept late homework unless you have a legitimate reason.

  2. (December 7, 2005, Wednesday, 10 am) In this project, write TWO programs to compute the minimal spanning trees of a graph, one with the usual transitive closure method (OR ANY OTHER METHOD YOU CAN THINK OF) and the other with the union-find method.  Output of the programs is the total weight of the minimal spanning tree.  Run each program three times and compute the average of the running time of each on the input data.

    The input data is located here. It is a graph, which is shown in this figure

    It is a Delaunay Triangulation. Open the ps file and you can know what it is all about. First line is the number of vertices. 2nd line is the number of edges. Vertices are given by its coordinates. Since they are from image, so the coordinates are int. Edges are specified by two integers i and j that means the line segment has one end the ith point and the other end the jth point. Weight is the Euclidean distance, the length of the line segment.

    As usual, you need to package your program, the outputs of three sample runs, and a 1-page report in a single zip file named isyyyyyHW02, where isyyyyy is your student identity number. Submit the whole file to ftp://140.113.166.31 port 332, name is ds-fall2005 and password is ED117 by December 7, 2005, Wednesday, 10 am.  We do not accept late homework unless you have a legitimate reason.  In the report, you should analyze the running time and give some comments about the programs.

  3. (January 11, 2006, Wednesday, 10 am) In this third project, we want to implement the algorithm for optimal merging of runs in section 7.11.5 (Optimal Merging of Runs).  The input data file is located here.  The output consists of two traversals of the merge tree: both the pre-order and the in-order traversals.  We should be able to reconstruct the optimal merge tree from the two traversals.

    The input consists of n lines, one line for the size of a run. Here is an example input file (with three lines):

    3
    4
    9

    The output also consists many lines. Each line contains two numbers: the first is the size of an run or intermediate run and the second is either 0 or 1. If the second is 0, this line represents an original run. If the the second is 1, this line represents an intermediate run. For the above example, the in-order traversal consists of the following lines:

    3    0
    7    1
    4    0
    16  1
    9    0

    The pre-order traversal has a similar format.

    As usual, you need to package your program, the outputs of three sample runs, and a 1-page report in a single zip file named isyyyyyHW02, where isyyyyy is your student identity number. Submit the whole file to ftp://140.113.166.31 port 332, name is ds-fall2005 and password is ED117 by January 11, 2006, Wednesday, 10 am.  We do not accept late homework unless you have a legitimate reason.

Local On-Line Information

  1. Two figures for threaded binary tree traversals: completeINORDER and completePREORDER.  
  2.  

Information on the Net


Wuu Yang, <wuuyang@cis.nctu.edu.tw>, EC332B, 03-5712121 ext. 56614.