NCTU CIS: Data Structures (Fall 2005)
Special Announcements:
1. 1st project due on November 9, 2005. See below.
2. 2nd project due on December 7, 2005. See below.
3. 3rd project due on January 11, 2006. See below.
4. CURRENT GRADES. Please check carefully about your midterm exam.
DS051117.xls.
Course Information
Instructor
Wuu Yang,
<wuuyang@cis.nctu.edu.tw>,
EC332B, 03-5712121 ext. 56614. Office Hours: Tu 1:30-2:30.
If you want to talk to me at other time, please make an appointment.
Teaching Assistants
黃信達、劉宸瑋、張志誠 、陳冠志、陳癸夫、黃柏翰、謝文均 at phone 56649.
Lectures
1G 3CD ED027.
Course URL
http://www.cis.nctu.edu.tw/~wuuyang/Lecture/lecture.DataStructure.html
Pre-requisites
The students are expected to have some experiences in programming and with the C
programming language.
Course Description
Data structure is the second course in computer programming. We will
study arrays and structures, stacks and queues, lists, trees, graphs, sorting,
hashing and heaps.
This course is an introduction to the use, design, and analysis of data
structures in computer programs. We will begin with a reflection on the software
construction process. Then we will discuss what an algorithm is and the use of
data abstraction in software construction. Then we will talk about the
evaluation of algorithms. Here we are interested in both the theoretical
analyses and practical measurements. This completes our introduction to data
structures.
In the following chapters, we will discuss arrays, stacks, queues, lists,
trees, and graphs. These are commonly used data structures in almost every
computer program. Mastering these basic data structures will greatly enhance our
ability to write elegant programs.
Sorting and hashing are important topics in the study of algorithms. They are
also closely related to the design of data structures. We will spend one week to
discuss various sorting and hashing techniques, respectively. In the last two
chapters, we will discuss heap and search structures.
Since there are 10 chapters in the textbook, we will discuss at least one
chapter each week. We plan to implement 4 programs and have both a midterm and a
final exam. In both classroom discussion and programming, we will use the C
programming language. If most people are not familiar with the C language, we
may discuss in more details the example programs in the textbook.
Textbook
Ellis Horrowitz, Sartaj Sahni, and Susan Anderson-Freed, Fundamentals of Data
Structures In C, Computer Science Press, 1993.
(REFERENCE ONLY) Frank M. Carrano and Walter Savitch, Data Structures
and Abstractions with Java, Pearson Education, 2003.
(REFERENCE ONLY) B.W. Kernighan and D.M. Ritchie, The C Programming
Language, Prentice-Hall, 1978.
Exams
IT IS RECOMMENDED THAT YOU USE CHINESE TO ANSWER QUESTIONS IN THE EXAMS.
YOU MAY ALSO USE ENGLISH IF NECESSARY. DO NOT USE ANY OTHER LANGUAGES.
Midterm: November 9, Wednesday, 10-12 am, 2005.
Final: January 11, Wednesday, 10-12 am, 2006.
Grading (tentative)
There will be two exams (40% each) and many short homeworks
and programming assignments (15%). Classroom discussion counts as 5% toward the final score.
This grading standard may change as the semester goes.
Current Homeworks
- Due September 28, Wednesday, 2005.
Please analyze the complexity of program 2.9
(sparse matrix multiplication) with the sparse
matrix representation discussed in the class.
You need to hand in the written homework before
class begins on 9/28.
- Due October 12, Wednesday, 2005. Please
design an algorithm for the longest common
subsequence problem and analyze it.
- Due October 31, Monday, 2005. Please design an
algorithm for inserting a new node into a threaded
binary tree as a left child.
See section 5.5.
Past Homeworks
-
September 29, 2004 Wednesday, (1) Analyze program 2.9 on p. 75. (2) Design
and discuss another data structure for sparse matrices.--蔡柏良助教
-
October 6, 2004, Wednesady, Design an algorithm for the longest common subsequence problem. Analyze your algorithm.
--彭垂業助教
-
October 13, 2004 Wednesday, Modify the algorithms for transforming infix form
into postfix form and for evaluating a postfix expression. We need to
take into account the associativity (as well as precedence) of operators.
--許志行助教
-
October 20, 2004, Wednesday, Can you find another LCS algorithm that is better
than the one in your homework? Why not? (CHECK YOUR BOOK AGAIN.)
--劉宸瑋助教
-
October 25, Mo, 3:30, (1) Write down a post-order traversal on a threaded tree.
Equivalently, write down the postsucc function which returns the
post-order successor of a node on a threaded tree.
(2) Insertion to the left on threaded trees (slide 21, chap 5).
--黃信達助教
Projects
-
(November 9, 2005, Wednesday, 10 am) In this project, we are going to
implement (1) input two sparse matrices, each in a file, (2) compute their
product, and (3) output the number of entries that are even integers in the
product.
Here are data of
5 sparse matrices.
The first line of every file indicates the dimension of the matrix. In
the five samples, all matrices are 1000x1000. But your program should
be able to process other dimensions. You may modify the program in the text
book for reading a matrix.
If there are duplicate entries, keep only the last one. Discard all the remaining duplicate entries.
Your program should also report the duplicate entries.
You need to package your program, the outputs of three sample runs, and
a 1-page report in a single zip file named isyyyyyHW01, where isyyyyy is
your student identity number. Submit the whole file to
ftp://140.113.166.31 port 332, name is ds-fall2005 and password is ED117 by November
9, 2005, Wednesday, 10 am. We do not accept late homework unless
you have a legitimate reason.
-
(December 7, 2005, Wednesday, 10 am) In this project, write TWO programs to
compute the minimal spanning trees of a graph, one with the usual transitive closure method
(OR ANY OTHER METHOD YOU CAN
THINK OF) and
the other with the union-find method. Output of the programs is
the total weight of the minimal spanning tree. Run each program three times and
compute the average of the running time of each on the input data.
The input data is located
here.
It is a graph, which is shown in this
figure
It is a Delaunay Triangulation.
Open the ps file and you can know what it is all about.
First line is the number of vertices.
2nd line is the number of edges.
Vertices are given by its coordinates.
Since they are from image, so the coordinates are int.
Edges are specified by two integers i and j that means the line segment
has one end the ith point and the other end the jth point.
Weight is the Euclidean distance, the length of the line segment.
As usual, you need to package your program, the outputs of three sample runs, and
a 1-page report in a single zip file named isyyyyyHW02, where isyyyyy is
your student identity number. Submit the whole file to
ftp://140.113.166.31 port 332, name is ds-fall2005 and password is ED117 by
December 7, 2005, Wednesday, 10 am. We do not accept late homework
unless you have a legitimate reason.
In the report, you should analyze the running time and give some comments about the programs.
-
(January 11, 2006, Wednesday, 10 am) In this third project, we want to
implement the algorithm for optimal merging of runs in section 7.11.5
(Optimal Merging of Runs). The input data file is located
here. The output consists of two traversals of the merge
tree: both the pre-order and the in-order traversals. We should be
able to reconstruct the optimal merge tree from the two traversals.
The input consists of n lines, one line for the size of a run.
Here is an example input file (with three lines):
3
4
9
The output also consists many lines. Each line contains two numbers:
the first is the size of an run or intermediate run and the second is either 0 or 1.
If the second is 0, this line represents an original run.
If the the second is 1, this line represents an intermediate run.
For the above example, the in-order traversal consists of the following lines:
3 0
7 1
4 0
16 1
9 0
The pre-order traversal has a similar format.
As usual, you need to package your program, the outputs of three sample runs, and
a 1-page report in a single zip file named isyyyyyHW02, where isyyyyy is
your student identity number. Submit the whole file to
ftp://140.113.166.31 port 332, name is ds-fall2005 and password is ED117 by
January 11, 2006, Wednesday, 10 am. We do not accept late homework
unless you have a legitimate reason.
Local On-Line Information
- Two figures for threaded binary tree traversals:
completeINORDER and
completePREORDER.
-
Information on the Net
Wuu Yang, <wuuyang@cis.nctu.edu.tw>, EC332B,
03-5712121 ext. 56614.