Database Systems - Homework Info Page



HW1
HW2
HW3

Note: there is also a software page available.

Homework 1

For questions 10 and 11 you'll need to load extra rows in to the Course and Employee tables. The data to load can be found at the following two files:

course_extra_data.txt
employee_extra_data.txt

The files are tabbed dilimited. MySQL supports this format as a default. If you are using a different database server you might need to specify that a tab is used as a delimiter, while importing the data.

To import the files use the "LOAD DATA" command. Documentation for this command can be found here. For example to load the extra course data you would connect with the mysql client to the database you are using and then run the following command:

load data local infile 'employee_extra_data.txt' into table Employee;

Note: the example above uses the "local" keyword, which indicates to the client to search for the file localy relativly to where the client is executed. For the example above you would need to have the file employee_extra_data.txt placed in the current working directory from which the client was executed.

Data types: When creating the tables all string data types can be defined as varchar(50). The "Salary" and "Grade" fields can be defined as int.

Solution: sql queries, relational algebra

Homework 2

For question 7: the file hispo.xml is part of the Galax distribution. See the software page for instructions on installing Galax. If you don't want to use Galax you can get the file here.
Note: the answer should include both the query and the result. The result should display both the partNum and its associated total sales.

Solution: hw2_solution.txt

Homework 3

Question 1:

All files provided are in the format of:
operation data
One operation per line. Where operation is one of either i for insert or d for delete and data is a consecutive string. For example:
i 3nwCQP59l3KyhDC8nKkP
Note: the data (the second parameter supplied) is to be used as the key for delete operations and for insert operations as both the key and value. The data to load file contains only insert operation. It is available at zip format here or tar.gz format here.
There are two operations files: operations1_data.txt and operations2_data.txt.

Guidelines:
  • You may program the application in either C, C++ or Java. If you choose C or C++ it must be able to compile with a free compiler such as gcc and run on either the department machines, Linux or on a Cygwin environment.
  • Beyond the writeup of your experiments and findings, the source code must be submitted electronically by email. All files should be packaged together in either a zip or tar.gz archive.
  • You should include in the archive a Makefile to use for compiling your program. It should be possible to compile the program by simply using make.
  • You should include in the archive a README.txt file stating how to run the program and any special notes.
  • If you used a non standard library please include it in your archive file.
  • When running your program it should accept 3 parameters: <insert_data file name> <operations file name> <H|T> . The third parameter indicates if the program should use a hash table (H) or the Red-Black Tree (T). If the program is run without the proper parameter the program should print out a usage message. After the program completes it should print out the results.
  • Timing experiments: The timing should be reported in milliseconds. Load the insert data and then for each operations file load the operations into memory and then perform in a loop 10,000 iterations of the operations. Report the results for each of the data structures (hash, red-black tree) and provide your conclusion and analysis for the results. Overall you'll be reporting 4 test cases:
    1. operation1 file - Hash
    2. operation2 file - Hash
    3. operation1 file - Red Black Tree
    4. operation2 file - Red Black Tree

Question 2:

The data files, in mysql tab delimited format, are available here:
course_hw3.txt
employee_hw3.txt

If you find that the above data set is too small (you don't get a significant difference between with and without indexes) you may use the following dataset which is larger.
hw3_large_data.zip (100,000 lines in Employee table)
Or you may choose to use the following data set which is huge:
hw3_huge_data.zip (1,000,000 lines in Employee table)

Note: the question requests: "Show the script ...". This means you need to show all sql statements you execute and their results.

Question 3:

Use the same file employee_hw3.txt from question 2. If you don't see a significant difference you may choose to use either the large or huge data set from question 2.

Question 4:

To generate the data use a scale factor of 0.02. No need to specify any other parameters. Thus, to generate the data use the following command:
dbgen -s 0.02
You can also get the table data here: tpch_table_data.zip.
The data is separated using the "|" character. Thus to load the data in mysql you should use a command similar to:
LOAD DATA LOCAL INFILE 'supplier.tbl' INTO TABLE tpch.SUPPLIER FIELDS TERMINATED BY '|';
Solution: hw3_solution.txt

Database Systems Spring 2006