T. Andrew Yang

Email:

yang@uhcl.edu

Web page:

http://sce.uhcl.edu/yang/

Tel.:

(281) 283-3835

last updated:

3/5: assignment 2 posted

2/17: added part 2 of the hands-on labs

2/11: added the link to part 1 of the hands-on labs

1/21: added detailed requirements of the research project

1/20: first posted

 

 

CSCI 5533 Distributed Information Systems


Assignment 1

Assignment 2

Hands-on Labs

Research Project & Presentations


1.     Assignment 1

Total: 100 points

1.1.   (10 pts) Visit the class discussion group (link available in the syllabus page). Post a message with your full name as the subject line. In your post, briefly introduce yourself (including your full name) and one item you most desire to learn in this class. Throughout this class, you shall regularly participate at the discussion group to find recent announcements, reminders, and discussions.

 

1.2.   Answer the following questions. Cite your source(s) when answering each of the questions. Note: Not all information published on the web are correct; discern the validity of the information you use.

a.      (10 pts) What is a ‘distributed information system’?

b.     (10 pts) What is a ‘distributed database system’?

c.      (10 pts) What is the relationship between ‘distributed information system’ and ‘distributed database system’?

d.     (10 pts) What is the relationship between ‘distributed information system’ and ‘parallel computing system’?

e.      (10 pts) What is the relationship between ‘distributed database system’ and ‘multidatabase system’?

f.       (10 pts) Is the Internet a distributed information system? Justify your answer.

g.      (10 pts) Explain the difference of query processing in a centralized database system versus query processing in a distributed database system.

1.3.   The following questions are based on the relations Book and Author as shown below.

Relation: Book

Book_ID

Book_Name

Book_Price

Author_ID

111

An Interesting Book

35.95

1001

222

A Travel Guide

11.50

1002

333

How to Build a Shed

55.90

1003

444

A Nature Lover’s Guide

39.99

1003

555

How to Become an Effective Learner

99.99

1002

666

Here Comes the Magician

123.00

1003

 

Relation: Author

Author_ID

Author_Name

1001

John Doe

1002

A.J. Smith

1003

Peter Wong

1004

Jane Chaudhary

 

h.     (10 pts) Explain what natural join means. Show the result of Author ⋈ Book. (natural join)

i.       (10 pts) Explain what inner join means. Show the result of Author Book. (left outer join)

 

Go to the Index

 

2.     Assignment 2

Total: 100 points

2.1.   (10 pts) In designing a multidatabase system architecture, mediators and wrappers are often used. Explain what a mediator is, what a wrapper is, and their respective roles in a multidatabase system.

2.2.   Problem 3.1 from the textbook.

Given relation EMP as in Figure 3.3, let p1: TITLE < Programmerand p2: TITLE > Programmer be two simple predicates. Assume that character strings have an order among them, based on the alphabetical order.

2.2.1.    (10 pts) Perform a horizontal fragmentation of relation EMP with respect to {p1, p2}.

2.2.2.    (10 pts) Explain why the resulting fragmentation (EMP1, EMP2) does not fulfill the correctness rules of fragmentation.

2.2.3.    (15 pts) Modify the predicates p1 and p2 so that they partition EMP obeying the correctness rules of fragmentaion. To do this, modify the predicates, compose all minterm predicates and deduce the corresponding implications, and then perform a horizontal fragmentation of EMP based on these minterm predicates.

2.2.4.    (15 pts) Finally, show that the result has completeness, reconstruction and disjointness properties.

Fig. 3.3 from the book: Modified Example Database

2.3.   (15 pts, Distributed Concurrency Control) Which of the following histories are conflict equivalent?

H1 = {W2(x);W1(x);R3(x);R1(x);W2(y);R3(y);R3(z);R2(z)}

H2 = {R3(z);R3(y);W2(y);R2(z);W1(x);R3(x);W2(x);R1(x)}

H3 = {R3(z);W2(x);W2(y);R1(x);R3(x);R2(z);R3(y);W1(x)}

H4 = {R2(z);W2(x);W2(y);W1(x);R1(x);R3(x);R3(z);R3(y)}

2.4.   (10 pts, continued from above) Which of the above histories, H1- H4, are serializable?

2.5.   (15 pts, Distributed DBMS Reliability) Briefly describe the various implementations of the process pairs concept. Comment on how process pairs may be useful in implementing a fault tolerant distributed DBMS.

 

Go to the Index

 

Hands-on Lab

Note: The lab may be completed by an individual person or by a team of two persons. A team can consist no more than two persons.

 

Preliminary description:

The project will involve the design and implementation of a sample distributed information system, where data are distributed over multiple sites. Successful completion of the project will allow students to explore the learned concepts and algorithms in the class.

 

Project: Distributed Clusters in ElasticSearch (Click each part to find detailed instructions.)

                        Part 1                          Part 2

NOTE: If the PDF document does not automatically show up in the browser when you click the link above, check your downloads folder or switch to another browser.

 

Go to the Index

 

 

Research Project and Presentations

Note: This is an individual project.

 

The goal of this project is for you to visit refereed publications (as well as some relevant web sites) to perform a detailed investigation of a chosen research topic. The topic you choose should be aligned with your chosen team project topic, by investigating related literature and resources about that topic. If you need suggestions when choosing the topic, feel free to discuss with the instructor.

 

Each person should create and maintain a distinct discussion thread in the discussion group, by responding to the instructor's post named 'Research projects should be posted here". Progress of your project should be updated weekly in that thread. How well you maintain your discussion thread is part of the grading.

A benefit of this approach is that both the instructor and your classmates will be able to view your progress and, if applicable, share their thoughts and comments.

 

Items to be submitted:

 

  1. (Before the midterm) The abstract & literature survey

A preliminary abstract of your presentation topic is due early in the semester. Check the syllabus for the due date. Each student should publish his/her abstract in the class discussion board by the due date.

The abstract should be 1-2 pages long, and contains the following sections:

(1)   Class name (i.e., CSCI5234 Web Security)

(2)   Your name and an email address that you check regularly (that is, at least once a day)

(3)   Topic of your investigation

(4)   General description of the topic

(5)   Why is the topic related to web security?

(6)   Survey of related work

Discuss at least three articles related to your chosen topic.

VERY Important: Make sure you properly cite the work of other researchers or professionals. Visit http://sce.uhcl.edu/yang/citing.htm for more information about cited references.

Warning: Missing or improper cited references in your abstract and final report will result in poor score for your research project.

(7)   A tentative outline (agenda) of your final report. That is, the sections/subsections that you plan to include in the final paper.

 

  1. (Last three weeks) Make a 15-minute oral presentation of your completed project. Use the CSE Oral Communication Rubric when making the oral presentations.

 

  1. (Right after the last class meeting) The final written report

1.      The written report should include your findings about the chosen topic.

2.      A draft of the final report should be published in the class discussion group to solicit comments from your classmates and the instructor.

Warning: Missing or improper cited references in your abstract and final report will result in poor score for this assignment.

3.      The following is a suggested outline of your final report:

                                          i.     Title

                                        ii.     Your name (and email address)

                                       iii.     An abstract (50-100 words)

                                       iv.     Introduction to the topic

                                         v.     Significance of the chosen topic with respect to the security of web-based applications

                                       vi.     Survey of related work

                                     vii.     Implemented demonstrations, if applicable.

                                    viii.     Your findings

                                       ix.     Future work: research ideas and projects possibly related to the topic

                                         x.     Conclusion

                                       xi.     Appendix (if any)

 

Go to the Index