\documentclass[11pt]{article}
\usepackage{amssymb}
\usepackage{amsmath}
\usepackage{fullpage}
\usepackage{enumerate}
\usepackage{titlesec}
\usepackage{url}
\setlength{\parindent}{0cm}
\newcommand{\bit}{\begin{itemize}}
\newcommand{\eit}{\end{itemize}}
\newcommand{\ben}{\begin{enumerate}}
\newcommand{\een}{\end{enumerate}}
\newcommand{\join}{\Join}
\newcommand{\wt}[1]{{\bf [#1~pts]}}
%%%%%
\begin{document}
\titleformat*{\section}{\LARGE\bfseries}
\titleformat*{\subsection}{\Large\bfseries}
\titleformat*{\subsubsection}{\large\bfseries}
\titleformat*{\paragraph}{\large\bfseries}
\titleformat*{\subparagraph}{\large\bfseries}
\section*{Comp115 Spring 2017, HW4 (bonus)}
\subsection*{Due April 11th, 2017}
\subsection*{Submit via provide using:}
\texttt {\$ provide comp115 115HW4-QEval }
or online at: \url{https://www.cs.tufts.edu/comp/115/provide.cgi}
\vspace{2mm}
Consider the join R$\join_{R.a=S.b}$S, given the following information about the relations to be joined. The cost metric is the number of page I/Os unless otherwise noted. The cost of writing out the result should be uniformly ignored, but bear in mind that one output buffer should be used for producing the result.
\\
\\
Relation R contains 200,000 tuples and has 20 tuples per page.\\
Relation S contains 4,000,000 tuples and also has 20 tuples per page.\\
Attribute a of relation R is the primary key for R.\\
Each tuple of R joins with exactly 20 tuples of S.\\
Both relations are stored as simple heap files.\\
Neither relation has any indexes built on it.\\
1002 buffer pages are available.\\
\ben
\item What is the cost of joining R and S using a page-oriented simple nested loops join? What is the minimum number of buffer pages required for this cost to remain unchanged? \wt{10}
\item What is the cost of joining R and S using a block nested loops join? What is the minimum number of buffer pages required for this cost to remain unchanged? \wt{10}
\item What is the cost of joining R and S using a sort-merge join? What is the minimum number of buffer pages required for this cost to remain unchanged? \wt{20}
\item What is the cost of joining R and S using a hash join? What is the minimum
number of buffer pages required for this cost to remain unchanged?
\wt{20}
\item What would be the lowest possible I/O cost for joining R and S using any join
algorithm, and how much buffer space would be needed to achieve this cost?
Explain briefly. \wt{15}
\item How many tuples does the join of R and S produce, at most, and how many pages
are required to store the result of the join back on disk?
\wt{15}
\item Would your answers to any of the previous questions in this exercise change if you were told that S.b is the primary key for S? \wt{10}
\een
\end{document}