% Jim Teresco
% Williams College, Mount Holyoke College, Siena College, The College
% of Saint Rose
%
% Last modified: Mon Dec  4 19:03:24 EST 2017
%
%
% Modified by: Pat Baumgardner, Adam Dachenhausen, Shah Syed
%
\documentclass[12pt]{article}
% extra packages to bring in
\usepackage{latexsym}
\usepackage{graphicx}      % extended graphics package
\usepackage{epsfig}        % wrapper for graphicx package
\usepackage{times}
\usepackage{url}
\usepackage{hyperref}
% set some margins, these can be defined as in, cm, pt
\setlength{\topmargin}{-0.5in}
\setlength{\textheight}{9in}
\setlength{\oddsidemargin}{0in}
\setlength{\evensidemargin}{0in}
\setlength{\textwidth}{6.5in}

% a few macros that might be useful -- any time we type \eg it expands
% to the italicized version defined here
\newcommand{\etal}{{\it et al}.$\:$}
\newcommand{\eg}{{\it e}.{\it g}.$\:$}
\newcommand{\cf}{{\it cf}.$\:$}
\newcommand{\ie}{{\it i}.{\it e}.$\:$}

%% to remove page numbers, uncomment this:
%% \pagestyle{empty}

%% Define single-space command
\newcommand{\singlespace}{
  \protect\renewcommand\baselinestretch{1.0}
  \protect\normalsize
}
% use this instead if you want to disable it completely:
%%\newcommand{\singlespace}{}

%% Define double-space command (really more like 1.5 spacing)
%% This is essential for rough drafts, and not a bad idea even for
%%  final submissions
\newcommand{\doublespace}{
  \protect\renewcommand\baselinestretch{1.5}
  \protect\normalsize
}
% use this instead if you want to disable it completely:
%%\newcommand{\doublespace}{}

% This tells latex we're done defining the preamble stuff and we're
% ready to start writing the document
\begin{document}

% this removes the date that is automatically placed in the title.
% comment it out if you want the date
\date{}

% the next few items define things to go onto the title section, like,
% well, the title, and the list of authors
\title{Java Based Memory Latency Simulator}

\author{Pat Baumgardner, Adam Dachenhausen, Shah Syed\\
Department of Computer Science\\
Siena College\\
Loudonville, NY  12211
}

% this tells latex that you're done setting up title stuff and that it
% should go ahead and generate the title here
\maketitle
% leave the page number off but for this page only
\thispagestyle{empty}

% Abstract!
\begin{abstract}

Insert abstract here.
  
\end{abstract}

% turn on double spacing
\doublespace

% Now we write the text of the paper, hopefully breaking it up into
% nice sections and subsections, using figures and tables as
% appropriate, and referring to those sections using labels instead of
% trying to number things by hand
\section{Overview}
\label{sec:overview}

The goal of this project was to create a computer memory latency simulator. 
For its ease of use, Java was chosen to program this project in. This 
simulator is designed to simulate data transfer from one system component 
to another. For example, from the hard drive disk, to the main memory. 

In Section~\ref{sec:memlate} we discuss abstractly how the memory latency 
simulator works, and why we built it. Section~\ref{sec:build} describes
how to build and run the simulator. In section~\ref{sec:expstats} we explain
all the types of  information collected from the simulator. Then, in section~\ref{sec:data}
we present the actual data collected from the simulator, as well as actual 
real world data. Section~\ref{sec:disc} goes on to explain any discrepencies 
found between our data and the real world collected data. Finally, in~\ref{sec:conclusions}
we summarize our results, as well as report what
we have learned, and suggest how this simulator could be better
improved or used. 

\section{Simulating memory latency}
\label{sec:memlate}

\subsection{Abstract Definition}
The goal of this project was to create a Java based simulator that will allow
semi-low access to artificial data so the user could gather statistics and trends
from moving this data around the artificial system. The simulator consists of four
primary parts: memSim, which is essentially the contoller; a CPU with one level of cache;
one or more sticks or RAM; and one or more hard drive disks.

Each component's (CPU, RAM, HDD), storage is represented with a byte array.
The user then has options to handle data on any component via the memSim terminal interface.

\subsection{Primary Audience}
Today, there are many levels of abstraction between the high-level user and their hardware.
We wanted to give a way to collect data at this low level of hardware, but not have
to be a electrical engineer working on the circuitry.

In our research, there is also very little public access on this type of data, so
it could be useful for researchers and students alike to use this simulator to collect
data for a variety of projects.

\section{Building and Running the Simulator}
\label{sec:build}

\subsection{Java}
The simulator is written in Java, and therefore, you will need to have
Java installed to run it. See \url{https://www.java.com/en/download/}
 for more.

\subsection{Aquiring Source Code}
To download the simulator code, go to \url{osfinal.dachenhausen.org} 
or
\url{https://github.com/adamdachenhausen/cs330FinalProject} 
and clone the repository. See \url{https://docs.github.com/en/free-pro-team@latest/github/creating-cloning-and-archiving-repositories/cloning-a-repository}
for more.

\subsection{Compiling}
\begin{itemize}

\item \textbf{Command Line}
  The source code includes two ways to compile the simulator.
  \begin{itemize}
  \item \textbf{Make}
  The simulator source code includes a Makefile, so if Make is installed, the command
  \begin{verbatim}make\end{verbatim}
  will compile the simulator.
  \item \textbf{Default}
  In the absence of Make, the default way to compile the simulator is to run  
  \begin{verbatim}javac *.java\end{verbatim}
  which will compile all of the Java files so they could be run.
  \end{itemize}
\item \textbf{Using an IDE}
  Given the multitude of IDEs that are available, please see your
  specific IDE's manual for how to compile and run the simulator.
  
\end{itemize}

\subsection{Licensing}
Before running the simulator, we would like to remind you that this
project is protected under the MIT License, so proceed at your own risk.

\subsection{Running the Simulator}
\begin{itemize}
\item \textbf{Command Line}
  Either way that the simulator was compiled, the command
\begin{verbatim}
  java memSim
\end{verbatim}
  will start the simulator
\item \textbf{IDE}
  Given the multitude of IDEs that are available, please see your
  specific IDE's manual for how to compile and run the simulator.
\end{itemize}
Upon running the simulator, you will be prompted for a variety of 
simulator parameters. Each of these is crucial, and cannot be left blank.
{\singlespace
\begin{verbatim}
How many Hard Drives would you like?
1
How many platters should each hard drive have?
4
And how big (in bytes) should each one be?
1024
How many RAM sticks would you like?
2
And how big (in bytes) should each one be?
128
Finally, how big (in bytes) would you like your CPU cache to be?
32
\end{verbatim}
}
After setting up the simulator, the components will each start up, and
you will be prompted to choose an option from the menu. You can choose to either
select based on the menu number, or the name.
{\singlespace
\begin{verbatim}
Java Based Memory Latency Simulator
Developed by Pat Baumgardner, Adam Dachenhausen, Shah Syed
Supported commands:
move
read
write
help
exit
For help with a specific command type 'help [command]'
\end{verbatim}
}

\section{Explanation of Data Gathered}
\label{sec:expstats}


The primary path of study through the simulator was to gather the statistics about the
events, latencies and delays that take place in the very low level of memory management.
Memory management as the name suggests is the act of managing memory in the low level of
a computer. Often represented by allocation of portions and freeing them once they are
done being used.

If we consider memory management as the super class, then the statistics that we gathered, such
as the latency of memory regulation are the one of the aspects of the memory management.
The latency is the delay or the time that it takes the data to be read from memory.
Since the movement of this data is bound to be within the speed of light,
the medium it is travelling in, this means that even though that speed is above human physical
comprehension but still there would be a delay, based on the size of the data.

We gathered the latency time, which is the time delay taking place in data transfer,reading and
writing at the lowest level of memory management. This was the primary statistic for us to consider
when building the simulator.

To reperesent this level of memory management the simulator was designed to hold some key components.
These components are those linked directly with the memory usage,including the
central processing unit (CPU), the Hard Drive and the RAM sticks.

The CPU further consists of cache, modern day powerful CPUS consisted of 2 level of caches, L1 and L2,
where L1 is very small compared to L2 in memory. L1 ranging from 2KB to 64KB, whereas the second level
is in the range of 256 KB to 2 MB. Cache is used for quicker access by the CPU of the data that it
frequently uses. Cache stores that data and thus decreasing the latency. However, in the simulator we
implemented a single level of cache with a variable size, which is input by the user.

Further down the heirarchy of memory managemnt, we have the RAM, which is relatively slower to CPU but bigger
in size by quite some margin. The RAM, which translates to RAndom Access Memory is a volatile form of memory
built by several memory cells. There are two types of RAMs; DRAM and SRAM, eventhought we did not distinguish
between them in our simulator. Given that the RAM is slower compared to the CPU(cache) it takes time to read
and move data off it. In the simulator we kept the size and the number of RAM sticks a variable, so
the user inputs the value for each. Since the RAM is voaltile it loses all the data it has on it, once we
take the power off it. As soon as the power is turned on, the alot of important section of OS is read from
the hard drive and put onto the RAM so some important functions can be accessed quickly instead of reading
it off the hard drive which is way slower. This OS data along with other make up alot the RAM usage when the
computer is running. Running other process lead to more writing and reading off the RAM which leads to different
delays.

Finally comes the Hard Drive, which is the biggest form of permanent memory but the slowest one. It is
extremely large compared to both RAM and cache. Almost all the software including the OS is stored on
the HDD. HDDs have evolved in many ways over the course of times, making them quicker, but they have
remained slower compared to RAM and CPU always. In our simulator we have built an HDD on arrays of bytes,
just like the RAM and cache. 

The movement, reading and writing of the data to any of the components was timed, which helped
us to  collect the major statistics we needed  here. The latency which was  going to be the
major finding of the simulation was calculated through a stopwatch implemented
inside our program. As we kept clocking, the time was a projected ratio of the time that was supposed to
take place on the basic level. Thus, the time gathered was reduced to the ratio fully suited for the
level of details being dealt with in the simulator.

The Memory, which was the core component of the simulator and the primary part of each component
was represented by arrays of byte type. As the simulation went on, these arrays in each of the components
were modified depending upon the event taking place. This modification gave us the data for how the size
of data acts proportional to the latency and the delay in the transfer, writing and reading of data. The size
of data was one of the major field of the statistics being gathered. We studied upon the size of the
data as how much of variance does it brings to the clock.

As the variables we had in the simulation were the number of Cores in the CPU,
the number of hard drives, the size of hard drive in bytes, the number of RAM sticks, the size of
each of the RAM stick and the size of the CPU Cache, we kept a record of each combination. However,
even with the CPU having a variable number of cores, for the path of study being conducted, we mantained
it to be 1 throughout the study. The simulator was structured to take in a value for each of the variables,
except the number of cores. This led to new fields of data for statistics being gathered.
The size of cache in the CPU, which is a small form of memory and the quickest one out of RAM and HDD, was
varied to yield a different result, which went into the into the table of statistics too.
The same was done with the RAM, which relatively larger form of memory. We gathered how the data access
and transfer on RAM vary in the aspect of Time as with different sizes, and how slower or faster was the case
with the accessing, editing and transfering from and to the Hard drives.

The number of RAM sticks and the number of hard drive was also an important variable in deciding the latency,
the access time and the writing time, given that increasing the number of these components meant more
size but a different access time.

\section{Data Gathered}
\label{sec:data}


Insert all the data gathered by the simulator and from sources here.

Latency Time in the all 3 major components being implemented in the simulator vary accordingly to their sizes,
that becomes the reason why cache is the fastest between RAM and the HDD.

Starting with the hardrive latency time for the input and output of data as it is the largest size of all memories here.
The latency of a hard drive is a combination of it RPM, Seek Time and Transfer time. In the simulator that we
implemented, we didnt go into details to a point that we offer the RPM, Seek time and the Transfer time as
a variable to the user. However we implemented the HDD on data blocks instead of platters, which acted as
partitions.
According to Eric Shanks from ithollow.com, the following were the Hard Drive latencies in 2013, given the different
RPM.

\begin{tabular}{|c|c|}
  \hline
  RPMs & Rotational Latency(ms)\\
  \hline
  5400 & 11 ms\\
  \hline
  7200 & 8 ms\\
  \hline
  10000 & 6 ms\\
  \hline
  15000 & 4 ms\\
  \hline
\end {tabular}

Seek time; which is the time it takes for the Read-Write to move up and down the platter.
The Seek time itself is comprised of a series of phase times. The first being the acceleration phase,
the read write arm starts to move, then comes the coasting phase when the arm is moving at full speed. This
is followed by the deceleration phase where the read write arm slows down. Then finally comes the
settling phase as the read write head stops right at the data it needs.
The average seek time of hardrives of size 300 GB to 1TB ranged from an average time of 4ms to 9ms.
(source: pages.cs.wisc.edu - Operating Systems: Three Easy peices peices - Remzi Arpaci, Andrea Arpaci).

The last part of total I/O latency for hard drive is the transfer time, which is the time it takes the
data to be transfer from or to the Read-Write head. Compared to Rotational Latency and seek time, transfer
time is insignificant, given that the mechanical movements take more time than data movement which is more of
movement of energy packets.

After all this we come to an equation:

I/O Latency Time of HDD = Rotational Latency + Seek Time + Latency Time 

In an extensive calculation done by Remzi Arpaci-Dusseau and Andrea Arpaci-Dusseau, in the Operating Systems: Three
easy peices, two hard drives one Cheetah, and the other being Barracuda was compared for the I/O latency time. The size of
data being dealt with was 100 MB. However, another controller in the comparison was whether the data read or writing
was sequential or random. Sequential meant that large chunks of data are just being read and written which are in
sequence, whereas the Random transfer just meant Random data was selected from a read issue some where on the disk,
which is how the data for applications work. Mind it that the data we represented in the simulator was also sequential
and not random, as the hardrive implemented was built of data blocks.

The I/O Latency time along with other statistics mentioned above was (taken from the source): 


\begin{tabular}{|l|r|r|}
  \hline
  HDD & Cheetah 15K.5 & Barracuda\\
  \hline
  Capacity & 300 GB & 1 TB\\
  \hline
  RPM & 15,000 & 7,200\\
  \hline
  Average Seek & 4 ms & 9ms\\
  \hline
  Platters & 4 & 4\\
  \hline
  T I/O Random & 6 ms & 13.2 ms\\
  \hline
  T I/O Sequential & 800 ms & 900 ms
  \hline
\end {tabular}

## In our simulation ##

Now coming to our second component RAM and how the data tranfer works in their and what delays take place.
RAM is a way smaller storage device compared to HDD and is alsos volatile, but that makes it quicker than
the HDD. The Latency of data movement in RAM is comprised primarily of CAS latency which translates to
Column Address strobe or signal latency. Since the RAM is built upon columns or cells of memory,
CAS latency means how much time it would take the RAM to access the memory cells, read the data and make it
available as an output.





Explain any terms as needed (only if not already). 

\section{Discrepencies}
\label{sec:disc}

Explain any discrepencies found here.

\section{Conclusions}
\label{sec:conclusions}

Insert conclusions here.

\singlespace
\bibliographystyle{abbrv}
\bibliography{references}
www.ithollow.com/2013/11/18/disk-latency-concepts/
pages.sc.wisc.edu/~remzi/OSFEP/file-disks.pdf
www.tomshardware.com/reviews/cas-latency-ram-cl-timings-glossary-definition,6011.html



% tell latex we're done.  Anything beyond this line will be ignored.
\end{document}
