Fix spelling errors

ahornace · Jul 19, 2018 · 090d92f · 090d92f
1 parent c8a4e3d
commit 090d92f
Show file tree

Hide file tree

Showing 4 changed files with 15 additions and 15 deletions.
diff --git a/text/chap01.tex b/text/chap01.tex
@@ -16,7 +16,7 @@ \section{Overview}
 \begin{itemize}
     \item Support for multiple projects. \textbf{Project} is a directory containing source files. Most commonly, it is a directory
     containing the source files for one software project; thus, the name.
-    \item Support for authentization and authorization (\cite{OpengrokAuthLayer}). For instance, by using LDAP
+    \item Support for authentication and authorization (\cite{OpengrokAuthLayer}). For instance, by using LDAP
         \footnote{\url{https://en.wikipedia.org/wiki/Lightweight_Directory_Access_Protocol}}.
     \item Support for multiple version control systems\footnote{\url{https://en.wikipedia.org/wiki/Version\_control}},
         e.g. git, mercurial, etc.
@@ -125,7 +125,7 @@ \subsubsection{Configuration}
 \subsubsection{REST API}
 \label{opengrok_rest}
 
-Opengrok provides REST API support. This is a relatively new feature. Before that, OpenGrok had known a concept of
+OpenGrok provides REST API support. This is a relatively new feature. Before that, OpenGrok had known a concept of
 \textit{Messages} – custom serialization of Java objects passed to the Web application via a custom port.
 So far, most of the REST API calls can be only made from the machine on which the OpenGrok runs.
 This is mainly because these REST API calls are meant as a means of communication between the Indexer and Web application

diff --git a/text/chap02.tex b/text/chap02.tex
@@ -6,7 +6,7 @@ \chapter{Analysis}
 \begin{itemize}
     \item \ref{general_architecture} \textbf{General Architecture} – explains the chosen suggester architecture and
     how it could be combined into the overall OpenGrok architecture.
-    \item \ref{opengrok_modifications} \textbf{Opengrok Modifications} – describes the major modifications that had
+    \item \ref{opengrok_modifications} \textbf{OpenGrok Modifications} – describes the major modifications that had
     to be made in OpenGrok code to enable suggester functionality.
     \item \ref{suggester_module} \textbf{Suggester} – provides detailed explanation of how the suggester functionality
     was implemented.
@@ -72,7 +72,7 @@ \subsubsection{Showing the Suggestions}
 \label{showing_suggestions}
 
 The Suggester needs to detect if the user pressed a key while having a specific input selected for which it is enabled. Upon
-detecting this change, it needs to process the data, send it to the backend part of the software, processs the returned
+detecting this change, it needs to process the data, send it to the backend part of the software, process the returned
 result and show it to
 the user. All this should be as quick as possible so the user considers it to be seamless.
 
@@ -533,12 +533,12 @@ \subsection{Wildcard Query}
 The specific case of \textit{prefix*} is covered in the previous Section \ref{prefix_query}. Therefore,
 all the other cases of wildcard queries will be covered in this section.
 The implementation of WFST cannot be used because of its nature.
-There is no way to efficiently search in WFST tokens for the query of type \textit{*sufffix}. The required result is
+There is no way to efficiently search in WFST tokens for the query of type \textit{*suffix}. The required result is
 the same as for the prefix query: to find the terms which are accepted by the query with the top score. However, the data
 structure which could achieve this for generic wildcard query with the WFST performance is not known to the author.
 Nonetheless, the Lucene evaluation of wildcard queries could be leveraged. An automaton specific to the wildcard query is
 created by using the Lucene automaton implementation by replacing \texttt{?} to accept any character and \texttt{*} to accept
-any string. Then the terms are filtered using this automaton. The implemenation is slower than WFST because all the terms
+any string. Then the terms are filtered using this automaton. The implementation is slower than WFST because all the terms
 need to be filtered once if they are accepted by the automaton then they need to be filtered for the second time based on
 their score.
 
@@ -931,7 +931,7 @@ \subsection{Promoting Suggestions Based on the Previous Searches}
 \textbf{Chosen solution} – \textit{nearest completion} and thus \textit{hybrid completion} are very intriguing and could
 improve the suggestions by a big margin. However, they would need to be adapted to Lucene and the implementation
 might not be completely straightforward. Therefore, the basic implementation of suggester will only include the
-\textit{most popular completion}. Implementation of the \textit{nearest completion} is a very promising canditate for
+\textit{most popular completion}. Implementation of the \textit{nearest completion} is a very promising candidate for
 future extensions.
 
 \subsubsection{Most Popular Completion – Simple Queries}
@@ -949,7 +949,7 @@ \subsubsection{Most Popular Completion – Simple Queries}
 efficiency. There are multiple options how to achieve this functionality:
 \begin{itemize}
     \item \textbf{Java Map} implementation with concurrent access, e.g. \textit{ConcurrentHashMap}. This map could be stored
-    on the disk periodically to fulfil the persistency requirement. This solution has a few drawbacks:
+    on the disk periodically to fulfill the persistency requirement. This solution has a few drawbacks:
     \begin{itemize}
         \item Loss of recent data after restart/crash.
         \item The data are held in memory. The size of the data is non-trivial, e.g.
@@ -1113,7 +1113,7 @@ \subsubsection{Most Popular Completion – Simple Queries}
     The memory usage increased by approximately $22$ \% for \textit{English words} dataset. However, it can be
     almost doubled as can be seen on \textit{Linux kernel} dataset where approximately $92$ \% size increase can be noted.
     The graph \ref{enc_comp} also shows the case when the encoding would use \textit{long} datatype. Although Lucene's \textit{Lookup}
-    interface specificies \textit{long} datatype, WFST implementation supports only \textit{int} so far.
+    interface specifies \textit{long} datatype, WFST implementation supports only \textit{int} so far.
 
     \item Lucene's WFST implementation does not know the notion of nodes – the data are stored only in arcs. The arcs that
     start in the root node might be stored in memory directly and therefore are not encoded in a byte array since these

diff --git a/text/chap04.tex b/text/chap04.tex
@@ -156,7 +156,7 @@ \section{Use as a Separate Library}
     \item Remove the \textit{projectsEnabled} parameter from the constructor. It is OpenGrok specific.
     \item Overload method \textit{search(List\textless NamedIndexReader\textgreater, SuggesterQuery, Query)} to provide
     possibility to search without the need for the list of \textit{IndexReader} variables. They are provided now to better
-    faciliate the resource reuse. However, they are not needed and could be created from the index paths specified in
+    facilitate the resource reuse. However, they are not needed and could be created from the index paths specified in
     the \textit{init(Collection\textless NamedIndexDir\textgreater)} method. The overloaded method could have the following signature:
     \textit{search(List\textless String\textgreater, SuggesterQuery, Query)} which would only specify index names.
     \item Provide a default parser which would be able to create \textit{SuggesterQuery} instances. This could be a

diff --git a/text/chap05.tex b/text/chap05.tex
@@ -57,11 +57,11 @@ \section{Impact on Hardware Requirements}
     is a lookup in the WFST data structure which is optimized for this kind of scenarios. However, in other cases
     index searches are performed which can consume a lot of CPU time.
     \item \textbf{Memory} – the WFST data structures are held in memory. Although their memory footprint is very low,
-    one data structure needs to be created per Lucene field per project which can sum up to a signifcant value.
+    one data structure needs to be created per Lucene field per project which can sum up to a significant value.
     Also, data for most popular completion are stored in the Chronicle Map implementation which translates to additional
     memory consumption.
     \item \textbf{Disk} – the WFST data structures are stored on the disk to provide a quick startup.
-    The data for most popular completion need to be stored as well. The comparison of disk consumptions for different datasets
+    The data for most popular completion need to be stored as well. The comparison of disk consumption for different datasets
     can be seen in the Figure \ref{comp_suggester_size}. The data show how much percentage of the index size the suggester
     data take. The data were measured on the machine with the operating system
     macOS\footnote{\url{https://en.wikipedia.org/wiki/MacOS}} and
@@ -143,7 +143,7 @@ \section{Impact on the Demo Instance}
 applications in a Tomcat instance. From those the most significant are:
 \begin{itemize}
     \item \textbf{JMX\footnote{\url{https://en.wikipedia.org/wiki/Java\_Management\_Extensions}} remote} –
-    Tomcat provides possiblity to manage and monitor the applications via JMX. Many applications use this functionality
+    Tomcat provides possibility to manage and monitor the applications via JMX. Many applications use this functionality
     to their advantage.
     \item \textbf{JavaMelody}\footnote{\url{https://github.com/javamelody/javamelody}} – can be added to the project as a
     dependency and creates a simple page with monitoring information available at \textit{application\_URI/monitoring}.
@@ -240,7 +240,7 @@ \subsection{Simple prefix query across all projects}
     \item Prefix \texttt{c} in \textit{full} field.
 \end{enumerate}
 The requests specified all 22 projects. The results can be seen in the Figure \ref{load_test_prefix_fig}. The slow startup
-can be noted; however, later requests were taking only a few miliseconds on average.
+can be noted; however, later requests were taking only a few milliseconds on average.
 
 \begin{figure}[htbp]
     \centering
@@ -2199,7 +2199,7 @@ \subsection{Simple prefix query across all projects}
 
 \subsection{Worst case across all projects}
 \label{load_worst}
-Query \texttt{". $\vert$"} was used where $\vert$ represents a caret positon. Term $.$ occurs in $205,943$ files. All these files
+Query \texttt{". $\vert$"} was used where $\vert$ represents a caret position. Term $.$ occurs in $205,943$ files. All these files
 need to be checked for the $.$ position and then all terms are traversed to check if they occur at the positions next to the
 $.$ term. Time threshold was set to the default value: $2$ seconds. The test case sends this request $1,000$ times
 linearly distributed during a $10$ second time period. It can be noted that the system was not able to satisfy the requests;