Skip to content

Commit

Permalink
Update API links in Components.md
Browse files Browse the repository at this point in the history
  • Loading branch information
SvenGroot committed Nov 13, 2023
1 parent 724267c commit e554428
Show file tree
Hide file tree
Showing 4 changed files with 63 additions and 32 deletions.
41 changes: 26 additions & 15 deletions doc/Sandcastle/JumboDoc.shfbproj
Original file line number Diff line number Diff line change
Expand Up @@ -58,25 +58,36 @@
<SdkLinkTarget>Blank</SdkLinkTarget>
<NamespaceSummaries>
<NamespaceSummaryItem name="Ookii.Jumbo" isDocumented="True">Provides utility types used by all Jumbo components.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Dfs" isDocumented="True">Provides types used in the implementation of the Jumbo Distributed File System.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Dfs.FileSystem" isDocumented="True">Provides types used to access the Jumbo Distributed File System.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.IO" isDocumented="True">Provides types related to reading and writing input, output and intermediate records in Jumbo.
<NamespaceSummaryItem name="Ookii.Jumbo.Dfs" isDocumented="True">Provides types used in the implementation of the Jumbo Distributed File System.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Dfs.FileSystem" isDocumented="True">Provides types used to access the Jumbo Distributed File System.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.IO" isDocumented="True">Provides types related to reading and writing input, output and intermediate records in Jumbo.

&lt;see cref="T:Ookii.Jumbo.IO.RecordReader`1" /&gt; is the base class for all types that can read input or channel data. &lt;see cref="T:Ookii.Jumbo.IO.RecordWriter`1" /&gt; is the base class for all types that can write output or channel data.

&lt;see cref="T:Ookii.Jumbo.IO.IWritable" /&gt; and &lt;see cref="T:Ookii.Jumbo.IO.IValueWriter`1" /&gt; provide the serialization infrastructure used for Jumbo Jet intermediate record types.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet" isDocumented="True">Provides types used in the implementation of the Jumbo Jet data processing engine, and basic types related to job operation such as &lt;see cref="T:Ookii.Jumbo.Jet.ITask`2" /&gt;, &lt;see cref="T:Ookii.Jumbo.Jet.Configurable" /&gt; and &lt;see cref="T:Ookii.Jumbo.Jet.TaskContext" /&gt;.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Channels" isDocumented="True">Provides the implementation of File, Pipeline and TCP channels, as well as the sorting implementation used by channels using SpillSort.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.IO" isDocumented="True">Provides types that define the input and output of stages and tasks in a job configuration.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Jobs" isDocumented="True">Provides types for creating and reading job configurations, as well as the base types used for JobRunners in JetShell.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Jobs.Builder" isDocumented="True">Provides types for creating a job configuration by defining a sequence of operations in code.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Samples" isDocumented="True">Provides sample jobs for the Jumbo Jet data processing engine.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Samples.FPGrowth" isDocumented="True">Provides a sample job implementing the Parallel FP Growth algorithm.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Samples.IO" isDocumented="True">Provides helper record types for various Jumbo Jet samples.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Scheduling" isDocumented="True">Provides interfaces required for implementing a Jumbo Jet task scheduler, as well as the default scheduler implementation.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Tasks" isDocumented="True">Provides helper types for common types of tasks.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Rpc" isDocumented="True">Provides the infrastructure used for communication between the various Jumbo components. Jumbo originally depended on .Net Remoting for this purpose, but because of issues with the Mono implementation of .Net Remoting, a custom RPC mechanism using similar semantics was implemented. These types are internal to Jumbo and should not be used by any clients.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Topology" isDocumented="True">Provides types used to define a network topology that can be used to determine data locality for replication and task scheduling.</NamespaceSummaryItem></NamespaceSummaries>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet" isDocumented="True">Provides types used in the implementation of the Jumbo Jet data processing engine, and basic types related to job operation such as &lt;see cref="T:Ookii.Jumbo.Jet.ITask`2" /&gt;, &lt;see cref="T:Ookii.Jumbo.Jet.Configurable" /&gt; and &lt;see cref="T:Ookii.Jumbo.Jet.TaskContext" /&gt;.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Channels" isDocumented="True">Provides the implementation of File, Pipeline and TCP channels, as well as the sorting implementation used by channels using SpillSort.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.IO" isDocumented="True">Provides types that define the input and output of stages and tasks in a job configuration.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Jobs" isDocumented="True">Provides types for creating and reading job configurations, as well as the base types used for JobRunners in JetShell.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Jobs.Builder" isDocumented="True">Provides types for creating a job configuration by defining a sequence of operations in code.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Samples" isDocumented="True">Provides sample jobs for the Jumbo Jet data processing engine.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Samples.FPGrowth" isDocumented="True">Provides a sample job implementing the Parallel FP Growth algorithm.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Samples.IO" isDocumented="True">Provides helper record types for various Jumbo Jet samples.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Scheduling" isDocumented="True">Provides interfaces required for implementing a Jumbo Jet task scheduler, as well as the default scheduler implementation.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Jet.Tasks" isDocumented="True">Provides helper types for common types of tasks.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Rpc" isDocumented="True">Provides the infrastructure used for communication between the various Jumbo components. Jumbo originally depended on .Net Remoting for this purpose, but because of issues with the Mono implementation of .Net Remoting, a custom RPC mechanism using similar semantics was implemented. These types are internal to Jumbo and should not be used by any clients.</NamespaceSummaryItem>
<NamespaceSummaryItem name="Ookii.Jumbo.Topology" isDocumented="True">Provides types used to define a network topology that can be used to determine data locality for replication and task scheduling.</NamespaceSummaryItem>
</NamespaceSummaries>
<PlugInConfigurations>
<PlugInConfig id="Additional Reference Links" enabled="True" xmlns="">
<configuration>
<targets>
<target htmlSdkLinkType="None" helpViewerSdkLinkType="Id" websiteSdkLinkType="None" helpFileProject="..\..\..\Ookii.CommandLine\docs\Ookii.CommandLine.shfbproj" />
<target htmlSdkLinkType="None" helpViewerSdkLinkType="Id" websiteSdkLinkType="None" helpFileProject="..\..\..\Ookii.BinarySize\docs\Ookii.BinarySize.shfbproj" />
</targets>
</configuration>
</PlugInConfig>
</PlugInConfigurations>
</PropertyGroup>
<!-- There are no properties for these groups. AnyCPU needs to appear in order for Visual Studio to perform
the build. The others are optional common platform types that may appear. -->
Expand Down
37 changes: 20 additions & 17 deletions doc/UserGuide/Components.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,11 +101,10 @@ When reading a file, the process is simpler:
next DataServer (there is currently no mechanism to remove the corrupted block and re-replicate
it; see, this is why Jumbo is not production quality code).

Client applications interact with the DFS using the [`Ookii.Jumbo.Dfs.Filesystem.FileSystemClient`](https://www.ookii.org/docs/jumbo-2.0/html/T_Ookii_Jumbo_Dfs_FileSystem_FileSystemClient.htm),
which provides an API for performing these operations. Using it, a client simply creates a file,
gets a [Stream](https://www.ookii.org/docs/jumbo-2.0/html/T_Ookii_Jumbo_Dfs_DfsOutputStream.htm) for
it, and writes to that stream like any normal file. All the steps above are taken care of under the
hood.
Client applications interact with the DFS using the [`FileSystemClient`][] class, which provides an API
for performing these operations. Using it, a client simply creates a file, gets a [`DfsOutputStream`][]
for it, and writes to that stream like any normal file. All the steps above are taken care of under
the hood.

It's probably rare that you'll ever write a DFS client application yourself. If you write Jumbo Jet
jobs, most of this is abstracted through its data input and output models. And as an end user, you
Expand Down Expand Up @@ -136,19 +135,23 @@ are run in parallel on multiple systems in a cluster. For stages reading a data
input is divided linearly across tasks (these are called _splits_). For stages reading a channel
input, the channel _partitions_ the data across the tasks.

Tasks run a user-defined piece of code to do their processing. This code doesn’t need to be aware
of most of the details. Regardless of whether the task is reading from or writing to a file or a
channel, the code is the same. Input is provided via a [`RecordReader`](https://www.ookii.org/docs/jumbo-2.0/html/T_Ookii_Jumbo_IO_RecordReader_1.htm)
and output is written to a [`RecordWriter`](https://www.ookii.org/docs/jumbo-2.0/html/T_Ookii_Jumbo_IO_RecordWriter_1.htm),
which take care of the details. Since partitioning (and optionally, things like sorting) are handled
by the Jumbo Jet infrastructure, you don’t have to worry about how to perform those operations; you
simply need to specify you want them to happen.
Tasks run a user-defined piece of code to do their processing. This code doesn’t need to be aware of
most of the details. Regardless of whether the task is reading from or writing to a file or a
channel, the code is the same. Input is provided via a [`RecordReader<T>`][] and output is written to a
[`RecordWriter<T>`][], which take care of the details. Since partitioning (and optionally, things like
sorting) are handled by the Jumbo Jet infrastructure, you don’t have to worry about how to perform
those operations; you simply need to specify you want them to happen.

The typical way to create a job in Jumbo is to use the [`JobBuilder`](https://www.ookii.org/docs/jumbo-2.0/html/T_Ookii_Jumbo_Jet_Jobs_Builder_JobBuilder.htm),
which allows you to specify a sequence of operations which are then translated into a job
configuration of stages and channels. This means you can create jobs without worrying too much
about their actual structure during execution (although of course this is available if you want to
do more complex processing).
The typical way to create a job in Jumbo is to use the [`JobBuilder`][], which allows you to specify a
sequence of operations which are then translated into a job configuration of stages and channels.
This means you can create jobs without worrying too much about their actual structure during
execution (although of course this is available if you want to do more complex processing).

We will go into more details about how jobs are executed [later](JobExecution.md). First, it's time
to learn how to [write your own jobs for Jumbo Jet](Tutorial1.md).

[`DfsOutputStream`]: https://www.ookii.org/docs/jumbo-2.0/html/T_Ookii_Jumbo_Dfs_DfsOutputStream.htm
[`FileSystemClient`]: https://www.ookii.org/docs/jumbo-2.0/html/T_Ookii_Jumbo_Dfs_FileSystem_FileSystemClient.htm
[`JobBuilder`]: https://www.ookii.org/docs/jumbo-2.0/html/T_Ookii_Jumbo_Jet_Jobs_Builder_JobBuilder.htm
[`RecordReader<T>`]: https://www.ookii.org/docs/jumbo-2.0/html/T_Ookii_Jumbo_IO_RecordReader_1.htm
[`RecordWriter<T>`]: https://www.ookii.org/docs/jumbo-2.0/html/T_Ookii_Jumbo_IO_RecordWriter_1.htm
14 changes: 14 additions & 0 deletions doc/refs.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"#apiPrefix": "https://learn.microsoft.com/dotnet/api/",
"#prefix": "https://www.ookii.org/docs/jumbo-2.0/html/",
"#suffix": ".htm",
"DfsOutputStream": "T_Ookii_Jumbo_Dfs_DfsOutputStream",
"DfsShell": null,
"FileSystemClient": "T_Ookii_Jumbo_Dfs_FileSystem_FileSystemClient",
"JobBuilder": "T_Ookii_Jumbo_Jet_Jobs_Builder_JobBuilder",
"Ookii.Jumbo.Dfs.Filesystem.FileSystemClient": "T_Ookii_Jumbo_Dfs_FileSystem_FileSystemClient",
"RecordReader": "T_Ookii_Jumbo_IO_RecordReader",
"RecordReader<T>": "T_Ookii_Jumbo_IO_RecordReader_1",
"RecordWriter": "T_Ookii_Jumbo_IO_RecordWriter",
"RecordWriter<T>": "T_Ookii_Jumbo_IO_RecordWriter_1"
}
3 changes: 3 additions & 0 deletions src/Ookii.Jumbo/IO/WritableUtility.cs
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ public static T GetUninitializedWritable<T>()
/// <summary>
/// Gets an uninitialized object of a type implementing <see cref="IWritable"/>.
/// </summary>
/// <param name="type">
/// The type of the object to create. This type must implement the <see cref="IWritable"/> interface.
/// </param>
/// <returns>An uninitialized instance of <paramref name="type"/>.</returns>
/// <remarks>
/// <para>
Expand Down

0 comments on commit e554428

Please sign in to comment.