Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I run a cluster on Linux(centos) #285

Open
wuyanxing opened this issue Mar 27, 2019 · 35 comments
Open

How do I run a cluster on Linux(centos) #285

wuyanxing opened this issue Mar 27, 2019 · 35 comments

Comments

@wuyanxing
Copy link

No description provided.

@wuyanxing
Copy link
Author

I can't find any explanation or method.

@wuyanxing wuyanxing changed the title How do I run a cluster on Linux How do I run a cluster on Linux(centos) Mar 27, 2019
@jkliss
Copy link

jkliss commented May 20, 2019

I am having the same problem.
As I understood from the documentation it should be sufficient to add servers to the config file.
GE then at least sends messages between these servers but it does not run the script. The script was running successfully in embedded mode. I also used Global.CloudStorage.SaveXYZ to access memory. What can I do to set it up correctly?

This is the config I was trying to use:

<Trinity ConfigVersion="2.0">
  <Local>
    <!-- Add any configuration the client might need -->
  </Local>

    <section name="Application">
     <entry name="ConfigOutputOn">true</entry>
     <entry name="CurrentRunningMode">Distributed</entry>
    </section>

    <section name="Network">
     <entry name="ClientBufferSize">1048576</entry>
     <entry name="ClientMaxBufferSize">134217728</entry>
     <entry name="ServerSocketBufferSize">8192</entry>
     <entry name="ClientSocketBufferSize">8192</entry>
     <entry name="ServerMaxConn">512</entry>
     <entry name="ServerMaxAcceptOps">512</entry>
     <entry name="PreferedNetworkMask"/>
    </section>

    <Cluster>
        <Server Endpoint="<IPADRESS_SERVER1>:8133" />
        <Server Endpoint="<IPADRESS_SERVER2>:8133" />
    </Cluster>
</Trinity>

@yatli
Copy link
Contributor

yatli commented May 20, 2019

@jkliss could you attach the log for the two instances?

@jkliss
Copy link

jkliss commented May 20, 2019

[ INFO    ] EchoOnConsole set to ON
[ INFO    ] Log: changing logging directory to /home/jkliss/BFS/BFSClient/bin/Debug/netcoreapp2.2/trinity-log.
[ INFO    ] Loading Graph Engine Extensions.
[ INFO    ] Scanning for TSL storage extension.
[ INFO    ] TSL storage extension loaded.
[ INFO    ] Scanning for MemoryCloud extensions.
[ INFO    ] No MemoryCloud extension found.
[ INFO    ] Scanning for startup tasks.
[ INFO    ] EventLoop: Starting.
[ INFO    ] *****************************************************
[ INFO    ] ServerCount: 2
[ INFO    ]     IPADRESS_SERVER1:8133
[ INFO    ]     IPADRESS_SERVER2:8133
[ INFO    ] ProxyCount: 0
[ INFO    ] *****************************************************

if I replace IPADRESS_SERVER1 with localhost I get the following additional lines (and it starts sending network packets but the script doesn't go on further):

[ INFO    ] LocalMemoryStorage is initialized in read-write mode
[ INFO    ] Initializing logging facility
[ INFO    ] Reading log file.
[ INFO    ] Write-ahead-log successfully loaded. Recovered 0 records.
[ INFO    ] Creating write-ahead log file /home/jkliss/BFS/BFSClient/bin/Debug/netcoreapp2.2/storage/B/write_ahead_log/primary_storage_log_20.dat

@jkliss
Copy link

jkliss commented May 20, 2019

@yatli is there anything that I need to incorporate in the code to enable communication between servers or is there a function to make a server wait for another server to make them synchronize?

If you have an example on how to setup and run a distributed system on linux it would be very helpful for me and probably for others too

@ToxicJojo
Copy link

I'm having the same problem as @jkliss . Documentation on how to setup GraphEngine in a distributed way would be very useful.

@edouardpoitras
Copy link

I'm also looking for a working example of a distributed GraphEngine with multiple servers.

@TaviTruman
Copy link
Contributor

TaviTruman commented Aug 30, 2023

@edouardpoitras Hi. Here is an example of how to configure for Graph Engine Availability Group (Cluster): This is my xml configuration on my head server (Only)

<!--Declare and Define the Head (Primary) Graph Engine Cluster-->
<!-- <Local Template="primary-rub-truespark-sf-cluster-template"/>-->
<!-- <Remote Template="rub-truespark-ontology-taxonomy-cluster-template"/>-->

<!--A Cluster node contains configurations for servers and proxies of a Graph Engine cluster. 
    There can be multiple Cluster nodes as long as they have different identifiers. 
			        Endpoint="10.1.10.5:7001" 
    A Cluster node can have an optional attribute Id.-->

<Cluster RunningMode="Server">
	<!-- IKW-GE-APPSVR03.inknowworksdev.net -->
	<Server Endpoint="10.1.100.5:7001" AssemblyPath="D:\Trinity TripleStore Server Deployment\">
		<Network HttpPort="-1" 
				CleintMaxConn="2" 
			    ClientSendRetry="5" 
			    ClientReconnectRetry="5" 
			    Handshake="TRUE"/>

		<Logging LogDirectory="D:\GraphEngine-Log\"
				LogLevel="Verbose" 
				LogToFile="TRUE" 
             	EchoOnConsole="TRUE"/>

		<Storage TrunkCount="256"
				ReadOnly="FALSE" 
				StorageCapacity="Max16G" 
				StorageRoot="D:\GraphEngine-Storage\" 
				DefragInterval="600"/>

		<LIKQ Timeout="90000" />
	</Server>

	<!-- IKW-GE-APPSVR03.inknowworksdev.net -->
	<Server Endpoint="10.30.10.23:7001" AssemblyPath="D:\Trinity TripleStore Server Deployment\">
		<Network HttpPort="-1" 
				CleintMaxConn="2" 
			    ClientSendRetry="5" 
			    ClientReconnectRetry="5" 
			    Handshake="TRUE"/>

		<Logging LogDirectory="D:\GraphEngine-Log\"
				LogLevel="Verbose" 
				LogToFile="TRUE" 
             	EchoOnConsole="TRUE"/>

		<Storage TrunkCount="256"
				ReadOnly="FALSE" 
				StorageCapacity="Max16G" 
				StorageRoot="D:\GraphEngine-Storage\" 
				DefragInterval="600"/>

		<LIKQ Timeout="90000" />
	</Server>

	<!-- IKW-GE-APPSVR03.inknowworksdev.net -->
	<Server Endpoint="10.30.10.34:7001" AssemblyPath="D:\Trinity TripleStore Server Deployment\">
		<Network HttpPort="-1" 
				CleintMaxConn="2" 
			    ClientSendRetry="5" 
			    ClientReconnectRetry="5" 
			    Handshake="TRUE"/>

		<Logging LogDirectory="D:\GraphEngine-Log\"
				LogLevel="Verbose" 
				LogToFile="TRUE" 
             	EchoOnConsole="TRUE"/>

		<Storage TrunkCount="256"
				ReadOnly="FALSE" 
				StorageCapacity="Max16G" 
				StorageRoot="D:\GraphEngine-Storage\" 
				DefragInterval="600"/>

		<LIKQ Timeout="90000" />
	</Server>

	<!-- IKW-GE-APPSVR03.inknowworksdev.net -->
	<Server Endpoint="10.30.10.108:7002" AssemblyPath="D:\Trinity TripleStore Server Deployment\">
		<Network HttpPort="-1" 
				CleintMaxConn="2" 
			    ClientSendRetry="5" 
			    ClientReconnectRetry="5" 
			    Handshake="TRUE"/>

		<Logging LogDirectory="D:\GraphEngine-Log\"
				LogLevel="Verbose" 
				LogToFile="TRUE" 
             	EchoOnConsole="TRUE"/>

		<Storage TrunkCount="256"
				ReadOnly="FALSE" 
				StorageCapacity="Max16G" 
				StorageRoot="D:\GraphEngine-Storage\" 
				DefragInterval="600"/>

		<LIKQ Timeout="90000" />
	</Server>

	<!-- IKW-GE-APPSVR03.inknowworksdev.net -->
	<Server Endpoint="10.30.10.71:7002" AssemblyPath="D:\Trinity TripleStore Server Deployment\">
		<Network HttpPort="-1" 
				CleintMaxConn="2" 
			    ClientSendRetry="5" 
			    ClientReconnectRetry="5" 
			    Handshake="TRUE"/>

		<Logging LogDirectory="D:\GraphEngine-Log\"
				LogLevel="Verbose" 
				LogToFile="TRUE" 
             	EchoOnConsole="TRUE"/>

		<Storage TrunkCount="256"
				ReadOnly="FALSE" 
				StorageCapacity="Max16G" 
				StorageRoot="D:\GraphEngine-Storage\" 
				DefragInterval="600"/>

		<LIKQ Timeout="90000" />
	</Server>
</Cluster>

image

@TaviTruman
Copy link
Contributor

TaviTruman commented Aug 30, 2023

For the other machines in the GE Cluster you only need to specify the local machine:

<!--Declare and Define the Head (Primary) Graph Engine Cluster-->
<!-- <Local Template="primary-rub-truespark-sf-cluster-template"/>-->
<!-- <Remote Template="rub-truespark-ontology-taxonomy-cluster-template"/>-->

<!--A Cluster node contains configurations for servers and proxies of a Graph Engine cluster. 
    There can be multiple Cluster nodes as long as they have different identifiers. 
			        Endpoint="10.1.10.5:7001" 
    A Cluster node can have an optional attribute Id.-->
<Cluster RunningMode="Server">
	<!-- IKW-GE-APPSVR03.inknowworksdev.net -->
	<Server Endpoint="10.30.10.23:7001" AssemblyPath="D:\Trinity TripleStore Server Deployment\">
		<Network HttpPort="-1" 
			CleintMaxConn="2" 
			ClientSendRetry="5" 
			ClientReconnectRetry="5" 
			Handshake="TRUE"/>

		<Logging LogDirectory="D:\GraphEngine-Log\"
			LogLevel="Verbose" 
			LogToFile="TRUE"
			EchoOnConsole="TRUE"/>

		<Storage TrunkCount="256"
			ReadOnly="FALSE" 
			StorageCapacity="Max16G" 
			StorageRoot="D:\GraphEngine-Storage\" 
			DefragInterval="600"/>

		<LIKQ Timeout="90000" />
	</Server>
</Cluster>

image

@edouardpoitras
Copy link

edouardpoitras commented Aug 31, 2023

Hey @TaviTruman, thanks for the quick reply.

I've tried your configs and am still scratching my head :)

To simplify things, I have three servers in the cluster (127.0.0.1:700[0-2]). I've stripped out most of the config options but the endpoint values.

I run the first "head" node with the cluster config:

...
[ INFO    ] My IPEndPoint: 127.0.0.1:7000
...
[ INFO    ] ServerCount: 3
[ INFO    ]     127.0.0.1:7000
[ INFO    ]     127.0.0.1:7001
[ INFO    ]     127.0.0.1:7002
[ INFO    ] ProxyCount: 0
...

Great!

Then I run a 2nd instance with a config that only contains the server 127.0.0.1:7001:

[ INFO    ] My IPEndPoint: 127.0.0.1:7001
...
[ INFO    ] ServerCount: 1
[ INFO    ]     127.0.0.1:7001
[ INFO    ] ProxyCount: 0

And then the last node:

[ INFO    ] My IPEndPoint: 127.0.0.1:7002
[ INFO    ] ServerCount: 1
[ INFO    ]     127.0.0.1:7002
[ INFO    ] ProxyCount: 0

At this point, all three instances have reported:

...
[ INFO    ] Scanning for MemoryCloud extensions.
[ INFO    ] No MemoryCloud extension found.
...
[ INFO    ] Server 0 is successfully started.

Is that correct? No sure how to get the MemoryCloud extension working...

Also, when running a client, I re-used the first cluster config and then in the code specified: TrinityConfig.CurrentRunningMode = RunningMode.Client;
That seems to work in that the DistributedHashtable sample works. But it's unclear to me if it's working as expected.

Thanks again for you help!

@TaviTruman
Copy link
Contributor

@edouardpoitras Hi. Looks like you've made great progress. If you don't mind please upload your trinity.xml configure file and your GE server and Client code. I'll get back you asap. Looks like you are trying to run the Distributed hash demo code, right.

@edouardpoitras
Copy link

Hey @TaviTruman, thanks again for your time. I created a new repository to create a working minimal cluster example: https://github.com/edouardpoitras/TrinityResearch

It's basically the DistributedHashtable sample with a few code tweaks in the Program.cs. Nothing changed in the DistributedHashtableServer.cs or DistributedHashtable.tsl.

The README.md file has the manual steps I'm trying. I want to eventually get this working with docker-compose.

I've tweaked a few things since we last talked. The closest I've gotten now is to use the same config but shuffle the server endpoint definitions around so the first one chosen is different for each config. Pretty sure I'm doing something wrong.

Let me know what you think/spot.

@TaviTruman
Copy link
Contributor

Hi, @edouardpoitras, I will take a look at this today and get back to you shortly.

@TaviTruman
Copy link
Contributor

I will upload a diagram that depicts the GE Cluster along with the trinity configuration for each server; the GE Client configuration is quite simple and typically all that you need to specify TrinityConfig.CurrentRunningMode = RunningMode.Client;

@TaviTruman
Copy link
Contributor

Hi, @edouardpoitras. I have written a new and improved version of the "Distributed Hash Table" sample, and it works well in the GE Availability Group of three servers. It is late here so I will post it for you in the morning. I have written a new set of documentation as well describing the API sets.

@TaviTruman
Copy link
Contributor

Hi, @edouardpoitras. I have written a new and improved version of the "Distributed Hash Table" sample, and it works well in the GE Availability Group of three servers. It is late here so I will post it for you in the morning.

@edouardpoitras
Copy link

@TaviTruman excellent! Looking forward to it 👍

@TaviTruman
Copy link
Contributor

image

@TaviTruman
Copy link
Contributor

GE DHT Server (Head): Running on my Windows 11 Desktop

image

@TaviTruman
Copy link
Contributor

GE DHT Server (Secondary Server in GE Cluster): Running in Hyper-V Windows Server 2022

image

In the GE log you can see that some of the work has been distributed to this DHT Server Instance

@TaviTruman
Copy link
Contributor

GE DHT Server (Secondary Server in GE Cluster): Running in Hyper-V Windows Server 2022. This is the 3rd server in the GE Cluster

image

You can see here that this server in my test never received any work.

@TaviTruman
Copy link
Contributor

Here is my DHT Client doing work!

image

@TaviTruman
Copy link
Contributor

TaviTruman commented Sep 5, 2023

Here are the trinity.xml config files:

  1. GE DHT Server Cluster Config (see attached file)
  2. 2nd GE DHT Server Instance config
  3. 3rd GE DHT Server Instance config
    primary GE DHT Cluster trinity.zip

image

image

@TaviTruman
Copy link
Contributor

I will create PR and submit the new demonstration. In the meantime, I will upload the VS Studio project for you here. FYI, I do have a new Discord Channel coming up in October; the channel is dedicated to all things Graph Engine, Knowledge Graphs, Ontology Driven Software Design (ODSD) using Graph Engine, and much, much more.

@TaviTruman
Copy link
Contributor

Here is the VS Solution and Project structure I use:

image

@TaviTruman
Copy link
Contributor

Here you go!

Distributed Hash Table on GE Cluster.zip

@edouardpoitras
Copy link

edouardpoitras commented Sep 5, 2023

Amazing @TaviTruman - giving it a shot now.
I noticed that the projects use GraphEngine v4.0 - that doesn't seem available in the repo.
I tried changing the references to use v3.0 instead, but I get a Unable to find package GraphEngine.Client error. Presumably GraphEngine.Client is new in v4?

Thanks again!

Edit: Looking over the code and configs - this is exactly what I was looking for and is making a lot of sense to me now. Just need to work out how to get v4.

@TaviTruman
Copy link
Contributor

Yikes - I forgot about that - sorry! I maintain my own public repo of the Graph Engine and have a lot of updates. Let me upload the new Nuget packages for you. FYI - the GraphEngine.Client was removed from this repo, but I have been using it for a few years now.

@TaviTruman
Copy link
Contributor

Let me get you everything you need and then you can have a lot of fun :-) I have also updated the code so that it saves the Local Memory Cloud and then restores or reloads it.
Distributed Hash Table on GE Cluster -V2.zip

@TaviTruman
Copy link
Contributor

Here is the Windows Ready Deployment

DHT server Deployment.zip

@TaviTruman
Copy link
Contributor

Here are the Nuget Packages from my 4.x local builds. I will update my public GE Git Repo later today and can use it.

GraphEngine.Client.4.0.12049.symbols.zip

@TaviTruman
Copy link
Contributor

Oh, you are using Linux, right?

libTrinity.zip

@TaviTruman
Copy link
Contributor

TaviTruman commented Sep 5, 2023

Here is the link to my repo: https://github.com/InKnowWorks/IKW-GraphEngine

There is a lot of new work here - most of it you may not need.

@edouardpoitras
Copy link

Success! Thank you very much @TaviTruman, I was able to get it working on Linux as per your screenshots. I really like your project structure and will convert my repo to use that as well. I'm still going to try to get a minimal version working with Docker + compose in my TrinityResearch repo as I think that could be useful for others, but the real gold here is in your comments above 👍

Hopefully this thread will save headaches for others in the future.

@TaviTruman
Copy link
Contributor

FYI - I am working out the wrinkles for Azure Kubernetes Deployment. I will post it in the Discussion as soon as it is bullet-proof.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants