Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I deploy a redis cluster with time series modules and others in redis stack? #1563

Open
krebznet opened this issue Dec 28, 2023 · 7 comments
Labels
question Further information is requested

Comments

@krebznet
Copy link

I am also trying to do this with kubernetes, there are bitnami helm charts but the I don't understand how i can deploy a cluster with the time series module, is it a matter of updating the config file to load modules and then ensure the modules are on the file system of the kubernetes node. I am loving redis-stack and time series, i am inserting about 300,00 key/values a second and it won't work on a single node server. Any ideas, please if I have to I can install the redis cluster outside of kubernetes but can't find documentation how to add time series modules. Any advice or links would greatly help, I've been spending two days on this. Thanks for any help in advance.

@LiorKogan
Copy link
Member

LiorKogan commented Dec 29, 2023

We built the Redis Stack Docker image to be a simple, straightforward, single-node environment.

We've seen people modifying our images (e.g., Deploying a Redis Stack Cluster using Docker images along with RedisInsight, to deploy a cluster).

Another option is to set up a Redis cluster and then add the relevant modules. See https://github.com/RedisTimeSeries/RedisTimeSeries?tab=readme-ov-file#build-it-yourself for instructions for building RedisTimeSeries from source code and adding it to your redis.conf file.

You can also check our Redis Enterprise Software and Redis Cloud.

@LiorKogan LiorKogan added the question Further information is requested label Dec 29, 2023
@krebznet
Copy link
Author

krebznet commented Dec 29, 2023

Thank you @LiorKogan - I am following the second approach you suggested, and I am able to build redis without issues however when trying to build the time series module it seems to build creates a RedisTimeSeries directory but within that directory I am confused if I should be expecting a bin folder with the compiled timeseries.so file that would be the module I instruct redis to load. this is a screen shot of the folder contents and the build commands I am executing, are there more scripts to run in order to build the timeseries.so. - Any help would be greatly appreciated I'm on day 3 here struggling.

Screenshot 2023-12-29 at 7 38 41 AM

Searching for a file ending in .so which is the module i would need to load returns 0 results, maybe i am being dumb here, not sure.

Screenshot 2023-12-29 at 7 42 12 AM

@LiorKogan
Copy link
Member

LiorKogan commented Dec 29, 2023

The build process should work fine, I just tested it. Yes, you should have a bin folder.

Try to execute the commands one by one (starting with the apt-get install -y git)

If you didn't build Redis according to the instructions on that page - try running the two apt-get commands specified in the Redis build instructions first.

@krebznet
Copy link
Author

krebznet commented Dec 30, 2023

@LiorKogan - I still could not get it to build but instead of wasting your time I'm having some resources within my team work on it. I do have a question for you however, as I am focused on the application building and you seem to know the time series module pretty well. What would you suggest I do. I have 8,000 stocks coming in from a market feed that i enrich to create about 50 derived variables for each stock, each of those variables I am using a time series key that identifies the stream id, the entity id (or stock) and then the variable id so i have a unique key to look up a variable value or agg for a specific stream/stock/varible. My question is, over time is it best to create new keys for each variable so that maybe every day i have a key for the variable values of a specific entity for that day or is it fine to just create one key for all the values. To give contenxt every day the stream/stock/variable key will add 3,840,000 (3.8 Million) values and in 5 days it would be 3.8M x 5 would i be better off with a key for each entity/variable everyday that as 3.8 million or one key for each entity/variable that will have millions of time series values, i am hoping for the second because then i could do better aggregations over multiple days. Does the number of values in a time series key impact performance? - Thanks man

@LiorKogan
Copy link
Member

LiorKogan commented Dec 31, 2023

Over time is it best to create new keys for each variable so that maybe every day I have a key for the variable values of a specific entity for that day or is it fine to just create one key for all the values?

Using keys with Billions of samples each is perfectly fine. As long as you add new data in order (new time > prev. time) - the number of samples should not affect the performance. Aggregating over historical timeframes (e.g., 1-day timeframes) can become just a tad slower as the series grows.

That being said, splitting time series by timeframes will allow you to archive old data. Redis Enterprise has Auto Tiering capability to swap less frequently-accessed keys between SSDs and RAM.

@krebznet
Copy link
Author

krebznet commented Jan 1, 2024

@LiorKogan - I feel like I am pressing my luck, i just respect the knowledge you have around this, i was able to build a custom docker image with the time series module, good milestone. What you wrote above is incredible, the fact time series key can handle that much data is awesome. I'm writing about 300,000 key values a second across a distributed system pointing to single redis-stack instance, what i am a bit nervous on is if time series will work through a java jedis client connection to the cluster as what i read you need redis gears installed for time series to work in a cluster and i am not sure if redis gears is a module or not, i also am curious if i do have issues with time series on a cluster if adding a redis proxy layer would resolve it, any insight here would be greatly appreciated. Your time series module is truly awesome, the issue on a single node is the persistence snapshot that runs on fixed time intervals blocking incoming writes, i do want to persist the data and thinking that somewhere i can convert it to AOP or log append and possibly do persistence on a replica not master node. I hope i can give back with a forked red-cluster helm chart that resolves all these issues so people can use it as a reference. Thanks Again - Duncan/Krebznet

@krebznet
Copy link
Author

krebznet commented Jan 1, 2024

P.S - Happy to close the issue, just want to see if you respond to the last question above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants