Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Overview #2591

Open
wants to merge 26 commits into
base: latest
Choose a base branch
from
Open

Create Overview #2591

wants to merge 26 commits into from

Conversation

Loquacity
Copy link
Contributor

@Loquacity Loquacity commented Aug 2, 2023

Description

[Short summary of why you created this PR]

Links

Writing help

For information about style and word usage, see the style guide

Review checklists

Reviewers: use this section to ensure you have checked everything before approving this PR:

Subject matter expert (SME) review checklist

  • Is the content technically accurate?
  • Is the content complete?
  • Is the content presented in a logical order?
  • Does the content use appropriate names for features and products?
  • Does the content provide relevant links to further information?

Documentation team review checklist

  • Is the content free from typos?
  • Does the content use plain English?
  • Does the content contain clear sections for concepts, tasks, and references?
  • Have any images been uploaded to the correct location, and are resolvable?
  • If the page index was updated, are redirects required
    and have they been implemented?
  • Have you checked the built version of this content?

@github-actions
Copy link

github-actions bot commented Aug 2, 2023

Allow 10 minutes from last push for the staging site to build. If the link doesn't work, try using incognito mode instead. For internal reviewers, check web-documentation repo actions for staging build status. Link to build for this PR: http://docs-dev.timescale.com/docs-dev-overview-lana

@github-actions
Copy link

Site build failed. For Timescale internal contributors, check the logs in the web-documentation repo to see the failure reason. For help, contact the docs team.

@github-actions
Copy link

Site build failed. For Timescale internal contributors, check the logs in the web-documentation repo to see the failure reason. For help, contact the docs team.

@Loquacity Loquacity marked this pull request as ready for review September 12, 2023 10:27
@Loquacity Loquacity requested review from a team, mkindahl and iroussos September 12, 2023 10:27
Copy link
Contributor

@mkindahl mkindahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review. Will look a little more on the rest.

Continuous aggregates require a `time_bucket` on the time partitioning column of
the hypertable.

By default, views are automatically refreshed. You can adjust this by setting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
By default, views are automatically refreshed. You can adjust this by setting
By default, views are automatically refreshed when they are created. You can adjust this by using

the [WITH NO DATA](#using-the-with-no-data-option) option. Additionally, the
view can not be a [security barrier view][postgres-security-barrier].

Continuous aggregates use hypertables in the background, which means that they
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Continuous aggregates use hypertables in the background, which means that they
Continuous aggregates use hypertables internally, which means that they

Comment on lines +1 to +7
Research has shown that when data is newly ingested, the queries are more likely
to be shallow in time, and wide in columns. Generally, they are debugging
queries, or queries that cover the whole system, rather than specific, analytic
queries. An example of the kind of query more likely for new data is "show the
current CPU usage, disk usage, energy consumption, and I/O for a particular
server". When this is the case, the uncompressed data has better query
performance, so the native PostgreSQL row-based format is the best option.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... not sure if "debugging queries" is the best categorization of this.

Suggested change
Research has shown that when data is newly ingested, the queries are more likely
to be shallow in time, and wide in columns. Generally, they are debugging
queries, or queries that cover the whole system, rather than specific, analytic
queries. An example of the kind of query more likely for new data is "show the
current CPU usage, disk usage, energy consumption, and I/O for a particular
server". When this is the case, the uncompressed data has better query
performance, so the native PostgreSQL row-based format is the best option.
For newly ingested data, the queries are usually
shallow in time, and wide in columns. At this stage, the queries delve into details of the system. An example of the kind of query more likely for new data is "show the
current CPU usage, disk usage, energy consumption, and I/O for a particular
server". When this is the case, the uncompressed data has better query
performance, so the native PostgreSQL row-based format is the best option.

Comment on lines +15 to +17
The result is transparent queries across standard PostgreSQL storage and S3
storage, so your queries fetch the same data as before, with minimal added
latency.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Work focus on the utility, not on performance.

Suggested change
The result is transparent queries across standard PostgreSQL storage and S3
storage, so your queries fetch the same data as before, with minimal added
latency.
As a result, you can write queries seamlessly reading and involving both tiered and untiered data.```

Comment on lines +1 to +2
When you create and use a hypertable, it automatically partitions data by time,
and optionally by space.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When you create and use a hypertable, it automatically partitions data by time,
and optionally by space.
Hypertables are used to automatically partition data: traditionally using time, but hypertables can also be used to partition data in other dimensions.```

Comment on lines +4 to +7
Each hypertable is made up of child tables called chunks. Each chunk is assigned
a range of time, and only contains data from that range. If the hypertable is
also partitioned by space, each chunk is also assigned a subset of the space
values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can partition using multiple time dimensions and multiple space dimensions, so suggest to elaborate a little on this.

Suggested change
Each hypertable is made up of child tables called chunks. Each chunk is assigned
a range of time, and only contains data from that range. If the hypertable is
also partitioned by space, each chunk is also assigned a subset of the space
values.
Each hypertable is made up of child tables called chunks. Each chunk is assigned
a range of time, and only contains data from that range. If the hypertable is
also partitioned by other dimensions, each chunk is also assigned a subset of the values in that dimension.

Comment on lines +1 to +3
Timescale is the database platform built for developers. It's engineered to
deliver speed and scale for your resource-intensive workloads—like those using
time series, event, and analytics data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we also want to push more on the ease of use. Hypertables are performing well, but they are also hassle-free because they can handle automatic partititioning and manage data in different stages of the life-cycle.

Suggested change
Timescale is the database platform built for developers. It's engineered to
deliver speed and scale for your resource-intensive workloads—like those using
time series, event, and analytics data.
Timescale is the database platform built for developers. It's engineered to
without hassle deliver speed and scale for your resource-intensive workloads—like those using
time series, event, and analytics data.

Comment on lines +5 to +7
* _PostgreSQL++_ - Timescale is the PostgreSQL you know and love, giving you
access to the entire PostgreSQL ecosystem, but Timescale has additional
features like hypertables, compression and continuous aggregates.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we want to add a bullet around something like "Designed for Data Intensive Applications".

Comment on lines +13 to +16
Time-series data grows very quickly, and as the data grows, analyzing it gets
slower and uses more resources. Timescale solves the slow-down with continuous
aggregates. Based on PostgreSQL materialized views, continuous aggregates are
incrementally and continuously updated, to make them lightning fast.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Time-series data grows very quickly, and as the data grows, analyzing it gets
slower and uses more resources. Timescale solves the slow-down with continuous
aggregates. Based on PostgreSQL materialized views, continuous aggregates are
incrementally and continuously updated, to make them lightning fast.
For data-intensive applications the amount of of data that needs to be managed grows very quickly, and as the data grows, analyzing it gets
slower and uses more resources. Timescale solves the slow-down with continuous
aggregates. Based on PostgreSQL materialized views, continuous aggregates are
incrementally and continuously updated, to make them lightning fast.

Comment on lines +37 to +38
When you are working with time-series and event data, storage costs can easily
spiral out of control. With Timescale, you never have to worry about hidden
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe talk more about "data-intensive applications that collect large amounts of time-series and events data".

@Loquacity Loquacity closed this Sep 15, 2023
@iroussos
Copy link
Contributor

Reopening this PR as it seems to have been closed by accident

@iroussos iroussos reopened this Sep 15, 2023
@leeshyan leeshyan enabled auto-merge (squash) October 10, 2023 01:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants