Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refact(session connection): remove session connection state table #4617

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

irenarindos
Copy link
Collaborator

WIP

This PR removes the session_connection_state table, replacing it with the new field connected_time_range on session_connection

Comment on lines 27 to 34
drop table session_connection_state;
drop table session_connection_state_enm;

-- If the connected_time_range is null, it means the connection is authorized but not connected.
-- If the upper value of connected_time_range is > now() (upper range is infinity) then the state is connected.
-- If the upper value of connected_time_range is <= now() then the connection is closed.
alter table session_connection
add column connected_time_range tstzrange;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not add the column and migrate the data from session_connection_state?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's worth migrating? I'm assuming that the control plane would go down for maintenance, and if we regularly remove terminated sessions from the db it seems like we should be fine to drop that table because sessions should be in a terminated state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should migrate the data. We only delete the sessions after they have been terminated for an hour. So once the controllers are back up there will be an hour before the previous sessions are deleted, and I think we should report an accurate connection state for that time period.

@irenarindos irenarindos force-pushed the irindos-session-connection-states branch 5 times, most recently from 993b4b3 to 17b5bc3 Compare April 29, 2024 19:56
@irenarindos irenarindos force-pushed the irindos-session-connection-states branch from 17b5bc3 to ea418af Compare April 30, 2024 16:17
Copy link
Member

@tmessi tmessi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to review the go code, but I have a few comments/requests on the sql.

Comment on lines +36 to +40
-- Insert on session_connection creates the connection entry, leaving the connected_time_range to null, indicating the connection is authorized
-- "Connected" is handled by the function ConnectConnection, which sets the connected_time_range lower bound to now() and upper bound to infinity
-- "Closed" is handled by the trigger function, update_connected_time_range_on_closed_reason, which sets the connected_time_range upper bound to now()
-- State transitions are guarded by the trigger function, check_connection_state_transition, which ensures that the state transitions are valid

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment block for the function? If so, we should remove this black line between the comment block and the function.

-- "Closed" is handled by the trigger function, update_connected_time_range_on_closed_reason, which sets the connected_time_range upper bound to now()
-- State transitions are guarded by the trigger function, check_connection_state_transition, which ensures that the state transitions are valid

create or replace function check_connection_state_transition() returns trigger
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a new function? If so we can just use create function not create or replace. If it is not new, we should document which migration this is replacing, and preferably would perform a drop then create instead of create or replace.

I have this same question for the other functions that are defined in this migration.

Comment on lines +71 to +82
perform from
session_connection cs
where
cs.public_id = new.public_id and
-- If the connection is already closed, there's no need to update the connected_time_range
upper(cs.connected_time_range) <= now();
if not found then
update session_connection
set
connected_time_range = tstzrange(lower(connected_time_range), now())
where
public_id = new.public_id;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this just be an update with the necessary where clause?

Suggested change
perform from
session_connection cs
where
cs.public_id = new.public_id and
-- If the connection is already closed, there's no need to update the connected_time_range
upper(cs.connected_time_range) <= now();
if not found then
update session_connection
set
connected_time_range = tstzrange(lower(connected_time_range), now())
where
public_id = new.public_id;
update session_connection
set connected_time_range = tstzrange(lower(connected_time_range), now())
where public_id = new.public_id
and cs.connected_time_range = tstzrange(lower(cs.connected_time_range), 'infinity'::timestamptz);

Comment on lines +104 to +105
-- open connections will have a connected_time_range with an upper bound of infinity
upper(connected_time_range) > now()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check if this is equal to 'infinity'::timestamptz.

Comment on lines +96 to +109
perform from
session
where
public_id = new.public_id and
public_id not in (
select session_id
from session_connection
where
-- open connections will have a connected_time_range with an upper bound of infinity
upper(connected_time_range) > now()
);
if not found then
raise 'session %s has open connections', new.public_id;
end if;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be done without the subquery and without using not in:

perform
   from session_connection
  where session_id                  = new.public_id
    and upper(connected_time_range) = 'infinity'::timestamptz;
if found then
  raise 'session %s has open connections', new.public_id;
end if;

Comment on lines +135 to +180
with
authorized_timestamp (date_dim_key, time_dim_key, ts) as (
select wh_date_key(create_time), wh_time_key(create_time), create_time
from session_connection
where public_id = new.public_id
and connected_time_range is null
),
session_dimension (host_dim_key, user_dim_key, credential_group_dim_key) as (
select host_key, user_key, credential_group_key
from wh_session_accumulating_fact
where session_id = new.session_id
)
insert into wh_session_connection_accumulating_fact (
connection_id,
session_id,
host_key,
user_key,
credential_group_key,
connection_authorized_date_key,
connection_authorized_time_key,
connection_authorized_time,
client_tcp_address,
client_tcp_port_number,
endpoint_tcp_address,
endpoint_tcp_port_number,
bytes_up,
bytes_down
)
select new.public_id,
new.session_id,
session_dimension.host_dim_key,
session_dimension.user_dim_key,
session_dimension.credential_group_dim_key,
authorized_timestamp.date_dim_key,
authorized_timestamp.time_dim_key,
authorized_timestamp.ts,
new.client_tcp_address,
new.client_tcp_port,
new.endpoint_tcp_address,
new.endpoint_tcp_port,
new.bytes_up,
new.bytes_down
from authorized_timestamp,
session_dimension
returning * into strict new_row;
return null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, and elsewhere can you use river formatting, i.e.:

Suggested change
with
authorized_timestamp (date_dim_key, time_dim_key, ts) as (
select wh_date_key(create_time), wh_time_key(create_time), create_time
from session_connection
where public_id = new.public_id
and connected_time_range is null
),
session_dimension (host_dim_key, user_dim_key, credential_group_dim_key) as (
select host_key, user_key, credential_group_key
from wh_session_accumulating_fact
where session_id = new.session_id
)
insert into wh_session_connection_accumulating_fact (
connection_id,
session_id,
host_key,
user_key,
credential_group_key,
connection_authorized_date_key,
connection_authorized_time_key,
connection_authorized_time,
client_tcp_address,
client_tcp_port_number,
endpoint_tcp_address,
endpoint_tcp_port_number,
bytes_up,
bytes_down
)
select new.public_id,
new.session_id,
session_dimension.host_dim_key,
session_dimension.user_dim_key,
session_dimension.credential_group_dim_key,
authorized_timestamp.date_dim_key,
authorized_timestamp.time_dim_key,
authorized_timestamp.ts,
new.client_tcp_address,
new.client_tcp_port,
new.endpoint_tcp_address,
new.endpoint_tcp_port,
new.bytes_up,
new.bytes_down
from authorized_timestamp,
session_dimension
returning * into strict new_row;
return null;
with
authorized_timestamp (date_dim_key, time_dim_key, ts) as (
select wh_date_key(create_time), wh_time_key(create_time), create_time
from session_connection
where public_id = new.public_id
and connected_time_range is null
),
session_dimension (host_dim_key, user_dim_key, credential_group_dim_key) as (
select host_key, user_key, credential_group_key
from wh_session_accumulating_fact
where session_id = new.session_id
)
insert into wh_session_connection_accumulating_fact (
connection_id,
session_id,
host_key,
user_key,
credential_group_key,
connection_authorized_date_key,
connection_authorized_time_key,
connection_authorized_time,
client_tcp_address,
client_tcp_port_number,
endpoint_tcp_address,
endpoint_tcp_port_number,
bytes_up,
bytes_down
)
select new.public_id,
new.session_id,
session_dimension.host_dim_key,
session_dimension.user_dim_key,
session_dimension.credential_group_dim_key,
authorized_timestamp.date_dim_key,
authorized_timestamp.time_dim_key,
authorized_timestamp.ts,
new.client_tcp_address,
new.client_tcp_port,
new.endpoint_tcp_address,
new.endpoint_tcp_port,
new.bytes_up,
new.bytes_down
from authorized_timestamp,
session_dimension
returning * into strict new_row;
return null;

return null;
end if;

if upper(new.connected_time_range) > now() then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should check if = 'infinity'::timestamptz

Comment on lines +211 to +221
date_col = 'connection_' || state || '_date_key';
time_col = 'connection_' || state || '_time_key';
ts_col = 'connection_' || state || '_time';

q = format('update wh_session_connection_accumulating_fact
set (%I, %I, %I) = (select wh_date_key(%L), wh_time_key(%L), %L::timestamptz)
where connection_id = %L
returning *',
date_col, time_col, ts_col,
new.update_time, new.update_time, new.update_time,
new.public_id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just do these updates within the two separate cases of the if above and avoid the use of format and execute?

Comment on lines +14 to +16
update session_connection
set connected_time_range=tstzrange(now(),'infinity')
where public_id = 's1c1___clare';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, can you format blocks like these to form a river?:

Suggested change
update session_connection
set connected_time_range=tstzrange(now(),'infinity')
where public_id = 's1c1___clare';
update session_connection
set connected_time_range = tstzrange(now(), 'infinity')
where public_id = 's1c1___clare';

@psekar psekar modified the milestones: deferred, 0.16.x May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants