Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create user last connection datetime #935

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

CamilleLegeron
Copy link
Collaborator

resolve #924

Each time the a Grist page is reload the last_connection_at of the user is updated

@CamilleLegeron CamilleLegeron requested review from fflorent and removed request for fflorent April 15, 2024 13:22
@@ -49,7 +51,7 @@ const migrations = [Initial, Login, PinDocs, UserPicture, DisplayEmail, DisplayE
CustomerIndex, ExtraIndexes, OrgHost, DocRemovedAt, Prefs,
ExternalBilling, DocOptions, Secret, UserOptions, GracePeriodStart,
DocumentUsage, Activations, UserConnectId, UserUUID, UserUniqueRefUUID,
Forks, ForkIndexes, ActivationPrefs, AssistantLimit, Shares];
Forks, ForkIndexes, ActivationPrefs, AssistantLimit, Shares, UserLastConnection];
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests fails with error no such column: users.last_connection_at triggered when run migration UserUUID
because it uses the entity User that has no last_connection_at for the moment (the migration come later) : see here
Locally when I put the migration UserLastConnection before UserUUID it works but I think it's not a good fixe.
What about duplicate the entity inside the migration as it has to be at this point ? (Not convince about this approach)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to update UserUUID and any other affected migrations to not depend on the User entity. I don't know of any workarounds, unfortunately.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you imagine updating UserUUID to not depend ont the User entity has it exists to modify this entity ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing the methods called on the User entity with equivalent raw SQL is one idea. (typeorm supports executing raw SQL via the query method.)

The chunk option could be replaced with a for loop that does the UPDATE calls in chunks.

@fflorent
Copy link
Collaborator

After a quick glance, it looks fine to me.

Though I wonder what happens if a user continues to be logged in as their session won't expire? In such a case, even if that's not really probable that their session won't never expire in years (that being said, maybe this can happen?), the data may not reflect correctly the last connection.

Maybe we could take advantage of the Authorizer and specifically of the addUserRequest method to update the field?

export async function addRequestUser(

We may only store the date without the time (so a date type instead of a datetime) in lastConnectionAt, and update only when the current date is not the same as in stored in the database?

And if an admin wants to know the activity of a user to the nearest second, they may take a look at the logs for that.

@CamilleLegeron
Copy link
Collaborator Author

CamilleLegeron commented Apr 15, 2024

We may only store the date without the time (so a date type instead of a datetime)

I'm agree and I can change that

Though I wonder what happens if a user continues to be logged in as their session won't expire?

In the code I made the last connection is updated when the page is loaded or reloaded don't no matter user login. But in your proposal we updated it on each action, it's more precise

Copy link
Collaborator

@fflorent fflorent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a comment, but I would need to take time to make this more helpful regarding the timezone.

today.getDate() !== user.lastConnectionAt?.getDate()){
user.lastConnectionAt = today;
needsSave = true;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may use moment otherwise:

const today = moment().startOf('day').toDate();
if (today !== moment(user.lastConnectAt).startOf('day')) {
  user.lastConnectAt = today.toDate();

(NB: the tricky part is that it should handle correctly the timezone… Probably needs unit-testing)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using import moment = require('moment-timezone'); it take timezone in account
It's in fact better to read

@fflorent fflorent self-requested a review April 18, 2024 12:57
@@ -501,6 +501,10 @@ export class Client {
const user = this._profile ? await this._fetchUser(dbManager) : dbManager.getAnonymousUser();
this._user = user ? dbManager.makeFullUser(user) : undefined;
this._firstLoginAt = user?.firstLoginAt || null;
if (this._user) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it important for you to also update lastConnectionAt when a user visits a non-document page? It's uncommon for a user to visit Grist without opening a document, but it could happen.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it will be better, it's still an action in Grist, well seen

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I think @fflorent's suggestion to modify Authorizer.addRequestUser is a reasonable approach.

Here's the relevant snippet of code modified to save the lastConnectionAt in the session object (if it differs from the current date). You'll need to tweak the types in a few other files, and possibly the types of the stored dates themselves. (I haven't run the code, so other changes may be necessary; this is just a rough starting point.)

      ...
      // If we haven't computed a userId yet, check for one using an email address in the profile.
      // A user record will be created automatically for emails we've never seen before.
      if (profile && !mreq.userId) {
        const userOptions: UserOptions = {};
        if (profile?.loginMethod === 'Email + Password') {
          // Link the session authSubject, if present, to the user. This has no effect
          // if the user already has an authSubject set in the db.
          userOptions.authSubject = sessionUser.authSubject;
        }
        const today = moment().startOf('day');
        if (today !== moment(sessionUser.lastConnectionAt).startOf('day')) {
          userOptions.lastConnectionAt = today.toDate();
          const scopedSession = options.gristServer.getSessions().getOrCreateSessionFromRequest(req);
          await scopedSession.updateUser(req, {lastConnectionAt: today});
        }
        const user = await dbManager.getUserByLoginWithRetry(profile.email, {profile, userOptions});
        if (user) {
          mreq.user = user;
          mreq.userId = user.id;
          mreq.userIsAuthorized = true;
        }
      }
      ...

With something like the above, I don't think we'd need the changes to Client.ts anymore, right? (As opening a document should hit a home server endpoint, which will involve calling Authorizer.addRequestUser.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in fact it's a better place to modify lastConnectionAt but why modify the scopedSession ? The only place we need the information to be saved is in the table users so we want to use dbManager.updateUser and not scopedSession.updateUser. If the user not exists we don't need to save the information.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds even better.

(I thought having it in the session might come in handy with tracking doc worker activity later, but it looks like Authorizer already fetches and caches a copy of the user there too, so no need to stash anything in the session.)

@@ -358,6 +359,10 @@ export async function addRequestUser(
mreq.user = user;
mreq.userId = user.id;
mreq.userIsAuthorized = true;
const today = moment().startOf('day');
if (today !== moment(user.lastConnectionAt).startOf('day')) {
await dbManager.updateUser(mreq.userId, {lastConnectionAt: today.toDate()});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we do this from getUserByLoginWithRetry instead? It also (optionally) updates the user.

await queryRunner.manager.save(userList, { chunk: 300 });
const userList = await queryRunner.query("SELECT * FROM users;");
let transaction = "";
for (let i = 0; i < userList.length; i += 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could group the users into sub-arrays of size 300 instead? Then, each group of 300 could run serially in a single transaction, with the 300 updates done in parallel. Something like:

const userChunks = chunk(userList, 300);
for (const users of userChunks) {
  await queryRunner.connection.transaction(async manager => {
    const queries = users.map(user => {
      return manager.query(`UPDATE users SET ref = '${makeId()}' WHERE id = ${user.id}`);
    });
    await Promise.all(queries);
  });
}

Here's how TypeORM chunks entities - doesn't have to be done exactly this way, but it's a useful reference: https://github.com/typeorm/typeorm/blob/e7649d2746f907ff36b1efb600402dedd5f5a499/src/util/OrmUtils.ts#L11

Parameterizing the queries would be safer, but I think there's some nuance to how you do it with sqlite vs. Postgres. I don't see any obvious SQL injection risks, but I haven't looked very hard. If it's not hard to do, I think we should do it.

@berhalak could you also do a review of the changes to the migrations? Thanks.

Copy link
Member

@paulfitz paulfitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functionality seems basically ok, thanks @CamilleLegeron . I am grateful that the user record isn't updated excessively. I have a collection of very nitpicky, linty comments.

@@ -2,6 +2,8 @@ import {DatabaseType, QueryRunner, SelectQueryBuilder} from 'typeorm';
import {RelationCountLoader} from 'typeorm/query-builder/relation-count/RelationCountLoader';
import {RelationIdLoader} from 'typeorm/query-builder/relation-id/RelationIdLoader';
import {RawSqlResultsToEntityTransformer} from "typeorm/query-builder/transformer/RawSqlResultsToEntityTransformer";
import {makeId} from 'app/server/lib/idUtils';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add new imports in alphabetical order by filename.

import {MigrationInterface, QueryRunner, TableColumn} from "typeorm";
import {addRefToUserList} from "../sqlUtils";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please expand filenames in imports to app/... (helps in grist-electron, grist-static - or rather, it is hard to explain to people why relative paths will sometimes work and sometimes not there)

import {MigrationInterface, QueryRunner} from "typeorm";
import {addRefToUserList} from "../sqlUtils";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please expand filenames in imports to app/...

@@ -0,0 +1,16 @@
import { MigrationInterface, QueryRunner, TableColumn} from 'typeorm';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either matched spaces { X } or no spaces {X}, not unbalanced spaces { X}

const queries = users.map((user: any, _index: number, _array: any[]) => {
return manager.query(
`UPDATE users
SET ref = ${addParamToQuery(dbType, 1)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall with the addParamToQuery function included this feels quite messy compared to the original queryRunner.manager.save setup? I probably missed it earlier in development, but what was the main reason for the change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some context here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I understand now. Would it be possible to narrow the scope of the selects, like for example in

public async up(queryRunner: QueryRunner): Promise<any> {
// Get all orgs and add a team member ACL (with group) to each.
const orgs = await queryRunner.manager.createQueryBuilder()
.select("orgs.id")
.from(Organization, "orgs")
.getMany();
for (const org of orgs) {
const groupInsert = await queryRunner.manager.createQueryBuilder()
.insert()
.into(Group)
.values([{name: roles.MEMBER}])
.execute();
const groupId = groupInsert.identifiers[0].id;
await queryRunner.manager.createQueryBuilder()
.insert()
.into(AclRuleOrg)
.values([{
permissions: Permissions.VIEW,
organization: {id: org.id},
group: groupId
}])
.execute();
}
}
? Hmm but then there is the actual updates to do...

Bundling does seem important since there are a lot of users. But I'm afraid if this code had a subtle bug we wouldn't catch it, and previous times we have had subtle bugs in migrations it has been a mess. I think I need to ask for this method's functionality to be given a set of unit tests. The addParamToQuery function that does one thing for Postgres and then ignores an argument for SQLite, relying on the user to call things in the right order, also feels dangerous - does it really need to be exported, can it be at least kept private?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

Login: store the last connection
4 participants