Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accounts in Database + Account Manager #2424

Open
wants to merge 29 commits into
base: develop
Choose a base branch
from

Conversation

neskk
Copy link
Contributor

@neskk neskk commented Jan 6, 2018

[Beta/Safe Stage] - Mostly tested, need more feedback.
Reworked pretty much every place where accounts are used and created a single entity responsible for handling all operations: an account manager.

Description

Accounts are now saved in the database, their status is kept updated.
AccountManager handles all the queues and is responsible for maintaining account information updated.
When an error/captcha is encountered accounts will remain allocated to current instance - preserving old behavior.

  • Detects possible duplicate instance launches. This is an attempt to prevent accounts from being allocated to two instances at the same time. See discussion bellow. Instance ID should unique to your configuration. When restarting an instance, allow it to cool-off for 10 seconds, then the error message won't be displayed.
  • Captcha code refactored into the account manager - generic way of handling captchas and reduced duplicate code.
  • Shadow ban detection and handling. --shadow-ban-scan: true allows these accounts to continue scanning. Note: detection is never disabled and will record no-rares in thread status.
    • Updated common Pokemon list.
  • --hlvl-workers controls the number of spare high-level accounts that account manager will keep allocated. If you had a CSV with 3 accounts, you can achieve the same effect with -hw 3.
  • --hlvl-workers-holding-time Controls how long high-level accounts can remain idle. After this time they will be deallocated and can be picked up by other instances. Defaults to 120 seconds.
  • "On-the-fly" account allocation is enabled by default: --hlvl-workers 0 or -hw 0
    • When we want to encounter a pokemon, account manager will attempt to allocate an account from the database if there's no spare high-level accounts available.
    • --hlvl-workers-holding-time 0 if set to 0, account manager will immediately release the account after it's used, the account is released back to the spare account pool.
    • Important: If your configuration has bursts of encounters to process, having "on-the-fly" allocation would result in many high-level accounts being allocated and then moved to spare queue. Even with --hlvl-workers-holding-time set to 0, this would be inefficient for setups with large volume of encounters because it would end up recreating the API object every single time, not "respecting" --no-api-store (repeat the whole login simulation process = many requests).
  • --hlvl-workers-max is set by default to 2, and it controls the maximum number of high-level accounts that can be allocated at any given time to current instance. This setting is only used when --hlvl-workers is 0 ("on-the-fly" allocation) to prevent a burst of encounters from suddenly allocate all the accounts in the database.

Motivation and Context

Managing CSV files with accounts is kinda boring.
Account handling code was scattered across the project, multiple queues and variables were required, now it should be much simpler to control accounts, allowing more complex logic.

TODOs:

  • Web endpoint to display accounts in the database and provide administration interface:
    • Insert accounts (import accounts from a CSV through web-browser)
    • Export accounts to CSV file.
    • Clear all accounts from the database.
  • (Optional) Provide a configuration parameter to control how account rotation works - currently hard-coded to maximize account re-usage.
  • (Optional) Configurable parameter to allow captchas and failed accounts to be immediately released from instance - they won't remain allocated so they won't show up on status printer, but can allow to have a dedicated instance just to solve captchas.
  • (Optional) Stuff like device_info, proxy_url and perhaps asset_download_status (future work - cache downloaded assets in json) could be kept in the database to maximize the consistency of usage.

How Has This Been Tested?

Production environment with MySQL and 40 instances sharing a 10k account pool.

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

@neskk neskk force-pushed the pr-accounts-db branch 3 times, most recently from b09eac8 to 9c03040 Compare January 6, 2018 05:40
@sebastienvercammen sebastienvercammen added this to the 4.1.1 milestone Jan 6, 2018
Alderon86 added a commit to Alderon86/RocketMap that referenced this pull request Jan 6, 2018
Alderon86 added a commit to Alderon86/RocketMap that referenced this pull request Jan 7, 2018
@neskk neskk force-pushed the pr-accounts-db branch 7 times, most recently from fd93951 to d8b8f8d Compare January 7, 2018 09:09
Alderon86 added a commit to Alderon86/RocketMap that referenced this pull request Jan 7, 2018
Alderon86 added a commit to Alderon86/RocketMap that referenced this pull request Jan 7, 2018
@Alderon86 Alderon86 mentioned this pull request Jan 7, 2018
5 tasks
@neskk neskk force-pushed the pr-accounts-db branch 7 times, most recently from 0f70f03 to 3a096f4 Compare January 9, 2018 16:56
neskk added 26 commits March 17, 2018 05:45
encounter_pokemon accounts can now solve captcha on the spot.
spin_pokestop uses location from account, no need to pass one more argument.
Removed 'shadowban' field in favor of having a 'banned' field with multiple status.
AccountBanned: 0 (Clear), 1 (Shadowban), 2 (Tempban) and 3 (Permban)
Improved account monitoring/cleanup routine.

Fixed high-level encounter logic, all situations are now correctly handled by the account manager, in particular:
 - When accounts are wrongly inserted as high lvl, they now fail and after resting they are returned to the appropriate queue.
 - Account information wasn't being updated upon account releasing.

Added altitude in some places we were missing it. I don't think it's worth keeping this field in Account database model.
Reverted some odd changes in schedulers.py
parse_accounts_csv will now filter duplicate accounts in files and only declares default account once.
If args.hlvl_workers is set to "0" but encounters are enabled, the Account Manager won't keep the accounts allocated.
This means that the queue of high-level accounts won't hold the account after it is used.
If args.hlvl_kph is set to "0", the speed limit won't be checked and we can maximize the number of encounters an account can do.
Added parameter "--encounter-retries" that controls the maximum number of attempts that an account will retry an encounter.
Some methods inside Account Manager should never be used from "outside" the class - making them "private".
Bug fix on AccountManager::run_manager()
Logging in account manager fixed, no more "duplicate" messages - old debug code.
Account manager thread startup moved outside of class init function. This fixes a issue when starting the scanner without any accounts already on the database, it would only fetch the inserted (new) accounts after a minute (first loop of account manager).
Refactored a bit of code in _allocate_accounts to avoid duplication.
Fixed identation of all actions that require an account manager - runserver.py
Small bugfixes.
Changed type of Account fields from 'CharField' to 'Utf8mb4CharField'.
Added 'instance_id' to MainWorker and WorkerStatus
A bit of clean up on worker status updates.
Added a check at startup to avoid running duplicate instances.
Fixed some bugs in account.py - reset_account()
When --hlvl-workers is set "0" it will try to allocate accounts on-the-fly. There was a bug that caused the instance to allocate 5 accounts because it wouldn't break on success.
Overseer stats message now includes more information about the account pool.
Status page "fixed": uses instance_id instead of worker_name to compute MainWorker hash.
WorkerStatus model updated to include "norares" count.
Status page and status printer thread displays the number of scans without rare pokemon (no-rares).
Instance ID generation now takes into account "no-gyms", "no-pokemon", "no-pokestops" and "no-raids" flags - This will help avoid conflicting instance IDs.
Old captcha logic was a bit fuzzy and there was a lot of duplicate code. I think I'm happy with this solution, still needs proper testing.
Reworked logic to unallocate high-level accounts after usage.
"--hlvl-workers-holding-time" is used when "--hlvl-workers" is set to 0 (on-the-fly allocation).
Encounter retry cycles fixed: we don't want to wait for nothing.
Database schema version updated and added schema "migration".
Re-organization of how account manager keeps track of allocated, active and spare accounts.
AccountRecycler: after failing and waiting its holding time in failed queue, banned accounts are released and deallocated from the instance.
AccountKeeper: release excess accounts that aren't being used (-asi << -ari).
Cleaned up handle_captcha() code to be more understandable.
Added separate locks for each account pool and rewrote functions to move accounts between pools - performance improvement.
Moved captcha code block to the end of the file - keeping accountManager.py organized.
Reduced timeout to 10 secs on startup duplicate instance launch check. Log message updated to avoid doubts. MainWorker should be updated every 3 secs, so I lowered the cooldown period from 30 secs 10 secs which should be enough.
AccountMananger internal methods now all start with '_', e.g. _account_keeper(). Methods that are supposed to be called from outside don't have this '_' prefix.
Tested encounters, with hlvl_kph set to 0, one account can perform several encounters in seconds.
Captcha solver now uses account.py setup_api() - tested.
Many, many, too many changes...
Rebased.
handle_captcha now handles "False" responses from failed requests.
Still have some issues with account allocation in a setup with multiple concurrent instances.
Added a double check that validates which accounts were successfully allocated in case something went wrong with update query.
Removed counters "replenish_count", I used them previously to replace active pool but in the end I figured it's best to keep an OrderedDict with the active account pool.
Refactored and moved all account allocation code to to _allocate_accounts and _fetch_accounts.
Reduced the time an account has to wait before retrying to get a spare account. Maybe this can even be lowered further to allow lower excess account holding times.
Turns out account allocation was fine and a bug in WorkerStatus was giving inconsistent results on status page.
WorkerStatus acts as temporary record keeper for account stats: no rares and captcha count). Other than that there's no more useful information to fetch from previous status since the remaining information (latitude, longitude, last_scan, last_modified) now comes from Account data.
Renamed WorkerStatus field 'last_scan_date' to 'last_scan' to match Account model. Updated and made adjustments where 'last_scan_date' was being used (schedulers.py).
WorkerStatus a.k.a. 'status' field 'last_scan' can now be set to 'None'. This change will make it easier in the future to merge/remove duplicate information from WorkerStatus and use latitude, longitude, last_scan, directly from Account model.
Small ajustements in Speed-Scan and Spawnpoint-Scan to accomodate 'last_scan' changes - I think code is more readable now and everything works the same way.
Renamed WorkerStatus field 'norares' to 'no_rares' to match with 'no_items'. Updated all references of this field.
Created a method that returns the default WorkerStatus dictionary with its stats reset - reduced duplicate code and separates local/runtime data from persistent/database data.
Reordered WorkerStatus fields: Skips > No Items > No Rares > Captchas
Status page will now only display WorkerStatus/MainStatus that were active in the last 30 minutes. Improves consistency of status page for those who don't enable database cleanup.
Account stats information displayed in manual captcha solving page will also only include stats from MainWorker that have been recently active.
Small improvements in some logging messages and formatting in search.py.
Since accounts are shared between instances we needed a way to keep track of the number of failed login attempts.
The solution we had was local/runtime only and wouldn't work for "on-the-fly" account allocation.
Renamed Account model field 'fail' to 'failed' and changed its type to SmallInt in order to have a persistent counter of login failures. 'fail' field was only used to simbolize account was on failed account pool, it had almost no purpose, now it's an important field.
Added '--account-max-failures' to control the number of consecutive login attempts failed it takes to flag the account as perm banned. Default is 5, meaning the account will have to fail N login-retries, on 5 different occasions to be considered banned. Once it fails the first time account will be put to sleep for a small period of time and will attempt to login again later.
Reverted some changes in '--hlvl-workers-max', it is only used when '--hlvl-workers' is 0 (on-the-fly allocation).
Improved overall robustness of critical code blocks and database operations.
Added back some safety checks to avoid having the same account being used by multiple search workers.
Added database().execution_context() to _allocate_accounts() to make sure account allocation is made within a transaction.
Improved thread safety locks, seemed like Account Manager was losing track of some accounts.
Added a try..except to _account_keeper() to pick up critical errors in account management.
No longer uses a cycle counter, instead using timers to allow more flexibility and to clarify code's logic.
Changed check_login() to output more info and handle exceptions. This function now returns a boolean accordingly to login success.
Removed exceptions no longer used (TooManyLoginAttempts and LoginSequence). All information is still kept for debug.
Login failures now expect a timeout afterwards, this is used to prevent permaturely marking all accounts as perma banned just because something was wrong for some hours (e.g. PoGo login servers offline).
Fixed duplicate 'only-server' test in runserver.py.
Account model now holds the methods that insert and clear accounts in database.
Small changes/additions to code comments.
Save remote configuration assets and templates times in the database (Account model).
Only perform assets requests if database times don't match current remote configuration.
This change should save a lot of RPM and make login times shorter.

Update database schema:
```
ALTER TABLE `<database>`.`account`
ADD COLUMN `time_assets` INT(4) UNSIGNED NOT NULL DEFAULT 0 AFTER `level`,
ADD COLUMN `time_templates` INT(4) UNSIGNED NOT NULL DEFAULT 0 AFTER `time_assets`;
```
@neskk neskk force-pushed the pr-accounts-db branch 2 times, most recently from d976e94 to e550a7e Compare March 17, 2018 06:12
This fix the situation where newly inserted accounts would be picked up instead of older ones due to having a more recent last_modified value.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants