Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(scraping): change lookup impl, add randomize sleep #110

Merged
merged 2 commits into from
Sep 20, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 2 additions & 1 deletion .env-example
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ PHONE_CARRIER="tmobile"
PLAY_SOUND="notification.mp3"
PUSHOVER_TOKEN="123pushover-token456"
PUSHOVER_USER="123pushover-user-key"
RATE_LIMIT_TIMEOUT="5000"
PAGE_SLEEP_MIN="5000"
PAGE_SLEEP_MAX="10000"
SHOW_ONLY_BRANDS="evga"
SLACK_CHANNEL="SlackChannelName"
SLACK_TOKEN="slack-token"
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,8 @@ Here is a list of variables that you can use to customize your newly copied `.en
| `PLAY_SOUND` | Play this sound notification if a card is found | E.g.: `path/to/notification.wav`, relative path accepted, valid formats: wav, mp3, flac, [free sounds available](https://notificationsounds.com/) |
| `PUSHOVER_TOKEN` | Pushover access token | Generate at https://pushover.net/apps/build |
| `PUSHOVER_USERNAME` | Pushover username |
| `RATE_LIMIT_TIMEOUT` | Rate limit timeout for each full store cycle | Default: `5000` |
| `PAGE_SLEEP_MIN` | Minimum sleep time between queries of the same store | Default: `5000` |
| `PAGE_SLEEP_MAX` | Maximum sleep time between queries of the same store | Default: `10000` |
| `SHOW_ONLY_BRANDS` | Filter to show specified brands | Comma separated, E.g.: `evga,zotac` |
| `SLACK_CHANNEL` | Slack channel for posting | E.g., `update`, no need for `#` |
| `SLACK_TOKEN` | Slack API token |
Expand Down
1 change: 0 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@
},
"homepage": "https://github.com/jef/nvidia-snatcher#readme",
"dependencies": {
"async": "^3.2.0",
"dotenv": "^8.2.0",
"messaging-api-telegram": "^1.0.0",
"nodemailer": "^6.4.11",
Expand Down
5 changes: 3 additions & 2 deletions src/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ config({path: resolve(__dirname, '../.env')});
const browser = {
isHeadless: process.env.HEADLESS ? process.env.HEADLESS === 'true' : true,
open: process.env.OPEN_BROWSER === 'true',
rateLimitTimeout: process.env.RATE_LIMIT_TIMEOUT ? Number(process.env.RATE_LIMIT_TIMEOUT) : 5000
minSleep: Number(process.env.PAGE_SLEEP_MIN ?? 5000),
jef marked this conversation as resolved.
Show resolved Hide resolved
maxSleep: Number(process.env.PAGE_SLEEP_MAX ?? 10000)
};

const logLevel = process.env.LOG_LEVEL ?? 'info';
Expand Down Expand Up @@ -53,7 +54,7 @@ const page = {
capture: process.env.SCREENSHOT === 'true',
width: 1920,
height: 1080,
navigationTimeout: Number(process.env.PAGE_TIMEOUT) ?? 30000,
navigationTimeout: Number(process.env.PAGE_TIMEOUT ?? 30000),
userAgent: 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36'
};

Expand Down
43 changes: 19 additions & 24 deletions src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,28 @@ import {Config} from './config';
import {Store, Stores} from './store/model';
import {Logger} from './logger';
import {lookup} from './store';
import async from 'async';
import {Browser} from 'puppeteer';

puppeteer.use(stealthPlugin());
puppeteer.use(adblockerPlugin({blockTrackers: true}));

function getSleepTime() {
return Config.browser.minSleep + (Math.random() * (Config.browser.maxSleep - Config.browser.minSleep));
}

async function tryLookupAndLoop(browser: Browser, store: Store) {
Logger.debug(`[${store.name}] Starting lookup...`);
try {
await lookup(browser, store);
} catch (error) {
Logger.error(error);
}

const sleepTime = getSleepTime();
Logger.debug(`[${store.name}] Lookup done, next one in ${sleepTime} ms`);
setTimeout(tryLookupAndLoop, sleepTime, browser, store);
}

/**
* Starts the bot.
*/
Expand All @@ -22,32 +39,10 @@ async function main() {
}
});

const q = async.queue<Store>(async (store: Store, cb) => {
setTimeout(async () => {
try {
Logger.debug(`↗ scraping initialized - ${store.name}`);
await lookup(browser, store);
} catch (error) {
// Ignoring errors; more than likely due to rate limits
Logger.error(error);
} finally {
cb();
q.push(store);
}
}, Config.browser.rateLimitTimeout);
jef marked this conversation as resolved.
Show resolved Hide resolved
}, Stores.length);

for (const store of Stores) {
Logger.debug(store.links);
q.push(store);
if (Stores.length === 1) {
q.push(store);
} // Keep from completely draining
setTimeout(tryLookupAndLoop, getSleepTime(), browser, store);
}

await q.drain();

await browser.close();
}

/**
Expand Down