Skip to content

Latest commit

 

History

History
51 lines (32 loc) · 1.91 KB

APIRequestStrategy.md

File metadata and controls

51 lines (32 loc) · 1.91 KB

APIRequestStrategy

Base for calculation

  • API Quota 10000, every API call 100 questions (1 request every 10s)
  • Major tag gets approx 1000 question / day.
  • Typical close vote and possible duplicates in major tag DUP=20 CV1=60 CV2=40 CV3=20 CV4=10 = 150 questions/days
  • Tracking cv / pd questions for 20 days (maybe can be reduced to 15)
  • Tracking 20 major tags
  • 100 Newest question in tag approx 1h-2h time span (verify high traffic periods)
  • 100 Newest question no tag approx 10 time span (verify high traffic periods)

New questions

Strategy 1: execute api call for each tag use first page (2h of questions)

Strategy 2: execute api (2-4 pages) call often with no filter on tag. (if tracking many tags probably best)

Question in database

20x20x150 = 60000, 600 api calls.

New question tracking will remove some api calls (hence some question 150x20=3000, 30 api call)

Every request done to bot for cherry picking removes 1 api call (4 users a tag --> 4x20= 80 api calls)

Total amount approx 500 api calls to update remaining.

Initial startegy

  • New questions, strategy 2 every 10 minutes --> 20x(24x6)=3000 api calls
  • Updating old questions cv count every 2 hours --> 500x12=6000 api calls

Total 9000 API calls

Conclusion

Thread 1: New questions

Running on 10 min interupt with a throttle of 1 api call every 2s (1 min to update, 9 min sleep).

Thread 2: DB update

Querying for 100 questions with oldest update date and executing api every 15s (hence approx 5760 calls)

Final consideration

For questions older then 7 days data.stackexchange.com could be used to update database, this would save 3000-4000 api calls or allow to double major tags that are tracked.