stack-code-finder

Automated scraper built on top of Stack Exchange API.
Search code fragments by given phrase in Stack Overflow - consider only code snippets in selected threads.
Report Bug


The example of a code snippet

About the project

Built With

Getting Started

Prerequisites

Set all following environmental variables, e.g.:

HOST=0.0.0.0
PORT=3000

MONGO_DB_USER=root
MONGO_DB_PASSWORD=example
MONGO_DB_NAME=appdb
MONGO_DB_PORT=27017
MONGO_DB_SERVICE_NAME=mongodb

CODE_FRAGMENTS_FETCH_LIMIT=10
JWT_TOKEN_SECRET=access_token_secret
STACK_API_KEY=stack_api_key

You can find more information on getting the STACK_API_KEY by following → https://api.stackexchange.com/docs/authentication.

Important note → https://api.stackexchange.com/docs/throttle

Dev installation

Clone the repo

git clone https://github.com/adjaskam/stack-code-finder.git

Install NPM packages for the client
```
cd client
npm i
```
Start the project with concurrently (invoke from the root directory)
```
npm run dev:fullstack
```

Note: The backend part of this project is based on Dockerfile and the development process is placed within the container.

Endpoints

POST /api/codefragments - start a job for given tag (includes scraping procedure). The application supports:
- Preventing creation of duplicates extracted fragments (comparing values of MD5 hash from code fragment factors).
- Handling user-specific documents - that means owning the single code fragment by multiple users.
- Web scraping optionally with Puppeteer or Cheerio.

Body of the example request:

{
   "tag": "Java",
   "searchPhrase": "int",
   "amount": 1,
   "scraperType": "cheerio"
}

Response:

{
   "items":[
      {
         "questionId": "71860220",
         "tag": "Java",
         "searchPhrase": "int",
         "codeFragment": "public class TekuciRacun implements IRacun{\n private String vlasnik;\n private int isplate;\n private int kredit;\nthis.stanje = stanje;\n    }\n    \n    \n    \n}\n",
         "hashMessage": "2de6aac5afba3f6f44aa7f9e91cb9d8d",
         "usersOwn": [
            "example3@example.com"
         ],
         "_id": "625712efea1e61ff34001739",
         "createdAt": "2022-04-13T18:14:07.563Z",
         "updatedAt": "2022-04-13T18:14:07.563Z",
         "__v": 0
      }
   ],
   "amount": 1,
   "executionTime": 960
}

GET /api/codefragments/my - get all obtained code fragments per user
DELETE /api/codefragments/:hashMessage- delete code fragment by MD5 hash
- Available for authenticated user.
- Soft delete is being proceeded until the last user owns the specific code fragment.
- usersOwn array of given code fragment is empty? -> hard delete item.

Authentication is needed to handle user-specific documents and is based on JWT standard. No confirmation needed while registering. Email has to be unique. All forms available in the application are being validated.

POST /api/register - register new user
POST /api/login- service login

TODO

Handle user-specific documents - create authentication & owning the documents by the specific user
Work on performance - added cheerio as the main scraper
Adjust searching for searchPhrase in the obtained content to be more precise (currently, the base of the search process is check if fragment includes given searchPhrase)
Work on refresh tokens

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
client		client
server		server
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

client

client

server

server

.gitignore

.gitignore

README.md

README.md

package-lock.json

package-lock.json

package.json

package.json

Repository files navigation

stack-code-finder

About the project

Built With

Getting Started

Prerequisites

Dev installation

Endpoints

Body of the example request:

Response:

TODO

About

Releases

Packages

Languages

adjaskam/stack-code-finder

Folders and files

Latest commit

History

Repository files navigation

stack-code-finder

About the project

Built With

Getting Started

Prerequisites

Dev installation

Endpoints

Body of the example request:

Response:

TODO

About

Topics

Resources

Stars

Watchers

Forks

Languages