These were the deliverables asked according to the assignment description along with the status of completion from my end -
- Create a web interface where a user can (1) feed N resumes and (2) add some inputs.
- This interface must generate a specific output in JSON format (mentioned in the Requirements section). Showcase this output data on the web.
- You have to submit a link to your Github repository for this assignment submission.
- The Github repository should contain following essential things -
- Python (Django) code files (.py) containing your implementation.
-
requirements.txt
file consisting of all the python dependencies used in your submission - A
README
file with instructions to run the code in order to generate the JSON output as mentioned - A short report documenting your approach, including details of the implementation, hyperparameters, evaluation results, and any insights gained.
- Design a web interface (React) to showcase the working of this project as per the Figma design. Adhere to the Figma designs provided. [I've tried my best. I have relatively less experience with frontend development.]
Here is a side-by-side comparison of what was asked and what I delivered. It's not perfect by any means, but this is the best I could get.
Figma Design | My Implementation |
---|---|
PyMuPDF is the fastest Python library when it comes to data extraction from PDF files. It is a Python wrapper for the MuPDF library, which is written in C (main reason why it's so fast). It was a no-brainer for me to use it instead of any other Python library. You can read more about performance comparison and methodologies used.
I deployed multiple threads to extract text from the PDFs and dispatch the requests to the OpenAI api. This significantly brought down the latency of the backend. All the computation is parallelized. To bring into perspective, when I submitted 10 PDF files without multithreading, it took about 40 seconds to get the final response from the server and with multithreading it only took ~7-8 seconds, which is a huge improvement.
Nextjs offers several advantages over regular React. It simplifies React application development by providing built-in server-side rendering (SSR), automatic code splitting, and a straightforward file-based routing system. Nextjs simplifies the development process by handling many configuration details out of the box, allowing developers to focus more on building features and less on setup and optimization tasks. UI component libraries such as shadcn/ui
also work out-of-the-box with Nextjs since both are maintained by Vercel.
Axios is a popular choice for handling HTTP requests in Next.js apps due to its simplicity, flexibility, and widespread adoption in the JavaScript ecosystem. It provides a clean and concise API for making requests, supports promises, and has built-in features for interceptors, request/response transformations, and automatic JSON parsing.
Apparantly, Pages
-based routing is the preferred routing strategy when making client-side application.
When making API calls to OpenAI, I used the functions
parameter in the request, which gave a more refined output for direct use.
Source: How to call functions with chat models | OpenAI Cookbook
To avoid mistakenly uploading large PDF files and exhausting all your token credits in the process, I implemented a page limit up to which the text will be extracted. The text will only be extracted up to page 3 of the PDF file to avoid any catastrophic damage.
This is just a basic prototype of the product in the making. Here are a few things that can be improved/added -
- Refining the frontend
- Making the prompts more robust
- Adding support for GPT-4 Image prompt support to skip the text extraction process altogether
- Handling edge cases in resumes
- Adding additional security features