How to extract images and drawings from PDF with Python

Extracting images and drawings from PDF files can be a challenging task, but with the right tools and techniques, it’s entirely achievable. This blog post explores how to use the PyMuPDF library in Python to extract both images and drawings from PDF documents. We’ll dive into the nuances of handling transparency layers in images and clustering drawings to preserve embedded text. Whether you’re building a PDF summarizer or simply need to extract visual content from PDFs, these methods provide a robust solution to automate the process.

ARIMA and Online Learning in Financial Forecasting

I discuss the development of an online learning system using the Jane Street Real-Time Market Data Forecasting challenge as a practice ground for time-series forecasting. The project involves predicting the responder_6 variable using an ARIMA model, with a focus on adapting to new data by re-training the model whenever a new date_id is encountered. This approach leverages multiprocessing to meet strict time constraints

Walk Forward Validation on Jane Street Real-Time Market Data Forecast

Walk Forward Validation (WFV) involves a training window that moves forward in time, training the model on historical data and then validating it on future, unseen data points. Unlike traditional cross-validation where data is randomly split, WFV respects the sequence of time, making it ideal for datasets with time-dependent features like stock prices, weather patterns, or sales figures.

How to override a method of instantiated object in python

In this post, I describe how I overcame AWS login challenges in a coding competition by using a method override trick. By defining a new function for authentication and dynamically replacing the existing method in an instantiated object, I was able to experiment with the Embedchain package without altering its class definition. This technique allowed for seamless integration with AWS services and added a valuable tool to my programming arsenal.

How to override Django-Allauth default templates

The Problem:

I chose the package ‘django-allauth’ to help me with the login management of a SaaS code base that I am building. All my installed packages are inside of my virtual environment folder (venv) inside my project folder.

I had already created a base layout for the landing page. However, after installing the ‘django-allauth’ and configuring it, I noticed that the login page did not inherited the layout configuration from my base template.