The web is an API (scrap it !)

How I have used the Python API BeautifulSoup to enhance a website.

I currently live in Montreal where we have very good public libraries to borrow books and all sort of things. Each time you take an item back home, a system is used to manage that. It comes with a public website that help you to check if you have to bring back the books or if you want to extend the loan period.

The point is that I have 3 kids, and my wife also have an account. Yep, we have 5 accounts on the system and things started to get difficult when I have to check on each account !

1*RHJgbEq5x58OfM_3KbkHOQ.png

Nelligan catalog, the end-user library system

The catalog system is named Nelligan, but searching on the source of the website, you will easily find it is a commercial solution named III Encore. The public website doesn’t allow multi-account option of something.

My goal is be able to :

  • Check quickly the date I need to bring back the books to the library to avoid fines
  • Be able to extend the loan period

So let’s try to check if an API is existing in order to write something on top of that ! After a few days, I didn’t see anything. The company III seems to provide API but not something with a public interface… I just found a python lib here that is doing this kind of work on multiple commercial systems for library ; but the implementation of III Encore / Web PAC Pro (seems the same product) seems not finished.

I have decided to work a bit on this GitHub repository and I found very useful to invest time in BeautifulSoup, that is a web-scrapper (ie you can read via a script what you want on the web !). Combined with the Requests API, I have all the tools I need to make my project. I decided to use Django for the web part just because everything was written in Python and because I want to learn it !

Mainly the game is not so hard :

  • Find a way to login via Request API :
login = {'code': code, 'pin': pin}
r = requests.post(NELLIGAN_URL + '/patroninfo/?', data = login)
  • Then find a way to read the data I want in the page :
# Grab loans (currently taken)
soup = BeautifulSoup(r.text, 'html.parser')
items = soup.select("tr.patFuncEntry")
for item in items:
    # do things on books

My main service file is here, it contains all the calls to the website. It allow me to test the card, grab the actual loans, extend the loan for a book, request for books, view the fine associated with each card…

It is then very easy to order books by end-of-loan date…

1*J0THuAHs40876n4gLLIO4w.png

Books ordered by end of loan date, from multiple cards

To get more on this application here is some links:

My next goal on this application ? I really want to contribute on this project to have a global API for library management (not only a single commercial app !). Then integrate this API in my web application (not only Montreal Nelligan system focus). And also one day trying to reproduce the API in a different language like JS for a equivalent NodeJS system.

No Comments Yet