#3 Update on the extractors

Update: Both Flipkart and Amazon Extractor (Python) with fine, expect for the fore-mentioned issue.

The purpose for which i made the python extractors, give quite accurate results. Parsing the email, to find the appropriate data was quite fun, but what worries me is the longevity of the semi-sketchy methods to extract the data.

Scrapely worked beautifully, but there were some unironed kinks which need attention while parsing the information.

{   "id" : "OE1004125T3442...",
    "total": "Rs. 310",
    "name":"<div><span><b><a href="http:// .../../..">ProductName</a></b></span></div>"

We can see here that the value of  “name”  is messed up a bit.
The desired result that was needed was:

{"name" : "ProductName"}

Yeah well, for now there was only way which came into my mind to parse this, was some sketchy method 🙁 . But rest assured, everything else works fine.

#2 Writing the first extractor

So, it begins with gibberish raw email data from shopping websites, confirming your order has been dispatched.

Thanks to the KDE Now base framework which effectively decodes quote printed text to UTF-8 easily 😀 . Now skimming through the decoded HTML, finding a pattern was little tricky, and gathering these raw pieces of data and forming meaning full information would be a hell lot difficult. The x-path would keep varying on few emails. Even regex failed to work in this situation, it wasn’t really reliable.

Thankfully I chose to use Python embedded on the C++ base of KDE Now. Python has a vast amount of library just to suit your needs.

Scrapely and Scrapy are two such packages which help me to breeze though the process of extracting valuable data based on a training data page and relatively scrape data from the freshly arrived emails.

#1 KDE Now Online Shopping Module

I found the KDE Now project on GSoC 2016 project list, as it was very similar to the concept of what we see on Google Now. I got very much interested in the way KDE Now was trying to mimic the functionalities, which is equally important and valuable for the desktop environments and is currently missing from any desktop environment.

As the project was part of GSoC, the required base platform for working of KDE Now works quite perfectly. The time when I found the project, few of the most essential plugins for the system has already been built.

  • Event Reservation
  • Flight Reservation
  • Hotel Reservation
  • Restaurant Reservation
  • Some thing was definitely missing which Google Now handles flawlessly – Online Shopping !!

So, here begins my journey trying to develop a  functional plugin, which will support multiple shopping websites, and give all the necessary details, which is completely useful to the user at a glance.