BACON in the Agile Platform

BACON in the Agile Platform


This article is a spiritual continuation of A Tale of Installers. We won't talk about installation processes specifically here, but rather on tackling the problem of making sense of A LOT of automated reports, namely of failed installations. On our way we’ll talk about search engines, using NoSQL and other technologies, and coming up with something useful beyond its original purpose.


So much information, so little time...

As we've seen previously, an installation process can fail due to a very wide range of reasons and predicting them all is, in practice, impossible. From having Skype blocking port 80 (and disabling your IIS) to people who've encrypted their hard disks (thus preventing SQL Server from installing), the possibilities are endless.


With the release of our Community Edition, we started having thousands of installations and getting reports of installation failures due to unexpected conditions. And although we knew we couldn't shield the installer from all the edge cases, we attempted to automatically solve the most common situations and present useful messages for the remaining ones.


Looking at all error reports, grouping them, and identifying the largest group, is a task better done automatically. But this isn't something that can be done with a simple database query:


  • An error may occur in one of many sub installers
  • Each sub installer has its own logging format
  • Installers evolve over time and log fields end up having different names
  • Some log names vary with the name of the machine where the installation occurs
  • It's quite a lot of data to keep uncompressed (an SQL Server installation may generate more than 60 MB of logs)


I'm sure I saw a similar stack dump in here...

In the end, this problem ended up having a somewhat simple solution: “all” we had to do was to build a custom search engine!


Search Engine How-To

Most search engines' architecture can be split into three stages:


  1. Crawl available information
  2. Index relevant content
  3. Provide a search interface



The crawling step is usually easy. If you're crawling the web, you probably have to deal with issues such as duplicate content or never-ending automatically generated content. In our case, since we're crawling a limited and well-defined bug report database our task is comparatively trivial.


Indexing content is usually the trickiest part. When doing this, you want to filter out the uninteresting information and keep track of the interesting stuff only, and that usually requires some intelligence regarding the format of the information. If you're crawling HTML documents, you might want to strip the HTML tags and leave just the text. In our case, we've built an indexing engine that has some knowledge about the installation reports, so if it sees an installation that failed due to a problem in IIS, it is able to discard the SQL logs when indexing that report, thus allowing us to keep our index with uncompressed data so we can search it faster. An added bonus of this is the cutting down of data to a reasonable size, by getting rid of superfluous information.


Providing a search interface is a trivial task, as long as your indexing did the right job.


Meet the BACON

Besides being delicious, BACON is our Better Automated Categorizer Of New reports!


Bacon Architecture:


  • Search Engine
  • Python crawler and indexer
  • MongoDB store
  • Web controller


We used Python for crawling and indexing because it's a very good language for performing operations with strings, regular expressions and dictionaries. And because we like to use the right tool for the right job.


We used MongoDB for the data-store because our use case was better suited by a NoSQL document-store engine than a standard relational database: we do mostly inserts and reads so we need atomicity but we don't need transactions; we have a lot of data and many searches require full scans; our data fits documents (with logs from the several sub-installers as attributes) better than rigid structure tables and due to the size of our data and search patterns, database joins would be painful. Again, the right tool for the right job (although we could have used other document-store NoSQL engines).


We used the Agile Platform for developing the web application that enables users to search MongoDB through a browser without having to learn mongo's specific query commands, and where search patterns can be stored for automatically grouping similar reports. The web controller also orchestrates the crawler runs. Building web apps with the Agile Platform is easy and quick and integrating them with MongoDB was no trouble. Again, the right tool for the job.


My precious...


We've been so happy with the performance of the system that we've extended its usage to index not only installation error reports, but all error reports for all Agile Platform components, so BACON has been helping our maintenance team in solving all the problems that are reported to us (many of which you can see by checking the release changelogs).


BACON before the weekend!

In closing, it is worth mentioning that the BACON project was fully built under OutSystems' R&D myFriday initiative. If you've heard of Google's "20 percent time", you may already be guessing what the myFriday is all about: we get out of our roadmap for a moment and use that time to experiment with things that may lead to improvements to our product or internal processes. It took about 2 days to bring BACON to life. Originally meant to tackle an annoyance, the system became meaningful and useful for the whole R&D team, ultimately allowing us to better serve our customers.


This comes to show that if you are faced with a daily hurdle, and you decide to take some time to fix it rather than living with it, you may end up coming up with a surprisingly powerful solution in a fairly short time. All the flavor with 0% fat.

Quick comment to let you know that the very last sentence of your article gave the strenght to go and solve an annoyance I have in my daily work. I'll be taking the rest of this Friday off to solve it, once and for all! Thanks for the inspiration! :)
Great post.

"because we like to use the right tool for the right job.", amen. :)
Hi Miguel,

You have mentioned that you used Agile Platform to develop the web application to search MongoDB without having to use the mongo specific query. Does that mean, Agile platform now has the MongoDB connector? Is there a way to access BACON to see how it functions?

Thanks in advance
Hi Jeyanthi,

The connector we've developed is simply an extension that issues the queries to mongo and converts the results into structures that can be used on the consumer eSpace. The API the extension exposes for querying doesn't require knowing about mongo specific queries.

At the moment we still haven't polished the connector enough but I'll try to make it available on the forge before March comes.

is it available for Java Platform? Or it is .Net specific?
Hi, just wondered if this MongoDb connector is releasable yet in any form, even if it's only part way there? It'd be really interesting to have a look at.