Category Archives:

Talk Nerd to Me

Using Joins Improved My Code

Database Joining can be complicated and feel time consuming at first but when used properly can actually save time and improve your code.

Before I get into anything in-depth, I’m going say that I didn’t have much background in programming before this internship. This year, I’m a second year Computer Science Undergrad at UCSC who’s had only two quarters of programming experience before being given the task to make a Magento Extension.




Image taken from: http://websiteexpert.com/ecommerce/magento-ecommerce

Magento and its frameworks was already confusing enough let alone have to build a custom table. However, because of a previous experience with working on a WordPress Plugin, I thought it was doable. The biggest hurdle I had to overcome in this project though was trying to save the Offer ID for each product. After much trial and error, I finally was able to create my own table that would store the Offer IDs.

My table basically looked like this:

My Auto Increment Title Magento ProductID Status OfferID
1 Product1 1 null null

The title was the name of the product. So, in order to keep my table up to date, I created a listener that has to check everything that a user edited or created in a new table.

When the user creates a new product, my table looks like this:

My Auto Increment Title Magento ProductID Status OfferID
1 Product1 1 null null
2 Product2 2 null null

Copying data every time the user clicked “save” seems correct, but its not and unnecessary. I thought that this was the proper way of organizing a database until my boss, Michi, told me that I should have been using joins instead of copying information from one table to another. If you look at the second table you can see that the OfferID and the Status values are null. Since this table is meant to keep track of the OfferID and its status, I really didn’t need the Product1 and Product2 row to exist. Ideally, I should have only created a row for a value for OfferID

So I decided to do a Left Join between my OfferID table and Magento’s Product table, while using the Magento Product table as the anchor. By getting rid of the title column, I reduced my table to four columns and was able to use the Magento ProductID column as a bridge to join both tables.

A simplified Magento Product table:

Magento ProductID Title
1 Product1
2 Product2

The new OfferID table with a value in Magento ProductID 1:

My Auto Increment Magento ProductID Status OfferID
1 1 On 123

The Left Join Table with Mangeto Product as anchor:

Title Status OfferID
Product1 On 123
Product2 null null

After I joined the table, everything ran more smoothly. The OfferID table only stores a value when an OfferID is entered now, before it would store a null value when it was not needed. Now that I linked to the Magento Product table, there is no point in having a listener. The process was simplified because the program would check after every save if a change is made. My widget pulls information directly from the Magento Product table, therefore getting rid of my listener.

One of the lessons I learned from this? Had I spent a little more time researching about joins beforehand, I would have saved a lot more time coding the Magento Extension.

How Data Changed Our Engineering Priorities

Launching a new service or product can be daunting to say the least. Before publicly releasing our front-end embeddable widget, we went through many internal iterations. Eventually, we came to a design we thought was “good-enough.” We then proceeded to share this layout to a few private beta customers.

When the first few campaigns that were run did not yield what we considered acceptable norms, there was some healthy debate as to what might be causing potential conversion loss. There were many theories, but at the end of the day, they were simply our opinions.

While this was occurring, our team was working in parallel on our data tracking project that captures user behavior. Initially, this project was designed to simply capture the click and social sharing data that our product employs. Drawing on the lessons learned from those first campaigns, we decided to enhance the existing data tracking to include even more granular data points.

photo credit: Carsten Mueller

At the time of analysis, the unit we used to display advertiser copy resided in an expandable tab that was displayed at the top of a web page. One question that was raised was, “What if the user scrolls down the page?” Another concern was that the initial design did not emphasize the discount percentages and shipping options.

In order to answer these questions, we tracked the following:

• confirm that page element was loaded
• “visibility” of our object (i.e. can the user “see” the tab)
• conversion rate of prior campaigns vs. campaign using latest UX (user-experience) changes (a very simple A/B analysis)

At the end of the revised campaign we discovered that close to 60% of users were not even seeing the tab due to the fact that they were already scrolling down the page before it even loaded and was visible at the top. We are now currently exploring different mechanisms to engage the end-user with promotional material that will allow for greater visibility across the entire page.

One of our core engineering principles at Splurgy is to be “data-centric.” Starting the process of tracking and reporting can be very daunting, but at the end of the day, it helps define the product in a way that benefits not only Splurgy, but our clients as well.

Hadoop and HIVE and Array Map Reduce, wait, what?

One of the first tasks here at Splurgy was to track data. We all drew straws on who was going to lead the initial effort to put something in place to track user behavior. Somehow I ended up with the short end of the stick.

My background is primarily in web development. That means I push around some ruby code and maybe do the odd SQL-based query. When I started looking into HIVE and Hadoop and array map reduce on top of all the other projects that were due seemingly at the same time, I went to my quiet place and cried for a little bit. Then I happened upon a little beta project called “TD” (Treasure Data) that is an add-on for heroku (our cloud-based production deployment environment).

TD promised to take care of all the nasty implementation details and just start logging data. I was somewhat skeptical, but when I looked at the engineering cost of setting up our own data-store from scratch, I became less so. The total implementation from start to end took me 2 days. OK, that’s neat, but does it work? Well, during the first day of installation, I was able to immediately record development data and query it.

Of course we have enhanced and updated our code since our original implementation, but we are extremely impressed by the fact that a task that would traditionally require a dedicated engineering resource can effectively be mitigated by a third-party solution that is “good-enough”. We have already identified certain pain-points that will have to be addressed in the future, but for our current needs, this is a great solution that meets our near-term needs.